Project and Thesis Ideas

Project and Thesis Ideas - Jim Diamond

(Last updated September 4, 2017)

The topics listed here are tagged as follows:

-> H - suitable as an honours thesis;

-> M - suitable as an masters thesis;

-> P - suitable as a COMP 4983 project; and

-> A - possibly suitable as any, by adjusting the scope.

If you would like to find out more about any of these projects, send me an e-mail message and we can discuss the project via e-mail or arrange a meeting in person, as desired.

P: NOTE: this is a two-person project, due to the amount of work required. If you are interested, *you* need to find another project student that you can work with *from beginning to end*, and be prepared to live with your decision.

I have not been able to locate a good standalone calendar program for Linux. (And by calendar program I mean a program with which you can schedule your appointments, classes, and other events, not something which shows you the days of a month.) The Thunderbird calendar program Lightning isn't bad, except for the fact that it is bloatware (300 MB of RAM (that's RSS, not VSS, for those of you who paid attention in OS class) to show a few appointments?!). The KDE program Korganizer isn't too bad, except that it depends on a bunch of associated KDE machinery, which turns it into bloatware. There is (was) a stand-alone program called Sunbird which eventually turned into the Lightning plugin for Thunderbird; it is pretty good from both the UI and functionality points of view, but the code base is a bit archaic and difficult to compile, thus fixing its shortcomings is not an easy task.

I would like to have a calendar program which (at least) combines the best features of these calendar programs and is designed in such a way that it is possible to maintain and improve the program. My first idea on the "team" aspect of this project would be that one person designs and implements the user interface (probably in Qt5 or gtk+3) and the other person designs and implements the back-end capability (in C or (*gag*) C++). The back-end job is probably larger, so the UI person might have to help out a bit there. The program would have to be compatible with standards such as ical and CalDav, and would need to be able to use remote calander servers as well as to be able to function intelligently when off-line.

If you and a friend are interested in this come and talk to me for more details. Before doing that, you should try out Korganizer and Lightning for a while to understand what you are signing up for.

P: I have been working on a project with the School of Music to network pianos together to provide the ability to give piano lessons or examinations remotely (see http://musicpath.acadiau.ca or this TV commercial). I have an implementation which uses a Java GUI and an implementation of the rest of the system in C, currently running on Linux. There has been some interest in an implementation of this system for M$-windows. This is not a trivial task, but should be a reasonable amount of work for someone proficient in developing C code in that environment.

A port of this program under cygwin has been mostly done, but I would prefer a native windows version. However, the lessons learned from the cygwin port should facilitate the "native" port.

P: I have been working on a project with the School of Music (are you getting a "deja vu" feeling right about now?) to network pianos together to provide the ability to give piano lessons or examinations remotely (see http://musicpath.acadiau.ca or this TV commercial). I have an implementation which uses a Java GUI and an implementation of the rest of the system in C, currently running on Linux. Every time Oracle brings out a new version of Java, something in the GUI breaks or otherwise doesn't work well, and I have become tired of this game. I would like a re-implementation of the GUI in QT5 or GTK3 to fix current bugs and avoid new ones when Java 9 rolls out.

If this is of potential interest to you, come and see me and I will show you the current GUI and give you an overview of the program which the GUI controls.

A: I have had two honours students develop systems for creating drawings of graphs ("Grapha" and then "Graphic"). They can be found by visiting http://graph-drawing.acadiau.ca. Both of these systems allow the user to pick a graph in a known "family", adjust some parameters, and create an aesthetically-pleasing drawing of the graph. However, one might like to take a description of a graph and create a drawing of it which meets some aesthetic criteria. There are a number of systems out there now, all of which do some things well and others not so well.

I would like to develop an "all-singing, all-dancing" graph drawing system. This would involve taking the work done by the previous students as well as collecting together open-source graph drawing programs and amalgamating them into a unified system. This is likely to be an on-going project, so a key point of the design would be extensibility.

This project is probably too large for a single 4983 project, but if two (or three?) 4983 students wanted to work together on it, that would be OK. It would make a large thesis project, but a proficient programmer could make a good start at this.

If you would like to find out a bit more, you can first d/l and compile Graphic and play around with it a bit, and then you can e-mail me and I'll send you a list of improvements/additions I'd like done. Also, the above web page links to the two respective honours theses, so you can read a bit more about the problem and these two solutions.

A: Review and implement approximation algorithms for NP-complete graph theoretical problems. This is very open-ended, and could be a compilation of algorithms (packaged in a coherent way) for COMP 4983, up to inventing, analyzing and/or testing new algorithms for an honours or master's thesis. Graph colouring is one such problem that I am interested in.

P: Data compression testbed. Create a library of data compression algorithms and a driver program to try them out. This could involve collecting good open-source implementations of compression algorithms and implementing some special-purpose compression algorithms that I can suggest.

P: Enhance an open-source project. The enhancements could be removing bugs or spurious warning messages, or by adding features to the project. I used to say that "gnumeric", "abiword", and "korganizer" were possibilities that interest me, but they have matured a bit, and I am always open to other (mutually interesting) options. As an archaic example, abiword used to handle (some) password-protected documents that OpenOffice.org 1.0.1 did not, but OpenOffice handled some things that abiword didn't. Another possibility is the "JES" system which is currently used by COMP 1113 and COMP 1893; there are a number of issues just waiting for someone to solve them and contribute code. (See https://github.com/gatech-csl/jes/.)

A: I am open to ideas for projects or theses in any area of graph theory, particularly graph theoretic algorithms. (I am open to co-supervision, when appropriate; I have co-supervised one honours thesis and one masters thesis with Dr. Nancy Clarke of the mathematics department.)

A: I could possibly be persuaded to supervise or co-supervise the right project in the area of cryptography. (I have co-supervised two honours theses with Dr. Jeff Hooper of the mathematics department.)

H or M: Lempel-Ziv data compression uses a greedy approach when choosing the next string to output; that is, it tries to match the next portion of the input with the longest string currently in the dictionary. This is not an optimal approach (in general, depending on how the output token for each match is encoded), although it is efficient (from a CPU time point of view).

A previous honours student (using a dynamic programing approach) showed that the optimal set of tokens could be generated when the token encoding is known a prioiri, and that this same technique could be employed to improve the compression when the token encoding is only known as or after the tokens are generated.

This project was a proof of concept and is very compute-inefficient. It might be interesting to review the technique and to see if it can be implemented efficiently, or whether this idea can be employed in other LZ-style (or maybe even non-LZ-style) compression algorithms.

H (or part of a master's thesis): When compressing a sound file, typically the file is divided into a sequence of relatively small blocks (e.g., 2K sound samples per block). These blocks are then compressed independently of each other. There are offsetting constraints which work against making the blocks either too short or too long.

A master's student examined the issue of dynamically adjusting each block's length to improve the overall compression ratio. Some gains were achieved with the technique, which then suggested comparing her results to the best possible. A "better than brute force, but still hugely expensive" algorithm was implemented. This algorithm is prohibitively slow for daily use, but finds the optimal blocking (given a few constraints). (The BFI (brute force and ignorance) algorithm would be exponential in the size of the sound file, and a dynamic programming approach was used to reduce the compute time O(N²), where N is the largest number of blocks possible.

Someone who is interested in algorithms could try to discover a more efficient way of finding the optimal blocking. This would require learning a tiny bit about audio data compression, but would mostly be an investigation into finding a better algorithm for this problem.

H or M: The "standard" audio compression algorithms use a linear function to predict the next audio sample, based upon the previous N samples, for some relatively small N. Linear functions are used because there is a relatively straightforward way to compute a linear function which has some desirable properties. However, there is no reason why other approaches might not work. Someone interested in numerical computation could look at techniques for generating non-linear functions with these desirable properties. Alternatively, someone interested in AI might investigate the use of machine learning techniques for developing prediction functions.

A: I am open to other ideas in the area of data compression. Text and audio compression are my main interests, but I could possibly be persuaded to supervise a project in video or image compression as well.

H or M: You might think you know what time it is, but don't be so sure! There is a (Unix) program called "ntpdate" which will set your system's clock to an approximation of the current time, using some "time server" as a time standard. There is also a program called "ntpd" which runs continuously, attempting to keep your system clock synchronized with one or more time servers. The documentation of ntpd suggests that very small changes are made to the system clock, which might imply that the system clock should asymptotically approach the same time as the server's clock, with very small variations. The behaviour of this program on Linux seems to be a bit peculiar: some quick tests (in 2003 or 2004, admittedly) indicated that the system time seems to repeatedly slowly drift away from the server time and then take a relatively large jump to get back. Investigating this phenomenon would require being able to get the system time of a remote computer with high accuracy (which is problematic because of network and OS delays) and then to track the differences between these two clocks. Having characterized the differences, it would be interesting to see if ntpd and/or Linux could be modified to avoid having large jumps in the client's system clock.

P: Linux kernels have a number of tools for configuring the system for your particular hardware choices. My favorite of the lot, xconfig, has (at least) one annoying "feature". Namely, if you are not allowed to select option A because you need to select or de-select some combination of other options, it does not provide you with any easy way to turn those other options on or off. Currently, it displays a Boolean formula indicating A's dependencies, and you then have to figure out (a) which ones you need to adjust, and (b) which screen you need to navigate to in order to adjust these options.

This project would probably build on the existing code and GUI to give the user the chance to easily select or de-select whichever options he chooses. Of course, changing one dependency might cause other dependencies to need changing, so a tree of dependencies might be created. This information has to be displayed to the user in some human-friendly way.

In principle, this xconfig replacement could be re-written from scratch, but that might be too much work for a 4983 project. Also, anyone considering this should realize that this project has to be done in C or C++; don't even suggest Java unless you're trying to make me laugh MAO. And if you don't know why I'd laugh, this project is not for you.

P: "The Great Canadian Rip-Off". In 2005, when evaluating a new text book for COMP 2213, I visited the publisher's US web site and saw that the suggested list price was US$75, which was about $89 at the time. Later, I was amazed to learn that the publisher (Oxford University Press) was charging $150 in Canada. After I sent an unambiguous e-mail to the publisher's representative expressing my displeasure (i.e., a "nastygram"), they lowered their price (but apparently only at Acadia!), but they still were charging way too much for the book in comparison to the US price.

Some textbooks have Canadian prices which are equivalent (plus or minus a few percent) to the US price. I'm interested in taking a survey of a large number of books and comparing the Canadian and US prices. One way to do this is to pick two or three large book sellers in both countries and check the prices on their web sites. This comparison should be done a number of times over a few weeks (or more) to avoid being misled by promotions.

This project would need to develop some software which would automatically retrieve prices for books from a number of web sites. A minimal version of the project would then store these prices and, after the survey was done a few times, report on which books had the worst price discrepancies. A better implementation would increase the automation of the project so that this project could run more-or-less independently and create periodic reports identifying the worst cases it finds. Better yet would be to identify publishers who are frequently found to be over-charging Canadians.

Normally I am vehemently opposed to 4983 projects being done by more than one person. In this case, it is possible that a really slick implementation is too much work for one person. I would accept a team of two people, but we would need to meet and discuss the ground rules. However, I would expect a two-person team to make an all-singing, all-dancing version of this project.

Keywords: rip-off, highway robbery, shameless, profiteering.

A: I am open to supervising other project and thesis topics. If you have something in mind come and see me to confirm whether or not it is something suitable for me to supervise.

Back to Jim Diamond's home page

Back to the Acadia School of Computer Science projects page

File translated from T_EX by T_TH, version 4.03.
On 4 Sep 2017, 15:19.