Entries for June 2005
Knoware, Part I
In my last post I mentioned that I made my decision to continue with my KDE project, Knoware, for the Summer of Code. For those who missed the full description (read it if you get confused), Knoware is a program that will use Bayesian networks to find patterns between Linux system configurations and bug report subscriptions. In this entry I'll provide a little background as well as some more details, plans, and open issues. Since the project has barely started yet, I'm open to input from any direction.
First, some background. I actually envisioned Knoware as a Windows program in high school more than five years ago. At the time I didn't even know Bayes nets existed, just that computers must have the processing power to analyze the thousands of statistics and arrive at a solution. I pictured the interface much like a chat program, since collaboratively working through problem solutions will play a prominent role in the program's success. I even came up with a business plan for it: make it free for users, but provide custom statistics reports to hardware and software companies at a cost (remember, Knoware will know which products are conflicting with one another, the distribution of models and driver versions, and such). I look back on this now and still see it as plausible, but I'm very glad to be developing for a platform I use and enjoy now. More on other aspects of my original vision, like the interface, shortly.
You may have realized that the mention of Windows brings up an interesting point: conceptually, Knoware has very little to do with KDE or even the Linux platform. It could easily be made to work with Windows or Mac OS, and focus on their respective bugs. If it turns out to work as well as expected, you can be sure versions will pop up on other platforms.
The current limitation to Linux and KDE may actually be a benefit in finding accurate statistics. Sure, simultaneously opening up the program to Windows and Mac users would result in a larger database. The problem is that the bug reports and system configurations would be much less focused. The extra input does no good if it's for a completely different problem and coming from a completely different system. If all input configurations have Linux and KDE in common, this will help the users focus on more of the same problems instead of completely unrelated ones.
The issue has been raised that it may be hard to gain enough users initially, making it difficult to draw useful conclusions. I'm pretty confident when it comes to this issue. Take for example the site KDE-Forum.org. This forum has more than 8,000 registered users, and this is surely a small fraction of KDE users — the English-speaking vocal minority. If just 2.5% of that vocal minority uses Knoware to report a specific problem, that's 200 configurations to analyze. I'm no expert on Bayes nets (yet), but to me that sounds like a fairly significant data set. And in reality, of course, I picture many more users giving Knoware a try.
One aspect that will help draw in participants is the user interface. My mentor for this project told me to concentrate on this area first, since it will ultimately be the first thing to get a user's attention. In my proposal, I scheduled the user interface toward the end of the summer, so already I've learned something new (about KDE development, at least). Earlier I mentioned that in the past I pictured the user interface much like a chat program. This has changed significantly in my mind, since collaborating with other participants is something the user will only be doing some of the time. The two other main tasks will be scanning — that is, gathering information about the user's system — and browsing for relevant bugs.
Scanning the system is an automated process, but requires the user to specify the level of granularity they will provide, since there is a chance that this information will be made available to other users. It's also possible that the user will need to input some information manually, in case the scanner can't determine an important aspect of the user's system.
Browsing for bugs will hopefully be made easy through searching, live filtering, tagging, and other new trends in navigation. Since the detailed bug reports will need to be created at some point, this process is not just read-only. The interface should therefore assist the user in proper naming and categorization of the problem.
Lastly, I'll discuss possible integration with existing projects. As far as I know, there's nothing out there right now (in the public, at least) which hopes to accomplish the same thing as Knoware. However, there are a few projects which can assist in its goal. KDE has an existing bug tracking system using Bugzilla, and a graphical front-end for it called KBugBuster. While I hope to use more meta-data to make bug browsing easier, there's no sense in completely reinventing the wheel, so this is a likely project I will turn to for integration. Additionally, KDE has a crash handler that tries to make crashes a little more friendly. While crashes are only a tiny fraction of the problems I encounter, I won't rule out the possibility of submitting crash reports to Knoware through this program (much like the Windows crash handler).
Hopefully that puts to rest any questions or doubts out there. As I said, input is welcome during any phase of development.
Summer of Code: Pre-Project Developments
Since the spring semester ended and my number of commitments promptly fell close to zero, my e-mail inbox has been mostly empty. But over the past few days, it's been a challenge to keep up with the swarm of traffic going on due to Google's Summer of Code program. This is a dream come true for people like me who check their e-mail every 30 seconds.
In a previous posting I mentioned that I was leaning towards my KDE proposal as my selected Summer of Code project. At the time of writing, I had already been in contact with the KDE folks for a couple days, and hadn't received a single message from the Python guys. The selection process inevitably requires crushing some dreams, and the lack of contact with the nice folks at the PSF facilitated my early decision.
What a surprise it was, then, when I received a message from one of the KDE mentors telling me that they had already been exchanging messages with the Python guys, who really wanted to see the proposal brought to life. Google had apparently instructed the organizations to 'fight' among themselves for duplicate acceptances. After finally receiving messages from Ian Bicking and David Ascher (this is not to their discredit — the KDE guys were just fast), I wasn't so sure which project to select anymore.
On one hand, I felt a little selfish selecting my own project over the Python one simply for the reason that it excited me more, even with more risk involved. On the other hand, I think a lot of other people are qualified to work on the Python project, and I also think Knoware will directly benefit more people in the end. Since my decision was not swayed, I hope it doesn't look like I've just been stubborn all along — I looked very closely at the goals laid out in my project schedule, and I see no reason not to continue. Furthermore, I plan to use Python for a significant part of my KDE project, and it directly lines up with another proposal accepted by the PSF. So there's something in it for the Python guys as well. Who'd have thought that after all this time, I'd have to turn down the organization and community that has been the biggest part of my life as a programmer for the past year?
Patty brought up a good point last night. After about a month of being unemployed and exploring possible ways to make money, it's funny that now I'm stressing out over my options of doing something completely awesome and doing something just slightly less awesome. Funny how things change...
I'm sending my final decision out to the mentoring organizations shortly. This means that the PSF will get to choose another proposal, which must be exciting for them as well as one lucky participant.
Happenings
Word on the street is no one from outside the cwru.edu domain can connect to Blog@Case pages right now. What's the deal, admins? I've had multiple people tell me the domain won't resolve at different times today.
Edit: When I said 'no one,' I meant 'some people.' And the issue still stands, as far as I can tell.
Anyway, something extremely cool happened to me earlier. I got an e-mail from Stephan Deibel over at Wingware, makers of the best damn Python IDE around. I use it whenever I can — problem is, it's not free software, so I'm always stuck with their trial version (which expires). Stephan's e-mail said they were offering free 3-OS Wing IDE Professional licenses to folks whose Python Software Foundation proposals were selected for Summer of Code, and that these licenses would even last beyond the summer. Even after letting him know that I was most likely accepting the KDE bounty instead of the Python one, he still gave me a free license! A quick trip over to the Wingware store reveals that the package they're offering is worth $395!
So I'd like to thank Stephan and the rest of the folks over at Wingware, you guys truly care about the Python community. Besides the Summer of Code program, they also offer educational and open source discounts for various types of development.
Edit 2: Well what do you know, I just realized that Stephan is also on the board of directors and the chairman of the Python Software Foundation.
Summer of Code Projects
As I mentioned in my last posting, I have the option of participating in the development of either of two Summer of Code proposals. Before I reveal the projects, it might help to put them into context with some information about the mentoring organizations: KDE and the Python Software Foundation. KDE is a desktop environment for Linux with a very complete software suite. I've tried out nearly two dozen window managers and desktop environments, and KDE definitely has something special going on. (Okay, for my brief GNOME-bashing on Alex's blog, I must say that I've learned to appreciate GNOME as well. I'm actually using it right now to write this up.) Python is my favorite programming language — what more can I say? I'm extremely productive with it.
On to the projects — first, the KDE proposal. On the KDE bounties page there's an entry for a "visionary application addition," awarded to innovative proposals — this was the bounty I was selected for. The idea is for a program called Knoware where users voluntarily contribute information about their system's configuration (hardware, installed packages, etc.) and then subscribe to bugs or problems they're experiencing (doesn't have to be a bug — any fixable annoyance). This information is kept on a server, which uses a system called Bayes Nets (short for Bayesian Networks — here are short and long descriptions — basically, inductive logic) to identify patterns between bug subscriptions and system configurations. If enough people participate, Knoware could quickly track down bugs that would otherwise take forever to identify. In my experience, a vast amount of problems in Linux arise from the unlimited number of possible hardware configurations people have. Often people turn to forums for tech support, where users generally don't provide enough information about their problem and others spend forever probing them about their system. There are a lot more details I could go into about this project, but I'll save it for a later posting. For example, the client isn't just a passive application that gathers system information, it also networks you with other users who are subscribed to similar problems, or have similar hardware configurations, so that you can collaboratively work through possible solutions (and log your progress for others to see).
If that description confused you, consider this example: everyone who subscribes to the 'Program Y crashes when I do this' bug has an 'X' brand graphics card. Knoware will recognize this pattern and then examine more details like the card model, specifications, drivers, etc. For instance, it could eventually determine that only 512MB 2nd generation cards with driver version 2.5 cause this crash, and this gives developers a starting point for fixing the problem.
My other submission was picked up from the Python Web Programming Ideas page. I proposed the addition of a comments & annotation system to the Python standard reference, much like what is seen in PHP's documentation. Oddly, I've had multiple conversations with people who agree that the PHP documentation is often made worse by the comments system. My proposal included the use of Ajax to provide abilities like live editing and filtering, a rating or categorization system to prevent abuse, and other additions. Okay, the comments thing is kind of boring, but the annotation part I am excited about — they are two different things. If you take a look at PHP's documentation, you see that all the user contributed notes are at the bottom of the page, like any other comments sytem. Annotation, on the other hand, is meant to sit alongside a specific block of text on the page. For example, many textbooks feature in-margin annotation to identify terms or concepts. Since it can be very easy to disrupt the flow of a page with extraneous text, especially on a web page, I'm looking forward to trying out different solutions such as CSS tooltips, in-margin annotation, and in-place footnote links. Assuming, of course, that I accept this bounty.
Right now I'm leaning heavily towards the Knoware proposal. My reasoning is that it's my own idea and I have a strong interest in seeing it done right and by my own hand. I'll go into more detail about Knoware and other Summer of Code happenings in future postings.
Google Summer of Code!
So the responses have finally been sent out to all of Google's Summer of Code applicants. After more than a week of stressing out, two (out of three) of my proposals were accepted! Decisions, decisions...
More info lata.
