The Eudyptula Challenge is a series of programming tasks, with the goal of getting one up-to-speed on Linux kernel programming. When I first heard about it, it immediately intrigued me. I’ve written a few production Linux kernel modules in my time — mostly device drivers — so I started the challenge today.
Real-time — or near real-time — data pipelines are all the rage these days. I’ve built one myself, and they are becoming key components of many SaaS platforms. SaaS Analytics, Operations, and Business Intelligence systems often involve moving large amounts of data, received over the public Internet, into complex backend systems. And managing the incoming flow of data to these pipelines is key.
When running a large real-time processing system, monitoring is critical. But it does more than allow you to keep an eye on your system. During development it allows you test hypotheses about how it works, how it performs when certain parameters are changed, and takes the guessing out of working with dynamic systems.
Storm, a real-time computational framework open-sourced by Twitter, is such a system and comes with a Spout, allowing messages to be streamed from a Kafka Broker.
Cassandra is an open-source, distributed database, informally known as a NoSQL database. It is designed to store large amounts of data, offer high-write performance, and provide fault-tolerance. I recently needed some hands-on experience with Cassandra, and being relatively new to Java programming, needed a simple set-up with which I would experiment.
My experience with Fedora 15 was not as I had hoped. The ATI graphics driver was particularly problematic (regular minute-long hangs due to spinlock issues) so I decided to try a completely different distribution. I decided to go with Kubuntu 11.04 and performance has been excellent.
I came to Django development from much lower-level development — embedded software, device drivers, and system software. What has impressed me most about Django (and python in general) is the manner in which it guides you to do the right thing in terms of code construction. The framework and language naturally make you think about better ways to express your designs.
expect is a tool built using Tcl which allows you to automate many tasks that would otherwise mean tedious repetition at the command line. While many tools come with a command line interface, they don’t lend themselves well to scripting — telnet is the classic example. But with expect you can script these tools as easily as bash.
Valgrind comprises a bunch of very useful tools for detecting problems with your programs. I first came across it a couple of years back and find it to be excellent. In particular I use its memory profiler, which helps you catch errors such as memory leaks and invalid accesses. In my experience these types of errors sometimes indicate logic errors, not just areas where you’ve forgotten to free some previously allocated memory — which is another reason why it is such a great tool.
I cannot praise the revision-control tool git highly enough, and often use it as a buffer between SVN and me. Much of my professional work flow involves fixing a bug here, fixing a bug there — lots and lots of small changes in many different branches. git is the perfect tool for this kind of work. And it is fast.
I got around to installing Yellow Dog 6.1 using a DVD of the full distro. The installation went OK, and the installer fired up in graphical mode. However it proceeded to create the swap partition almost immediately because of low-memory concerns.
In between bouts of Wipeout HD, I net-installed english-language 64-bit PowerPC Fedora Core 12 on my 80GB PS3. Installation with PetitBoot didn’t present any problems, though audio didn’t seem to work. However FC12 is quite slow on my PS3, so I ain’t going to use it – it seems it’s paging to disk a lot.