This past week I had the opportunity to speak, with my colleague Jim Nisbet, at AWS re:Invent 2013. Titled “Unmeltable Infrastructure at Scale: Using Apache Kafka, Twitter Storm, and Elastic Search on AWS“, Jim and I described the architecture of Loggly’s next-generation log aggregation and analytics Infrastructure, which went live 3 months ago, and runs on AWS EC2.
Loggly recently held an elasticsearch meetup, which was a great success. One question that was repeatedly asked was how to ensure elasticsearch does not suffer a partition — known as a split-brain. This can be a particular problem in AWS EC2, where the network is subject to interruptions. It can also happen if the elasticsearch master node performs long garbage collection cycles.
One configuration that is very effective at preventing this problem is described in this post.
After 14 months of hard work, the next generation of Loggly has been released. It’s been a great time to be part of the Software Infrastructure team at Loggly and we have put together a superb log aggregation & real-time analytics platform.
As technical lead at Loggly, responsibility for a well-engineered infrastructure ends with me. And one way to ensure the system is designed and implemented well is to stay as close as possible to the code, ensuring that the team and I write quality software.
But it can be difficult to complete the design and implementation of the features I am responsible for, ensure that what the team produces is well-implemented, and understand every line of code — there is only so much time in the day.
When running a large real-time processing system, monitoring is critical. But it does more than allow you to keep an eye on your system. During development it allows you test hypotheses about how it works, how it performs when certain parameters are changed, and takes the guessing out of working with dynamic systems.
The Boost ASIO Library is a wonderful piece of software. I’ve built high-performance event-driven IO C++ programs that just scream — it works very well. However, there is one subtlety when it comes to timers — specifically when it comes to cancelling expired timers.
Cassandra is an open-source, distributed database, informally known as a NoSQL database. It is designed to store large amounts of data, offer high-write performance, and provide fault-tolerance. I recently needed some hands-on experience with Cassandra, and being relatively new to Java programming, needed a simple set-up with which I would experiment.
I finally moved to mutt for my Loggly e-mail (which runs on Google Mail). After moving from e-mail client to e-mail client, I was keen to give it a try — the minimalist design and speed really appealed.
It took a little while to get it just right, but it’s up and running now. I’m pretty happy with it so far, and might consider using it for my personal Yahoo! Mail.
You can find my .muttrc file here.
After almost 5 years at Riverbed Technology, it’s time for new challenges. I’ve started a new development position at Loggly in San Francisco, helping to build their Cloud-based Logging-as-a-Service platform.
I spent significant time at building systems that needed comprehensive logging support. But it’s something that developers don’t need to worry about — let others do it for you.
Dogfood testing is an effective way to increase testing, and get valuable feedback, on one’s products. It can be especially effective in the earlier stages of a product’s development, when the user base can be small. Having a forgiving — and sometimes captive — audience provides very useful feedback.
My experience with Fedora 15 was not as I had hoped. The ATI graphics driver was particularly problematic (regular minute-long hangs due to spinlock issues) so I decided to try a completely different distribution. I decided to go with Kubuntu 11.04 and performance has been excellent.
CPU emulation, particularly of older processors, is an interesting topic.
While emulation source code for various CPU cores is easily available, I wanted to better understand how to interface the emulated CPU with my host machine. Therefore I decided to write a simple example of a host system for an emulated MOS Technology 6502 microprocessor.
The goal would be to have the emulated 6502 write “Hello, world” to the console of my linux desktop machine.
While the install went OK, I am not (yet) convinced it was worth the hassle. While it’s nice to pick up bug fixes, I haven’t noticed much change to the feature set.
Software developers, such as myself, can get very involved with a single part of the new software on which we are working, sometimes losing perspective. Helping customers deploy and test pre-release software helps us to design better products when we are reminded we need to create a solution, not just a box running some clever software.
You can read my latest post here.
I came to Django development from much lower-level development — embedded software, device drivers, and system software. What has impressed me most about Django (and python in general) is the manner in which it guides you to do the right thing in terms of code construction. The framework and language naturally make you think about better ways to express your designs.