Loggly recently held an elasticsearch meetup, which was a great success. One question that was repeatedly asked was how to ensure elasticsearch does not suffer a partition — known as a split-brain. This can be a particular problem in AWS EC2, where the network is subject to interruptions. It can also happen if the elasticsearch master node performs long garbage collection cycles.
One configuration that is very effective at preventing this problem is described in this post.
Continue reading Avoiding elasticsearch split-brain
After 14 months of hard work, the next generation of Loggly has been released. It’s been a great time to be part of the Software Infrastructure team at Loggly and we have put together a superb log aggregation & real-time analytics platform.
We used a combination of custom log Collectors, Apache Kafka, Twitter Storm, ElasticSearch, and lots of secret sauce. You can find more details about the technology stack from my Loggly blog post.
As technical lead at Loggly, responsibility for a well-engineered infrastructure ends with me. And one way to ensure the system is designed and implemented well is to stay as close as possible to the code, ensuring that the team and I write quality software.
But it can be difficult to complete the design and implementation of the features I am responsible for, ensure that what the team produces is well-implemented, and understand every line of code — there is only so much time in the day.
Continue reading Technical Leadership through Testing
I have written another post for the Loggly blog — all about our guidelines for choosing and integrating open-source software and technology in your next project.
Check it out here.
I recently wrote my first post for the Loggly blog. It illustrates why host machines are often the worst place to store the logs those machines are generating.
You can check it out here.
When running a large real-time processing system, monitoring is critical. But it does more than allow you to keep an eye on your system. During development it allows you test hypotheses about how it works, how it performs when certain parameters are changed, and takes the guessing out of working with dynamic systems.
Storm, a real-time computational framework open-sourced by Twitter, is such a system and comes with a Spout, allowing messages to be streamed from a Kafka Broker.
Continue reading Monitoring Storm Kafka Spouts using Python
The Boost ASIO Library is a wonderful piece of software. I’ve built high-performance event-driven IO C++ programs that just scream — it works very well. However, there is one subtlety when it comes to timers — specifically when it comes to cancelling expired timers.
Continue reading Boost ASIO timers — errors are never enough
Cassandra is an open-source, distributed database, informally known as a NoSQL database. It is designed to store large amounts of data, offer high-write performance, and provide fault-tolerance. I recently needed some hands-on experience with Cassandra, and being relatively new to Java programming, needed a simple set-up with which I would experiment.
I needed some C++ code to generate Type-1 time-based UUIDs. The Boost libraries, while offering support for other types, don’t have support for time-based UUIDs.
A cut of my code can be found in github.
I finally moved to mutt for my Loggly e-mail (which runs on Google Mail). After moving from e-mail client to e-mail client, I was keen to give it a try — the minimalist design and speed really appealed.
It took a little while to get it just right, but it’s up and running now. I’m pretty happy with it so far, and might consider using it for my personal Yahoo! Mail.
You can find my .muttrc file here.
After almost 5 years at Riverbed Technology, it’s time for new challenges. I’ve started a new development position at Loggly in San Francisco, helping to build their Cloud-based Logging-as-a-Service platform.
I spent significant time at building systems that needed comprehensive logging support. But it’s something that developers don’t need to worry about — let others do it for you.
Why not check out Loggly for your logging and monitoring needs? And if you like building scalable, distributed, software systems, why not join us?
Dogfood testing is an effective way to increase testing, and get valuable feedback, on one’s products. It can be especially effective in the earlier stages of a product’s development, when the user base can be small. Having a forgiving — and sometimes captive — audience provides very useful feedback.
I just wrote a post for the Riverbed Blog about Dogfood testing during development of the Riverbed Cloud Portal. You can check it out here.
My experience with Fedora 15 was not as I had hoped. The ATI graphics driver was particularly problematic (regular minute-long hangs due to spinlock issues) so I decided to try a completely different distribution. I decided to go with Kubuntu 11.04 and performance has been excellent.
Continue reading Kubuntu 11.04 on the Chembook 2370VA
Another post for the Riverbed Technology Blog, on the value of looking back.
You can read it here.
CPU emulation, particularly of older processors, is an interesting topic.
While emulation source code for various CPU cores is easily available, I wanted to better understand how to interface the emulated CPU with my host machine. Therefore I decided to write a simple example of a host system for an emulated MOS Technology 6502 microprocessor.
The goal would be to have the emulated 6502 write “Hello, world” to the console of my linux desktop machine.
Continue reading A simple host system for a 6502 emulator
It’s been almost 18 months since I installed Fedora Core 12, so I decided to move onto Fedora 15.
While the install went OK, I am not (yet) convinced it was worth the hassle. While it’s nice to pick up bug fixes, I haven’t noticed much change to the feature set.
Continue reading Fedora Core 15 on the Chembook 2370VA
Another post, written by me, for the Riverbed Technology blog — this time about the value of Alpha and Beta testing.
Software developers, such as myself, can get very involved with a single part of the new software on which we are working, sometimes losing perspective. Helping customers deploy and test pre-release software helps us to design better products when we are reminded we need to create a solution, not just a box running some clever software.
You can read my latest post here.
I recently wrote a entry for the Riverbed Technology blog, describing an interesting collaborative development experience I had with the AWS EC2 Cloud.
You can read it here.
I came to Django development from much lower-level development — embedded software, device drivers, and system software. What has impressed me most about Django (and python in general) is the manner in which it guides you to do the right thing in terms of code construction. The framework and language naturally make you think about better ways to express your designs.
Continue reading My guidelines for reusable Django applications
I really like having inline source when using gdb. Code Complete, by Steve Mcconnell has an entire chapter explaining how you should proactively step through all code you write — and not just when you’re actively debugging an issue. Having followed this practice for a few years now, I can testify that it increases your productivity enormously. I simply can’t imagine not doing so before committing any code.
Continue reading gdb, inline source, and stepping through your code