Many software engineers never write design documents. Design documentation takes time, and implementations often proceed so far without any documentation that if it happens, it’s an act of recording what has been done — a tedious task at the best times.
Many software engineers argue “the code exists, it’s running, it’s working, let’s move on and build the next thing.”
Continue reading Why you should write software design documents
My father worked for many years in QA at Beckman, an American medical instruments firm. His job was to ensure that newly-manufactured centrifuge rotors would hold up when spun at thousands of RPMs. He used to tell me that the Beckman philosophy could be summarised in one sentence — “There is no substitute for quality”.
Continue reading Always thinking of the next guy
After 2 years at Loggly, tomorrow I start a new role at Jut. While I will miss the team at Loggly very much, and the wonderful product we built during my team there, I’m looking forward very much to working again with some old colleagues from Riverbed Technology.
I came across a very readable paper on distributed systems — Distributed systems for fun and profit. I recommend it for anyone interested in learning more about distributed systems, and the challenges involved with designing, building, and operating distributed systems.
Packt recently asked me to review their new publication Mastering ElasticSearch by Rafał Kuć and Marek Rogoziński. Since most of my experience with elasticsearch has been from a systems points of view — index management, cluster maintenance, indexing performance — I paid most attention to the chapters about those parts of elasticsearch.
Continue reading Book Review: Mastering ElasticSearch
AWS have posted the video online of Jim Nisbet’s and my talk at AWS:reinvent 2013. In it, Jim and I describe the system we built at Loggly, which uses Apache Kafka, Twitter Storm, and elasticseach, to build a high-performance log aggregation and analytics SaaS solution, running on AWS EC2.
Continue reading Infrastructure at Scale: Apache Kafka, Twitter Storm and elasticsearch
Loggly recently held an elasticsearch meetup, which was a great success. One question that was repeatedly asked was how to ensure elasticsearch does not suffer a partition — known as a split-brain.
This can be a particular problem in AWS EC2, where the network is subject to interruptions. It can also happen if the elasticsearch master node performs long garbage collection cycles.
Continue reading Avoiding elasticsearch split-brain
After 14 months of hard work, the next generation of Loggly has been released. It’s been a great time to be part of the Software Infrastructure team at Loggly and we have put together a superb log aggregation & real-time analytics platform.
We used a combination of custom log Collectors, Apache Kafka, Twitter Storm, ElasticSearch, and lots of secret sauce. You can find more details about the technology stack from my Loggly blog post.
As technical lead at Loggly, responsibility for a well-engineered infrastructure ends with me. And one way to ensure the system is designed and implemented well is to stay as close as possible to the code, ensuring that the team and I write quality software.
But it can be difficult to complete the design and implementation of the features I am responsible for, ensure that what the team produces is well-implemented, and understand every line of code — there is only so much time in the day.
Continue reading Technical Leadership through Testing