Monitoring — the measurement of your system, the gathering of telemetry, and alerting when it behaves anomalously — is key to running large-scale, modern computer systems. But what many developers today don’t realise is that monitoring can be a key part of your design cycle too.
I’ve been programming for many years, and have spent most of the last few years managing development teams. I’ve written plenty of closed source software, and for a time made my living writing open source software too.
One thing stands out: a shared code base does not a software team make.
Slack: Where work happens
Something is happening at companies that use Slack. Slack, the company, may claim it’s work, but it’s less and less productive work, and it’s having a destructive affect upon my own field of software development.
Some fellow developers, using Go for the first time, recently asked me how to organise a Go project and for some high-level guidance on programming using the language.
I thought the most effective way to answer this question was to build a simple Go HTTP service, that provides a key-value store. It also includes a README, outlining my most important guidelines for Go programming. You can check it out here.
Programming a database is fascinating work. I’ve been deeply involved with developing open source databases for the past two years and programming a database is possibly the most instructive project one can ever complete as a software developer.
What’s really striking however, is how much my attitude towards databases has changed over the past 6 years. From a state of disinterest, I’ve come to think of these systems as a pinnacle of software engineering.
“Bad money drives out good.”
When is the last time you spoke with your fellow developer? I mean actually spoke? Or was it just over Slack?
In the last post we examined the design and implementation of Ekanite, a system for indexing log data, and making that data available for search in near-real-time. Is this final post let’s see Ekanite in action.
In the previous post I outlined some of the high-level requirements for a system that indexed log data, and makes that data available for search, all in near-real-time. Satisfying these requirements involves making trade-offs, and sometimes there are no easy answers.
For the past few years, I’ve been building indexing and search systems, for various types of data, and often at scale. It’s fascinating work — only at scale does O(n) really come alive. Developing embedded systems teaches you how computers really work, but working on search systems and databases teaches you that algorithms really do matter.
“Run into an obstacle in what you’re working on? Hmm, I wonder what’s new online. Better check.”
If you haven’t already, you should start reading Paul Graham’s essays. In one on philosophy, Graham believes that many of the answers provided by philosophy are useless because “…of how little effect they have”. By that standard another of his essays is of high utility because it has affected the way I program. John Stuart Mill would be pleased.
I recently came across a talk on YouTube titled History of Software Engineering, given by Paolo Perrotta. Normally I find online videos to have a low information-to-time ratio, but this one was excellent. It’s not too long, with plenty of humour, and makes many serious points that resonated with me.
Continue reading History of Software Engineering
Bjarne Stroustrup has another very interesting paper on his website. Titled Software Development for Infrastructure, it discusses some key ideas for building software that has “…more stringent correctness, reliability, efficiency, and maintainability requirements than non-essential applications.” It is not a long paper, but offers useful observations and guidelines for building such software systems.
Real-time — or near real-time — data pipelines are all the rage these days. I’ve built one myself, and they are becoming key components of many SaaS platforms. SaaS Analytics, Operations, and Business Intelligence systems often involve moving large amounts of data, received over the public Internet, into complex backend systems. And managing the incoming flow of data to these pipelines is key.
In my last blog post I explained why writing design documents is such a powerful approach to building well-engineered systems. But what should one document? When it comes to software, if one documents too much, the content of the documentation can become inaccurate very quickly, and inaccurate documentation is quickly ignored.
Many software engineers never write design documents. Design documentation takes time, and implementations often proceed so far without any documentation that if it happens, it’s an act of recording what has been done — a tedious task at the best times.
Many software engineers argue “the code exists, it’s running, it’s working, let’s move on and build the next thing.”