It’s been 18 months since the first commit to my first significant Go project — syslog-gollector. After an initial burst of activity to create a functional Syslog Collector that streamed to Apache Kafka, the source code hadn’t been updated much since. But today I received a report that it no longer built, so I spent some time porting the code to the latest Shopify Sarama framework.
It was amusing to see how naive much of my early Go code was.
Continue reading Revisiting syslog-gollector
I recently presented at the InfluxDB San Francisco Meetup, on InfluxDB and the Raft consensus protocol. My talk was about the fundamental problems of distributed systems, and how InfluxDB uses Raft to solve these issues.
Continue reading InfluxDB and the Raft consensus protocol
This is the last part of a 3-part series “Designing and building a search system for log data”. Be sure to check out part 1 and part 2.
In the last post we examined the design and implementation of Ekanite, a system for indexing log data, and making that data available for search in near-real-time. Is this final post let’s see Ekanite in action.
Continue reading Designing a search system for log data — part 3
This is the second part of a 3-part series “Designing and building a search system for log data”. Be sure to check out part 1. Part 3 follows this post.
In the previous post I outlined some of the high-level requirements for a system that indexed log data, and makes that data available for search, all in near-real-time. Satisfying these requirements involves making trade-offs, and sometimes there are no easy answers.
Continue reading Designing a search system for log data — part 2
This is the first part of a 3-part series “Designing and building a search system for log data”. Part 2 is here, and part 3 is here.
For the past few years, I’ve been building indexing and search systems, for various types of data, and often at scale. It’s fascinating work — only at scale does O(n) really come alive. Developing embedded systems teaches you how computers really work, but working on search systems and databases teaches you that algorithms really do matter.
Continue reading Designing a search system for log data — part 1
When you’d like to contribute to an open-source project it can be difficult to know where to start. Check out my latest post for the InfluxDB blog, explaining how we on the Core team have curated a set of issues, hopefully making it easy for potential contributors to start.
Another post for the InfluxDB blog — on testing the storage engines within InfluxDB.
You can check it out here.
Hashicorp provide a nice implementation of the Raft consensus protocol, and it’s at the heart of InfluxDB (amongst other systems). I wanted to experiment with a simple system built using this particular Raft implementation, so was inspired by raftd to built hraftd.
Continue reading Building a distributed key-value store using Raft
“Run into an obstacle in what you’re working on? Hmm, I wonder what’s new online. Better check.”
If you haven’t already, you should start reading Paul Graham’s essays. In one on philosophy, Graham believes that many of the answers provided by philosophy are useless because “…of how little effect they have”. By that standard another of his essays is of high utility because it has affected the way I program. John Stuart Mill would be pleased.
Continue reading Coding like it’s 1999
I’ve written my first post for the InfluxDB blog. In it I discuss the new statistics and monitoring system built into InfluxDB, starting with the 0.9.4 release. Functionality like this is critical when it comes to running a distributed database like InfluxDB.
You can check it out here.
It’s been 418 days since my first Github commit of Go code. In that time I’ve written a Syslog-to-Kafka producer, a Raft-based distributed SQLite database, a near real-time log search system, and become a core developer of InfluxDB.
Continue reading 400 days of Go
This past week I attended Gophercon 2015, in Denver, CO. It was also a chance to get together with the rest of the InfluxDB team. And because the Go community is still relatively young and small, it was a great chance to meet, in person, some of the best people working with Go today.
Continue reading Gophercon 2015
The first version of the 0.9.0 series of InfluxDB has been released. It’s alpha-quality software but all of us on the InfluxDB team are very excited to see the software reach this stage.
You can read more about the release on this blog post.
Search is everywhere. Once you’ve built search systems, you see its potential application in many places. So when I came across bleve, an open-source search library written in Go, I was interested in learning more about its feature set and its indexing performance. And I could see immediately one might be able to shard it to improve performance.
Continue reading Increasing bleve indexing performance with sharding
Bjarne Stroustrup has another very interesting paper on his website. Titled Software Development for Infrastructure, it discusses some key ideas for building software that has “…more stringent correctness, reliability, efficiency, and maintainability requirements than non-essential applications.” It is not a long paper, but offers useful observations and guidelines for building such software systems.
Continue reading Software Development for Infrastructure
Real-time — or near real-time — data pipelines are all the rage these days. I’ve built one myself, and they are becoming key components of many SaaS platforms. SaaS Analytics, Operations, and Business Intelligence systems often involve moving large amounts of data, received over the public Internet, into complex backend systems. And managing the incoming flow of data to these pipelines is key.
Continue reading Drop, Throttle, or Buffer
Tomorrow I join the team at InfluxDB, something I’m really excited about. I’m really looking forward to coding in Go full-time — it’s a language with real promise, a nice clean tool chain, and a very active community.
Continue reading Measure Everything
Something just doesn’t feel right about node.js.
After coding in it for almost a year, it’s been fun, but I’ve decided it’s just a waypoint to somewhere better.
Continue reading Is node.js just a stopgap?
SQLite is a “self-contained, serverless, zero-configuration, transactional SQL database engine”. However, it doesn’t come with replication built in, so if you want to store mission-critical data in it, you better back it up. The usual approach is to continually copy the SQLite file on every change.
I wanted SQLite, I wanted it distributed, and I really wanted a more elegant solution for replication. So rqlite was born.
Continue reading Replicating SQLite using Raft Consensus
So far coding in Go has been fun. It comes with nice functionality that lets you know that the Go team really have been writing system software (useful stuff like this, and this). And then I read about the Go Memory Model, and had my consciousness raised.
Continue reading Wow, the Go Memory Model really threw me