rqlite provides robust replication for SQLite databases using the Raft consensus protocol. Coded in Go it ensures that all changes made to the leader SQLite database are replicated to all other nodes in the cluster, providing fault-tolerance and reliability.
It’s been 18 months since development of rqlite first started and it’s time for version 2.
Continue reading rqlite – replicated SQLite with new Raft consensus and API
I’ve started replacing go-raft within rqlite with the implementation from Hashicorp. go-raft is no longer maintained, and I’ve good experience with the Hashicorp code, due to my work with InfluxDB and hraftd. I’m also going to change the API, so it’s more useful. The existing implementation and API has been tagged as v1.0, so it’s still available.
You can follow the work on this branch, and I hope to merge it to master in the near future.
I recently presented at the InfluxDB San Francisco Meetup, on InfluxDB and the Raft consensus protocol. My talk was about the fundamental problems of distributed systems, and how InfluxDB uses Raft to solve these issues.
Continue reading InfluxDB and the Raft consensus protocol
Hashicorp provide a nice implementation of the Raft consensus protocol, and it’s at the heart of InfluxDB (amongst other systems). I wanted to experiment with a simple system built using this particular Raft implementation, so was inspired by raftd to built hraftd.
Continue reading Building a distributed key-value store using Raft
The first version of the 0.9.0 series of InfluxDB has been released. It’s alpha-quality software but all of us on the InfluxDB team are very excited to see the software reach this stage.
You can read more about the release on this blog post.
Packt recently asked me to review their new publication Cassandra High Availability, written by Robbie Strickland. I’ve worked with Cassandra in the past — early designs of Loggly‘s 2nd generation Log analytics platform used Cassandra as its authoritative store for log data, but we ended up pulling it and using elasticsearch as both the store and search engine.
Continue reading Book Review: Cassandra High Availability
Tomorrow I join the team at InfluxDB, something I’m really excited about. I’m really looking forward to coding in Go full-time — it’s a language with real promise, a nice clean tool chain, and a very active community.
Continue reading Measure Everything
SQLite is a “self-contained, serverless, zero-configuration, transactional SQL database engine”. However, it doesn’t come with replication built in, so if you want to store mission-critical data in it, you better back it up. The usual approach is to continually copy the SQLite file on every change.
I wanted SQLite, I wanted it distributed, and I really wanted a more elegant solution for replication. So rqlite was born.
Continue reading Replicating SQLite using Raft Consensus
The creator of the network monitoring system Riemann, Kyle Kingsbury, has put together a comprehensive series of blog posts, on the fault-tolerance, high-availability, and general correctness of number of database and storage technologies. Of the technologies discussed I am most familiar with — elasticsearch and Apache Kafka — I found the posts to be a great read.
If you haven’t read them yet, you should check them out on his site.
Over 16 years, I’ve written software up-and-down the entire stack. Earliest in my career I wrote boot ROM software for specialized embedded devices. This kind of programming taught me so much about how computers really work.
Continue reading A Prayer for Distributed Systems Developers
This blog describes working with InfluxDB 0.8. InfluxDB 0.8 is no longer supported, and has been superseded by the 1.0 release.
I recently came across InfluxDB — it’s a time-series database built on LevelDB. It’s designed to support horizontal as well as vertical scaling and, best of all, it’s not written in Java — it’s written in Go. I was intrigued to say the least.
Continue reading InfluxDB and Grafana HOWTO
I came across a very readable paper on distributed systems — Distributed systems for fun and profit. I recommend it for anyone interested in learning more about distributed systems, and the challenges involved with designing, building, and operating distributed systems.
AWS have posted the video online of Jim Nisbet’s and my talk at AWS:reinvent 2013. In it, Jim and I describe the system we built at Loggly, which uses Apache Kafka, Twitter Storm, and elasticseach, to build a high-performance log aggregation and analytics SaaS solution, running on AWS EC2.
Continue reading Infrastructure at Scale: Apache Kafka, Twitter Storm and elasticsearch
Loggly recently held an elasticsearch meetup, which was a great success. One question that was repeatedly asked was how to ensure elasticsearch does not suffer a partition — known as a split-brain. This can be a particular problem in AWS EC2, where the network is subject to interruptions. It can also happen if the elasticsearch master node performs long garbage collection cycles.
One configuration that is very effective at preventing this problem is described in this post.
Continue reading Avoiding elasticsearch split-brain