Tag Archives: distributed systems

Book Review: Cassandra High Availability

cassandraPackt recently asked me to review their new publication Cassandra High Availability, written by Robbie Strickland.

I’ve worked with Cassandra in the past — early designs of Loggly‘s 2nd generation Log analytics platform used Cassandra as its authoritative store for log data, but we ended up pulling it and using elasticsearch as both the store and search engine.

Continue reading Book Review: Cassandra High Availability

Replicating SQLite using Raft Consensus

raft-logoSQLite is a “self-contained, serverless, zero-configuration, transactional SQL database engine”.  However, it doesn’t come with replication built in, so if you want to store mission-critical data in it, you better back it up. The usual approach is to continually copy the SQLite file on every change.

I wanted SQLite, I wanted it distributed, and I really wanted a more elegant solution for replication. So rqlite was born.

Continue reading Replicating SQLite using Raft Consensus

Call me Definitely

The creator of the network monitoring system Riemann, Kyle Kingsbury, has put together a comprehensive series of blog posts, on the fault-tolerance, high-availability, and general correctness of number of database and storage technologies. Of the technologies discussed I am most familiar with — elasticsearch and Apache Kafka — I found the posts to be a great read.

If you haven’t read them yet, you should check them out on his site.

InfluxDB and Grafana HOWTO

This blog describes working with InfluxDB 0.8. InfluxDB 0.8 is no longer supported, and has been superseded by the 1.0 release.

grafanaI recently came across InfluxDB — it’s a time-series database built on LevelDB. It’s designed to support horizontal as well as vertical scaling and, best of all, it’s not written in Java — it’s written in Go. I was intrigued to say the least.

Continue reading InfluxDB and Grafana HOWTO

Infrastructure at Scale: Apache Kafka, Twitter Storm and elasticsearch

storm_logoAWS have posted the video online of Jim Nisbet’s and my talk at AWS:reinvent 2013. In it, Jim and I describe the system we built at Loggly, which uses Apache Kafka, Twitter Storm, and elasticseach, to build a high-performance log aggregation and analytics SaaS solution, running on AWS EC2.

Continue reading Infrastructure at Scale: Apache Kafka, Twitter Storm and elasticsearch