Tag Archives: operations

Come for the Features, Stay for the Uptime

I’ve been thinking  a lot recently about what makes computer services and products sticky — what makes users and customers come back again and again to what you’ve built. There are lots of ways to summarize it, but when it comes to systems that help technical people run their own systems, they come for the features, but they stay for the uptime.

Continue reading Come for the Features, Stay for the Uptime

InfluxDB and Grafana HOWTO

This blog describes working with InfluxDB 0.8. InfluxDB 0.8 is no longer supported, and has been superseded by the 1.0 release.

grafanaI recently came across InfluxDB — it’s a time-series database built on LevelDB. It’s designed to support horizontal as well as vertical scaling and, best of all, it’s not written in Java — it’s written in Go. I was intrigued to say the least.

Continue reading InfluxDB and Grafana HOWTO

What I wish I’d been told about the JVM

Java_logoJava is the predominant language of Big Data technologies. HBase, Lucene, elasticsearch, Cassandra – all are written in Java and, of course, run inside a Java Virtual Machine (JVM). There are some other important Big Data technologies, while not written in Java, also run inside a JVM. Examples include Apache Storm, which is written in Clojure, and Apache Kafka, which is written in Scala. This makes basic knowledge of the JVM quite important when it comes to deploying and operating Big Data technologies.

Continue reading What I wish I’d been told about the JVM

Avoiding elasticsearch split-brain

elasticsearchLoggly recently held an elasticsearch meetup, which was a great success. One question that was repeatedly asked was how to ensure elasticsearch does not suffer a partition — known as a split-brain. This can be a particular problem in AWS EC2, where the network is subject to interruptions. It can also happen if the elasticsearch master node performs long garbage collection cycles.

One configuration that is very effective at preventing this problem is described in this post.

Continue reading Avoiding elasticsearch split-brain

Monitoring Storm Kafka Spouts using Python

kafka-logo-tallWhen running a large real-time processing system, monitoring is critical. But it does more than allow you to keep an eye on your system. During development it allows you test hypotheses about how it works, how it performs when certain parameters are changed, and takes the guessing out of working with dynamic systems.

Storm, a real-time computational framework open-sourced by Twitter, is such a system and comes with a Spout, allowing messages to be streamed from a Kafka Broker.

Continue reading Monitoring Storm Kafka Spouts using Python