Tag Archives: operations

Designing a search system for log data — part 1

November 22, 2015 Philip O'Toole Leave a comment

This is the first part of a 3-part series “Designing and building a search system for log data”. Part 2 is here, and part 3 is here.

For the past few years, I’ve been building indexing and search systems, for various types of data, and often at scale. It’s fascinating work — only at scale does O(n) really come alive. Developing embedded systems teaches you how computers really work, but working on search systems and databases teaches you that algorithms really do matter.

Continue reading Designing a search system for log data — part 1 →

Who watches the watchers?

September 22, 2015 Philip O'Toole Leave a comment

I’ve written my first post for the InfluxDB blog. In it I discuss the new statistics and monitoring system built into InfluxDB, starting with the 0.9.4 release. Functionality like this is critical when it comes to running a distributed database like InfluxDB.

You can check it out here.

400 days of Go

September 1, 2015 Philip O'Toole 33 Comments

It’s been 418 days since my first Github commit of Go code. In that time I’ve written a Syslog-to-Kafka producer, a Raft-based distributed SQLite database, a near real-time log search system, and become a core developer of InfluxDB.

Continue reading 400 days of Go →

Running services is hard

August 12, 2015 Philip O'Toole Leave a comment

I’ve recently been thinking about why running Services is particularly hard. By Services I mean Software-as-a-Service platforms. During the years, I’ve written software for many different systems — embedded software, web services, databases, and distributed systems, but being involved with designing and running a SaaS platform was difficult in a whole new way: running Services is hard work.

Continue reading Running services is hard →

InfluxDB and Grafana HOWTO

June 9, 2014 Philip O'Toole 14 Comments

This blog describes working with InfluxDB 0.8. InfluxDB 0.8 is no longer supported, and has been superseded by the 1.0 release.

I recently came across InfluxDB — it’s a time-series database built on LevelDB. It’s designed to support horizontal as well as vertical scaling and, best of all, it’s not written in Java — it’s written in Go. I was intrigued to say the least.

Continue reading InfluxDB and Grafana HOWTO →

What I wish I’d been told about the JVM

April 16, 2014 Philip O'Toole Leave a comment

Java is the predominant language of Big Data technologies. HBase, Lucene, elasticsearch, Cassandra – all are written in Java and, of course, run inside a Java Virtual Machine (JVM). There are some other important Big Data technologies, while not written in Java, also run inside a JVM.

Examples include Apache Storm, which is written in Clojure, and Apache Kafka, which is written in Scala. This makes basic knowledge of the JVM quite important when it comes to deploying and operating Big Data technologies.

Continue reading What I wish I’d been told about the JVM →

If you love your logs, set them free

March 21, 2013 Philip O'Toole

I recently wrote my first post for the Loggly blog. It illustrates why host machines are often the worst place to store the logs those machines are generating.

You can check it out here.