Tag Archives: operations

Designing a search system for log data — part 3

This is the last part of a 3-part series “Designing and building a search system for log data”. Be sure to check out part 1 and part 2.

ekanite-cubeIn the last post we examined the design and implementation of Ekanite, a system for indexing log data, and making that data available for search in near-real-time. Is this final post let’s see Ekanite in action.

Continue reading Designing a search system for log data — part 3

Designing a search system for log data — part 2

This is the second part of a 3-part series “Designing and building a search system for log data”. Be sure to check out part 1. Part 3 follows this post.

ekanite-cubeIn the previous post I outlined some of the high-level requirements for a system that indexed log data,  and makes that data available for search, all in near-real-time. Satisfying these requirements involves making trade-offs, and sometimes there are no easy answers.

Continue reading Designing a search system for log data — part 2

Designing a search system for log data — part 1

This is the first part of a 3-part series “Designing and building a search system for log data”. Part 2 is here, and part 3 is here.

ekanite-cubeFor the past few years, I’ve been building indexing and search systems, for various types of data, and often at scale. It’s fascinating work — only at scale does O(n) really come alive. Developing embedded systems teaches you how computers really work, but working on search systems and databases teaches you that algorithms really do matter.

Continue reading Designing a search system for log data — part 1

Running services is hard

I’ve recently been thinking about why running Services is particularly hard. By Services I mean Software-as-a-Service platforms. During the years, I’ve written software for many different systems — embedded software, web services, databases, and distributed systems, but being involved with designing and running a SaaS platform was difficult in a whole new way: running Services is hard work.

Continue reading Running services is hard

Drop, Throttle, or Buffer

Real-time — or near real-time — data pipelines are all the rage these days.  I’ve built one myself, and they are becoming key components of many SaaS platforms. SaaS Analytics, Operations, and Business Intelligence systems often involve moving large amounts of data, received over the public Internet, into complex backend systems. And managing the incoming flow of data to these pipelines is key.

Continue reading Drop, Throttle, or Buffer

Come for the Features, Stay for the Uptime

I’ve been thinking  a lot recently about what makes computer services and products sticky — what makes users and customers come back again and again to what you’ve built. There are lots of ways to summarize it, but when it comes to systems that help technical people run their own systems, they come for the features, but they stay for the uptime.

Continue reading Come for the Features, Stay for the Uptime

InfluxDB and Grafana HOWTO

This blog describes working with InfluxDB 0.8. InfluxDB 0.8 is no longer supported, and has been superseded by the 1.0 release.

grafanaI recently came across InfluxDB — it’s a time-series database built on LevelDB. It’s designed to support horizontal as well as vertical scaling and, best of all, it’s not written in Java — it’s written in Go. I was intrigued to say the least.

Continue reading InfluxDB and Grafana HOWTO

What I wish I’d been told about the JVM

Java_logoJava is the predominant language of Big Data technologies. HBase, Lucene, elasticsearch, Cassandra – all are written in Java and, of course, run inside a Java Virtual Machine (JVM). There are some other important Big Data technologies, while not written in Java, also run inside a JVM.

Examples include Apache Storm, which is written in Clojure, and Apache Kafka, which is written in Scala. This makes basic knowledge of the JVM quite important when it comes to deploying and operating Big Data technologies.

Continue reading What I wish I’d been told about the JVM