This is the last part of a 3-part series “Designing and building a search system for log data”. Be sure to check out part 1 and part 2.
In the last post we examined the design and implementation of Ekanite, a system for indexing log data, and making that data available for search in near-real-time. Is this final post let’s see Ekanite in action.
Downloading and running
Starting Ekanite
ekanited -datadir $HOME/ekanite
Once launched, Ekanite listens on three TCP ports:
- At http://localhost:9951/debug/vars it makes simple statistics and diagnostic information available.
- On TCP port 5514 it accepts log data.
- On TCP port 9950 it listens for queries.
All of these ports are configurable.
Index some log lines
curl https://raw.githubusercontent.com/ekanite/ekanite/master/test_resources/logs1k.txt.bz2 | bunzip2 | netcat localhost 5514
Search
$ telnet 127.0.0.1 9950
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
login
<134>0 2015-05-05T23:50:17.025568+00:00 fisher apache-access - - 65.98.59.154 - - [05/May/2015:23:50:12 +0000] "GET /wp-login.php HTTP/1.0" 200 206 "-" "-"
<134>0 2015-05-06T01:24:41.232890+00:00 fisher apache-access - - 104.140.83.221 - - [06/May/2015:01:24:40 +0000] "GET /wp-login.php?action=register HTTP/1.0" 200 206 "https://www.philipotoole.com/" "Opera/9.80 (Windows NT 6.2; Win64; x64) Presto/2.12.388 Version/12.17"
<134>0 2015-05-06T04:20:49.008609+00:00 fisher apache-access - - 193.104.41.186 - - [06/May/2015:04:20:46 +0000] "POST /wp-login.php HTTP/1.1" 200 206 "-" "Opera 10.00"
login -GET
<134>0 2015-05-06T04:20:49.008609+00:00 fisher apache-access - - 193.104.41.186 - - [06/May/2015:04:20:46 +0000] "POST /wp-login.php HTTP/1.1" 200 206 "-" "Opera 10.00"
Enhancing Ekanite
Ekanite could be enhanced and improved in many, many ways. Major improvements would include:
- More sophisticated query syntax would allow, for example, search on specific fields of the log data. This would involve building a parser, which would run in the query server.
- A more sophisticated query would allow the system to accept time-bounded queries. This would allow searches of, say, the last hour. Searches would be correspondingly faster.
- Ekanite has not been tuned for performance. Go comes with an extensive set of performance and profiling tools. Since Ekanite is a demonstration system, its main goal is functionality, but there are significant performance improvements available. For example, use a state-machine to parse the log lines, instead of regular expressions.
- Reducing the number of memory allocations the code makes would minimize the impact of the garbage collection.
- Ekanite uses bleve with its default storage engine, which is BoltDB. Better storage efficiency and indexing throughput may be achieved with other engines such as LevelDB. This would complicate the build process however, as LevelDB is written in C++.
- Fix the bugs!
FIN
Indexing and search systems are fascinating, and I encourage you to check out the bleve and Ekanite source code.
Hopefully this series of posts has been helpful in understanding the various requirements, trade-offs, design, and implementation of one particular type of these systems.