7 years of open-source database development: lessons learned

It was April 9th 2016, and I tagged my first official release of rqlite — two years after I actually started coding it. Since then there has been 58 releases, 277 closed issues, 416 closed pull requests, 32,785 insertions, 1954 deletions, and 100 files have changed.

What is rqlite?

rqlite is a lightweight, open-source, distributed relational database written in Go, with SQLite as its storage engine. I started it for fun, but it’s since become something more serious. So what have I learned about open-source database development these past 7 years?

One feature at a time

Once I tried to rewrite the HTTP serving layer and replace the Raft consensus subsystem. At the same time.

It was too much, the second system effect kicked in, and I abandoned the work. That development effort cost me many weeks before I realised the implementation was becoming bloated. I learned a valuable lesson — keep your changes small, and focus on one feature at a time.

Try to plot an incremental path to any new design and implementation. Make releases as regularly as you can, and get your changes back into the mainline as quickly as possible. And be wary of large rewrites without clear intermediate deliverables, which don’t meet an actual need.

Creativity is erratic and unpredictable

I’ve added some of the most important features in a single weekend.

The best times have often been a contiguous, high-intensity multi-day period where I could see the shape of the new system in my mind before I wrote a single line of code. I once completely redesigned and re-implemented the HTTP API in a weekend, it resulted in rqlite version 2.0 — and it was way better.

Another weekend I migrated the Raft log to use Protobuf encoding instead of JSON. The following week I added compression — and it all just worked beautifully.

But sometimes months went by when I would do nothing. That has still been the pattern to this day. I often wonder how much progress I could have made if I worked solidly on the database for a year.

The importance of testing

I’m convinced that it’s the extensive test coverage that has kept the quality of the code high. I’ve received reports from users that rqlite instances have run for more than a year without ever needing to be restarted.

I adhere pretty closely to the testing pyramid philosophy. Write your test cases as close as possible to the actual code — it makes a world of difference. Do not ignore test failures or try to code around them. Tests don’t fail for magical reasons — they are telling you that you don’t fully understand what you’ve built.

Keep your integration testing for smoke tests — to make sure your database actually starts and that you haven’t missed anything basic. Only when there is no way to exercise the code except when an actual full instance of the software is running should an end-to-end test be used. And even if that’s the case, it may be telling you something about your implementation — that your software is not modular enough, or that your interfaces are insufficiently orthogonal.

Unit testing has been key. Without excellent coverage at the unit test level your software will never be high quality.

Go has stood the test of time

I’ve always been impressed with Go — it’s long been my favourite programming language. And I’ve been productive. After almost 7 years I still enjoy it. Months pass between periods of rqlite development, and I find I still haven’t forgotten my Go usage style and patterns when I return to the code.

Publicity is hard

I write my database for the challenge — can I create an interesting system, that maintains a clean design and coherent implementation? And keep the quality high? I think I can, doing so shows I can — and that’s usually enough. But it’s also gratifying when people use it.

But publicity is hard. It’s been on Hacker News a few times. I’ve spoken about it at Meetups. It’s taken 7 years to gain 8,000 Stars on GitHub. Is this good? I don’t know. Should I care? I don’t know that either.

Programming is therapeutic

I manage programmers for a living. It’s a fascinating job, but it’s not the same as doing the coding yourself. Programming as a team activity requires agreeing on coding style, bug resolution policies, code reviews, and feature prioritization. Building software as a team requires a lot of non-coding activities.

So working on your own project is liberating. You decide on the coding style. You decide on the features. You decide which bugs to fix. You don’t go to meetings.

It also shows why crossing into the world of multiple developers slows progress. When it comes to coherency and clarity of design, nothing beats the single mind, the single vision.

7 years, and so much left to do

rqlite has existed for 7 years, and there is still so much to do.

Transactions, better client libraries, proper Kubernetes support, performance — the appeal of software is that it’s infinitely extensible, and can always be improved. I’m pretty certain it’ll never get to a place where I will say “it’s done”.

Instead, like an old soldier, it’ll probably just fade away.

Leave a Reply

Your email address will not be published. Required fields are marked *