I’ve recently been thinking about why running Services is particularly hard. By Services I mean Software-as-a-Service platforms. During the years, I’ve written software for many different systems — embedded software, web services, databases, and distributed systems, but being involved with designing and running a SaaS platform was difficult in a whole new way: running Services is hard work.
There are two main reasons for this – one obvious, and the other only becomes apparent after you’ve actually done it yourself.
The technical reasons are apparent, even to those that haven’t built these systems. Services often comprise complex systems, with many heterogenous components, such as databases, virtual machines, varying network topologies, caches, and load balancers. Clustered systems are becoming more and more prevalent, as a way to provide reliabilty and fault tolerance.
But with these systems comes a complexity that is hard to handle, often resulting in emergent behaviour, unforeseen at design-time. The complexity of modern computer systems is beginning to get ahead of the Technical Operation teams’ ability to run those systems.
When you write software, you usually hope that software will have a user base. That user base downloads your software, provisions the resources, installs it, and runs it. And since all software has bugs, it will fail on them from time to time. But since your users are running it, they still feel like they have control. This control may be illusory since they often must wait for fixes from you, but they can experiment with their systems, try changing something that might alleviate the issue and may even find a workaround.
However, the emotional reaction to a service failure can be very different. The user — your customers — have so much less control over the system. The overwhelming emotion is one of frustration, which can quickly cause your customers to lose faith in your service. The invective that may come your way is often stunning. This puts enormous pressure on Technical Operations teams and developers, and is the real reason why running services is difficult.
As someone who is passionate about engineering, running Services has a virtuous side — if you want to be successful, you must practise superior engineering. Without it the failures may overwhelm you, your Operations team, and finally the system itself. So if you’re running — or are going to run — a service, embrace the challenge but remember, running services is hard work.