As a consultant, I tend to work with a variety of clients and teams all across the product maturity spectrum.
Some are just starting; maybe they have an MVP, maybe they are still building it. Others have existed in their space for years. Typically, when I get called into projects, the product maturity is on one extreme of the spectrum. DevOps maturity, on the other hand, tends to follow a different distribution with most DevOps programs somewhere past just starting, but not quite mature.
In this not-quite-mature stage of DevOps, there is usually a sort of CI/CD pipeline, automatic tests, linters and an emphasis on automation. Logs, metrics, and backups are taken automatically and centralized. There may be basic dashboards and saved queries. Documentation may or may not be adequate for helping to debug problems. If there was a DevOps checklist, all the boxes would be checked. At this point, many teams stop building.
Perhaps there was never a plan to mature beyond that level. People love to say “premature optimization is the root of all evil”, and they have a point; if one of my clients didn’t have automated backups the first thing I’d do is enable them, at a minimum by taking snapshots on the VM level. If they didn’t have any logging setup at all the first step would be to enable them, maybe push them to centralized storage. Once a team reaches that level of maturity they have a working system. Backups don’t usually need to be restored that often, and a team can get by looking at the logs there only when something needs to be debugged. When the team isn’t getting woken up in the middle of the night there is less incentive to continue maturing.
Another reason why DevOps engineers may shy away from working to mature their program is that after a certain level it becomes “less fun”. The rate of building slows down and the focus shifts to maintenance and metrics. Metrics are decidedly Not Cool; they’re what non-tech people and management types use to justify their paycheck. Still, once the product reaches a certain level and developers stop making large amounts of process change, defining metrics and direction will help to continuously improve the program and keep it from getting out of date.
In a mature DevOps program, much more time is spent optimizing than implementing. Logs have long been centralized and now most of the effort is spent looking for new indicators that something may be wrong. Linters may have more custom checks than out of the box code formatters and style checks. Mature DevOps is about continuously building processes and responding to change by continuously optimizing and improving.
Mature DevOps programs allow errors to be caught earlier and make them easier to debug. In an early stage DevOps program, an administrator may only get alerted if the entire application is down, but may not notice if a small component is broken. For example, as the program matures, the team may start to track the rough number of “404 page not found” responses returned, alerting when the metric crosses a certain threshold. Later, the team can get even more granular, perhaps reporting when a 404 is returned for a page that had previously not returned one.
Mature DevOps helps reduce tech debt and improves efficiency. A basic DevOps program may only track the overall page load time or memory usage. As a result, by the time the many small statistically insignificant changes become noticeable, it has already become infeasible to go back and identify/fix them. If code changes are reviewed on their performance impact to individual components, then adverse changes can be identified earlier in the process, increasing the overall health of the application and extending the amount of time before the inevitable “major refactor”.
If DevOps is your thing, and you want the opportunity to build out advanced DevOps programs for different types of clients, RunAsCloud is hiring. Reach out to email@example.com to learn more!
There’s been a ton of coverage of the recently discovered Capital One breach. I’m generally very skeptical when AWS security makes the news; so far, most “breaches” have been a result of the customer implementing AWS services in an insecure manner, usually by allowing unrestricted internet access and often overriding defaults to remove safeguards (I’m looking at you, NICE and Accenture and Dow Jones!). Occasionally, a discovered “AWS vulnerability” impacts a large number of applications in AWS – and it also impacts any similarly-configured applications that are *not* in AWS (see, for example, this PR piece…um,…Read More