Much of my time over the past couple years has been spent working with
DevOps tools such as Ansible, Puppet, Vagrant, and Packer, rather than
the traditional programming I did previously. Since I work in the
defense industry, there are some unique challenges applying these tools
and DevOps techniques. I had a recent conversation with a colleague about
these challenges and whether it DevOps is worthwhile in our business.
Of course, I think it is, and I tried to explain some of the reasoning.
When most people think about DevOps, they envision an automated pipeline
that runs from development all the way into production. The joint
Development and Operations team maintains all of the software and the
configuration for the system in source code form. Changes run through
a test and review process, as automated as possible. Changes that get
through that process are promoted to production automatically. Ideally
production itself is automated, with idempotent configuration (a fancy
way of saying we build up servers or containers from scratch rather than
upgrading). If possible, the architecture is such that we have many
copies of each piece of the system, so we can automatically update them
on a rolling basis, with the ability to roll back if problems appear.
This description of things doesn’t have much to do with the defense
industry, at least not my part of it. We generally build systems that
run not on the Internet, or even on some classified wide-area network,
but on platforms like aircraft, ships, submarines, or vehicles.
Updates to those systems happen only at scheduled times and they
undergo substantial testing commensurate with the difficulty of
installing a new version. (If you have a week-long window to install
an update before a six-month deployment, it’s worth the expense of
testing to be sure that there won’t be any show-stopping problems during
that six months.)
So for the majority of defense systems, there’s no DevOps Pipeline from
development to production. So what good are all these DevOps tools and
techniques? There are still a number of benefits to be had:
- Controlled Configuration. Obviously one of the most important pieces of
being sure your installed system will perform is being sure it’s the same
as the one you exhaustively tested. There is a process called Physical
Configuration Audit (PCA) that is used to ensure that the “as-built” system
matches the one that went through testing. This PCA can be a remarkably
detailed look down into the individual versions of every piece of installed
software. Automating the installation using DevOps can mean that an inspection
of the configuration source code plus a successful run equals a successful PCA.
- Fast Rebuilds. In order to improve the realism of a full system test, test
systems in the defense industry tend to mirror production, even to the extent of
including expensive custom equipment or simulation. A test system is expensive
enough that it is never possible to have as many as the engineering team would
like. Also, it is very common to have to support patches and bug fixes for
multiple versions in the field while also supporting testing for the next version.
It is critical to be able to get a system into the right configuration for the
next test quickly and reliably.
- Fast Installation. Software updates are never the only thing going on during
the brief maintenance window, and they typically aren’t the most important.
It’s often necessary to squeeze the software installation into a few hour window.
Not only does automating the installation process speed it up, it also reduces
the chance that an error will be made.
- Better Development Testing. The items above are important for the whole program,
but this one is especially critical to me as a software engineer. The cost of
test systems means that no software engineer gets as much time using one as
desired. At the same time, it’s possible to buy lots of regular servers for much
less than the cost of one full test system. Historically, we added a lot of
separate configuration to our systems in order to be able to test them in a
“development” environment. In addition to being extra work, this meant we were
testing in a different environment, missing some issues that only show up on
the real system. With tools like Vagrant, we can now use development servers to
build server and network environments that are much closer to production, and with
tools like Puppet and Ansible, we can use the exact same code to configure those
development systems that we use to install in production. The result is everyone
on the team getting a personal test system that is a close match to production.
If you’re in a business where it doesn’t seem like DevOps is a good solution,
hopefully one or more of these advantages sounds like something worth having.
If DevOps makes sense in cases where installation means carrying a DVD onto an
airplane, it probably makes sense lots of other places too.