Whether you’re a TDD zealot, or you just occasionally write a quick script to reproduce some bug, it’s a rare coder who doesn’t see value in some sort of automated testing. Yet, somehow, in all of the new-age “Infrastructure as Code” mania, we appear to have forgotten this, and the tools that are commonly used for implementing “Infrastructure as Code” have absolutely woeful support for developing your Infrastructure Code. I believe this has to change.
At present, the state of the art in testing system automation code appears to be, “spin up a test system, run the manifest/state/whatever, and then use something like serverspec or testinfra to SSH in and make sure everything looks OK”. It’s automated, at least, but it isn’t exactly a quick process. Many people don’t even apply that degree of rigour to their system config systems, and rely on manual testing, or even just “doing it live!”, to shake out the bugs they’ve introduced.
Speed in testing is essential. As the edit-build-test-debug cycle gets longer, frustration grows exponentially. If it takes two minutes to get a “something went wrong” report out of my tests, I’m not going to run them very often. If I’m not running my tests very often, then I’m not likely to write tests much, and suddenly… oops. Everything’s on fire. In “traditional” software development, the unit tests are the backbone of the “fast feedback” cycle. You’re running thousands of tests per second, ideally, and an entire test run might take 5-10 seconds. That’s the sweet spot, where you can code and test in a rapid cycle of ever-increasing awesomeness.
Interestingly, when I ask the users of most infrastructure management systems about unit testing, they either get a blank look on their face, or, at best, point me in the direction of the likes of Test Kitchen, which is described quite clearly as an integration platform, not a unit testing platform.
Puppet has rspec-puppet, which is a pretty solid unit testing framework for Puppet manifests – although it isn’t widely used. Others, though… nobody seems to have any ideas. The “blank look” is near-universal.
If “infrastructure developers” want to be taken seriously, we need to learn a little about what’s involved in the “development” part of the title we’ve bestowed upon ourselves. This means knowing what the various types of testing are, and having tools which enable and encourage that testing. It also means things like release management, documentation, modularity and reusability, and considering backwards compatibility.
All of this needs to apply to everything that is within the remit of the infrastructure developer. You don’t get to hand-wave away any of this just because “it’s just configuration!”. This goes double when your “just configuration!” is a hundred lines of YAML interspersed with templating directives (SaltStack, I’m looking at you).