Wednesday, August 13, 2014

Economies of Scale

Let's walk through the typical software development cycle for a new product.  Needs are identified, a team is put together, functionality built, and deployed.  As business needs evolve new features are developed, existing features are changed, and in general systems usually grow in size.  As the size of the system grows testing becomes a bottleneck for most companies.  In fact, companies that let this go on for too long end up with testing cycles that dwarf the development portion of the delivery process.  Why?  Testing needs to cover new features as well as existing features, so testing effort is theoretically equal to all past testing efforts plus the effort required to test whatever is new.  If manual, this obviously gets cumbersome quickly.

Enter test automation. The argument becomes clear to organizations that more of the testing process must be automated.  It's a process that is repeated over and over again.  Why would a company pay a tester to do perform the exact same task over and over again?  The logic is straightforward.  It seems to me that this is the reason that test automation practices are gaining a much larger foothold in companies in the last several years.

The Infrastructure as Code movement is much newer, however.  It seems to me that the argument for test automation (above) is not as applicable in the ops space.  If you consider the same product development cycle described above, but from an ops perspective, it might go something like this.  Needs are identified, teams request new infrastructure to host new solution(s), infrastructure is provisioned, new resources are folded into existing support processes.

"Infrastructure is provisioned".  Once.  Granted, in the case of Amazon, Netflix, Facebook, etc, this wouldn't hold true.  Everything they build is scaled massively.  That's not the case for a lot of companies, especially those in the Information Technology Dark Side.  Infrastructure is provisioned once and that's it.*  There is no economy of scale.

Now there is still an argument, it's just an argument that I think it harder to make and back up with concrete businessy goodness.  Testing?  Manual regression testing takes 1 month.  Our automation suite will take 4 minutes.  Boom.  Infrastructure?  Welllll, the automation is a repeatable, reliable process that will significantly improve our confidence in the provisioning and subsequent change process.  "That's all well and good, but you're telling me we need to write all this... what did you call it... infrastructure codeage?  That seems like a lot of work to stand up these 2 servers.  We'll just hammer them out."

I wholeheartedly believe that the real value in infrastructure as code + automation is the actual repeatable, reliable process that results in consistency.  The extremely beneficial by-product is that the result can scale across as many nodes as you want.  To me this is akin to the TDD argument.  A lot of very smart people pointed out that the real value in TDD is the improved code design.  The fact that the resulting tests form a regression suite is just icing on the cake.  But neither of those real reasons is an easy argument to make to management.

Cost-focused organizations and managers want economies of scale.  They affect the bottom line.  It seems to me that test automation is an easier argument to make.  What am I missing?  What's the business case for infrastructure as code / automation?  How do you frame it up in a way that connects to concrete business value?



* You could certainly make the argument that subsequent changes to the infrastructure should be vetted through an automated test suite process similar to the one I described for application code.  That's fair.  And I'm sure people do it.  That just feels even less tangible to me right now.

No comments:

Post a Comment