The DevOps movement has recently been presented as a way of introducing “anti-fragility” into IT operations (http://gigaom.com/2013/04/21/great-devops-anti-fragility-and-complexity-resources/). Next generation of cloud-native systems, e.g. Netflix, are also declared as “anti-fragile” (http://perfcap.blogspot.co.il/2013/01/looking-back-at-2012-with-pointers-to.html).
The concept of anti-fragility was coined by Nassim Taleb in his book carrying the same name, and literally means a property directly opposite to fragility. While fragile objects, e.g. Venetian glass, can easily brake and have to be treated with care, anti-fragile objects and systems virtually flourish from mishandling – not only do they come back to where they were before the shock, but reach a new level of strength, performance or whatever properties are vital for their survival. Under normal conditions anti-fragility is not observed in mechanical objects. It is observable, though up to a limit, in more complex structures such as living organisms, eco-systems, society, politics, and business.
For DevOps and Continuous Delivery the idea of anti-fragility looks appealing: by automating manual processes and delivering new functionality in small pieces we expose system to a larger number of smaller shocks, each one with controlled and predictable damage from potential failure. By applying relentlessly root cause analysis we eliminate fragile elements from the system ensuring that the same problem will never happen again. However just eliminating fragility does not automatically guarantee introduction of anti-fragility. The former merely means making the system more robust. Where the big, albeit rare, potential payoffs come from? Also, by introducing more and more automation and protective measures against failures we encountered in the past how could we ensure we do not introduce fragility backdoor in the longer term? The problem is exacerbated by the fact that failures in complex systems too often do not have a single “root cause”. How could we ensure we do not introduce a cure which is worse than original decease? Blindly copying externally visible practices of successful companies without deep understanding of underlying mechanisms and proper analysis of particular context is seldom a good idea.
In this presentation Asher will be taking a deeper look at the nature of Complex Adaptive Systems, Highly-Optimized Tolerance, asymmetric pay-off structures, and a concept of barbell (introduced in the Taleb’s book). Then I will suggest a way to connect these abstract concepts to more concrete ideas of Technology Development Lifecycle, Disruptive Innovation, Product Development Flow and Lean Start Up. That would in turn allow positioning DevOps and Continuous Delivery practices more accurately in the context of a larger body of knowledge hopefully obtaining some deeper insight into potential gains and pitfalls.
Asher Sterkin, Cisco Systems
Distinguished Engineer at Cisco Video Systems, Israel (former NDS Technologies, the last position was VP Technologies, totally 18 years tenure). Member of a New Initiatives Group within the Office of CTO of Service Provider Video Technology Group (SPVTG). Areas of specialization: Software Architecture, Domain Driven Design, Software Development Process, Technology Radar (most interesting findings are published at https://plus.google.com/100923793849881426374/posts/p/pub). Evangelizing adoption of DevOps and Cloud Technologies since 2011 (at that time as a member of NDS Chief Technology Advisor Group). More detailed profile could be found here.