I have a fondness for philosophy. I’m about three classes short of a degree, and every few years I tell myself one day I’ll finish it. I’ve passed my fondness onto my oldest, who did get a degree in philosophy to complement his degrees in computer science and data science. Yes, our text conversations are often quite interesting, thanks for asking.
Thus, I am very familiar with what is known in statistics—and logic—as a post hoc fallacy, from which we get the saying “correlation is not causation.” This is the logical error of assuming that if event Y followed event X, event Y must have been caused by event X. The most famous callout of this fallacy came from Bobby Henderson, who illustrated the absurdity of assuming causation from correlation with his chart demonstrating that global warming was caused by the declining number of pirates in the world.
Yeah, that doesn’t make sense, but neither do a lot of charts that people draw causality from. Just because two data points are mapped against each other does not mean one caused the other. In many cases, it doesn’t even make logical sense to correlate the two. After all, pirates and global warming? No one actually takes that seriously.
But it’s an important point to make as we dive into the question of the relationship between SRE operations and cloud repatriation.
To be clear, I’m not suggesting that adopting SRE practices causes cloud repatriation. But I am suggesting that there is a close and meaningful relationship between the two. The fact that Google—a cloud provider—created SRE as a practice is not a mistake. The model, mindset, and skillsets associated with SRE are integral to successfully operating cloud infrastructure and services.
Public cloud repatriation itself is a somewhat taboo topic in certain circles. Consider the controversy raised by Andreessen Horowitz when it published “The Cost of Cloud, a Trillion Dollar Paradox” and suggested companies were repatriating from the cloud and realizing significant cost savings as a result. Some would have you believe it’s not happening, but there’s enough data and anecdotal evidence to indicate that yes, it is.
For our 2021 report we asked the market about public cloud repatriation. A mere 13% had repatriated apps and another 14% were planning to. One year later, that combined total rose 40 percentage points to 37% and 30%, respectively. This is not an anomaly, as there are multiple credible analyst firms reporting similar results. Interestingly, the rate of repatriation is not globally universal. APCJ and LATAM are both far less likely to repatriate than EMEA and NA.
I maintain that companies are repatriating apps from the public cloud and the question isn’t ‘are they?’ but rather ‘how many workloads are they pulling out—and where are they going?’ That’s a question we’ll try to answer next year when we complete our State of Application Strategy 2023 research.
For now, we’ve been digging into a possible enabler of repatriation—SRE operations. Because even if the increasing cost of cloud is a driver of the desire to repatriate, if you don’t have the skills to operate as efficiently elsewhere—and thus benefit from lower cost—then why would you repatriate?
And we posit that it is SRE operational practices and skills that enable companies to repatriate and maintain the efficiency and cost savings needed to justify the decision—whether they’re moving those workloads to another public cloud, on-premises, or to the edge.
On the surface, there is a strong correlation between the adoption and application of SRE practices with cloud repatriation that seems to indicate that organizations with the ability to operate in a cloud-like manner, i.e., they’ve adopted SRE practices, effectively pick up their toys (apps) and go home (on-premises or elsewhere) because they can.
Cut another way, only 4% of organizations that have not adopted SRE practices have repatriated apps from the public cloud. A whopping 73% of those who have adopted SRE practices have also repatriated apps.
Of course, adopting practices does not necessarily mean applying practices. So, we looked at how organizations are actually operating applications, systems, and infrastructure. Specifically, we looked at the percentage of their operations that use SRE practices. Perhaps unsurprisingly, that generated similar results.
Of those who operate 0% of their apps, systems, and infrastructure using SRE practices, 81% are not repatriating. Conversely, of those who use SRE practices for 76%–99% of apps, system, and infrastructure operations, 54% have repatriated. The point at which repatriation appears to begin picking up steam is when organizations surpass using SRE practices to operate more than one-quarter (25%) of their apps, systems, and infrastructure.
Remember I noted that APCJ and LATAM were far less likely to repatriate? They’re also far less likely to be leveraging SRE practices to operate their apps, systems, and infrastructure. In fact, over one-quarter (26%) in LATAM and APCJ (29%) were operating ZERO percent of apps, systems, and infrastructure using SRE practices. In EMEA? That’s only 5%. And in NA, even lower at 2%.
There appears to be an inarguable correlation between organizations embracing SRE as an operational practice and public cloud repatriation rates. But is it a meaningful relationship or merely a curious coincidence?
I’m going to argue, because this is my blog, that it’s a meaningful relationship.
The practices and skillsets associated with SRE are wholly suited to operating a cloudy environment—at scale. As I said before, it’s no mistake that it was Google who created SRE and has literally written the book on it. And I’ve said before (and I’ll say it again)—the value of cloud is in its operational model, which can dramatically lower the cost per transaction—whether measured by HTTP exchanges or customer sessions. That enables cost-efficient scale of applications and digital services.
The use of automation and practices that tend to focus on meaningful incidents rather than non-disruptive occurrences provides cost-efficient scale of the people (and thus their expertise) who are tasked with maintaining a high level of availability and performance.
The adoption and use of SRE practices enables organizations to efficiently scale operations whether in the public cloud or on-premises or at the edge. And what the data tells us is that organizations appear to be using that capability to do just that.
To learn more about modernizing architecture—and adopting SRE operations—to serve a digital business, you can dive into our new O’Reilly book, Enterprise Architecture for Digital Business.