AIOps Is Booming But ROI Remains Elusive

AIOps Is Booming But ROI Remains Elusive - Professional coverage

According to Forbes, the recent AWS US-EAST-1 data center failure knocked out platforms from Snapchat and Reddit to Fortnite and major financial apps for several hours, reigniting enterprise interest in AIOps for faster recovery. A 2025 New Relic report puts the median cost of a major outage at nearly $2 million per hour, making faster detection financially critical. While 87% of organizations say their AIOps investments have met or exceeded expectations, only 12% have achieved full enterprise-wide deployment due to data quality and integration challenges. Gaurav Toshniwal, CEO of Sherlocks.ai, explains that AIOps value comes from cutting alert noise and speeding fixes, while Riverbed’s global survey shows persistent barriers slowing broader adoption across IT operations.

Special Offer Banner

The attribution problem

Here’s the thing about AIOps ROI: everyone wants it, but nobody can quite pin it down. When performance improves after rolling out AI tools, is it because of the AI itself? Or better data? Or improved workflows? Most companies can’t easily separate these factors. Toshniwal says his company tackles this by benchmarking mean time to detect and resolve before and after deployment, plus tracking what percentage of issues get automatically triaged or resolved through their recommendations.

But that level of precision is rare. Basically, we’re in that awkward phase where everyone’s buying the tools but few can definitively prove they’re working. And when you’re dealing with million-dollar-per-hour outage costs, that uncertainty becomes expensive fast.

Startups vs enterprises

The ROI story changes completely depending on company size. Startups deploying rapidly with small teams see value faster—automation lets them maintain reliability without adding heavy operational layers. But large enterprises? That’s a different ballgame.

Legacy systems, overlapping vendor tools, and dependence on a few key engineers make everything harder to measure. Toshniwal notes the real enterprise value comes from turning implicit knowledge into explicit, reusable intelligence. When your star engineer leaves, does their troubleshooting expertise go with them? AIOps at scale becomes less about automation and more about preserving institutional knowledge.

The accountability push

After the AWS outage, even major financial institutions started rethinking how they track performance and risk. Christer Holloman noted in Forbes that multi-cloud strategies are gaining traction as risk mitigation. The message is clear: with downtime costs climbing, executives want evidence their tech investments actually add business value.

Toshniwal thinks we need a “reliability scorecard” that tracks detection speed, fix times, how often updates cause failures, and avoided downtime. Consistent benchmarks would bring much-needed transparency to the AIOps market. Because let’s be honest—without clear results, this whole category risks being dismissed as vendor hype.

The next frontier

We’re at a turning point for AIOps. The experimentation phase is giving way to disciplined proof-seeking. As Alois Reitbauer, chief technology strategist at Dynatrace, noted, observability is shifting from reporting telemetry about application health to informing the decisions that run the business.

The next decade won’t be about seeing systems more clearly—we’ve mostly solved that. It’ll be about understanding them deeply enough to predict and prevent failures entirely. Reliability becomes the clearest sign that data, not guesswork, runs your operations. And in a world where one data center failure can take down half the internet, that’s not just nice to have—it’s business-critical.

One thought on “AIOps Is Booming But ROI Remains Elusive

Leave a Reply

Your email address will not be published. Required fields are marked *