Pulse Alternative
Equities

From resilience to survivability: How AI forces a rethink of business continuity


Artificial intelligence is forcing companies to change almost every aspect of their business. From operations to hiring to sales and training, change is happening faster than ever. One aspect of this change that has flown under the radar is the need for companies to rethink their business continuity plans.

AI is pressuring enterprises to move beyond traditional ideas of resilience and toward architectures and operating models that assume continuous, systemic disruption — and can keep the business running anyway. For information technology leaders, that means business continuity shifts from a document-and-disaster recovery exercise to an operating discipline.

Equinix Inc.’s recent announcement, “Resilience Isn’t Enough: The New Rules of Business Continuity,” argues that redundancy and failover are no longer sufficient as disruptions become systemic. The company highlights research indicating that Global 2000 firms now incur roughly $400 billion in downtime annually, with an average cost of about $540,000 per hour, underscoring how business-wide the continuity problem has become. I expect that as AI becomes more embedded in organizations and productivity grows, the cost of downtime will also increase.

The post defines “operational survivability” and introduces Zscaler Inc.‘s Business Continuity Cloud, running on Equinix infrastructure, as an example of “architectural independence.” This fault-isolated, parallel environment has separate deployment pipelines, network paths, domains and routing, and is designed to remain operational when the primary stack cannot. It is positioned not as a cold backup or secondary region, but as a continuously operating, logically separate control and data plane that preserves zero-trust policies, user experience and compliance even when primary environments or teams are degraded.

Why AI changes the continuity conversation

The Equinix post calls AI a “force multiplier” for continuity risk. As enterprises scale AI from pilots to production, workloads become more distributed, latency-sensitive and deeply embedded in real-time operations. When AI services fail, organizations don’t just lose compute; they lose the decision systems that now drive logistics, fraud detection, customer experiences and revenue-critical processes.

Beyond that, several trends are converging:

  • AI workloads are highly interconnected. Model training and inference typically span multiple clouds, data stores and networks, increasing the likelihood of hidden shared dependencies.
  • AI raises the stakes for latency. Generative and analytical workloads increasingly sit in the transaction path, so degradation translates directly into user-visible impact, not just slower reports.
  • AI is reshaping the threat landscape. Adversaries are using AI to automate and scale attacks, accelerate the discovery of misconfigurations, and generate more convincing social engineering, increasing both the frequency and complexity of incidents IT must address.

In this environment, continuity and resilience need to be AI-aware in two directions: Protect AI as a critical dependency and use AI to build more adaptive continuity capabilities.

From resilience to architectural independence

Traditionally, resilience has meant building robust systems with improved redundancy, clustering, backup data centers and DR processes to restore service after an outage. The reality is that this is necessary but not sufficient because primary and backup environments often share invisible dependencies, such as cloud regions, identity providers, control planes or operations teams.

The “architectural independence” idea pushes continuity a step further:

  • Separate blast radii: Parallel environments are designed so that failures in one stack don’t automatically propagate to the other, using distinct infrastructure footprints, network paths and domains.
  • Independence at multiple layers: Though physical infrastructure is important, so are deployment pipelines, change windows, supporting systems and even operational teams. These can be decoupled to avoid common-mode failure.
  • Always-on posture: Instead of a standby environment waiting for failover, independent environments run concurrently, making cutover effectively transparent to users and endpoints and avoiding risky manual reconfiguration. This has obvious economic benefits over having a parallel system on continuous “standby.”

In practice, this means IT leaders need to look past legacy “N+1 in the same cloud” thinking and consider independence by provider, platform and even organizational control.

AI as both a risk and a resilience engine

AI is not just another workload you need to protect, but it’s also a tool to transform how continuity is managed.

Risk factors

  • New dependencies: Cloud-hosted AI platforms, third-party models and external data feeds introduce fresh supply chain and concentration risk, particularly when multiple critical processes depend on the same provider.
  • Model and data integrity: Model hallucinations, corrupted training data or poisoning attacks can turn AI-driven decisions into a continuity risk of their own, especially in automated operations.
  • Regulatory uncertainty: Emerging AI regulations can force rapid operational changes, affecting which models and data can be used and where they can run.

Opportunities

  • Predictive continuity: AI systems can analyze telemetry and external signals, such as infrastructure metrics, weather, geopolitical events and supplychain data, to forecast disruptions before they hit.
  • Selfhealing operations: Agentic AI can link anomaly detection directly to automated remediation, enabling infrastructure that can reconfigure, scale or isolate components autonomously.
  • Smarter testing: AI-driven chaos engineering and simulation let teams explore a much broader set of failure scenarios, including AI-specific ones, than manual tabletop exercises allow.

The implication is that continuity strategies that ignore AI, either as an asset or as a source of risk, are already obsolete.

Guidance for IT and operations leaders

For an IT audience that lives this every day, the question is how to turn these ideas into tangible next steps. Several lessons can be learned from both Equinix’s announcement and the broader industry work around AI-first resilience:

Map your AI-era blast radius

Architectural independence can’t be built if you don’t know where dependencies concentrate.

  • Inventory critical AI-enabled business services, including where models run, what data they consume and which clouds, colocation sites and networks they traverse.
  • Identify shared dependencies between “primary” and “backup” paths — identity providers, DNS, control planes, observability stacks, CI/CD pipelines and operations teams.

Use that map to pinpoint where a single misconfiguration, regional outage or vendor issue could take out both sides of your current DR design.

Design for independence, not just redundancy

Once you understand shared dependencies, refactor continuity architectures to prioritize independence.

  • Separate the control and data planes where feasible and consider using neutral interconnection infrastructure to decouple connectivity from any single cloud’s fate.
  • Where you rely heavily on a single security or connectivity provider, explore continuous parallel environments, similar in spirit to Zscaler’s Business Continuity Cloud, that run on distinct infrastructure and network paths.

This doesn’t mean duplicating everything; it means making deliberate choices about which layers must be independent for true survivability.

Make AI part of your continuity toolkit

AI should be as integral to your continuity strategy as backup and monitoring.

  • Build or adopt AI-driven anomaly detection across infrastructure, network, application and security telemetry to spot precursors to outages earlier.
  • Start with “human-in-the-loop” automation, letting AI recommend remediation actions and gradually moving to fully automated runbooks where risk is low and patterns are well understood.

The goal is to shorten the path from detection to action, while keeping humans firmly in charge of high-impact decisions.

Treat AI itself as a continuity risk domain

Business continuity professionals need to add AI to their impact analyses and tabletop exercises.

  • Include AI platform and model failures in business impact assessments: What happens if your primary model endpoints are unavailable for an hour, a day or a week?
  • Evaluate third-party AI providers through the same continuity and resilience lens you apply to core software-as-a-service and cloud services, including their own backup, failover and incident response capabilities.
  • Establish clear governance for the use of AI in continuity processes, including model validation, data quality checks and escalation paths when AI outputs conflict with expert judgment.

This is especially important as more operational decisions in areas such as security, logistics and IT operations are delegated to AI systems.

Evolve the operating model for autonomous resilience

Finally, continuity in an AI-driven world is as much an operating model challenge as a technology one.

  • Build a unified observability backbone, so AI has the data it needs to reason across application, infrastructure, network and security domains.
  • Shift teams from manual incident response toward engineering autonomous guardrails and recovery behaviors, measuring success by mean time to detect, mitigate and learn, not just by traditional uptime metrics.
  • Embed continuity considerations into platform engineering and AI platform teams so resilience properties are designed in from the start, not bolted on later.

Equinix’s emphasis on “operational survivability” captures the mindset shift: Assume disruption, assume AI as both dependency and tool, and engineer your environment so the business keeps running anyway.

Zeus Kerravala is a principal analyst at ZK Research, a division of Kerravala Consulting. He wrote this article for SiliconANGLE. 

Image: wal_172619/Pixabay

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About SiliconANGLE Media

SiliconANGLE Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of SiliconANGLE, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — SiliconANGLE Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.



Source link

Related posts

.2 Financials Stocks for Long-Term Investors and 1 We Brush Off

George

ASX Microcap Industrials Stock: Is DDT.AX Showing Oversold Bounce Potential? – Kalkine Media

George

Bank J Safra Sarasin Upbeat On Emerging Market Equities, Driven By Tech

George

Leave a Comment