Ora

What is Cold Redundancy?

Published in Disaster Recovery Strategy 3 mins read

Cold redundancy is a disaster recovery strategy where a backup system or machine is available, but it does not continuously mirror or back up data from the primary system. In the event of a failure in the primary system, any data generated or modified since the last manual or scheduled backup on the primary system will be lost. While a redundant machine exists, it's not kept in a continuously synchronized state with the active system.

Key Characteristics of Cold Redundancy

  • No Continuous Data Backup: The most defining characteristic is the absence of real-time or near real-time data synchronization between the primary and the backup system.
  • Backup Machine Exists: There is a separate, ready-to-use (or near-ready) machine intended to take over if the primary fails.
  • Data Loss Upon Failure: Due to the lack of continuous backup, data created or changed after the last non-continuous backup will be permanently lost when the primary system fails.
  • Manual Activation: The switch-over to the redundant system typically requires manual intervention to bring it online and restore data from the last available backup.
  • Lower Cost: Compared to more robust redundancy methods, cold redundancy generally involves lower operational costs and less complex setup.

Where is Cold Redundancy Used?

Cold redundancy is often employed for systems or data where:

  • Data Change Rate is Low: Applications with infrequent data updates or where some data loss is acceptable.
  • Recovery Time Objective (RTO) is Flexible: The business can tolerate a longer downtime period because bringing the cold backup online and restoring data takes time.
  • Cost is a Primary Concern: It provides a basic level of resilience without the high investment required for continuous synchronization.

A practical example of cold redundancy can be seen in how cluster management information is protected. While the cluster itself might handle data in a more resilient way, the configuration and management data for the cluster itself might not be continuously replicated. If the primary management node fails, that configuration information, up to the last saved point, is restored on a backup node.

Cold vs. Hot Redundancy

To better understand cold redundancy, it's helpful to contrast it with hot redundancy, which represents the opposite end of the spectrum in terms of data synchronization and availability.

Feature Cold Redundancy Hot Redundancy
Data Synchronization No continuous backing up of data. Data is not synchronized in real-time. Continuous, real-time data synchronization. Data is mirrored constantly.
Backup System State Offline or dormant; needs activation and data restoration. Online and active or standby; ready for immediate failover.
Data Loss on Failure Significant data loss occurs (data since last non-continuous backup). Minimal to no data loss (only data in transit at moment of failure).
Recovery Time Objective Longer (minutes to hours/days); requires manual intervention and data restoration. Shorter (seconds to minutes); often automatic failover.
Complexity & Cost Lower complexity, lower cost. Higher complexity, higher cost (hardware, software, network bandwidth).
Ideal Use Cases Less critical data, archival systems, cluster management information. Mission-critical applications, high-transaction databases, high-availability services.

Cold redundancy provides a foundational layer of protection against hardware failures but comes with the inherent risk of data loss and longer recovery times. It's a strategic choice for specific types of data or systems where these trade-offs are acceptable.