Recovering from Ransomware Attacks: The Magic of an Immutable Backup Architecture

0
2601

Ransomware has been a hot topic in cybersecurity for years, with many stories of organizations that can no longer access their business-critical data after the attackers have encrypted access to production files and storage devices. While cybersecurity teams have invested in a myriad of protection tools, extortionists continue to find new mechanisms to encrypt organizations’ data.

Backups are one of the most – if not the most – important defences against ransomware. But advanced ransomware is now targeting backups – modifying or completely wiping them out – eliminating that last line of defence and driving large ransom payouts.

Rubrik’s uniquely immutable filesystem natively prevents unauthorized access or deletion of backups, allowing IT teams to quickly restore to the most recent clean state with minimal business disruption. This blog walks you through their immutable architecture and robust security controls that can harden your data against cyberattacks.

The Effects of Ransomware

Ransomware is designed to encrypt your data so that it is no longer usable. Often, this means encryption of data held on primary storage to overwhelm IT and requires massive recovery efforts from tape or other archives. Additionally, lower level encryption of the Master Boot Record (MBR) or other operating system level encryption is used to prevent booting and other common operations. For virtualized environments, the shared data storage used to host virtual machines is a primary target, such as with NFS-backed datastores. This can effectively bring down critical services in an organization. The attackers would then demand a ransom to unlock the data so that services can be resumed. 

How Rubrik Has Helped Customers

Several Rubrik customers have successfully survived a ransomware attack through the use of their immutable solution and instant recoveries as part of a defence in depth strategy. 

For example, the City of Durham in North Carolina detected a ransomware attack on Friday, March 6, and its leaders credited their quick response to Rubrik’s backup solution. Durham Mayor Steve Schewell said, “The city can be assured that our backups are very good because they’re immutable. [This means that] they could not be consumed by ransomware.”

As a result, they were able to quickly restore critical city services, including access to 911. In addition, Kerry Goode, Durham CIO, emphasized that core business systems, including ones that manage payroll, were back online by the start of the business week. 

Kern Medical Center discovered a large ransomware attack in June 2019 when users reported they couldn’t access their systems. They were able to recover 100% of the impacted systems protected by Rubrik within minutes, including recovering their business-critical electronic medical record system. CTO Craig Witmer said, “After the incident, we were so impressed that we moved more of our legacy systems to Rubrik and are fully confident that Rubrik’s immutable backups will protect us from future incidents.”

How Do You Recover from a Ransomware Attack?

Data backups can be an effective way to restore data that has been locked/encrypted by the attack. However, what if your backup data is also encrypted or deleted by a ransomware attack? How do you ensure that your backup data is not vulnerable to these attacks? 

The Key Is Immutable Backups

While primary storage systems need to be open and available for client systems, your backup data should be immutable. This means that once data has been written it cannot be read, modified or deleted by clients on your network. Immutability can help to ensure recovery when production systems are compromised.

This goes well beyond simple file permissions, folder ACLs or storage protocols. The concept of immutability needs to be baked into the backup architecture so that no security exposure can tamper with the backups. 

How Rubrik Is Designed for Immutability

Rubrik uses an immutable architecture by combining an immutable filesystem with a zero trust cluster design in which operations can only be performed through authenticated APIs.

Rubrik’s approach is in contrast with general purpose storage that uses standard protocols such as NFS or SMB to advertise their availability to a wide assortment of clients. Data management solutions using general purpose storage can potentially have limited or ineffective means for securely transacting data and, in some cases, leave files in their native format while allowing clients to read the backup data directly. This puts an extra burden on the customer to secure the storage independent of their data management solution.

An Immutable Distributed Filesystem

Rubrik constructed Atlas, an immutable Filesystem in Userspace (FUSE) that was largely POSIX-compliant. This provides tight controls over which applications can exchange information, how each data exchange is transacted and how data is arranged across physical and logical devices. Atlas is custom designed to be a distributed and immutable file system for writing and reading data for other Rubrik services.

Immutability is provided across two layers: the logical layer (Patch Files, Patch Blocks) and the physical layer (Stripes, Chunks). The dynamics between these two layers will be explained further in the next few sections.

The Logical Layer

All customer data brought into the system is written into a proprietary sparse file called a Patch File. These are append-only files (AOFs), meaning that your data can only be added to the Patch File while it is marked as being open. All of the customer snapshot and journal data is held within Atlas, which enforces the use of Patch Files in the underlying directory structure. This powerful filesystem will refuse writes at the API level that are not append-only, such as situations in which the write offset value does not equal the file size.

Atlas has total control over how and where customer data is written.

If your backup data has been modified, then it’s essentially worthless. Rubrik solved for this by ensuring that checksums are generated for each Patch Block within a Patch File. These checksums are computed and written to a Fingerprint File stored alongside the Patch File. Rubrik always does a fingerprint check before committing any data transformations. This ensures that the original file remains intact with forced validation during read operations.

In order to counter a ransomware attack, the original, validated data must be restored from backup. Rubrik routinely verifies the Patch Blocks against their checksums to ensure data integrity at the logical Patch Block level. Patch Files are not exposed to any external systems or customer administrator accounts. This ensures that meticulous care is taken to restore exactly what you originally stored in a backup.

In a traditional approach, administrative access is granted to the filesystem – especially when using general purpose storage – which presents further confidentiality and integrity challenges and gives “Leakware” another attack vector. In addition, many traditional solutions simply restore whatever data is located in the backup folder or volume without performing validation and other due diligence on the data

The Physical Layer

While the logical layer focused on data integrity at the file level, the physical layer is focused on writing customer data across the immutable cluster to achieve data integrity and data resiliency. To do this, Patch Files are logically divided into fixed length segments called Stripes. As Stripes are written, the AOF computes a Stripe level checksum, which it stores within each Stripe Metadata.

Stripes are further divided into physical Chunks stored on physical disks held within the Rubrik cluster. Activities such as replication and erasure coding occur at the Chunk level. Just as with Patch Files, as each Chunk is written, a Chunk checksum is computed and stored in the Stripe Metadata alongside the list of chunks. These checksums are periodically recomputed as part of Atlas’ background scan by reading the physical Chunks and comparing against the checksums in the Stripe Metadata. Additionally, if a data rebuild is needed, the resiliency provided by erasure coding is automatically leveraged in the background.

Zero Trust Cluster Design

Traditional approaches to cluster security often rely on a “full trust” model in which all members of the cluster are able to communicate with one another. In some cases, this includes root level authority, no mutual authentication checks and the ability to read or modify customer data that is held within the filesystem. This creates a weak surface area when designing a defence in depth architecture; if backup data can be compromised, there is no path to restoration when disruption occurs.

Secured Cluster Communications

Each cluster has some number of nodes that need to communicate with one another. This means there is a need to validate each node that wants to exchange data. At Rubrik, all of their intra-node and inter-cluster communication, as well as communication with external applications, use the TLS protocol with certificate-based mutual authentication for secure communication.

Rubrik does not use insecure protocols, such as NFS or SMB, to relay information within the cluster; all communication is performed through secure and trusted channels. In fact, all their internal communications use TLS 1.2 with strong cipher suites and Perfect Forward Secrecy (PFS).

Each Rubrik cluster shipped to a customer uses strong, randomized passwords on a per-node basis. There is no concept of a “admin/admin” style of default local authentication that is easily searchable on the web to add an attack vector.

Systems Hardening Standards

There are numerous other elements in position to protect the integrity of the system through internal hardening standards. Here are a few that help combat ransomware:

Authenticated APIs

Rubrik adopted an API-first design as part of the architecture. They require authentication to all endpoints that are used to operate the solution. Authentication can be handled via credentials or secure token. This includes environments using their Role-Based Access Control (RBAC) or Multi-tenancy features to logically divide the roles, features and resources that are under management. Rubrik’s CLI, SDKs and other tools consume the API and are held to the same security requirements.

API endpoints that control the underlying behaviour of the system require an additional level of authorization that can only be supplied from a certified technical support engineer. This prevents a malicious actor from being able to alter the behaviour of a Rubrik cluster.

Conclusion

Numerous resources on the Internet advocate for a Defence in Depth strategy. This combines efforts across employee education and enablement, rapid deployment of patches and a solid backup and recovery plan.

This post described how Rubrik uses a combination of data immutability and a zero trust cluster design to build a great product for protecting and recovering data. Rubrik helps organizations further strengthen their ransomware response strategy with an application called Radar to increase visibility into the scope of attack. This allows organizations to quickly pinpoint which applications and files were impacted and where they reside to further minimize business impact. Learn more about Radar.

Many customers need to be able to reliably recover from ransomware attacks to ensure minimal downtime of their critical services. A product with a truly immutable architecture provides customers the peace of mind that, when they need to, they can always access the data to recover from such attacks.

To learn more, schedule an in-person or virtual meeting with CDW and Rubrik today.