Archiving is a challenge well suited for cloud, and businesses today are rapidly adopting the cloud for its convenience. But archiving and cloud services can have an element of lock-in, and your ability to work with your data once archived in the cloud will be constrained to the interfaces provided by the service. Here are the six things you need to know before getting into cloud archiving:
#1 – Where in the cloud?
Remembering my days at ZANTAZ, it was common that potential customers received guided tours of the private cloud datacenter. I don't think you should expect such a tour if your cloud archiving solution is based on public cloud – Microsoft Azure, Amazon, or Google Cloud Platform – but you should have a say in the datacenter region(s) your cloud archive configuration will be deployed into.
It's usually an obvious decision, but some coordination is needed if your organization has multiple large offices, or locations that require data sovereignty. Your cloud archive vendor should have no problem deploying into any region. And in larger deployments involving multiple datacenters, you'll want to look for a cloud archiving service that supports federation across the multiple regions so that you have one configuration at the top to manage.
#2 – Will you have noisy neighbors?
Is your cloud archive subscription provisioned in a multi-tenant architecture, or will you be configured with single tenancy? Multi-tenancy is modus operandi for the majority of SaaS applications, but I don’t like it for enterprise cloud archiving. That’s because the noisy neighbor scenario is too likely. Imagine another customer with a 700 terabyte archive running a massive content indexing job – and this impacts the performance for you and others. No thanks.
Single tenancy is the way to go for enterprise SaaS cloud archiving. It's more expensive because it means subnetting dedicated resources for each customer, but that’s how you ensure performance consistency. It’s also how you avoid commingling data in the same resources. You might have data security standards, compliance regulations, or other reasons that make commingling taboo for your business. Single tenancy puts your cloud archive in a dedicated environment – avoiding noisy neighbors and commingling.
#3 – What if you decide to cancel and need to get your data out?
As you look ahead to cloud archiving, you need to watch out for vendors that put hefty cancellation fees in their agreements. And you need to understand the process for getting your data back.
It’s unfortunate, but many IT leaders have endured hard lessons when it comes to getting out of older archive systems. Legacy archives didn’t consider bulk extraction scenarios (that’s why there’s a niche market of companies specializing in archive migrations today).
Tony Redmond recently called for an “open archive API” that would help organizations move their data between archiving providers. I like the idea, but I also think cloud archiving solutions need to include native features that make it intuitive for a customer to perform self-service bulk extraction. As a customer, you should be able to pull any amount of your data out whenever you need, and not be reliant on the vendor or some custom development or third-party tool using an arcane API.
As for costs and logistics with bulk extraction, expect to incur some data transfer (aka. egress) fees at a minimum. Data transfer costs can be significant if you're looking at pulling out a large scale archive (but still much less expensive than specialized migration software and services). I recently wrote a blog post analyzing the issue of data transfer costs in a cloud archive.
Obviously a large archive is going to take some time to extract from the cloud. See if your cloud archive vendor offers an option to extract to a basic cloud storage container, or maybe they can physically ship the data to you and perform the extraction and shortcut rehydration behind your firewall. In all of this, there may be an element of professional services, but look for cloud archiving that has most of this sorted out in software (native features in a simple interface).
#4 – How will you meet eDiscovery obligations?
I believe that basic eDiscovery requirements like placing data on legal hold, creating and managing cases, and performing collection should be entry-level features in any cloud service. Unfortunately, that's not reality: You usually have to be at the vendor’s top tier pricing package to access their eDiscovery features (if they offer them).
With a cloud archive that’s based purely on a consumption pricing model, your eDiscovery costs won’t come from an arbitrary pricing tier, but you should expect to see additional costs associated with content indexing to support any eDiscovery searching. If you have a large archive, look for search-as-a-service that can be deployed on-demand and controlled through indexing policies so that you can have search when you need it, and index only target portions of your archive.
#5 – What can you do to manage your data?
Data management is the red headed stepchild of IT. And the records management community has long been MIA. Despite your organization not having its information management policies nailed down, you should still look for a solid suite of data management controls in your cloud archive. The need to manage your data can arise at any time, for various reasons, so look for things like the ability to manage retention policies, identify private and sensitive data, apply tagging policies, and even impose data loss prevention (DLP) controls.
#6 – Can you easily report on user and system activities?
Last but not least, you should consider your needs for auditing data (aka. activity intelligence): You might need it for an investigation, an HR issue, or for security reasons. Either way, the ability to access and export a detailed history of activities by user, folder, or item is something many IT professionals have had to deal with. Many products have been developed over the years to help with this, but unfortunately they won’t be able to tell you anything about the data in your cloud archive. That’s why your cloud archive solution needs to natively provide activity auditing. Auditing should be an always-on feature -- something that you don’t need to think about. And the audit data should be exposed through a self-service interface, with easy controls for filtering and exporting.
Cloud archiving should be convenient for you. There shouldn’t be surprises. And your data should never be held hostage.
Let’s learn from the hardships of legacy archiving and make modern cloud archiving a great leap forward.