This post discusses what not to do when using the cloud for unstructured data archival storage.
The great sea change is here. IT teams are moving to cloud computing and storage for their enterprise archiving needs.
As an IT leader, you know the advantages of leveraging the cloud for archiving the costly hordes of barely-used unstructured data sitting on primary storage. But when it comes to a new-style cloud archive, there are various ways to make it happen – and various ways to feel pain.
Regardless of the type of corporate data, or whether you’re looking at solutions based on the public cloud or a vendor’s private cloud, here are the four worst cloud archive practices I've encountered – and their best practice replacements.
1. Forgetting eDiscovery
Ever hear of drive shipping? It’s the worst case scenario that can happen when your cloud storage solution lacks self-service eDiscovery. Without the right tools in the cloud, drive shipping becomes the only way for you to meet your eDiscovery obligation when a pressing legal claim hits you. Imagine all your data in the cloud being physically shipped back to you on drives, which you then must deal with, along with the added chain-of-custody complexity.
Not nice. You’d think things would be more advanced than this in 2016, but they aren't in many cloud archive offerings.
Even if litigation is rare for your organization, you should make sure that your cloud archive includes the ability to index data, perform custodian and keyword searches, manage legal holds, and perform self-service targeted collection. You may never use them, but if you ever do need them, these features save you from much pain and anguish.
Remember, the cloud offers incredible cost savings, but a nasty eDiscovery scenario without the right tools will incur shockingly high costs that negate your cloud cost savings.
2. Getting locked in
I’ve been told that many cloud archiving vendors have term commitments and cancellation fees in their service agreements. To me, that’s a huge red flag.
If the solution is great, it should be available on a pay-as-you-go (PAYG) model. I think that speaks volumes. It says, “We’re confident you’ll like our service - we believe it's unlikely you'll want to leave”.
There are other ways of getting locked in too. For instance, some Cloud NAS solutions don’t make it easy to get your data out in bulk – you have to go through them to do it. Being at the mercy of the vendor – and their support policies and professional services fees – is another element of lock-in to avoid.
3. Not having an exit strategy
Archiving is usually a long-term thing. But even if we’re talking 20 years from now, you should know what the end of the road looks like before you get into it.
In other words, can you easily migrate out when it’s time to move on?
The market right now is rife with archive migration projects – organizations getting out of a legacy archive solutions that they’ve been using for the past 10-12 years. The painful lesson is that you need to pay for specialty migration software and services to get it done – with a staggering cost per terabyte.
It isn’t just with cloud archiving – this applies to any cloud service you might use.
You need to be in full control of your data. As you look to move your archive strategy to the cloud, make certain that your cloud archive solution empowers you with self-service abilities to perform bulk extraction on demand – and the data should be exactly what it was before archiving. Also make sure you understand all cost factors associated with exiting.
4. Cutting corners on security and data governance
Most cloud storage offerings are going to put your data in the dark. That is, you don't have easy visibility into what you're storing.
Dark data is a very bad sin because it drives up cost and risk, and may even make it hard to prove compliance.
Think through some of the security scenarios you might be faced with. Perhaps you need to scan for PII? Interrogate access rights? Maybe there’s a need to audit user activity? Make sure you have visibility and controls to meet a variety of business scenarios.
If your cloud archive doesn’t make such things a simple task, you should think twice.
Here’s another example: Imagine your chief general counsel wakes up one day and decides retention policies are important. Your cloud storage not only needs to give you a delete button, but you probably want to preview a retention policy before running it. And you need an audit report to make sure the deletion is defensible. If it's dark data, good luck executing what should be a very simple data governance task.
As a best practice, you should look for a cloud archive solution that intrinsically meets the majority of your security and data governance needs. Layering these things on after the fact can be painful. I’m a firm believer that things like analytics, data loss prevention, auditing, RBAC, and other data governance features need to be deeper in the stack – closer to your data near the point of storage. This concept is a must-have in cloud services, especially archiving.