What is drive shipping?
Drive shipping is a method of uploading or downloading data to, or from, the cloud, and is the ideal option when transferring data over a network connection is prohibitive due to slow network bandwidth, time sensitivity, or other inefficiencies.
As we will explore, drive shipping allows for the safe transfer of large amounts of data to and from cloud data centers via disc drives.
What is the Azure Data Box?
The Azure Data Box is an appliance that streamlines the process of drive shipping by making the movement of large amounts of data much easier and more secure.
The Azure Data Box public preview was announced at the 2017 Ignite Conference.
General availability is reported to be the end of Q2 2018.
At HubStor, we are excited about it because Data Box provides greater flexibility and less complexity than the existing Azure Import/Export Service option, as we will explore.
Here's how the Azure Data Box process will work: The Data Box rental is ordered through the Azure Portal. You are provided a password for encryption upon ordering.
When it shows up, you connect it to your network. Data Box has built-in administration capabilities that let you specify an IP address via direct assignment or via DHCP.
Once mounted, you can pack 100 TB of data into the current form factor via SMB or CIFS protocols. (It was noted during the Ignite conference that later designs of the Azure Data Box will have a petabyte capacity.)
Equipped with an e-paper display that functions as a shipping label, the highly-durable box requires no additional packaging for shipping. When you are done, you simply schedule pickup and off it goes, back to Microsoft.
Once received at the selected Azure region, the data will be uploaded to an existing storage account. Once uploaded to Azure, the rental box will be securely erased.
The Data Box is convenient for both importing and exporting data from Azure data centers.
Amazon and Google Options for drive shipping
All the big players in the industry are working to make offline transport of large amounts of data to their respective data centers easier.
Here is a brief look at what Google and Amazon are offering in this regard:
- Amazon -- Amazon Web Services has the Snowball Edge service that transfers 100 TB of data. Also available by AWS is their Snowmobile service, capable of transferring 100 PB in a shipping container that requires a tractor trailer to transport.
- Google -- The Google Transfer Appliance is available in a 100 TB or a configuration with 4.8 times that capacity.
Current drive shipping for Microsoft Azure
The current data import service by Microsoft works by utilizing your existing Azure subscription and storage accounts, creating an import or export job, and shipping disc drive(s) to one of the supported public Azure regions. The service allows the transfer of block blobs, page blobs, or files.
At present, the Azure Import/Export Service - Drive Shipping option is supported by all public Azure accounts. The current service is quick, approximately 7 to 10 days processing time once the drive(s) are received.
Cost effective, it is $80.00 per drive that is processed as a part of each job. There is no transaction cost for importing data into blob storage. However, standard egress charges are applicable when data is exported from blob storage. You pay the costs to ship to the Azure center, and when Microsoft returns your drives the shipping cost is billed to the carrier account you provided when you created the job.
The WAImportExport tool must be downloaded and connected to your local server to facilitate copying data to your drive(s), encrypting data with AES 128 BitLocker encryption (can be increased to AES 256 by manually encrypting with BitLocker before data is copied). The encryption key on an export job will be provided in the Azure portal. The tool also generates drive journal files necessary for the job (basic information such as drive serial number and storage account name).
You can ship up to 10-disc drives of any disc size capacity per job – each drive is associated with a single job, and each job associated with a single storage account. If you have multiple jobs, you can ship any number of drives in single shipment to the Azure data center. There is no limit on the number of jobs that can be created. The maximum block blob size is approximately 4.768 Terabytes (TB). The maximum page blob size is 1 TB.
Utilizing the Azure Portal, users can create and track the status of jobs.
Some of the known limitations of the current service are:
Each import/export job can only be to or from a single storage account. A job cannot span across multiple storage accounts.
Does not support premium storage accounts.
You can import block blobs, page blobs, and files. You can only export block blobs, page blobs, or append blobs from Azure storage. You cannot export files.
Limited to only 2.5-inch solid state drives (SSD), or 2.5-inch and 3.5-inch SATA II or III hard disc drives (HDD). The data volume must be formatted with New Technology File System (NTFS). External HDD with built-in USB adaptors are not supported. The disc inside the casing of an external HDD cannot be used. Drives that do not conform will be returned.
For import jobs only the first data volume on the drive will be processed.
The WAImportExport tool must be used to prepare and copy data to the drive.
The WAImportExport tool is only compatible with 64-bit Windows operating systems.
Drives can only be shipped to supported Azure locations. If your account is not supported an alternative shipping location will be provided when you create the job.
Azure Regions that Support Drive Shipping
Drive shipping and intelligent storage tiering
Regardless of the drive shipping method you use, you have the problem of source metadata and Access Control List (ACL) permissions not being retained during the transfer of data into the cloud.
Uniquely, HubStor has patent pending features that re-establishes access rights, metadata, and folder structure synchronization for the imported content.
When HubStor clients have busy network connections and large volumes of data to archive into the cloud, they will use drive shipping to expedite the seeding of data into their cloud archive. When the drive-shipped data is loaded into a storage account in Azure, HubStor performs ingestion into the archive.
Next, with the blobs archived, an on-premises instance of the HubStor Connector Service runs using the 'Blobless Archive' setting. This setting tells HubStor that the blobs are already in the cloud -- all it needs to do is scan the source content on-premises to match up the original metadata and ACLs, as well as apply whatever storage tiering policy you might want.
Putting it all together, you get the best of both worlds: Fast and secure data ingestion to the cloud without impacting your network, and the end result of a hybrid storage architecture wherein users and applications can conveniently recall cloud-archived data from within the original directory and storage appliance.
The Azure Data Box at the 2017 Ignite Conference
The new Azure Data Box from Microsoft will be a significant improvement for drive shipping to the cloud because it simplifies the process and limits the opportunities for errors and mishandling of the physical media.
Many of HubStor’s customers have utilized the Azure Import/Export Service to dramatically cut down the network bandwidth required to archive large volumes of legacy data to the cloud. With the improvements to the drive shipping process brought about by the Azure Data Box, we expect to see wider adoption of this cloud archiving approach.
After the Azure Data Box is generally available, I will give an update on pricing and discuss user experiences.