To recap, here is one plan for using an off-the-shelf cloud storage solution such as MSFT SkyDrive or Google Drive for private data storage, while minimizing any access to personal information by the service provider:
- Create a virtual hard disk (VHD) image
- Move the VHD disk image into a directory synced to the cloud backup solution
- Mount the disk as local drive and format it using NTFS
- Enable BitLocker-To-Go full disk encryption on the disk, preferably using smart cards to avoid exposing user-chosen passphrases hashes to the cloud
- Copy/move any files desired for backup into the mounted virtual drive. The contents will be stored encrypted on the virtual disk, which is itself backed by a single massive VHD file.
The cloud provider only receives a single massive VHD file for backup. If they were curious, they could infer that the file is a virtual hard disk, that it is protected by BitLocker and even the protection options such as passphrase verses smart card. But without access to the decryption keys, they would not have visibility into the internal filesystem embedded in the virtual drive. Depending on implementation, they could infer how much of the disk is in use. They can also observe how much data changed between different snapshots of the VHD, which can indirectly hint at filesystem structure. For example if the user is constantly editing one particular file, the sectors on the virtual disk corresponding to that file would change while other areas of the disk remain identical. While the service provider can not determine the contents of those sectors, they can conclude those particular sectors were modified.
Accessing the data from another computer is easier than the original setup. Assuming the VHD file has been replicated to a local directory compliments of the sync client from the cloud provider, there are two additional steps:
- Mount the VHD as local drive
- Unlock the BL2G volume by providing necessary credentials for decryption
Because Windows 7 does not natively recognize VHD extension, the first step requires either a visit to the Disk Management console (under Computer Management) or scripting the diskpart utility.
Windows 8 provides a more seamless experience. Not only does it allow mounting VHD with simple double-click but also recognizes BL2G encryption. A desktop notification walks the user through the steps of unlocking the drive:
What is the catch? There are two significant limitations that relegate this experiment to the realm of proof-of-concept:
- Mounted VHD disk images are locked for exclusive access by the OS. Backup to the cloud can not proceed in the background while the user is still accessing content from that virtual disk. Instead it must be unmounted (using “Eject” from UI, scripting with diskpart or programmatically) to allow synchronization agents to gain read access again.
- Many cloud storage systems do not support differential backup. The entire VHD image is uploaded to the cloud each time, even when a small percent of the file has changed. This is very likely, since filesystems try to keep different parts of a file clustered together. If the user changes one file on the virtual disk, it will touch sectors associated with that file, along with sectors containing filesystem metadata. But the majority of that disk image is not altered. Yet the inefficiency of uploading the entire file each time means that total size of VHD is constrained by upstream bandwidth.
The second problem is aggravated by the first one, since VHD must be offline and inaccessible until the backup completes. Worse the blackbox nature of the synchronization process makes it difficult to infer when it has completed. That makes automation tricky. In principle we could schedule a periodic task to run when the computer is not in use: detach the virtual disk, wait for cloud backup to complete and reattach it. Even if this could be tolerated– any application that had open files on that volume may behave unpredictable– the problem remains that it is not possible to determine when backup is finished, short of low-level hacks such as monitoring the open file handles in the sync client process for references to the VHD.
In a follow-up post, we will look at an alternative approach that allows for incremental backup, with security still provided by BL2G full-disk encryption.