Shedding Light on Storage Encryption

I’ve been noticing some fundamental misunderstandings around storage encryption – I see this most when dealing with XtremIO although plenty of platforms support it (VNX2 and VMAX).  I hope this blog post will help someone who is missing the bigger picture and maybe make a better decision based on tradeoffs.  This is not going to be a heavily technical post, but is intended to shed some light on the topic from a strategic angle.

Hopefully you already know, but encryption at a high level is a way to make data unreadable gibberish except by an entity that is authorized to read it.  The types of storage encryption I’m going to talk about are Data At Rest Encryption (often abbreviated DARE or D@RE), in-flight encryption, and host-based encryption.  I’m talking in this post mainly about SAN (block) storage, but these concepts also apply to NAS (file) storage.  In fact, in-flight encryption is probably way more useful on a NAS array given the inherent security of FC fabrics.  But then, iSCSI, and it gets cloudier.

Before I start, security is a tool and can be used wisely or poorly with equivalent results.  Encryption is security.  All security, and all encryption, is not great.  Consider the idea of cryptographic erasure, by which data is “deleted” merely because it is encrypted and nobody has the key.  Ransomware thrives on this.  You are looking at a server with all your files on it, but without the key they may as well be deleted.  Choosing a security feature for no good business reason other than “security is great” is probably a mistake that is going to cause you headaches.


Here is a diagram with 3 zones of encryption.  Notice that host-based encryption overlaps the other two – that is not a mistake as we will see shortly.

Data At Rest Encryption

D@RE of late is typically referring to a storage arrays ability to encrypt data at the point of entry (write) and decrypt on exit (read).  Sometimes this is done with ASICs on an array or I/O module, but it is often done with Self Encrypting Drives (SEDs).  However the abstract concept of D@RE is simply that data is encrypted “at rest,” or while it is sitting on disk, on the storage array.

This might seem like a dumb question, but it is a CRUCIAL one that I’ve seen either not asked or answered incorrectly time and time again: what is the purpose of D@RE?  The point of D@RE is to prevent physical hardware theft from compromising data security.  So, if I nefariously steal a drive out of your array, or a shelf of drives out of your array, and come up with some way to attach them to another system and read them, I will get nothing but gibberish.

Now, keep in mind that this problem is typically far more of an issue on a small server system than it is a storage array.  A small server might just have a handful of drives associated with it, while a storage array might have hundreds, or thousands.  And those drives are going to be in some form of RAID protection which leverages striping.  So even without D@RE the odds of a single disk holding meaningful data is small, though admittedly it is still there.

More to the point, D@RE does not prevent anyone from accessing data on the array itself.  I’ve heard allusions to this idea that “don’t worry about hackers, we’ve got D@RE” which couldn’t be more wrong, unless you think hackers are walking out of your data center with physical hardware.  If the hackers are intercepting wire transmissions, or they have broken into servers with SAN access, they have access to your data.  And if your array is doing the encryption and someone manages to steal the entire array (controllers and all) they will also have access to your data.

D@RE at the array level is also one of the easiest to deal with from a management perspective because usually you just let the array handle everything including the encryption keys.  This is mostly just a turn it on and let it run solution.  You don’t notice it and generally don’t see any fall out like performance degradation from it.

In-Flight Encryption

In-flight encryption is referring to data being encrypted over the wire.  So your host issues a write to a SAN LUN, and that traverses your SAN network and lands on your storage array.  If data is encrypted “in-flight,” then it is encrypted throughout (at least) the switching.

Usually this is accomplished with FC fabric switches that are capable of encryption.  So the switch that sees a transmission on an F port will encrypt it, and then transmit it encrypted along all E ports (ISLs) and then decrypt it when it leaves another F port.  So the data is encrypted in-flight, but not at rest on the array.  Generally we are still talking about ASICs here so performance is not impacted.

Again let’s ask, what is the purpose of in-flight encryption?  In-flight encryption is intended to prevent someone who is sniffing network traffic (meaning they are somehow intercepting the data transmissions, or a copy of the data transmissions, over the network) from being able to decipher data.

For local FC networks this is (in my opinion) not often needed.  FC networks tend to be very secure overall and not really vulnerable to sniffing.  However, for IP based or WAN based communication, or even stretched fabrics, it might be sensible to look into something like this.

Also keep in mind that because data is decrypted before being written to the array, it does not provide the physical security that D@RE does, nor does it prevent anyone from accessing data in general.  You also sometimes have the option of not decrypting when writing to the array.  So essentially the data is encrypted when leaving the host, and written encrypted on the array itself.  It is only decrypted when the host issues a read for it and it exits the F port that host is attached to. This results in you having D@RE as well with those same benefits.  A real kicker here becomes key management, because in-flight encryption can be removed at any time without issue.  You can remove or disable in-flight encryption and not see any change in data because at the ends it is unencrypted.  However, if the data is written encrypted on the array, then you MUST have those keys to read that data.  If you had some kind of disaster that compromised your switches and keys, you would have a big array full of cryptographically erased data.

Host Based Encryption

Finally, host-based encryption is any software or feature that encrypts LUNs or files on the server itself.  So data that is going to be written to files (whether SAN based or local files) is encrypted in memory before the write actually takes place.

Host-based encryption ends up giving you both in-flight encryption and D@RE as well.  So when we ask the question, what is the purpose of host-based encryption?, we get the benefits we saw from in-flight and D@RE, as well as another one.  That is the idea that even with the same hardware setup, no other host can read your data.  So if I were to forklift your array, fabric switches, and get an identical server (hardware, OS, software) and hook it up, I wouldn’t be able to read your data.  Depending on the setup, if a hacker compromises the server itself in your data center, they may not be able to read the data either.

So why even bother with the other kinds of encryption?  Well for one, generally host-based encryption does incur a performance hit because it isn’t using ASICs.  Some systems might be able to handle this but many won’t be able to.  Unlike D@RE or in-flight, there will be a measurable degradation when using this method.  Another reason is that key management again becomes huge here.  Poor key management and a server having a hardware failure can lead to that data being unreadable by anyone.  And generally your backups will be useless in this situation as well because you have backups of encrypted data that you can’t read without the original keys.

And frankly, usually D@RE is good enough.  If you have a security issue where host-based encryption is going to be a benefit, usually someone already has the keys to the kingdom in your environment.

Closing Thoughts

Hopefully that cleared up the types of encryption and where they operate.

Another question I see is “can I use one or more at the same time?”  The answer is yes, with caveats.  There is nothing that prevents you from using even all 3 at the same time, even though it wouldn’t really make any sense.  Generally you want to avoid overlapping because you are encrypting data that is already encrypted which is a waste of resources.  So a sensible pairing might be D@RE on the array and in-flight on your switching.

A final HUGELY important note – and what really prompted me to write this post – is to make sure you fully understand the effect of encryption on all of your systems.  I have seen this come up in a discussion about XtremIO using D@RE paired with host-based encryption.  The question was “will it work?” but the question should have been “should we do this?”  Will it work?  Sure, there is nothing problematic about host-based encryption and XtremIO D@RE interacting, other than the XtremIO system encrypting already encrypted data.  What is problematic, though, is the fact that encrypted data does not compress, and most encrypted data won’t dedupe either…or at least not anywhere close to the level of unencrypted data.  And XtremIO generally relies on its fantastic inline compression and dedupe features to fit a lot of data on a small footprint. XtremIO’s D@RE happens behind the compression and deduplication, so there is no issue.  However host-based encryption will happen ahead of the dedupe/compression and will absolutely destroy your savings. So if you wanted to use the system like this, I would ask, how was it sized?  Was it sized with assumptions about good compression and dedupe ratios?  Or was it sized assuming no space savings?  And, does the extra money you will be spending for the host-based encryption product and the extra money you will be spending on the additional required storage justify the business problem you were trying to solve?  Or was there even a business problem at all?  A better fit would probably be something like a tiered VNX2 and FAST cache which could easily handle a lot of raw capacity and use the flash where it helps the most.

Again, security is a tool, so choose the tools you need, use them judiciously, and make sure you fully understand their impact (end-to-end) in your environment.

EMC Recoverpoint and XtremIO Part 4 – Recovery and Summary

In this final post we are going to cover a simple recovery, as well as do a quick summary.  I’ll throw in a few bonus details for free.


Our CG has been running now for over 48 hours with our configuration – 48 hours Required Protection Window, 48 max snaps, one snap per hour.  Notice below that I have exactly (or just under, depending on how you measure) a 48 hour protection window.  I have one snap per hour for 48 hours and that is what is retained.  This is because of how I constructed my settings!


If I reduce my Required Protection Window to 24 hours, notice that IMMEDIATELY the snaps past 24 hours are nuked:


The distribution of snaps in this case wouldn’t be different because of how the CG is constructed (one snap per hour, 48 max snaps, 24 hour protection window = 1 snap per hour for 24 hours), but again notice that the Required Protection Window is much more than just an alerting setting in RP+XtremIO.

Alright, back to our recovery example.  Someone dumb like myself ignored all the “Important” naming and decided to delete that VM.


Even worse, they decided to just delete the entire datastore afterwards.


But lucky for us we have RP protection enabled.  I’m going to head to RP and use the Test a Copy and Recover Production button.


I’ll choose my replica volume:


Then I decide I don’t want to use the latest image because I’m worried that the deletion actually exists in that snapshot.  I choose one hour prior to the latest snap.  Quick note: see that virtual access is not even available now?  That’s because with snap based promotion there is no need for it.  Snaps are instantly promoted to the actual replica LUN, so physical access is always available and always immediate no matter how old the image.


After I hit next, it spins up the Test a Copy screen.  Now normally I might want to map this LUN to a host and actually check it to make sure that this is a valid copy.  In this case because, say, I’ve tracked the bad user’s steps through vCenter logging, I know exactly when I need to recover.  An important note though, as you’ll see in a second all snapshots taken AFTER your recovery image will be deleted!  But again, because I’m a real maverick I just tell it go to ahead and do the production recovery.


It gives me a warning that prod is going to be overwritten, and that data transfer will be paused.  It doesn’t warn you about the snapshot deletion but this has historically been RP behavior.


On the host side I do a rescan, and there’s my datastore.  It is unmounted at the moment so I’ll choose to mount it.


Next, because I deleted that VM I need to browse the datastore and import the VMX file back into vCenter.

xsumm11 xsumm12

And just like that I’ve recovered my VM.  Easy as pie!


Now, notice that I recovered using the 2:25 snap, and below this is now my snapshot list.  The 3:25 and the 2:25 snap that I used are both deleted.  This is actually kind of interesting because an awesome feature of XtremIO is that all snaps (even snaps of snaps) are independent entities; intermediate snaps can be deleted with no consequence.  So in this case I don’t necessarily think this deletion of all subsequent snaps is a requirement, however it certainly makes logical sense that they should be deleted to avoid confusion.  I don’t want a snapshot of bad data hanging around in my environment.



In summary, it looks like this snap recovery is fantastic as long as you take the time to understand the behavior.  Like most things, planning is essential to ensure you get a good balance of your required protection and capacity savings.  I hope for some more detailed breakdowns from EMC on the behavior of the snapshot pruning policies, and the full impact that settings like Required Protection Window have in the environment.

Also, don’t underestimate the 8,192 max snaps+vols for a single XMS system, especially if you are managing multiple clusters per XMS!  If I had to guess I would guess that this value will be bumped up in a future release considering these new factors, but in the meantime make sure you don’t overrun your environment.  Remember, you can still use a single XMS per cluster in order to sort of artificially inflate your snap ceiling.

Bonus Deets!

A couple of things of note.

First, in my last post I stated that I had notice a bug with settings not “sticking.”  After talking with a customer, he indicated this doesn’t have to do with the settings (the values) but with the process itself.  Something about the order is important here.  And now I believe this to be true because if I recreate a CG with those same busted settings, it works every time!  I can’t get it to break. 🙂  I still believe this to be a bug so just double check your CG settings after creating.

Second, keep in mind that today XtremIO dashboard settings display your provisioned capacity based on volumes and snapshots on the system, with no regard for who created those snaps.  So you can imagine with a snap based recovery tool, things get out of hand quickly. I’m talking about 1.4PB (no typo – PETAbytes) “provisioned” on a 20TB brick!


While this is definitely a testament to the power (or insanity?) of thin provisioning, I’m trying to put in a feature request to get this fixed in the future because it really messes with the dashboard relevance.  But for the moment just note that for anything you protect with RP:

  • On the Production side, you will see a 2x factor of provisioning.  So if you protected 30TB of LUNs, your provisioned space (from those LUNs) will be 60TB.
  • On the Replica side, you will see a hilarious factor of provisioning, depending on how many snaps you are keeping.

I hope this series has been useful – I’m really excited about this new technology pairing!

EMC Recoverpoint and XtremIO Part 3 – Come CG With Me

In this post we are going to configure a local consistency group within XtremIO, armed with our knowledge of the CG settings.  I want to configure one snap per hour for 48 hours, 48 max snaps.

Because I’m working with local protection, I have to have the full featured licensing (/EX) instead of the basic (/SE) that only covers remote protection.  Note: these licenses are different than normal /SE and /EX RP licenses!  If you have an existing VNX with standard /SE, then XtremIO with /SE won’t do anything for you!

I have also already configured the system itself, so I’ve presented the 3GB repository volume, configured RP, and added this XtremIO cluster into RP.

All that’s left now is to present storage and protect!  I’ve got a 100GB production LUN I want to protect.  I have actually already presented this LUN to ESX, created a datastore, and created a very important 80GB Eager Zero Thick VM on it.


First thing’s first, I need to create a replica for my production LUN – this must be the exact same size as the production LUN, although again this is always my recommendation with RP anyway.  I also need to create some journal volumes as well.  Because this isn’t a distributed CG, I’ll be using the minimum 10GB sizing.  Lucky for us creating volumes on XtremIO is easy peasy.  Just a reminder – you must use 512 byte blocks instead of 4K, but you are likely using that already anyway due to lack of 4K support.


Next I need to map the volume.  If you haven’t seen the new volume screen in XtremIO 4.0, it is a little different.  Honestly I kind of like the old one which was a bit more visual but I’m sure I’ll come to love this one too.  I select all 4 volumes and hit the Create/Modify Mapping button.  Side note: notice that even though this is an Eager Zero’d VM, there is only 7.1MB used on the volume highlighted below.  How?  At first I thought this was the inline deduplication, but XtremIO does a lot of cool things, and one neat thing it does is discard all ful-zero block writes coming into the box!  So EZTs don’t actually inflate your LUNs. 


Next I choose the Recoverpoint Initiator group (the one that has ALL my RP initiators in it) and map the volume.  LUN IDs have never really been that important when dealing with RP, although in remote protection it can be nice to try to keep the local and remote LUN IDs matching up.  Trying to make both host LUN IDs and RP LUN ID match up is a really painful process, especially in larger environments, for (IMO) no real benefit.  But if you want to take up that, I won’t stop you Sysyphus!

Notice I also get a warning because it recognizes that the Production LUN is already mapped to an existing ESX host.  That’s OK though, because I know with RP this is just fine.


Alright now into Recoverpoint.  Just like always I go into Protection and choose Protect Volumes.


These screens are going to look pretty familiar to you if you’ve used RP before.  On this one, for me typically CG Name = LUN name or something like it, Production name is ProdCopy or something similar, and then choose your RPA cluster.  Just like always, it is EXTREMELY important to choose the right source and destinations, especially with remote replication.  RP will happily replicate a bunch of nothing into your production LUN if you get it backwards!  I choose my prod LUN and then I hit modify policies.


In modify policy, like normal I choose the Host OS (BTW I’ll happily buy a beer for anyone who can really tell me what this setting does…I always set it but have no idea what bearing it really has!) and now I set the maximum number of snaps.  This setting controls how many total snapshots the CG will maintain for the given copy.  If you haven’t worked with RP before this can be a little confusing because this setting is for the “production copy” and then we’ll set the same setting for the “replica copy.”  This allows you to have different settings in a failover situation, but most of the time I keep these identical to avoid confusion.  Anywho, we want 48 max snaps so that’s what I enter.


I hit Next and now deal with the production journal.  As usual I select that journal I created and then I hit modify policy.


More familiar settings here, and because I want a 48 hour protection window, that’s what I set.  Again based on my experience this is an important setting if you only want to protect over a specific period of time…otherwise it will spread your snaps out over 30 days.  Notice that snapshot consolidation is greyed out – you can’t even set it anymore.  That’s because the new snapshot pruning policy has effectively taken its place!


After hitting next, now I choose the replica copy.  Pretty standard fare here, but a couple of interesting items in the center – this is where you configure the snap settings.  Notice again that there is no synchronous replication; instead you choose periodic or continuous snaps.  In our case I choose periodic and a rate of one per 60 minutes.  Again I’ll stress, especially in a remote situation it is really important to choose the right RPA cluster!  Naming your LUNs with “replica” in the name helps here, since you can see all volume names in Recoverpoint.


In modify policies again we set that host OS and a max snap count of 48 (same thing we set on the production side).  Note: don’t skip over the last part of this post where I show you that sometimes this setting doesn’t apply!


In case you haven’t seen the interface to choose a matching replica, it looks like this.  You just choose the partner in the list at the bottom for every production LUN in the top pane.  No different from normal RP.


Next, we choose the replica journal and modify policies.


Once again setting the required protection window of 48 hours like we did on the production side.


Next we get a summary screen.  Because this is local it is kind of boring, but with remote replication I use this opportunity to again verify that I chose the production site and the remote site correctly.


After we finish up, the CG is displayed like normal, except it goes into “Snap Idle” when it isn’t doing anything active.


One thing I noticed the other day (and why I specifically chose these settings for this example) is that for some reason the replica copy policy settings aren’t getting set correctly sometimes.  See here, right after I finished up this example the replica copy policy OS and max snaps aren’t what I specified.  The production is fine.  I’ll assume this is a bug until told otherwise, but just a reminder to go back through and verify these settings when you finish up.  If they are wrong you can just fix them and apply.


Back in XtremIO, notice that the replica is now (more or less) the same size as the production volume as far as used space.  Based on my testing this is because the data existed on the prod copy before I configured the CG.  If I configure the CG on a blank LUN and then go in and do stuff, nothing happens on the replica LUN by default because it isn’t rolling like it use to.  Go snaps!


I’ll let this run for a couple of days and then finish up with a production recovery and a summary.

EMC RecoverPoint and XtremIO Part 1 – Initial Findings and Requirements

Back in the saddle again after a long post drought!  I’ve been busy lately working on some training activities with pluralsight, as well as dealing with a company merger.  I’m no longer with Varrow, as Varrow was acquired by Sirius Computer Solutions.  And enjoying time with my son, who is about to turn 1 year old – hard to believe!

Over the past couple of weeks, I’ve been involved in some XtremIO and Recoverpoint deployments.  RP+XtremIO just released not too long ago and it has been a bit of a learning curve – not with the product itself, but with the new methodology.  I wanted to lay out some details in case anyone is looking at this solution.

There is a good whitepaper on called Recoverpoint Deploying with XtremIO Tech Notes.  It does a good job of laying out the functionality, but for me at least still missed some important details – or maybe just didn’t phrase them so I could understand.

First, great news, from a functional standpoint this solution is roughly the same as all other RP implementations.  The same familiar interface is there, you create CGs and can do things like test a copy, recover production, and failover.  So if you are familiar with Recoverpoint protection operationally there is not a lot of difference.

Under the covers, things are hugely different.  I’m going to talk about the snap based replication a little later, and probably in part 2 as well.

First, the actual deployment is roughly the same.  Don’t forget your code requirements:

  • RecoverPoint 4.1.2 or later
  • XtremIO 4.0 or later

RP is deployed with Deployment Manager as usual, and XtremIO is configured as usual. 3GB repository volume (as usual!).

RP to XtremIO zoning is simple – everything to everything.  A single zone with all RP ports and all XtremIO ports from a single cluster in each fabric.

With the new 4.0 code, a single XtremIO Management Server (XMS) can manage multiple clusters.  Even though it would probably work, I would use a single zone per fabric for each cluster regardless of whether it is in the same XMS or not. More on the multi-cluster XMS with Recoverpoint later…

When you go to add XtremIO arrays into RP, you’ll use the XMS IP, and then a new rp_user account.  I’m not sure what the default password here is, so I just reset the password using the CLI.  If you have pre-zoned, you just select the XtremIO array from the list, give the XMS IP and rp_user creds.  If you haven’t pre-zoned, you also have to enter the XtremIO serial as well.


Here is the “I didn’t zone already” screen.  If you did pre-zone, you’ll see your serial in the list at the top and don’t need to enter it below.  Port 443 is required to be open between RP, XMS, and SCs.  Port 11111 is required between RP and SCs.  Usually this is in the same data center so not a huge deal.

Once the arrays have been added in and your RP cluster is configured like you want it, the rest is again same as usual.

  1. Create initiator group on XtremIO for Recoverpoint with all RP initiators.
  2. Create journal volumes, production volumes, and replica volumes.
  3. Present them to RecoverPoint
  4. Configure consistency groups.  Here there are important things to understand about the snap-based protection schemes, that I’ll go over later.

One important change due to the snap based recovery – no recovery data is stored in journals, only metadata related to snapshots!  Because of this, journals need to be as small as possible – 10GB for normal CGs, 40GB for distributed.  They won’t use all this space but we don’t care (assuming your jvols are on XtremIO) because XtremIO is thin anyway.  Similarly, each CG only needs one journal, as your protection window is not defined by your total journal capacity.

RP with XtremIO licensing is pretty simple.  You can either buy a basic (“/SE”) or full (“/EX”) license for your brick size.  Either way you can protect as much capacity as you can create, which is nice considering XtremIO is thin and does inline dedupe/compression.  Essentially basic just gives you remote protection, XtremIO to XtremIO only, only 1 remote copy.  Full adds in the ability to do local, as well as go from anything to anything (e.g. XtremIO to VNX, or VMAX to XtremIO), and a 3rd copy (so production, local, and remote, or production and two remote).  Obviously you need EX or CL licensing for the other arrays if you are doing multiple types.  Just a point of clarification here, the “SE” and “EX” for XtremIO are different than normal.  So if you have a VNX with /SE licensing, you can’t use it with /SE (or even /EX) XtremIO licensing. 

If you are using iSCSI with XtremIO, you can still do RP in direct attach mode, similar to what we do on VNX iSCSI.  Essentially you will direct attach  up to two bricks directly to your exactly 2 node RP cluster.  I would imagine (though not confirmed) that you could have more than two bricks, but only would attach two of them to RPAs.  vRPA is not currently supported – this remains a Clariion/VNX/VNXe only product.

I’m going to cover some details about the snap based protection in the next post, but in the meantime know that because it IS all snap based and there is no data in the journal to “roll” to, that image access is always direct and it is always near instantaneous.  It doesn’t matter if you are trying to access an image from 1 minute ago that has 4KB worth of changes, or an image from a week ago with 400GB worth of changes.  This part is very cool, as there is no need to worry about rolling.  There is also no need to worry about the undo log for image access – with traditional recoverpoint you were “gently encouraged” 🙂 to not image access for a long time, because as the writes piled up, eventually replication would halt.  And there was a specific capacity for the undo log.

Instead now the only capacity based limit you are concerned about is the physical capacity on the XtremIO brick itself.

Allegedly Site Recovery Manager is supported but I didn’t do any testing with that.

RP only supports volumes from XtremIO that use 512 as the logical block size, not the 4K block size.  Although there is such little support for 4K block size now I’m still strongly discouraging anyone to use that, unless they have tons of sign off and have done tons of testing.  But if you are using 4K block size, then you won’t be able to use RP protection.  Just to clarify, this is the setting I’m talking about – this is unrelated to FS block sizing a-la NTFS or anything of that nature.


A few other random caveats:

  • If one volume at a copy is on an XtremIO array, then all volumes at that copy must be on that XtremIO array.  So for a given single copy (all the volumes in a copy), you can’t split them between array types or even clusters due to snapshotting.
  • There must be a match between production and the replica size, although I always recommend this anyway.
  • Resize for volumes is unfortunately back at the old way.  Remove both prod/replica volumes from CG, resize, then re-add.  Hopefully a dynamic resize will be available at some point.

In the next post I’m going to talk about some things I know and some things I’ve observed during testing with the snapshotting behavior, but I wanted to call out a specific limitation right now and will probably hammer on it later – there is an 8,192 limit of total volumes + snapshots per XMS irrespective of Recoverpoint.  This sounds like a ton, but each production volume you protect will have (at times) two snapshots associated with it.  Each replica volume will have max_snaps + 1 snapshots associated with it.  Because this is a per XMS limitation and not a per cluster limitation, depending on exactly how many volumes you have and how many snapshots you want to keep, you may still want a single XMS per cluster in a multi-cluster configuration.

More to come!