Monday, 23 January 2012

Use of Remote BLOB storage in SharePoint 2010 to reduce backup overheads

King of the snappy and interesting blog post titles, that's me...

Our current 2007 storage topology is, to be frank, not fit for purpose. Too much content sits inside a single site collection, which means a single database and understandably the chaps that run the storage infrastructure, especially the backup and disaster recovery set up, aren't happy at having to back it all up overnight, every night.

Part of the purpose of our ongoing migration work is to look at this set up and see if there's anything we can do to make it a little more sane and manageable for backup. One option is to look at persisting all of the main document storage (which is to be split more sensibly across a multitude of content silo databases) via Remote BLOB (Binary Large OBject) Storage (RBS), something new to SP 2010, though not new for the world of SQL, which uses the FILESTREAM option in the background to carry out this cleverness and has been since SQL 2008.

On the surface this would be a nice neat solution for our situation and looking at the planning documentation makes me think that there are scenarios where it could work nicely, but you obviously have to be very careful about how you use it as there are some key limitations to be aware of.

Key considerations

For our context, the critical issues to note about RBS enabled databases are:

  • 200GB limit per database (well, ok, not actually a 200GB limit, but in our situation, as this article explains, we would have to carefully benchmark our IOPS to go past 200GB up to the max 4TB enabled by SP 2010 SP1)
  • No encryption supported, even transparent DB encryption
  • RBS is bad for situations where your writing lots of small BLOBs (< 256KB), it's good for situations where you have fewer larger BLOBs (>256KB)
  • Does not support database mirroring for FILESTREAM enabled DBs. You cannot have FILESTREAM filegroups on the principle mirroring server
Obviously your issues may differ and the fit between your data environment and RBS may be better or worse. 

No database mirroring? Erk!

The last issue suddenly set off alarm bells to me as we're intending to use a mirrored DB setup as the SQL backend for our SP infrastructure. This makes perfect sense as mirroring works by sending transaction logs directly from the principle to the mirror server and BLOBs don't exist in transaction logs and the mirror server cannot (by definition) claim access to principle server contents. So we're at a little bit of impasse, we can hardly claim to be improving the resilience and DR strategy of our SP infrastructure if we have to ditch DB mirroring as an option.

Alternatives to mirroring

There are some alternative routes available for DR/resilience. FILESTREAM does work in concert with fail-over clustering in SQL 2008 R2, but clustering is less performant than mirroring when it comes to providing resilience (even though it's probably quicker in general operation than mirroring as it doesn't have a principle 'waiting' for the mirror to catch up). Recovery in fail-over clustering is slower and there is no load-balancing of effort. It's for this reason that most big orgs use mirrored clusters, giving them the best balance of immediate switch-over and good overall DR.

We're not in that situation and in any case it wouldn't resolve the FILESTREAM issue, only compound it, so at present I really don't know what the best option is going to be. Log-shipping also works with FILESTREAM but I'm not keen on that option as again fail-over is slow and even worse, manual.

Conclusions
 
So, this leaves us (me) in something of a dilemma. We can't take advantage of FILESTREAM, which is required for RBS unless we ditch the idea of database mirroring as part of our resilience and DR model. The alternatives to mirroring have their own draw-backs that I'm not sure I'm happy to compromise on.

Interestingly (though of no help here as it's in RC0 still), SQL 2012, 'Denali', brings a new option to the table  in the form of AlwaysOn Availability Groups, which do combine with FILESTREAM (though whether SharePoint will support this is not yet obvious).

In summary then I'm likely going to need to talk very nicely to my systems backup guys and hope they don't baulk at the prospect of not being able to do differential back ups of nice neat files sat on disk...

- rob 

No comments:

Post a Comment