So, I haven't mentioned it but we're currently engaged in (finally!) getting around to upgrading our infrastructure from MOSS 2007 to SP2010.
It's been a very long time coming and it's providing an opportunity to have a really good look at the SP farms we already have, how they're structured and to correct a few less-than-best-practices that we've fallen into.
We're working with an excellent consultant who I'm working with to try and get the most out of this transition across a number of areas. I want this to be an opportunity seized rather than suffered and I always try to view an experienced external perspective as an opportunity take on board deficiencies, learn best practice and implement solid plans for ensuring the long term resilience and scalability of the SharePoint platform at my organisation.
Anyway, one of the things that he's mentioned in passing is a new feature in SP2010 called 'Document Sets'. Doing some reading around on the subject (see how they work, how they compare to folders and what they really are) I think I'm almost sold on the idea of ditching the current storage architecture, which is heavily reliant on folders for content segregation, in favour of the document set approach. I wanted to take the time to explain why folders were the answer for me in 2007 so that I can better explain why document sets seem to be a better option when shifting to 2010.
The existing environment uses folders
The prevailing view in the SharePoint community (and I'd guess in any context where good document management systems are in place) is that folders are dead and metadata is king. I completely agree with this sentiment, which is why I used folders in our 2007 environment.
Wait, what? Well, as I say above, for me folders in 2007 were never about content organisation, that is to say, they were not a useful route for information discovery (except in some very rigid circumstances). Rather, they were a very useful route for achieving with a minimum of fuss a way around one of the most annoying limitations of 2007, the well-known 2,000 items in a container limit. Folders are one of the quickest ways of getting round this and the architecture I put in place to observe this recommendation ensures that no user can accidentally navigate to a page where they experience horrendous slow down as they try to view 10,000 documents. So for me, folders were not dead, in fact they were very much alive; not as organisers of content but as a way segregating it. All access to content is via indexed metadata properties and, essentially, customised data view web parts that manage the display of searched content.
Document sets as potential folder alternative
Now that document sets have come on the scene I feel I have a potentially better tool at my disposal for organising content related to the metadata 'objects' in existence. I'm especially interested in the 'Welcome Page' concept as it looks as though it would be a very neat way of managing some of the functionality I've already put in place in the 2007 infrastructure, by displaying other related LOB data with the documents in the set. Anything that makes the rolling up of these sorts of 'role-based' views easier is definitely a step in the right direction in my book. The content routing feature would also hopefully take the place of my existing 'uploads triage' system, allowing users to upload docs that are automatically routed to the correct document set. It's a bit of a fudge by all accounts, but it could be extremely useful; though it would necessitate some pre-validation of metadata to ensure that users can't enter invalid IDs, or IDs that are valid in formation but don't refer to a real applicant/student.
Allotting a document set per student and per applicant, or, even generating a document set at the applicant stage that then gets translated into a document set for the student, will give a lot of flexibility when it comes to managing the display, location and eventual archiving of student documents.
Document set limitations
However, there do appear to be a few fairly critical limitations related to document sets that may make this a tricky transition.
One thought I'd immediately had with document sets was: how about an overarching set that has subsets relating to the applicant and student phases of the student life-cycle. Well, no dice, you can't nest document sets (well, ok, I can see why, they shouldn't just become a folder-like entity, but still, it would've been nice here).
I'm also not sure how document sets will work in the archiving situation as document sets can't be declared as in place records, you have to push it into the records centre first and then declare. I wasn't envisaging users having to manually declare an applicant or student file as a record, but what if they do want to? It would mean additional training and guidance at the very least.
As with any new feature, it's always useful to make sure you do a wide review of available 'literature' on it to get a 'warts and all' view of it. Then you must spend time in advance of even double-clicking on your IDE planning through how you can use it to best advantage in your own environment. For me I think the most likely outcome would be three document set types, one related to direct applications, one to undergraduate applications and another to students. It would be possible to break student document sets into course types (i.e. undergrad, postgrad taught, postgrad research) but we get plenty of students who have multiple stays that cut across these boundaries and even within a given stay a student can have both taught and research components to their studies covering multiple independent courses.
Where to go from here?
This is still something of an open question though. Whilst I like very much some of the commonality of content you get with document sets, there are advantages to going 'lightweight' and letting the metadata do all of the talking at the document level, especially when it comes to dicing up those documents for display in a UI.
I already have interfaces setup for managing the relationship between the applicant and student documents an individual has with us, including switching between the different relationships depending upon the role of the end user. My suspicion is that this may not be as easy with document sets (though my gut tells me it's perfectly possible) or that it might not be as intuitive. I'm also concerned that some of the limitations of document sets might appear to be ok right now, but could cause problems further down the line.
If I can get my head around how a single document set could usefully represent the whole life cycle of applicant <-> student <-> alumnus then I'd be making some progress, but I'm not sure they're actually that well suited to that kind of 'object'. They seem specifically targeted at the legal community who like to keep case files and whilst you could treat the life-cycle as a 'case' I don't think that this would necessarily serve the users in the best way.