You may recall the last two blogs (‘What Do We Really Mean By The Term ‘Content’?’ And ‘The Perils Of The ‘Garbage In’ School Of Content Migration’) on the topic of content migration services, where we discussed what ‘content’ actually is, what forms it tends to reside in, what steps tend to be taken to preserve it and so on. Now we need to discuss two important aspects of the migration itself, partly to detail how we can help you but also to underline the importance of the concepts.
First let’s start with data enhancement. As they live in a file system, documents often have limited information associated with them, apart from their name. It can be useful to extend that, so that you can also include a name, title and subject and so on. The other ‘data’ to add is where that document sits within a folder structure. So think, finances and ‘expenses’ and a file classification by days or months and so on.
A common problem is that while the folder structure has probably served the organisation well over a period of time, never forget the information always has to be actively extracted with the folder approach. You search by going through the folder hierarchy – a hierarchy that has evolved over a period of time and is now many folders deep after 10 years or so of continuous service. That extraction process can end up being quite an onerous task.
In contrast, the current universally accepted and popular method for finding information is to Google it. I talked last time about how we can add a ‘Google’ front end to your content – Now I am going to start explaining what that really means.
The Google school of keeping ‘score’
So in Google, you type a few words into a text field, hit return and you get a list of information running into thousands of pages. In this paradigm, the way you get the information you want is to sort through the first page of the results to find the information you desire.
Very rarely will anyone go past that first page of results. That means that for the corporate data and content migration task, you need to ensure that the appropriate taxonomy to help this search gets added to the content in migrating, to ensure that the first page shows the most relevant results to the end user.
That’s why we emphasise the importance of data enhancement. We can work with you to ensure that when we put information into the system we can provide you with the necessary data taxonomy, not only the content (all systems will now typically search on the content, that is how Google does its indexing, plus using a few keywords). That is something which we provide as part of the migration process, getting the business to help clean up the data prior to the move and can add useful information to the data associated with each document. This is going to provide the end user with a much more enriched search capability.
Next, we’ll look at data de-duplication of content and data.