SynApps Solutions’ joint chief exec James Paton gives some useful tips on avoiding some common problems with moving to the cloud
I’d like to pass on some friendly and useful advice to anyone looking at sizing up a new Enterprise Content Management (ECM) system in the cloud.
Why is this a problem? It comes down to sizing their new system. Quite often customers approach an ECM solution the same way that they would approach a website when it comes to sizing it.
What do I mean by that? Well, with a website you work on the principle of how frequently you hit that website in terms of making sure there’s enough capacity, effectively measuring the number page impressions as your metric.
Say you have 10,000 users. By using this website-driven logic, you assume you’re going to have 10,000 users hitting the website at the same time, so that’s the metric you use for building the new content management system.
But it doesn’t work like that. While there may be some busy periods for which you have to account, what you actually need to measure the system on is concurrent users. When you come to load testing your ECM solution, treating the ECM solution and the way users access it in the same website-driven logic generally results in bringing it to its knees.
When sizing a ECM solution you need consider carefully how your user population access the content and how frequently they will need to be present in the system. Quite often the key consideration for scaling up ECM is not how frequently the users are accessing the pages, it’s how many users are logged in at any one time. Each user has a memory footprint and that introduces load in terms of memory usage within the infrastructure, as frequently a user logs into the solution performs a search to retrieve content and then opens it and reads it, thus their session in ECM becomes idle. When scaling ECM solutions so you’ve got to scale up to make sure that you have got that capacity for those users.
The 10-15% rule of scaling thumb
Think of it another way; imagine you had 9,990 people in the company who only ever read one document – and ten people who each time they logged in, wanted to read 10,000 documents. I exaggerate to make the point, but your system needs to be able to deal with this kind of asymmetric demand. You need to distinguish between heavy and light users, because a light user accesses one document and the heavy user 50. But the one that accesses one document hit the ECM server twice, so he is causing no activity to speak of on the ECM server, whereas the heavy user is accessing it frequently – he may hit 50 documents, which is a lot more activity. The gap between a user retrieving a document and searching for the next document is often called the Think time.
Having empirical data to back up the user concurrency is extremely valuable but quite often many organisations do not know what their user concurrency rate is likely to be so as a rule of thumb organisations can take that user base estimate to be 10-15% of the total user population. This group can be further split into heavy vs light users thus allowing some realistic calculations to be taking on the sizing requirements for needed for the ECM solution.
One example should suffice: In an insurer, underwriters are going to look at the documents on an as-case-needed basis, i.e. if there is a claim coming they need to go and refer to the document once. By this logic, they are light users. But that insurer’s back office finance and accounts team would be classed as heavy users, because they are accessing the system all the time. It’s important therefore to look carefully at the types of roles that the teams perform that will be using the system; there is no scientific formula to it, but a rule of thumb we find fairly consistent is that 10 to 15 per cent want to access an ECM system concurrently, so if you look at what that top percentile want to do in terms of access you should be safe.
In fact, scoping out the likely balance between light and heavy users is very sensible from your point of view. Light users are barely going to use the system and will have a small footprint; if you have a lot of light users who only typically want to access a document per hour, your server requirement gets a lot smaller. That will in turn reduce the cost of ownership of the application (as it cuts down a lot on the overhead in terms of infrastructure requirements). Equally this process may help identify when there are burst requirements, such as end of quarter scenarios.
When it comes to sizing a ECM solution the User usage profile is often the most mis-understood aspect of the architecture. There are many other factors to be considered for example Existing Content volumes, new content created and/or bulk ingestion of migrated content. Ultimately many organisations initially treat ECM like a website, which you cannot do, or you’ll end spending much more on infrastructure than is actually required in real-world situations.
James is Joint CEO for SynApps Solutions UK