While your structured content sources such as databases or ERPs might have already separated out such information for easier identification, unstructured data sources such as documents and portals pose a challenge. You might be compelled to go to your search centre and ask for it to return all the documents that have PII (Personal Identifiable Information) in it.
After all, you have spent tens of thousands of pounds on setting up your ECM. However, as you can imagine, this is not possible – not possible unless your documents (content) have “declared” that they contain PII, and hence make themselves visible to search queries looking for PII documents.
How exactly do you achieve this? The key prerequisite to effective information findability through search or eDiscovery services is classifying and tagging the documents accurately using metadata. Metadata is data about data, and can be used by content to “describe” itself. You would define a taxonomy that you would like to apply in your enterprise. And then implement a process while creating or editing content (uploading document) so that every user associates the document with the right tags from the taxonomy.
Very well, then, what is the challenge, you may ask! The challenge is the very process of users associating metadata with the documents. This process depends on the ability of humans to read, comprehend, and describe the documents accurately. A Hoovers report estimates that the average cost of manually tagging one item runs from $4 to $7.
And this doesn’t not even take into account the fact that this human-heavy process is error-prone, and often results in mis-tagged content (no two humans comprehend content in a similar way). IDC estimates that it costs you $180 to recreate a document that is not tagged correctly and can’t be found.
If you are an enterprise with 500GB of documents, this translates to a cost between $3.2 million to $5.6 million.