Gartner defines dark data as the information assets that organizations collect, process and store during regular business activities, but generally fail to use for other purposes. Across enterprises, a bottomless pit of dark data is getting built; and enterprises are not only running the risk of underutilized data, but also compliance violations.
A major source of such dark data are your documents. Over a period of time, many documents get authored and stored. Other than a few documents that are accessed or updated, enterprises are typically unaware of the contents of the remainder. An IDC report indicates that 85% of documents are never retrieved in searches, thereby pushing them to darker realms.
Gartner predicts that by 2020, less than 10% of organizations will find value in “dark data”. In other words, 90% of the organizations will find it difficult to extract value out of their dark data and stay clear of the lurking compliance risks.
90% of the organizations will struggle to illuminate their dark data.
Illuminating dark documents is much easier today with the rise of Artificial Intelligence. If enterprises can leverage AI to make their documents discoverable through search and other means, they can start understanding the content. AI bots can crawl through all your documents, understand them, classify and tag them automatically. Consequently, these become easily discoverable through search platforms. For example, your dark data (documents) can contain a huge amount of Personally Identifiable Information (PII). Imagine if your Information Manager could go and filter all the documents that contain PII, it would be a breeze to comply with the GDPR regulation – which is not possible currently.
Gartner’s top recommendation to address this challenge is to purchase a file analysis product to get a picture of the data demographics, emphasizing redundant, outdated and trivial data along with sensitive and personally identifiable information.
Acuvate Lexico is such a tool that auto-generates and associates metadata to the documents through lexical analysis. With Lexico you now have the power to classify a document as “PII-containing”, and can archive or catalog all such documents by creating governance policies and controls. You can simply refer to this archive or catalog when you want to retrieve PII documents.
By analyzing, classifying and tagging all your documents accurately & automatically, all your dark documents get illuminated.