By Danny Molhoek, General Manager at Lexmark UK & Ireland

According to Gartner, around 80 per cent of the data within a business is considered to be unstructured. But what does this mean, and what is the data that enterprises are struggling to classify?

There is still an enormous amount of enterprise information that sits in text documents and presentations, graphics, email, audio, video, web pages and in various office software. Classifying data as unstructured doesn't mean it lacks any structure — rather it means that it doesn't fit in a database or as part of the enterprise relational data model.

Information stored outside a database accounts for the lion's share of all enterprise data. As such, there is a glut of unstructured data and content that has left knowledge workers disconnected from the information they need to be productive. For example, this can take the form of loan applications or insurance claims, student testing or job application forms. Regardless of the source, this generates paper-based information that must be captured and brought into business systems, but frequently remains unstructured.


Not only is there is a lot of unstructured data throughout organisations, the creation of new data has never been more distributed, and the velocity of this information moving around networks and into storage is accelerating.

Managing unstructured data is extremely important if a business wants to improve efficiency — as well as reduce storage and compliance costs. This task has been painfully difficult due to the time, resources and overhead required to manually insert, process and classify this immense volume of data. Not only is this process time consuming, it's error prone too.

The job isn’t going to get easier all by itself. Despite our best efforts to corral the unstructured beast, this kind of data continues to grow larger. Gartner analysts predict unstructured data will grow a whopping 800 per cent over the next five years.

This presents a real challenge for organisations that want to automate and improve their ability to understand their business, anticipate what’s coming and act quickly on risk and opportunity. If enterprises don't have the right software to prepare for these forthcoming content management issues, they had better start planning to do something about it.

The 'Digitise Everything' approach

Even worse, much of our enterprise process exists as unstructured data, in the heads of workers and lacking any systematic approach for capture, management, communication, measurement and improvement. When the work activities themselves are unstructured, the day-to-day behaviour of workers lacks cohesiveness and efficiency.

For many companies, the knee-jerk reaction to this combination is to try and digitise everything in sight and store it for safekeeping. But this does nothing to address the issue of unstructured data, in fact it can even make the situation worse. Unless this information is properly catalogued and tagged, all this added content has to be manually trawled through every time someone wants to find something.

In today’s information-driven environment, the way you process critical data can make or break your business. Get it right and you’re golden. Make one wrong entry and you could lose time, money and customers. That’s why it’s critical to get the right information, to the right people, at the right time, in the right format.

Turning the tables on unstructured data

The vital element is introducing intelligence and automation into the capture process. Now, instead of just being added to the pile of unstructured data, documents are accurately sorted based on their content, critical information is lifted out based on its context, then validated and seamlessly passed into the core business applications and workflows.

Intelligent capture with high levels of accuracy can suddenly turn data from unstructured to structured. As such, it becomes part of a data model that can be indexed and integrated.

Furthermore, this process can be done at the point of capture, even in distributed environments. As such, it becomes possible to create an end-to-end information capture, processing, distribution, storage and retrieval solution. This addresses the issue of unstructured data in an accurate, automated way — an intelligent way.


When this approach is implemented properly and seamlessly, it becomes possible to transform costly, error-prone processes into streamlined, revenue-generating and value-added operations. By automating these processes, staff can spend more time focusing on what really matters − growing your business and improving customer satisfaction.

Similarly, by reducing the amount of unstructured information within the business it becomes easier to derive added value from that content. Not only can processes and workflows be more automated, but it also becomes possible to find, view and analyse content across organisational siloes.