What is unstructured data?

“Unstructured data” is a popular focus within information security and governance circles – but what is it? What are the security and privacy concerns associated with this type of information? In this blog, we’ll define unstructured data, explain where it comes from and outline the issues associated with managing the security and privacy compliance of sensitive and regulated unstructured data.

In general terms, unstructured information is any data that is file-based as opposed to organized into rows and columns in a database. Common examples include Microsoft Office documents, images or pictures, videos, PDFs, emails and text files. The commonly known term “Big Data”, is used to describe massive repositories of unstructured data that is typically analyzed using specialized software. However, unstructured data can be analyzed by humans as part of manual business processes.

The amount of unstructured data stored by businesses is massive and growing. Consider all the documents and spreadsheets that a typical employee reads or creates during the course of a normal workday. For example, think of how many documents you submit to a lender as part of the home loan underwriting process – and how many documents that financial institution shares with you. Now consider how many home loans that business originates each week, month and year. That’s a lot of unstructured data! That’s also a lot of Personally Identifiable Information (PII) and other private data.

Structured data is typically stored in a database of tables, neatly arranged in rows and columns. Business applications that rely on databases are relatively easy to find and secure. Unstructured data is rarely well-organized and tends to be unconsolidated making it difficult to locate. Further complicating efforts to secure unstructured data, is that access to it is almost always at the discretion of the employee that created it. Needless to say, most businesses are not doing enough to locate sensitive unstructured data and keep it confidential.

Technical solutions exist for discovering unstructured data, encrypting it, managing access to it, and controlling how it is shared. Historically, however, there was no single solution that could do all of these things. Lacking a unified platform for securing this kind of information, many businesses have simply delayed efforts to address the problem. Fully integrated solutions for discovering, securing, and controlling unstructured data now exist, and with increased data privacy regulations (e.g. EU GDPR, US HIPAA, etc.) businesses need to take action quickly to address this gap in their security and compliance strategy.

October 30, 2017