Finding the Embedded Meaning in Unstructured Data

If you are implementing learning technology you probably need data, and a lot of it. Maybe you need course data for an LMS of you need content for a content management system or a learning portal.

The data you need is in the heads of learning folk. But when you get that data it is usually unstructured: merged cells, inconsistent hierarchies, superfluous columns.  You roll your eyes once again and IM your tech buddies about what idiots your stakeholders are. But this doesn’t help you at all.  It will take you many meetings to tease out the data that you need and by that time you will be frustrated and at risk of missing your deadlines and your stakeholders will have lost faith in your process.

You need to let go of your smugness about understanding the need for structured data.  You don’t have time for it.

I was staring at one of these unstructured data sets one day and I had an epiphany. Once I let go of my frustration and tried to see things from my stakeholders’ view I realized something important.  Learning folk aren’t taught data structure but they are rewarded for imparting meaning into the visual format of documents.  How can you blame them for expecting that the meaning implied in formatting is transferable to systems?

All of those shifts in metadata and hierarchy tell a story. A computer program can’t read that story but as a human you can read it. It is your job to find that story and interpret it in a way that can be replicated in the system.

It may take longer to extract but it sure beats countless frustrating meetings.


