How good is your data? Time to look under the hood

Case study: A data company

A common problem our clients express is around the lack of visibility or documentation currently in place for their data...

1. I don't know where my data comes from or what it contains

2. I'm not sure how complete or accurate the data is

3. What are the processes or governance currently in place on my data?

Does this sound familiar?

It's the inside that counts

From our experience, we've noticed a few key reasons why you might be asking some of these questions:

  1. Complex and undocumented data mappings or processes as a result of layers of “band aid solutions” and legacy systems. It is common that businesses prioritise quick tactical wins over more effective long term solutions, creating years of build up in technical debt.

  2. Key personnel departing from the business, taking with them years of valuable knowledge

  3. Breakdown of communication and collaboration between key teams within an organisation; this is also known as the "silo" effect

If any of the above is present, it is time to look under the hood to see if your data is as good as you think it is. After all, rubbish in, rubbish out, right?

We’ve worked with a number of clients over the years to help them document and understand the quality, reliability and compliance of their core data assets. One particular client is a major data provider within Australia. Although we were originally hired to assist with an analytics use case, we quickly became involved in the assessment and improvement of their data foundations.

Sources of data

Do you know where your data ultimately comes from? What happens if one of your source systems shuts down for a day?

We helped our client develop a living document containing information around their data sources, including:

  • Original source systems

  • Vendor names

  • Description of data (data dictionary)

  • Frequency data is provided

In addition to this we ensured that there was continual reporting and monitoring of the source data. This is important to identify any compromises to the data to prevent or minimise the impact on downstream processes. As an example, do you have a process in place to flag if a vendor provides the same file to you every month?

Processes of data manipulation and data quality

What are the different stages of change that are made to your data? Are you converting minutes to hours, or removing certain records - if so, why?

Data manipulation is generally required to improve the quality of the data (e.g. fix dates or typos), enrich existing data (e.g. through data matching) or implement business rules. These rules need to be reviewed and updated from time to time to ensure that the quality and relevance of your data assets does not falter. Documentation of these processes and decisions play a crucial role in maintaining data quality. Good documentation not only reduces the time taken to on-board new staff members, but also ensures that core IP and knowledge is retained within the business. Additionally, it captures business decisions which have been made and key people accountable for these decisions.

As an example, our client experienced a large turnover of staff in which many of the original developers and decision makers have left the company. This meant that process errors or bugs which required timely rectification were difficult to address as there were very little documentation in place. Another key control to maintaining high quality data is the ability to quickly identify and rectify abnormal behaviour or data errors. We helped our client develop an internal dashboard to track core metrics and monitor their data assets on a daily basis.

Data privacy and governance

Scrutiny on data security and privacy is more important than ever; according to the Office of the Information Commissioner Australian companies experienced over 800 data breaches in 2018. For our client, this was utmost priority as a data business.

We helped our clients work through existing and new data suppression requirements to ensure that they were continually compliant with both legislative and contractual agreements. Again, this was highly reliant on good documentation and controls. As a data company, our client needed to ensure that strict governance was applied not only on internal access, but also what data can be accessed by external users.

Downstream dependencies

Communication is key - particularly across different teams. What might seem like a trivial change to the underlying data might have a tenfold impact on a revenue generating product.

In the case of our clients, we identified key product and team dependencies on their core data. It is an important protocol to send formal communications regarding any key changes or errors, time frames or potential impacts to these stakeholders. This enabled our stakeholders to assess the impact and have more proactive conversations with their clients.

Never stop improving

The process of documenting and monitoring your data should never be a once-off activity. Whilst we helped our client put in place the foundation for ongoing monitoring, this will quickly decline and become irrelevant if not continuously maintained and reviewed. Good processes directly links to the trustworthiness and reputation of any company.


About Us

EdgeRed is a boutique data and analytics consultancy specialising in delivering high quality outcomes for our clients. Drop us a note and we'll be happy to have a chat regarding your data and analytics needs.

#data #analytics