How to enhance your data quality and make friends

03 December 2019 Daniel Kidd, Deputy Academic Registrar

THE PROLOGUE

I’m sat in an airport reflecting, and one of the many things I dislike about myself – ranking just below my dreadful hair – is how awkward I am when I meet new people. I guess it’s a defence mechanism (or maybe it is something worse), but I tend to feign indifference to the situation and try to act overly casual. One way in which this manifests itself is I struggle to say my own name when introducing myself (goodness that’s even more ridiculous in writing) – it tends to come out garbled and I become almost apologetic that the person I’m meeting has to process the information. I’ve just recently experienced this on a trip to visit my friend in Dubai (I mention this here purely to boast – yes I do have a friend), and waiting on the marina before a boat trip I started to meet his other friends for the first time:

PERSON A: “Hi, I’m Mike”

ME: “Hi, it’s Dan”

PERSON A: “Hi Stan”

A shrug of my shoulders followed, and I just assumed I was suffering from heat exhaustion.

PERSON B: “Hi, my name is Jonathan”

ME: “Hey, it’s Dan”

PERSON B: “Good to meet you Stan”

ME: “No, it’s Dan”

PERSON B: “Ok Stan”

ME: “It’s Dan”

PERSON B: “Cool, see you on the boat Stan”

PERSON C: “Hello, I’m Rachel”

ME: “I’m Stan, good to meet you”

…and so it is, the mistake is normalised, and the error becomes truth.

In the sphere of data quality the same pattern often follows – a mistake, a challenge, then reluctance acceptance.

THE MISTAKE

The mistake should always be expected, imperfect human beings trying to adhere to principles of perfect data. The mistake can and will happen to anyone – I’ve long tired of telling stories of when I incremented the NUMHUS for every single student transfer over the course of a whole year, or closed my eyes as I hit F6 on a SITS table without truly knowing whether what I had done will break everything.

The mistake is coming, and you should plan for it. I should have foreseen it, it’s happened countless times before (for 3 years a colleague from the then TDA called me Darren), and I should have been prepared. A strategy, I needed a strategy of how best to reduce the likelihood of error. Damn it, why didn’t I print a name badge off. People are less likely to make the mistake if they can see it written down, if they have something to follow. It’s dull when auditors talk about procedural documentation, but they do so for a reason, and that’s because it supports repeatable processes with consistent known outcomes. It’ll take time and resource, but it’s an investment you must make to support good data management and provide resilience for the future… you know, for when that stressed looking member of staff who is twitching and muttering about IRIS and Minerva finally follows their dream to be a social scientist. Once you’ve written up the procedural documentation, maybe get on to writing up a data quality strategy too.

Or just say “Hey, it’s Daniel”. It literally would have been that simple. It was the abbreviation of ‘Dan’ that caused the mistake. Abbreviating, and language generally, presents challenges around data. Language quickly becomes idiosyncratic, institutionalised, and lazy. Establish a data dictionary, maintain it, and use it – language around data should be deliberate and precise. Of course, if I had more friends to lose, I could have sat them all down on the marina, in 35 degrees heat, and run a 3 hour training session on my name. It would have got the job done, reduced the likelihood of reoccurrence of the error, but to be successful I would have needed to make clear the value in the training (me having a more enjoyable time) in order for them to take time out of their busy boat trip.

THE CHALLENGE

The challenge is a challenge. For staff to feel confident to challenge data error, and to be resolute in re-challenging, there needs a culture of trust, continuous improvement, and supported by symbiotic relationships. Making mistakes is okay, letting them go unchallenged is not.

Concepts of ownership in data is a good thing (it is integral to good data governance), however it can also prove to be a barrier as we feel less empowered to challenge or question how data is being managed and processed within its ownership domain. The very term ‘ownership’ is problematic for me, it suggests possession, autocracy, and control. I understand why we use ‘owners’, however I favour pushing the view that the ownership of data rests with the organisation as a whole, and thereafter we trade in more inclusive and less dividing terms such as ‘data leads’ or ‘data custodians’. Maybe I’m being overly cautious here, however we need to ensure that our language does not reflect a deeper cultural problem or indeed that language isn’t driving an unwanted culture. My experience tells me that too often the challenging of data error is an informal event, a desk-side conversation, with little or no follow-up. As such where the response is “it’s right”, “we’ve always done it like that”, or “sorry, who are you?”, the matter tends to be closed, or (perhaps worse) the issue is just fixed there and then without a wider discussion with stakeholders about how that change might impact onwards processing and use. There is a role for self-challenge too; how often do we stop and ask ourselves if we have got this right, or seek advice from others within or outside the organisation.

There are mechanisms that can be deployed here to better support the challenge. I spoke recently at the HESPA Data Governance conference about the work we are doing at the University of Wolverhampton to implement an enterprise-wide non-conformance process. In simple terms, the process aims to record significant or widespread data errors that are likely to reoccur, assess the cause, and suggest mitigations for preventing repeat. The process involves a multi-departmental investigation team sitting down with the relevant data lead to understand the cause (system, process or people), and then agree an immediate remedial action and a longer-term mitigation. It provides an opportunity to learn from mistakes, and improve organisational memory and data management capabilities moving forward. Interestingly, given my earlier point about language and culture, where I named the process ‘non-conformance’, the information and data governance committee (who oversee and approve the recommendations) offered an excellent suggestion of being more nuanced and language-sensitive by calling it something like ‘data improvement process’.

I guess in practice it would work like this with my experience:

Cause: a nervous disposition (people based), and a lack of annunciation (system based), people not listening properly (process based)

Remedial action: A WhatsApp message to the boat party group ‘It has become apparent that a number of you sunbathing at the front of the boat believe my name to be Stan. It is not. My name is Dan, so if any of you feel compelled to talk to me later I will be hunched over the toilet below deck’.

Mitigation: Future training, name badges to be worn, and change my name to ‘Ray’

THE RELUCTANCE ACCEPTANCE

Why is having a process for challenging and reviewing data error so important? Once the data error is accepted and normalised it becomes very difficult to unpick and correct in the future – the process or reasoning behind the error becomes subsumed and embedded into standard practice. Why do we accept the error? Well it’s just easier isn’t it… otherwise we run the risk of becoming the owner of the issue, responsible for identifying fixing it and ensuring it stays fixed – and frankly, who has the time for that. Resisting acceptance needs the open data culture I described previously, recognised expertise in the organisation, and it needs structures such as ‘non-conformance’ to avoid premature closure. In my time working in training and consultancy at HESA, I would on occasion encounter resistance to the idea that there was an error in the way data was being reported. At times I would stand with the coding manual projected behind me, arguing back and forth about an incorrect use of FUNDCOMP=3. Data is personal, and persuading people that they might have been doing it wrong all of these years is not easy. Stop and reflect, do not accept that your data is what your data is. Honestly, I’ve seen things you people wouldn’t believe. Data and systems on fire over the shoulder of analysts. I watched C sharp written in the dark when it’s all too late. All that data will be lost in time, like tears in rain. Time to fly.

POST EDIT

At the University of Wolverhampton we have either implemented, or are in the process of doing so, everything I’ve suggested in this blog. We see them as integral pieces to realising our ambition of data being viewed and managed as an asset. If you need support with enhancing your data quality, help is out there, and of course HESPA is always a good place to start.

Stray Kidd

Tagged : Miscellaneous

Suggestions

How to enhance your data quality and make friends

Read more

Latest news

Advocacy

Development

Networks

Knowledge Hub

Events

About

News and Blogs

How to enhance your data quality and make friends

Read more

Latest news