Search This Blog

Friday, October 23, 2009

Disturbances in the Force

So, if we know how to make things better in terms of data quality and we're motivated to do so, what's stopping us? A word of caution; what you're about to read may be harmful to your health.

Maybe you're old enough to have lived through the Watergate fiasco and can remember the facts coming to light one by one in the press until they eventually began to make a complete picture. Maybe you remember the Hollywood version, All The President's Men, in which the whole picture was produced in two+ hours rather than months, or maybe the whole thing is in the same category as the Crimean War for you and is nothing more than a question on a pop quiz in one of your least favorite subjects.

I'd like to suggest for your consideration that if we want to track down why we are having such a difficult time accomplishing something that we all claim to want, we need look no further than the paragraph above to get all the answers we need.

First, let's imagine that data quality is like truth in government. It's a good thing and we would like to assume that we have it. If, in fact we do not have truth in government (or data quality), who benefits? The answer is that it is in the interests of those who believe they can/will be blamed for the status quo to cover up the problems and subvert efforts to get at the facts that can provide the complete picture. This is especially true if they are responsible for the problems. Even if the only identifiable responsibility is that the person is the supervisor or manager of the function(s) that owns the troubled processes, they still may elect to resist and subvert in order to avoid becoming responsible for the fix.

If we want to avoid this situation, we should absolutely avoid any questions that seem directed at why or who or even how. We should avoid to the extent possible, any investigation into the past. Try to keep all discussions focused on process-based causes that might be producing the effects you are seeing. Do not zoom in on isolated instances but look for trends. Remember, your goal is not prosecution but consistent quality.

In the words of Bob Woodward's source, Deep Throat, "Follow the money." The programmer-champion will struggle against this repeatedly. There is a perception that implementing integrity checking at the point of input represents added cost. Like any other complex process, system development should seek to minimize total cost of ownership rather than any single cost line item. If it takes an extra day of programmer time to ensure that we get 99.99% consistency of integrity in the database and thereby avoid dedicating multiple full-time staff to data clean up, this is a net cost reduction.

Our system design and project management processes may not be mature enough to assign dollar values to this, but it should be easy to determine how much money we are spending on fixing poor quality data every month (or year) and then amending the design and development processes to devote a fraction of that amount to prevention.

The final perspective to be extracted from our example is that a short attention span provides little hope of even recognizing that a problem exists let alone understanding it enough to develop a mitigation strategy. Data quality (and truth in government) requires that everyone be involved. People are capable of recognizing self interest within the corporate interest and enough people will be motivated to act that the ball will be kept moving, but the media of the late 1960's is not the media of 2009. In the 60's there was an interest in the truth that perhaps doesn't exist today. In your corporate environment, you may find it easier to maintain a constant pressure of communication directed at a single theme. The widespread motivation will not be produced by a single appeal surrounded by banners and fanfare and free cake. A communications campaign must be designed for the long haul with continuous refreshing of the message.

You don't even have one percent of your employee workforce today who are ready to grapple with the issue of data quality. You are going to have to break it down in multiple variations and start with the concept of data itself. What is data? You'll be surprised at what you uncover when you go out to talk to people about their data. Stay tuned for some samples.

1 comment:

  1. I like where this is going. In talking to a coworker about DQ yesterday, the topic of discussion was--what's important to the customer (and who are your customers for each field)? They may not care about the quality of fields they don't use (but may be important to a different customer).

    Identifying costs and time associated with quality improvements is important to getting management to devote the resources to making the improvements--so homework is important!

    ReplyDelete