Search This Blog

Tuesday, October 5, 2010

Whose Job Is It, Anyway?

There are several reasons why you are reading this blog post. Leaving out all the self-aggrandizing ones, let's focus on those who actually have the title question in their minds. You may have been drawn here by an interest in BI (business intelligence), the latest name for "reporting." It is likely that you have had some bad experiences involving reports or dashboards, or mash-ups or some other information display/access effort.

You spent what seemed like way too much time getting to an understanding of
  • how this this was to be used
  • the kind(s) of content that would be useful/acceptable
  • how the information should be arranged/displayed

If you are part of a really accomplished organization, you may also have had seemingly endless discussions concerning

  • How "bad" data would be recognized
  • How "bad" data would be handled
  • Remediation or cleansing processes to reduce the incidence of "bad" data

And, finally, if your organization is in the six-sigma population

  • What is "bad" or poor quality data?
  • Where does it come from?
  • What does it cost?
  • Where should we devote our efforts?
  • What kinds of efforts offer the greatest ROI?

If you follow the various discussion forums concerning data quality, you will find one question popping up with regularity: "Who is responsible for Data Quality?" I asked myself why the question is asked. What prompts the question? It seems not to matter whether the organization has a reputation for quality, nor whether it has a history with data quality, nor whether the questioner is experienced or inexperienced, executive, manager, or front-line production. Why?

Having talked with some of these folks and researched the situations of several others and then simply meditated on this over an extended period of time, I have come up with a few likely scenarios.

  1. The questioner knows the answer but either wants validation or a sufficient number of the "right" answer from people who are likely to be respected.
  2. The questioner has encountered roadblocks from unexpected directions and is dealing with surprise and disappointment.
  3. The questioner is curious about what others are doing.

What I have NOT seen is any evidence that the questioner is sincerely trying to determine how best to attack the problem of poor data quality. It would be easy to assume from these same discussion forums and conversations, that nearly everyone knows what they're doing and has either solved the problem or is well on their way to a solution. When we learn enough about human nature we understand that these people are all feeding their egos (or rather that their egos are feeding themselves since much of this is unconscious) and that they are taking an incidence of limited or small-scale success and projecting it into an eventual enterprise level solution. I have done this myself.

I know this--that data quality, or any other variety of quality for that matter, will not bend to advanced degrees, nor to mastery of data and information design concepts, nor to any product or set of products, nor to any effort by marketers, nor even to the best-designed procedures, methods, governance structures, architectures...

The ONLY way to data quality lies through the hearts and minds of each and every person in your organization. Each of them has the power to subvert any plan, procedure or method; to render ineffective any tool or product; and to humble the greatest of egos.

The bottom line is that it must be everyone's job but of course that isn't a satisfying answer because the very short list of things that everyone believes are important includes things like breathing, eating, elimination of wastes, procreation, maybe community, relationship, acknowledgement... This list of universally agreed-upon important things will never spontaneously include data quality. In fact, if a poll were conducted in the boardroom, it is unlikely that data quality would appear on the list of things important to the company.

Don't misunderstand--the quality, security and accessibility of your data assets is at least as important to the continued health and well-being of your company as that of your capital assets. The problem lies in the fact that data isn't real and tangible. If bad data smelled bad or rusted or developed crumbling holes, or if it resigned and went elsewhere where it was more appreciated or was subjected to audits by outside entities, or showed up on a P&L or balance sheet where it was reviewed by prospective data contributors--THEN it might get some attention.

It is true that anyone can produce an example of data but virtually no one--even those responsible for collecting and storing its instances--will understand that data is something other than what they are holding or pointing to or storing. But I digress.

If we can't accept the answer that data quality is everyone's job then we need to move on to identify the person or corporate function who will be accountable for the quality of the company's data. It's not possible here to put a name to this accountable party. What we can do, though is itemize some of the skills and abilities required to help point the way to your unique name.

First and most important, let's agree that what we are talking about here is cultural change and cultural change, more than any other kind of change, requires leadership. Already we see that a corporate function can never be accountable although you can tape a job title on the person's door when you identify him/her. This leader will be able to move freely across the company and will be able to give everyone the feeling that they have been heard. This DQL (data quality leader) will be conversant with principles and practices of quality improvement. The DQL must be completely comfortable with the nature of data and will not be confused by the display of samples. Attributes of the data asset as a whole will be the focus of all of the DQL's efforts. S/he may well choose to shine the spotlight on a segment of the population and may delegate someone more familiar with that segment to assume the leadership of this effort. That surrogate DQL will also deal only with population attributes (metrics).

The DQL will never fall into the trap of confusing examples with anything else. An example may be representative or it may be an anomaly. Only the population metrics allow us to tell the difference.

Further attempts at guiding your choice may be counter-productive if you begin to feel manipulated or otherwise used.

A final caution concerns those characteristics that will render a person unsuitable. Just as you might wonder about a carpenter who feels compelled to talk interminably about his hammer or his saw, the person who leads with the name of a tool, tool vendor, methodology, author, book, etc., is unlikely to be the one you're looking for. The last thing that the leader/agent of cultural change needs is to divert any attention away from the primary focus. Products and tools may be useful for producing the population metrics discussed above, but beyond that should be well in the background and completely invisible to the majority of those you are attempting to influence.

Those who confuse files or [mega/tera]bytes with data are likewise unsuitable as are those who confuse a spreadsheet, chart or report with data.

I hope I haven't made it seem like an impossibility. Talk with people about this and over time you'll begin to get a feel for what to look for and what to avoid.