There have been some great discussion threads on the IAIDQ LinkedIn group recently. One thread that attracted a lot of attention started with a question from a PhD candidate in Data Quality. It simply asked whether there is an accepted definition of data quality. 200 replies later, most people would say, "No." More recently a thread began by bemoaning the fact that there is no accepted definition of Data Governance. A lively discussion followed that continues even now. Yet another refers to an article on five reasons to cleanse downstream instead of preventing upstream.
I am about to shed some light, however feeble, on the subject. Allow me to start by admitting that I am a person who likes to do the analysis necessary to solve a problem. Though my patience is improving with practice, those who know me will back me up when I say that I want to solve a problem ONCE.
Previous posts here have explained the abstract nature of data and all that implies in terms of getting people on board when it comes to doing something about quality. People will listen to or read a horror story about some preventable data issue that cost Company ABC $umpteen million. They will nod sagely and say something like, "They should have seen that coming." They are simply unable to see that their own company is engaged in the exact same practices and completely at risk for the $umpteen million.
Friends, it isn't just our favorite whipping boys, Management. There is no more recognition within IT than there is in the boardroom. Our boxes and wires friends think of data in terms of DASD and Raid configurations or bandwidth and throughput. Our developer pals don't really think of data at all except as the fuel that activates their code. Architects appear to be concerned with the storage and throughput views overlaid with an access management filter. They seem more concerned with making developers and DBAs happy than with the quality of the asset.
Enter Data Governance, which in most instances wants to be about definitions, rules and "enforcement." Often Data Governance tries to heap another thick layer, called meta data, on top of all the data that is already being mismanaged in the organization. It's often the case that Data Governance fails to practice what they preach.
Here's the revelation: Data Governance isn't about data. Data Governance is about process. It is the means to the Data Quality end. I have already said that Data Governance is that part of corporate governance that is dedicated to stewarding the corporation's data asset. It is exactly analogous to the role of Finance/Accounting with respect to the capital asset. Unfortunately, Finance has two things going for it that Data Governance doesn't have: GAAP and audits.
Generally Accepted Accounting Practice is a set of guidelines for money management processes that are accepted as the name implies and USED nationally and internationally. The use of these practices insures that processes will be auditable. The audit process verifies that GAAP was used and if there were exceptions, that they were clearly noted with enough information to allow the results to be brought back into alignment with GAAP. The underlying theme is that if the processes were sound then the result is believable.
Imagine if every company of any size whatsoever were able to devise and use its own bookkeeping structure and process. There could never be a stock market. Equity trading would be too risky for anyone and all business would essentially be sole proprietorships. Moreover, there would be no chance of oversight by outside bodies (Government).
This is a picture of the situation with respect to data today. When will it get better? Data Governance has no power to make the situation better. Without an externally defined data management framework and periodic audits by independent auditors, there will be no improvement. In the meantime, if data quality metrics improve, it's only because some particularly strong and charismatic personality is present.
No one questions the need for accounting nor the rigor of accounting procedures. Actually the same can be said for data governance and data management procedures. The difference being that in the case of money, the lack of question results in compliance while in the case of data it results in apathy or confusion.
Does the data world have something like GAAP that could become the necessary process infrastructure to support data management audits? I don't see it. Data is still too personal, too subjective, too misunderstood to attract the attention of researchers. Data management is a black box to virtually everyone and they like it that way.
People prefer to cleanse downstream data because their customers fell their pain being relieved. Happy customers is the goal after all. The bonus is that cleansing provides an unending source of employment for those doing the cleansing. It's win-win! People aren't going to be highly motivated to change a win-win scenario any time soon.
No comments:
Post a Comment