Search This Blog

Friday, March 29, 2013

Jon Stewart Don't Know Squat About Data

This past Wednesday (Mar 27, 2013), the Daily Show included a segment on the backlogs within the VA and found the cause to be
  • The reliance on paper to transfer medical records between the DoD and the VA
So far, so good.  Good investigative reporting.  If they had only left it at that and allowed viewers to form their own conclusions...

Instead, John Stewart made a joke about validating his preconceived ideas concerning Republican responsibility.  He played a clip of a Republican member of the Defense Appropriations committe comparing AHLTA (the DoD healthcare information system) and VistA (the VA's Healthcare Information umbrella) to Play Station and X-Box which, though both can use the same TV, cannot talk to one another.  AND THEN Mr.Stewart suggested that the parent of the household, if he/she wanted to minimize contention and confusion, could impose a single solution on the household.  A photo of President Obama was displayed during this, clearly indicating that all the blame could be laid at the President's feet.

Nothing could be further from the truth.  The VA has been assembling the components of VistA since 1987  and it was fully formed by the late 1990s.  AHLTA was introduced amidst much fanfare at the beginning of 2004.  The birthing process for AHLTA overlapped the adolescence and maturation of VistA, indicating a conscious decision (pre-2004) by the DoD to ignore the VA's efforts.

This entire mess is too complex in its origins and effects to discuss in a single blog entry so I'll content myself with one last piece of exculpatory evidence on behalf of anyone who has assumed elected office since 2004.  The only way to move information from AHLTA to VistA is
  1. Generate paper from AHLTA
  2. Send the paper to the VA
  3. Enter the information from the paper into VistA
This is also the reason why the DoD can't simply replace AHLTA with VistA. 

There is a lot more to be said on this subject but, among other things, it seems clear that this is but one (albeit very painful) example of the inability of our health care "system" to adopt anything approaching a standardized view of the enterprise.  Interoperability is something to be wished for but which cannot exist in an environment in which healthcare systems vendors are the X-Box, the Play Station, and Wii.

Thursday, February 14, 2013

Data is like Scat

We (data people) like to talk about data as if it were the most important thing in the world when, in reality, it is exactly like turds.  Yes, that's what I said, merde, gavno, scheiss, shit. 

Of course, it could still be the most important thing is the world but whether it is or is not depends on many things, not the least of which is what we're trying to do.  Data is the byproduct of process much as turds are the byproduct of digestion.

For naturalists and scientists castings or anaimal feces provide important information on a wide range of subjects including the animals diet, where it has been, and its general health.  For a tracker, a different set of information is deducible from the spoor.  The tracker tells which species, how recent, whether the animal is moving or is likely to be nearby and other things that may help him earn a bonus.

No one says, "This is quality shit." or "We need a governance program to improve the quality of our shit."  They simply learn what they can from it and move on.  The kind of value available changes over time but even petrified feces have a story to tell.

If we presume to be data experts, maybe we should be focused on extracting value from our data and understanding the kinds of value that can be extracted as well as how to recognize the data that will tell us what we need to know.

Even the most inconsistent set of data has a story to tell and we should listen to that story instead of wailing about the story we wanted to hear.  Because data is the byproduct of process, inconsistent data should tell us that the process that produced it is inconsistent.  If this is the case, how can we expect consistent data unless we can create a consistent process.

The bottom line is that when we focus our efforts on data quality we are misunderstanding the world we live and work in.  We are creating additional and entirely unncessary complexity.  How does this happen?  The cause lies in the storage of data for input to and output from computer systems.  We have allowed ourselves to institutionalize the cart before the horse.  What computer systems require is consistent input.  A system can be designed to deal with quality issues as long as they are predictable.  A system (process) will always produce output of a consistency equivalent to its input.

If we were pursuing consistency instead of quality, our disagreements would be fewer, our stress would be lower and our impact would be greater.  Consistency a readily understood concept while quality will always be the most elusive of quarries.

Sunday, September 9, 2012

Reach the Unreachable Goal

Information quality, like many information and technology subspecialties, seems to drift in and out of focus.  Sometimes I find it difficult to understand why the subspecialty exists.

A recent epiphany concerning words and language has sent me on an outward-bound trajectory.  I used to chafe at the "definition" of quality in a data context--the one that defines quality as "fit for purpose."  I am now ready to break with it completely. 

My epiphany had to do with the fact that meaning is not packaged up in words and phrases.  Rather meaning is hiding behind and beneath them.  Well-chosen words can be used as markers to stake out the boundaries of meaning or even, if we're careful, to constrain meaning much like a fence.  When we focus on the stakes or the enclosure we risk losing the meaning that's inside.

We have all sat in high school (or college) literature classes and debated what the author or poet meant--what was the meaning that he or she had captured within the fence of language they had constructed.  What we have failed to see in our information technology context is that business leaders, managers, consultants and others are exactly like those authors (only usually not nearly as careful in their use of language).  We always have to ask what they meant and when we investigate, we invariably find that they didn't really know.

For more than 30 years we have labored to constrain the use of terms and encourage (or even enforce) the standardization of terminology.  The issue seems to have taken on even more importance with the widespread adoption of newer reporting (business intelligence) technology that brings the data full circle.  The leaders who were unable to stay focused long enough to generate good requirements are now launching their BI desktop and seeing the result.

I think that if we are honest with ourselves we will realize that though we now have better titles for what we do and we are getting paid better to do it, we would have to admit that our goal is not attainable.  We are being held accountable for The Quality of Data because that's how we have labeled ourselves.  We haven't even learned from the tribulations suffered by the other quality disciplines.  At least we should begin calling it data quality control or data quality assurance.  We have to focus on improvement in processes rather than improvement in data.

Look inside the fence I have built here and see if you can find some meaning that will help you.

Friday, June 22, 2012

Language, Information, Data and Quality

Hat in Puddle
Hat in a Puddle
Language is a slippery thing.  For many years I conducted my life in the firm belief that if I were only precise enough in my selection of vocabulary, clear enough in my choice of syntax, I could convey an idea without ambiguity to any audience.

I have frequently been disappointed with the result. There is a force at work that allows people in the audience to navigate their own way through the meaning of my language.
Those in the data and information industry are accustomed to thinking of meaning as the semantic of the data.  Nothing could be further from the truth.

The dictionaries are full of semantics.  We can choose from a rich set of words (semantic tokens) to describe the situation represented in the picture above.  Note that we can change the meaning of the set of semantic tokens by several non-verbal (without words) methods including tone and inflection.  For example "hat in a puddle" has a different meaning than "hat in a puddle" or "hat in a puddle."

When we talk about semantics, we mean the meaning denoted by the words.  Alas, as humans we must also deal with the connoted meaning that each of us associates with the words.  Words and collections of words invoke in us memories, hopes , desires that are not part of the semantic but are part of the meaning.

The situation is very much like the puddle above.  Most people will accept the puddle at "face value" and simply avoid it so as not to get wet or muddy.  Others will make assumptions and develop expectations based on their personal experience with puddles.  Some of these will not change course, especially if their experience and their current situation allows them to expect that they won't get muddy.  Note that a new dimension was just introduced--a temporal dimension that allows us to react differently now than we might have an hour ago.

Appearance is Not Meaning
All Meaning Not Apparent
A man walking along a road saw a hat in a puddle and recognized the hat as one usually worn by his neighbor.  He thought to pick it up and return it.  When he picked up the hat, however, he saw the face of his neighbor.  He asked whether the neighbor needed help.  "I'm all right.  I've got my horse under me," was the reply.

The face value of words (and appearances) is accepted by most people and used to support decisions of all kinds.

Poets understand that meaning is not conveyed by words.  "Wait a second!" you say, "Poetry is composed of words."  We're both correct.  The meaning of a poem (or a story) is created by all the images, memories, hopes, dreams and desires that those words evoke in us.  This is why everyone who makes rhymes is NOT a poet and why everyone who has a command of vocabulary and syntax is NOT an acclaimed author.

This is the world in which we attempt to improve data quality.  While we may aspire to improve the quality of the semantics, it seems clear that we will never influence the quality of meaning.  This is, perhaps, what "fit for intended purpose" tries to convey.  What if the semantic tokens were musical notes instead?  What if they were colors or smells?  Would we be as confident?

What if we ceased our attempts to control the perceptions of an audience and instead created ways for our audience to explore the boundary between semantics and meaning?

Monday, May 14, 2012

Whole Cloth or Patchwork

Data Quality is too big to conquer. 

Customer mailing addresses or patient demographics are big enough challenges for most.  What is the common thread that, once followed will allow for a holistic approach to data and information quality? 

The picture at right (captured from Wikipedia and source unknown though the file name was in German) gives an idea of what I mean.  The weft (or woof) is the continuous thread while the warp is comprised of individual threads such as tools, methods, metadata, process, culture, management, governance and so on.

The whole cloth can smother any initiative while the individual threads of the warp provide gainful employment and career advancement for (at least) thousands.  Anyone with thread or yarn experience knows that, wound on spool or into an organized and well-designed ball, it is useful and can be applied effectively to many purposes.  Off the spool and out of control it is trash that must be thrown away.

If the whole cloth is data quality, then what is the weft that makes it more than a collection of individual fibers?  When we wish to weave data quality, we don't get to choose our warps.  We have to take everything as we find it and somehow turn it into the blanket that everyone wants.  Sometimes we find that we have a small patch of that blanket but no way to merge it with other patches because the weft is too specific and insufficient.

Sometimes to find a way to take several such patches and turn them into a quilt.  Rarely, the quilt provides the security needed.

Some questions that occur to me:
  • Can data quality be a patchwork?
  • What are the candidates for the weft that will bring all the variations of warp together into a consistent fabric?
  • Can we content ourselves with becoming masters of a specific warp thread?  If enough such masters emerge, can they collaboratively create data quality?
  • Is computer science part of the warp?  How about MIS?  Psychology?  Anthropology?  Or are these kinds of things that can be twisted into the yarn of the weft?
  • How do management, governance, leadership, software development, system design and architecture, data design and architecture, process design and architecture... fit?
  • How do (data) modeling, metadata management and various other data-specific technologies fit?
  • How do commercial products and tools fit?
I'm on the lookout for new questions but if I ever come across an answer, I'll certainly scoop it up and add it to my basket.

Sunday, November 13, 2011


We don't value now.  We talk about it all the time.  We use it for emphasis.  We mostly use it to separate the past from the future.

Often when we talk about now we have only the fuzziest of notions, the flimsiest of definitions in mind.  "Now is the time for men and women of conscience..."  The concept of now in this instance could mean anything from a generation to a session of Congress to a particular crisis.  Sometimes we use now to mean very soon or as soon as humanly possible (the future) as in, "I need it now!"  Occasionally we use it to bound some time in the past as in, "Now in those days..."

Now has power that, for most people, is unrealized because it is unrecognized.  For our purposes now is the moment of decision.  If we are able to grasp that moment, that now, and use it, we can change our own life and the lives of those around us.  We can use now to create a new future.

In the gap between stimulus and response there is a piece of eternity--now--in which we can decide what the future will look like and decide on the response that will launch that future.  Please be aware of now and use it as it is intended to be used. 

When we are present in our life we are conscious of each now and we use them to create a future that matches our vision.  What is your vision, your ideal?  What will you use to direct the decisions you make in each of your nows?

Wednesday, November 9, 2011

All Governance (like Politics) is Local

Tip O'Neill, the Speaker of the House of Representatives during the Kennedy/Johnson years, is famously said to have offered the advice that "All politics is local."  If there is anyone out there who doesn't understand that Data Governance is politics then wait.

If we're to gain any advantage from the former Speaker's wisdom, we going to have to pull it apart and take a look at all the pieces.  Clearly he wasn't denying the existence of national and even international politics.  He had participated in politics at every possible level so what did he mean and how can we benefit?

First of all the context (which is always eliminated from "sound bites") is that of successful politics.  Which of us doesn't dream of successful data governance?  If we can accept that DG is political rather than technological or administrative or managerial, then we're ready to make use of political wisdom in our quest for successful data governance.

Successful politics is about getting enough people to come with you so that you can accomplish a vision.  Because we're human, we look for shortcuts.  We start by assuming that if we can convince the right person then that person will bring everyone else along.  So we start with our elevator speech in case we find ourselves confined with an influential person for any period of time. 

We also adopt the position that money will equate to support.  We pursue funding which requires approval at the executive level.  In short, we focus much if not most of our efforts on the critical few in the blind hope that all others are followers.

For sheep this may work.  Substantial research has been done on flock or herd behavior in an attempt to understand how humans are influenced to move one way or the other.  We have all seen a flock of starlings or sparrows or a school of fish suddenly change direction--apparently with a single will.  What magic would get people to act that way?  Leaving aside the question of goal or vision, which may or may not involve the common good, if we could master this magical force, think of all the effort that could be put to better use.

I have read some of this research and at the risk of oversimplification the answer lies not in identifying the leader but in identifying the first followers.  When one bird or fish or wildebeest, in motion, changes direction it may be for any reason or no reason at all.  If no one comes with them, they will very quickly rejoin the mass.  If another individual comes along then two going in the same direction exert some "gravitational" attraction that acts to influence others in the vicinity.

In the human world we divide people into leaders and followers.  More generally, we try to create leaders by assigning titles or creating org charts.  As mentioned above, we intend to leverage leadership by devoting our efforts to affecting their path, trusting that they will bring with them enough followers to makes our effort successful.  The problem with all of this is that titles do not confer leadership. 

What lesson can we learn then from Tip's advice?  My own take is that, rather than search for a leader, we might better be a leader, campaigning locally and helping our neighbors and those in need.  When we have one or more others with us because they are benefiting form the relationship we become much more effective in changing the direction of the heard.  Tip understood that grand political movements arise from individual voters recognizing common goals. No legislation is effective when the governed choose not to obey.  Devote your efforts locally and pay attention to what your neighbors in the next block are saying.

Politics is local.