Data Integrity Fundamentals
The term “Data Integrity” basically means “is the information reliable?” This is an obvious concern in healthcare, in particular as healthcare moves from relying upon complete documents to sometimes less than complete “data feeds” when making patient care decisions. There are multiple vulnerability points in the life cycle of health data:
- Point of capture from clinical records
- Point of storage in a local database
- Point of retrieval from a local database to be shared with another system
- Point of import of data into another system
- Point of reconciliation when data from disparate sources of input are not in agreement
The diagram below shows the “Four Pillars of Data Integrity in Healthcare”
- Concept level specificity: This refers to how the information is stored in the database, usually in the form of a code (e.g., an ICD-9-CM code). In healthcare, information that is stored in databases related to patient care is often in the form of claims data (e.g., ICD, CPT, HCPCS). This type of data is often not specified to the level that would allow it to accurately represent clinical information. For example, there is no code in ICD-9-CM that matches the concept of pelvic pain so providers must choose another code with either or broader or related meaning (e.g., right lower quadrant abdominal pain). The use of claims data in U.S. healthcare is ubiquitous and is currently a major source of data integrity compromise. Unfortunately this most often occurs at the point of data entry, making it very difficult to correct these errors once the have occurred.
- Completeness: Even when information is accurately tied a code at the concept level the information may not be complete. To illustrate this, if a neurologist was referred a patient for the evaluation of “Possible multiple sclerosis” and the neurologist’s impression was “Doubt multiple sclerosis,” the modifying term “doubt” would have considerable clinical relevance. The vast majority of health information technology systems can capture a code (e.g., ICD-9-CM) that represents “Multiple Sclerosis.” However, very few of these systems can also capture and store the modifying term “doubt.” This has patient safety indications, as if the patient were to be seen by another provider and the only information conveyed was the diagnosis of multiple sclerosis, the provider may not pursue other causes of the patient’s symptoms, not knowing that the specialist actually felt the diagnosis of multiple sclerosis was in doubt.
- Temporal Accuracy: Healthcare information can become rapidly out of date as changes are made the patient’s medications, new test results arrive, etc. This is compounded by the fact that information is stored in systems that are not in harmony, such as the hospital electronic health record (EHR) database, the primary care provider’s EHR database, the EHR database of specialists, pharmacy databases, data entered by and maintained by the patient and numerous other sources. Data that is being retrieved locally or received from another system may be out of date, and this can also create patient safety concerns. Healthcare providers and HIT stakeholders need to make every effort to validate that the information that is being used to make medical decisions is current.
- Highest Authority Authorship: This refers to situations where there is conflicting information contributed by different specialists or from other sources. Certain sources of information may have a higher level or authority and reliability than other sources of information. As providers create problems lists in EHRs, they may enter information that is less specific and accurate that what already exists in the record. For example, a patient with recurrent headaches may be diagnosed as having a temporomandibular joint disorder (TMJ syndrome) by an ENT specialist. The patient may then be subsequently seen by another provider (e.g., a orthopedist) whose staff enters “Migraine Headaches” into the patient’s problem list based on what the patient reports. Given that this is the most recent source of information, their is a possibility that the inaccurate but more recent diagnosis of migraine headaches will replace the accurate diagnosis of TMJ syndrome. For this reason additional context as to the source of information is needed when attempts are made to reconcile data in patient records.
In summary, protecting the integrity of data is one of the most challenging areas of health information technology but it has direct impact on patient safety and the quality of data that can be used for population health management and clinical research.
The information presented above represents the opinions of Michael Stearns, MD
©Michael Stearns, all rights reserved