Electronic Health Records in Four Community Physician Practices: Impact on Quality and Cost of Care
- Affiliations of the authors: UnitedHealthcare, (DB, KR, LGS), Edina, MN; Center for Health Care Policy and Evaluation/Ingenix (YB, WPW, RH), Edina, MN; The Lewin Group (WPW), Falls Church, VA; Duval County (FL) Health Department (RH), Jacksonville, FL
- Correspondence and reprints: Dawn Bazarko, RN, MPH, UnitedHealthcare, 5901 Lincoln Drive, Edina, MN 55346 e-mail: < >
- Received 12 April 2006
- Accepted 9 February 2007
Health information technology (HIT) is widely seen as a way to increase the quality and lower the cost of care. Advocates for HIT suggest technology (particularly clinical decision support) increases guideline adherence in practice, which improves health status, which lowers utilization and cost, especially over the long term.
Electronic health (or medical) records (EHRs) may be the most frequently discussed form of HIT. The term “EHR” is used to include a wide range of functionalities in some discussions and a much narrower range in others. The Institute of Medicine (IOM) described eight “core” functionalities.1 Prominent among these are
health information and data storage,
management of results from laboratory and imaging tests,
electronic ordering (e.g., prescription drugs and referrals),
clinical decision support (e.g., guideline reminders),
administrative processes such as billing.
HIT adoption and diffusion have garnered proponents across the political spectrum, most notably Republican Newt Gingrich and Democrat Hillary Clinton, and the Bush Administration appointed a National HIT Coordinator to promote HIT.
Nonetheless, few studies have systematically analyzed the costs and benefits of HIT, and even fewer empirical studies of HIT adoption have been published in the peer-reviewed literature. The cost and quality impact of HIT adoption is unclear. Hillestad et al2 estimated that EHRs could save the country approximately $40 billion a year during the adoption period, through reduced hospitalization rates and length of stay, as well as more appropriate pharmaceutical utilization and care for chronic illness. Conversely, given the significant underuse of effective care,3 EHRs could also increase costs (particularly in the short term) by identifying and ensuring delivery of effective preventive and chronic care services, or by addressing long-standing workflow issues. One recent notable study reported an increase in mortality associated with Computerized Physician Order Entry (CPOE) in pediatric hospitals.4
Moreover, the published literature on the impact of applying information technology in ambulatory care settings is particularly scant, with most of the studies pertaining to computer-based clinical reminder systems.5 6 7 But EHRs help to identify adverse drug effects.8 Primary studies9 10 11 12 limited analysis to a single practice and none assessed impact on cost. Three of the four practice sites were university affiliated and the fourth was affiliated with a research foundation, raising issues of representativeness of the practice sites.
The current study analyzes the impact of implementing EHRs in four community-based private practice settings, which are more representative of practices nationally than those previously studied. Further, this study employed a novel methodology to assess the impact on both clinical quality and costs by utilizing the database of a large managed care organization. In this study, we report on the impact of EHR adoption for patients with at least one of four conditions (diabetes, hyperlipidemia, selected heart conditions, and hypertension), analyzing both guideline adherence (quality) and cost to payers (payment to providers, adjusted for casemix and payment rate), before and after these practices adopted EHRs.
In order to study a series of “natural experiments” among private medical practices, we utilized the database of a large national managed care organization (MCO). The data in this MCO’s data warehouse were prepared for analysis and de-identified for analytic purposes. Each enrollee was assigned a single ID number for research purposes, which both linked across multiple operational IDs and protected the enrollee’s identity in the data warehouse. The study received an exemption from the Western Institutional Review Board, because the study analyzed only existing data and did so in such a way that individual patients could not be identified.
The database included all claims—facility, professional, and pharmacy—for this MCO’s commercial enrollees, and included variables that are typical across payer systems. Using this analytic foundation, we then initiated recruitment of study and control practices.
Recruitment of Practices
Eligible study practices had the following characteristics: significant volume of the MCO commercial membership (at least 10% of patient volume), representation in the MCO’s database at least one year prior to EHR implementation, care for patients with the chronic conditions of interest, and willingness to participate.
We asked the MCO’s market medical directors in various parts of the country to suggest study and control practices. The medical directors, who are knowledgeable about the practices in their area, provided us with contacts at 23 potential study practices, all of which were contacted via phone to conduct an initial assessment of appropriateness. Based on this screen (or because of a failure to respond to several contacts), nine groups were eliminated.
Discussions were held with the remaining fourteen groups to obtain greater detail on their HIT implementation timeline and level of HIT. These discussions were often with a staff member with in-depth knowledge of the practice’s IT. Given a practice’s HIT timeline, we checked data availability in the data warehouse. These two steps eliminated another seven practices. Because these practices are long-standing business partners of the MCO, only practices that signed a participation agreement were included in the study. Due to internal competing priorities, three practices did not agree to participate. The four participating practices are described in Table 1. All practices had multiple office sites. The second (B) was a primary care practice; the others were multispecialty practices.
The first three practices implemented EHRs that maintained diagnostic data, laboratory rest results, and (except for practice B) imaging test results. They also had e-prescribing capabilities and decision support functionalities for certain conditions. Practice D did not have a complete EHR; rather it enhanced its practice management software by adding features such as limited e-prescribing capabilities and by accessing data from a local hospital network (inpatient, ER, laboratory, and imaging results plus consultant notes).
Using input from the local medical directors and the data warehouse, we identified practices located in one (or more) of the counties served by a study practice. For each candidate practice, we obtained counts of its claims for the relevant specialties in the study practice’s before and after periods, retaining only those practices with a substantial number of claims.
Project staff members contacted each of the potential control practices via phone. Only those confirming no EHR implementation were used as control practices. Because of the large number of potential control practices, we were able to exclude practices that had any EHR use (e.g., clinical decision support or management of results from laboratory and imaging tests) or were questionable in that regard. Given that our information needs were substantially less than for study practices, we did not request an agreement to participate.
A number of techniques were used to control for practice characteristics. The same time period was defined for both a study and its control practices. While not controlling directly for specialty of the practice, we did so indirectly: only physicians in the specialties relevant to the condition being examined were included (see below). Implicit in our study-control-before-after design (see below), each practice served as its own control. Thus, such practice-level characteristics as size and socioeconomic patient mix are controlled for.
Both quantitative and qualitative data collection and analysis were utilized in this study. This approach involved not only claims analysis to understand changes over time in quality and cost but also to collect information about the intangibles of EHR implementation.
In the qualitative study, we initially met via telephone and later visited each of the study practices to understand all of the intangibles that could not be gleaned from the data analysis. We discussed technical functionality including: diagnosis, pharmacy, lab, radiology, population management, decision support, reminders, and patient education capabilities. We also discussed more broadly the practice culture and adoption of the EHR, including whether or not physician champions were identified, physician and staff scheduling during implementation, and barriers they faced during implementation. The practices approached the implementation of their EHR in different manners. (For practice-specific details on implementation, see the appendix.)
The remainder of this study focuses on our quantitative analysis.
The methodology outlined below looks at change in quality and cost of care for selected medical conditions thought to be impactable by adoption of EHRs. Changes in outcome variables were assessed by measuring differences in casemix-adjusted “episodes of care” measured in two time periods, “before” and “after” EHR adoption, in both study and control practices, matched for time-frame and local geography. The unit of analysis is the episode of care, defined using Episode Treatment Groups (ETGs)™. Because these pertain to chronic conditions, as a practical matter the unit of analysis was a 12-month period.
The medical conditions selected for analysis had the following characteristics: related to the National Committee for Quality Assurance (NCQA) physician recognition programs and involved care that is complex enough to benefit from EHR tools. In addition, they were chronic, largely treated in the office, not defined in terms of a procedure, and prevalent. Given these criteria, the following conditions were selected: diabetes, hyperlipidemia, selected heart conditions, and hypertension, for which there are 11 ETGs. See Table 2 (below).
Once practices and targeted medical conditions were identified, we analyzed the episodes of care attributed to each practice. We analyzed each practice’s quality and cost, as measured by guideline adherence and cost adjusted for casemix. The two software packages (see the next two subsections) for these metrics differ in several ways: The selected conditions are slightly different, the number of episodes for a condition differs, and the method for identifying the responsible physician differs (see below). For both metrics, the unit of analysis is the episode of care.
From the MCO’s claims database, two files were extracted: (1) claims with such fields as enrollee ID, date of service, the practice’s tax ID, and physician ID; and (2) a physician file with ID and specialty. For each tax ID and for the before and after periods, we initially extracted all physician claims for the specialties relevant to our study, namely, family practice, general internal medicine, cardiology, and endocrinology. All the claims—facility, professional, and pharmacy—for each enrollee ID in this extraction were extracted, regardless of provider.
Cost Measure (Episode Treatment Groups™)
Episode Treatment Groups™ (ETGs) software was used to measure case-mix adjusted cost (i.e., payment to providers) at the episode level. The software uses data-mining logic on enrollee’s claims over time to construct episodes of care. An episode may have a beginning and end date, although episodes of chronic care are often open-ended. Each episode is classified into one of 558 episode groups that are homogeneous in terms of co-morbidities and other characteristics of a patient’s condition. An enrollee may have overlapping episodes of different ETGs, and may have multiple, non-overlapping episodes of the same ETG. For each episode, the ETG software yields both an expected cost (the mean for the ETG) and the actual cost.
The first step in constructing an episode involves identifying an “anchor record” that represents a clinician directly evaluating or treating a patient. Other claims (e.g., for tests or prescription drugs) can then be linked to the anchor records. The linkage rules depend on the ETG.
This approach has its advantages and disadvantages. An advantage is that the impact on cost of comorbidities is mitigated. If the dependent variable were the entire cost of care over a 12-month period (regardless of the condition being treated), one would need to control for comorbidities. Two disadvantages relate to the fact that untreated hypertension, for instance, leads to major conditions such as stroke, usually in subsequent years. The episodes are defined narrowly, so the hypertension ETG, for instance, does not include the major conditions resulting from untreated hypertension. In addition, by capturing cost over one-year periods, our study design misses these long-term effects. So our measurement of cost is necessarily short term.
Because payment rates changed at different rates for different practices, we “re-priced” claims. This involves calculating the mean allowed amount for each CPT code and attaching the appropriate mean to each claim, taking into account the quantity of services and distinguishing between professional and technical components for radiology services. This process was not applied to facility claims using revenue center codes, because those constitute only a fraction of one percent of the cost. It was also not applied to pharmacy claims, because pharmacy allowed charges were the same for enrollees in study and control practices.
Because average cost varies substantially by ETG group, our metric is the ratio of actual cost to expected cost.
The responsible physician is the one with the largest costs for management or surgery. The episode is linked to that physician’s practice.
Quality Measures (EBM Connect™)
EBM Connect™ is software that computes compliance with evidence-based guidelines. Produced by Ingenix, it was released in May 2004. The software translates guidelines from English text into algorithms that assess guideline compliance from administrative data, for at least 20 conditions. This assesses adherence for guidelines for four chronic conditions: adult diabetes, coronary artery disease, hypertension, and hyperlipidemia.
Conceptually, EBM Connect identifies patients who are under treatment for a certain condition (e.g., hyperlipidemia) and then determines whether they received certain services, tests, or prescription drugs [e.g., annually receive a low density lipoprotein (LDL) cholesterol test]. We used the default parameter of including only patients enrolled for the entire 12-month period. Some guidelines pertain only to a subset of patients with a given condition (e.g., those taking a certain drug). Only guidelines that a physician could reasonably be held responsible for were used in this analysis.
The unit of analysis was a guideline for a patient in a given 12-month period. The formulas of aggregation to higher level of analysis were straightforward. Each guideline was given equal weight within a condition. Each condition was given equal weight within a site and within a study-control-before-after cell. Each site was given equal weight within a cell.
For guidelines within a condition, adherence may be correlated; for instance, patients may receive low- and high-density lipoprotein (LDL and HDL) cholesterol tests together. To avoid overestimating t-values for adherence rates across several guidelines for a condition, standard errors are calculated using the number of patients with a condition, for a given site and cell.
EBM Connect imputes a primary care physician (PCP). For instance, if a physician with a primary care specialty has given the most recent physical examination, he or she would be considered the PCP. If not, another algorithm is used. Then the episode is linked to this PCP’s practice.
Definition of Time Periods
Unfortunately for analytic purposes, adoption of EHR is not a precise event, either in time or scope. In an idealized design, practices would implement all aspects of EHR functionality over a short time period, followed by a defined learning phase, rapidly leading to a stable post-implementation phase. Reality, of course, is less analytically convenient: Practices implementing EHR tend to do so in steps: perhaps implementing a registry for all patients in one year, then setting up a decision support system two years later. An EHR system was considered to be “implemented” when a majority of the components had been implemented at each of a practice’s sites. However, it may take several years for the physicians, staff, and patients to learn to take full advantage of it.
We specify different periods for different study practices in order to maximize our study population. For each one, periods are defined in light of information on the timing of their implementation. To ensure comparability, control practices are assigned the same periods as their study practice.
We measured the impact of EHR implementation using a study-control-before-after design. The impact of EHR implementation was measured as the difference between (a) the change in casemix-adjusted cost or quality for study practices and (b) the change for control practices. As long as practice characteristics such as size and ownership type do not change over time, each practice’s performance in the before period implicitly controls for the impact of its characteristics in the after period.
To test the significance of this difference of differences, one usually assumes that there is no covariance between the standard errors of the four means (e.g., for study practices in the before period). The standard error of the difference of changes is calculated by squaring the standard error of each of the four means, summing the squares, and taking the square root of the sum.
In sum, we used several methods to control for potential confounding effects.
Casemix and price levels were controlled for by adjusting the dependent variable.
Location and time periods were controlled for through the selection of control practices and their time periods.
Practice characteristics such as size were controlled for through the study-control-before-after design.
As Table 2 shows, the database has more than 40,000 episodes in each of the two periods. Almost half of the episodes pertain to hypertension, a quarter to hyperlipidemia, 16 percent to diabetes, and 11 percent to heart conditions. The mean cost in the before period ranged from $476 for benign hypertension without comorbidity (the most prevalent condition) to $3,787 for coronary heart disease without acute myocardial infarction (the least prevalent condition).
Table 3 reports that there were 56 practices, including four study practices and 52 control practices. The study practices had a sixth of the episodes (summed across the two periods). Site B had 13 percent of the episodes and Site C had 40 percent, with the other two sites in between. Overall, cost per episode increased 2.9 percent, but it increased about 20 percent in site A, reflecting that implementation there started earlier and went longer than elsewhere. However, absolute changes are of little analytic interest, because they include changes in the prices of drugs but not physician services. Of greater analytic interest are the relative changes.
In aggregate, episode costs increased in the study practices by 0.4 percentage points faster than in control practices (t = 0.21). In exploratory regression analysis that controls for physician-specific effects via a random effect model,13 the results continue to be insignificant (t = 0.46), with a minimum detectable difference of 3.5 percent of the mean. The effect of the EHR implementation on cost is essentially zero.
In three of the four sites, cost per episode increased faster in the practices that implemented EHR. In sites B and D, cost per episode for the study practice increased roughly 6 percent faster; in site A, it increased less than 1 percent faster; and in site C it increased about 3 percent slower.
Tables 4 and 5 present results on the impact of EHR implementation on quality, as measured by guideline adherence. Measured across all conditions, guidelines, and sites, adherence increased 8 percentage points for study practices and 6 percentage points for controls. The differential change, which represents the impact of EHR implementation, is 2 percentage points. This differential change varies considerably across sites. Note also, although the adherence rate in the after period varies little across sites (for study and control practices separately), the adherence rate in the before period is substantially lower in site A, which has the earliest implementation.
Table 5 disaggregates the EHR impact by condition and guideline. Although the EHR impact is insignificant across conditions as a whole, it is significant for hypertension and hyperlipidemia. The minimal detectable difference is less than 2.5 percentage points for these two conditions because of their high prevalence, but it is 5 percentage points for diabetes and 12 points for coronary artery disease.
This suggests that large numbers of episodes may be necessary to show relatively small differences in quality of care, especially for a relatively rare event such as myocardial infarction.
We identified four practices that met criteria for study inclusion. The challenge around identification of practices using EHRs illustrates the continued slow rate of adoption and the low level of EHR use in most geographic areas. Further, there is no single data repository which holds information about physician practice EHR use, so grass-roots recruitment is required to identify EHR users and degree of implementation.
Of the practices studied, the implementation of the EHR did not consistently follow a prescribed method and its use was highly variable (see the appendix). Practices purchased different software tools and approached training, tool use, and cultural adoption in varying manners. Practices also varied in the degree upon which clinical decision support capabilities of the EHR were “turned on.” As a result, the full effects of EHR such as evidenced based guidelines, patient reminders, and disease management and patient activation capabilities may not be realized, which is consistent with findings in O’Conner et al.12
Applying software tools to evaluate cost per episode and adherence to treatment guidelines, we did not find a statistical difference in costs per episode between the study and control groups. Similarly, the impact of EHRs on adherence to evidence-based guidelines (quality) was not statistically significant except for a minimal impact on hypertension and hyperlipidemia. This may reflect the fact that practices can have some of the functionalities of EHR (e.g., clinical reminders) without having an EHR.14
In contrast to our expectation, cost to the payer increased faster in three out of the four practices than in their controls. In one of those three practices, an explicit motivation for EHR was to more completely capture the services provided in each visit. Overall, however, cost to the payer was unaffected by the EHR.
Whereas one issue pertains to payer costs, another pertains to the provider’s cost of delivering the service. EHR-using practices report savings associated with reduced medical record staff and infrastructure, but they also report concomitant loss of productivity and scheduling disruption associated with implementation. The practice that could be considered “best practice” in terms of physician and staff training, tool use, and cultural change reduced scheduling during the initial implementation period and had heavy involvement by physician champions. Overall, study practices do not report cost savings due to EHR use and cite the economic issues associated with implementation. The financial implications of EHR implementation and the question of “who pays” are real and outstanding issues facing practices, particularly smaller ones, which comprise the majority of organized physician groups in the United States.15
This study has a number of strengths. First, it evaluates the impact of EHR implementation in multiple sites (in three states). Relative to the literature, which has usually analyzed university-affiliated practices, our results are more representative of practices nationally.
Second, this analysis examined community practice patterns in study and control practices, a feasible approach due to the investigators’ access to a large MCO’s claims database. As this database yielded almost 100,000 episodes, it allowed detection of relatively small differences in episode costs and/or guideline adherence—generally in the 2.5 to 5.0% range. Third, this study involved use of widely available, industry-recognized software tools, EBM Connect and Episode Treatment Groups, to evaluate episodes of care utilizing casemix-adjusted methodologies; these analytic tools can be used for subsequent research studies.
Finally, this study combines qualitative data collection with quantitative analysis. Onsite practice discussions on EHR implementation and application provided insights into gaps in use of clinical decision support tools. The presence or absence of the application of the full capabilities of EHR cannot be easily ascertained through claims-based analyses only.
This study has several limitations. One is that it evaluates EHR implementation in only four study practices recruited in a non-random manner, practices which may have unique organizational, cultural, clinical, or provider characteristics that led to EHR adoption. They certainly are above average in terms of size. Within these practices, EHR implementation varied in terms of time duration, process, and scope. The before-after study design restricted the number of practices meeting study inclusion eligibility criteria, and claims volume criteria within these time periods eliminated several EHR practices from consideration and may have impacted the overall findings. Moreover, claims volume criteria constrained the analysis to conditions, that while common, may not be the clinical conditions that are most amenable to impact by EHR adoption. Finally, this study analyzed episode costs and guideline adherence in the short term. One year of data (following a transition period) is sufficient to measure the impact of clinical reminder systems on guideline adherence. However, the long term impact of better guideline adherence on cost and clinical outcomes could not be assessed with the study timeframe.
While HIT adoption has been shown to be a component of addressing the well documented challenges of overuse, underuse, and misuse of healthcare services,16 more research is needed to understand the nuances of EHR implementation and the cultural and technology barriers to adoption, particularly in the “typical” US healthcare practice. The impact of introducing new technology into complex workflow is not well understood, and cannot be automatically equated to improved clinical quality for patients or lower cost for payers. Additional research is also necessary to study the impact on quality and cost over a longer time horizon.
To facilitate further research in this area, we offer an approach—the application of commercially-available software to a large database—to assess both risk adjusted episode cost and guideline adherence across large numbers of practices. Further studies utilizing databases of large national MCOs could shed even further light on the complex process of HIT adoption and the resultant impact on costs, quality, and medical care.
Appendix on EHR Capabilities and Implementation by Study Practices
Site A implemented a fully integrated EHR, including e-prescribing with its in-house pharmacy as well as laboratory and radiology results that were transmitted electronically. The system had point-of-care clinical decision support in the form of health maintenance reminders. Patient outreach reminders were available in the system but not used at the time. The practice also utilized the population management capabilities of the EHR to manage patients with diabetes and hypertension as well as for mammograms and childhood immunizations. At the time of our study, Site A was planning for integration with its hospital system.
Site A took a methodical approach to implementing HIT. They implemented the technology across their six clinics over the course of two years. Physician schedules were reduced to 50% for the first two weeks and then to 75% for the subsequent two weeks while the physicians and staff became comfortable with the new technology. The site also had two physician champions that worked alongside their colleagues as the system was being brought up in each practice to assist the physicians with troubleshooting.
Site B implemented an integrated EHR, including laboratory results and system-generated faxes sent to local pharmacies. Radiology was not integrated at the time due to cost. The EHR had enabled point-of-care reminder capability and a mailing capability for patient outreach was in use. Reporting from the system was largely manual, and at the time, population management reporting had only been developed for diabetes. The site also utilized patient trend data to manage individual patients’ chronic disease.
Site B implemented their EHR at their eight clinics every two weeks. They did not reduce physician work schedules; rather asked their physicians and staff to migrate their patients over time by starting new patients on the EHR, patients with specific diseases, or targeting a number of patients to enter each day. The CEO of the practice set up a central training lab where all physicians and staff were trained. Physicians went through the most extensive training and held bi-weekly round tables to share their learning about their use of the technology and best practices.
Site C implemented an integrated EHR, but had not enabled all of its capabilities. System-generated faxes were used for prescriptions (an interim step toward e-prescribing), and laboratory and radiology results were also fully integrated. The site was not using the point-of-care clinical decision support capabilities for the conditions of interest in this study. The patient outreach capability was only enabled for mammography, not for the chronic diseases of interest in this study. At the time, the site was not using the population management capabilities of the system or many of the reminder capabilities for chronic disease management. The site had also utilized the letter generation capability and the patient education library available through the technology.
Site C implemented HIT without a reduction in work schedules. Their physicians were still adjusting to the technology two years after implementation. Most of the physicians were still dictating rather than using the templating features of their technology, even though the transcription costs are charged directly to the physicians. According to the site, each specialty uses the HIT to different degrees.
Site D enhanced its practice management system to include a disease registry, which allows for individual and population reporting for patients with chronic diseases. Quarterly practice and physician quality scorecards were produced for diabetes, cardiac conditions, adult preventive care, and childhood immunizations. The site relied heavily on the data and analytic capabilities of their central office to drive clinical improvements. The site used paper generated checklists as its point-of-care decision support, which was not supported through its EHR. The site also used a reminder system for patient outreach. The site had integrated a majority of its laboratory and radiology results from its affiliated medical center.
The implementation at site D occurred across the multiple practices over time. The site indicated that data entry on the front end was time consuming, estimating that it took six months to get the majority of data entered.
Usage was throughout a practice unless otherwise noted.
The authors thank the participating study practices for their contribution to this study. Their time and candidness have been invaluable. We are grateful for the assistance of the MCO’s Market Medical Directors; their expert knowledge of their markets helped make this study successful. Also, we would like to thank Mona Shah, MS, for project coordination, Erik Hokenson for project support including study practice recruitment, evaluation, and literature review and Carol Calvin, RN, MS, for her guidance, study practice recruitment, and control group verification.
Funding for this project was provided by the Agency for Healthcare Research and Policy, Contract # 290-00-0012, to the Center for Health Care Policy and Evaluation/Ingenix.