Evaluating Healthcare Information Technology Outside of Academia: Observations from the National Resource Center for Healthcare Information Technology at the Agency for Healthcare Research and Quality
- aBrigham and Women's Hospital, Boston, MA
- bHarvard Medical School, Boston, MA
- cNational Opinion Research Center at the University of Chicago, Office in Bethesda, MD
- dIndiana University School of Medicine, Indianapolis, IN
- eRegenstrief Institute, Inc., Indianapolis, IN
- fNational Resource Center for Healthcare Information Technology, Agency of Healthcare Research and Quality, Rockville, MD
- Correspondence: Eric Poon, MD, MPH, Division of General Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital, 3/F 1620 Tremont Street, Boston, MA 02120 (Email: ).
- Received 13 October 2008
- Accepted 28 May 2009
The National Resource Center for Health Information Technology (NRC) was formed in the fall of 2004 as part of the Agency for Healthcare Research and Quality (AHRQ) health IT portfolio to support its grantees. One of the core functions of the NRC was to assist grantees in their evaluation efforts of Health IT. This manuscript highlights some common challenges experienced by health IT project teams at nonacademic institutions, including inappropriately scoped and resourced evaluation efforts, inappropriate choice of metrics, inadequate planning for data collection and analysis, and lack of consideration of qualitative methodologies. Many of these challenges can be avoided or overcome. The strategies adopted by various AHRQ grantees and the lessons learned from their projects should become part of the toolset for current and future implementers of health IT as the nation moves rapidly towards its widespread adoption.
Information technology has significant potential to improve patient safety, organizational efficiency, and patient satisfaction within healthcare.1 2 3 4 5 6 7 8 Since the late 1990's, major initiatives have been proposed to promote the adoption of health information technology (health IT).9 10 11 12 13 With congressional support, the Agency for Healthcare Research and Quality (AHRQ) initiated an extensive portfolio of health IT projects in 2004 to plan for, implement, and evaluate the use of health IT around the nation.14
The AHRQ National Resource Center for Health Information Technology (NRC) was formed in the fall of 2004 as part of the AHRQ health IT portfolio to support the implementation and evaluation efforts of AHRQ health IT grantees and to share knowledge and findings derived from the real-world laboratory created by the portfolio. The NRC conducted a needs assessment of health IT grantees funded between 2004 and 2005 through review of their applications, site visits with grantees, and consultations with AHRQ representatives and national health IT experts. During this assessment it became clear that many grantees would need assistance in evaluating the health IT they were about to implement and in measuring its impact and value. The Value and Evaluation (V&E) group within the NRC thus targeted its activities to assist grantees in this endeavor. This manuscript summarizes the activities of the V&E team during the early evaluation efforts between the fall of 2004 and the fall of 2006. It describes the common issues that grantees encountered in their evaluation efforts and the lessons learned in providing assistance in evaluation methods to those who are inexperienced in the field.
The NRC wanted to share its experience with the larger informatics community for two main reasons. First, the evaluation challenges experienced by grantees outside of large academic centers likely reflect the challenges facing the spectrum of healthcare institutions implementing health IT. Second, our experience suggests that many of these evaluation challenges can be overcome. Therefore, the strategies the grantees implemented with the assistance of NRC should become part of the toolset for current and future implementers of health IT.
The AHRQ Health IT Portfolio 2004–2006
The AHRQ initiative on health IT is a key element to the nation's 10-year strategy to bring health care into the 21st century by advancing the use of information technology. The AHRQ initiative at the end of 2006 included more than $166 million in grants and contracts in 41 states nationwide to support and stimulate investment in health IT, especially in rural and underserved areas. Through these and other projects, AHRQ and its partners aim to identify challenges to health IT adoption and use and to develop solutions and best practices for making health IT work as well as tools that will help hospitals and clinicians successfully incorporate new IT.15 A major component of AHRQ's health IT initiative is a virtual nationwide learning laboratory of more than 100 hospitals, physician practices, research institutes, nursing homes, and collaboratives immersed in developing and testing new health IT applications that will change the way Americans experience health care.
Characteristics of Grantees and Contracts within the AHRQ Health IT Portfolio
As of 2006, the AHRQ health IT portfolio consisted of 6 major types of projects:
planning projects, which focused on increasing grantees' readiness to adopt health IT,
implementation projects, in which grantees focused on the implementation of health IT in various settings,
value projects, which called for grantees to define the value of health IT,
state regional demonstration (SRDs) projects, which involve electronic data exchange at the regional level,
e-prescribing pilots to implement and test new e-prescribing standards, and
Health Information Security and Privacy Collaboration (HISPC).
The most common care setting in the portfolio was ambulatory (72%), followed by inpatient (56%), emergency (18%), and community health centers (15%). More than 16 unique technologies existed across the portfolio. Forty-six percent of the projects involved technology designed to exchange health information among disparate systems; other technologies include clinical decision support (40%), electronic health records (33%), and computerized physician order entry (22%).16
Activities of the NRC Value and Evaluation Group
Formation of the Value and Evaluation Group
In the formative days of the NRC, staff at both AHRQ and the NRC recognized that the nation would only achieve the full value from the significant investment in the health IT portfolio if each grantee was able to document the impact of their health IT implementation; sharing quantitative and qualitative lessons learned became a major goal. Due to Congressional budgetary direction to fund rural and small health care providers, AHRQ decided to award a large number of health IT grants to health care organizations with significant potential for success but little experience in traditional academic activities such as evaluation, publication, and so forth. Although evaluation is critical in all the AHRQ funded health IT projects, there was a recognition that a more diverse group of grantees could result in a broader national perspective on the obstacles and enablers faced by health care organizations beginning to implement health IT. The intent was to offset the initial lack of traditional evaluation experience by focused efforts on the part of the NRC.
National experts involved in awarding the AHRQ grants recognized that many grantees, in particular those in the health IT planning and implementation portfolios, lacked necessary experience in evaluation. This was borne out when reviewing initial grant applications. Many, while meritorious in other ways, lacked details on how the grantees would evaluate their health IT projects. In addition, during the formal needs assessment conducted by the NRC, assistance with evaluation was one of the six most common types of requests received from the grantees. As a result, the NRC targeted the activities of the Value and Evaluation (V&E) group on two specific needs: (i) assistance to grantees in the domain of evaluation, and (ii) gathering lessons learned from grantees so the value of the entire portfolio would be greater than the “sum of its parts”.
Educational and Outreach Activities
One of the first formal activities conducted by the V&E team was to examine the evaluation plans that implementation grantees were asked to submit post-award. The results of this examination confirmed the need to address the knowledge and experience gap in evaluation for a significant number of grantees. The V&E team recognized that different educational and outreach methods would suit different grantees; thus the team engaged in a variety of methods to support grantees in their evaluation efforts.
The V&E team developed 1 hour tutorials delivered via teleconferences and again during the AHRQ health IT annual meeting on the basics of evaluating health IT. These didactic sessions provided an overview of the components of an evaluation plan and discussed nuances that even experienced health service researchers might overlook.
Development of evaluation toolkits
The V&E team developed a written toolkit in an effort to help grantees develop their evaluation plans. The toolkit itself underwent several iterations, moving from one that had a series of tables, metrics, and specific examples of projects to one that provided more guidance by outlining the steps involved in the creation of an evaluation plan in the format of a workbook.17
Steps in the workbook included asking the grantees to identify the goals of the health IT implementation and the evaluation itself, to identify key stakeholders, and to identify what could be measured to determine if stated goals had been achieved. The expanded workbook version asked its users to prioritize the various measures using a combination of both the importance and the feasibility of each measure. In response to a request for a toolkit focused on data exchange, the V&E team created a second toolkit to address the needs of projects that focused on health information exchange.
Development of workshop curricula and case studies
Recognizing that some grantees might benefit from more intensive assistance in creating their evaluation plans, the V&E team also held two case-based workshops. Two plans initially submitted to AHRQ were reviewed in-depth and specific feedback was given iteratively as members of the V&E team guided the grantees to focus on evaluation measures that were both important to their particular stakeholders and also feasible to measure. Each workshop consisted of brief didactic sessions on evaluation, covering both quantitative and qualitative approaches, followed by the presentation of the two cases in which the grantees discussed the iterative process of improving their evaluation plans. Time was also set aside for NRC experts to address each grantee's questions on evaluation.
To extend the availability of health IT evaluation experts to grantees who could not attend the workshops, the V&E team offered a series of office-hours via teleconference so that grantees could have their questions answered.
Structured evaluation of evaluation plans in 2005
During the second year of the V&E team activities, an additional 15 implementation grants were awarded. Partly out of concern that these grantees might not have fully defined their evaluation strategy within their initial applications, AHRQ requested evaluation plans from each of these new implementation grantees. Based on the issues encountered during the first year of activities conducted by the V&E group, the team developed and piloted an evaluation plan critique instrument (available upon request). This allowed for the formal evaluation of the evaluation plans submitted by the second group of implementation grantees. During this round the State and Regional Demonstration projects (SRD) were also asked to submit their evaluation plans. The critique instrument sought to determine whether (i) the goals of the implementation were well articulated, (ii) the goals of the evaluation were well stated; (iii) the data collection and analysis plans for quantitative and qualitative measures were appropriate, and (iv) the plan presented was feasible. Each plan received a score of 1 (highest) through 5 (lowest) for each domain. An overall rating of the evaluation plan on a scale of 1 (highest) through 5 (lowest) was also given. Each plan was scored using this instrument by two members of the V&E team independently, and discussed within the team. Scores and critiques were then forwarded to each grantee and grantees were given the opportunity to attend further office hours to address comments raised by the V&E team.
Quantitative Evaluation of Evaluation Plans
Results of the evaluation of the evaluation plans submitted by 15 AHRQ grantees in 2006 appear in Table 1. Overall, these evaluation plans were able to articulate the goals of the health IT implementation and its evaluation (average implementation goal score = 1.2, SD = 0.45); average evaluation goal score = 1.5; SD = 1.0). These 15 implementation grantees scored in the reasonable range for measure selection (average score 2.3, SD = 1.2), and study design (average score = 2.4, SD = 0.8). The feasibility score, the quantitative measures score, and qualitative measures score were less impressive, ranging from an average of 2.9, 3.3, and 3.3 respectively. The average overall score was 2.7 (SD = 0.8), indicating that on average these 15 implementation grantees provided evaluation plans that ranged between “good” and “fair”.
Common Issues with Evaluating Health IT Outside of Academia
Several common themes emerged in the activities of the V&E group. The findings, discussed in detail below, are likely to reflect not just the group of grantees with whom the V&E group interacted directly, but also many others who are committed to evaluating health IT projects but do not have direct evaluation experience themselves or access to those who do.
Leaving Evaluation as an ‘After-thought’
We found that many projects had not planned on evaluating the health IT they were tasked with implementing, and others had vague notions about evaluation. This issue has been addressed within the AHRQ health IT portfolio through the work of the V&E group and subsequent funding announcements, but it likely reflects the general level of misunderstanding of evaluation work beyond the academic setting. Most projects were able to specify the goals for the implementation of health IT and the success criteria for the project, but did not specify how the project team would measure whether the goals or success criteria, if identified, would be met. Other implementation projects did acknowledge the need for evaluation by including specific personnel with the training and experience needed to carry out an evaluation, but in the vast majority of cases, did not allocate sufficient resources for it. Despite these concerns, the NRC, with the assistance of AHRQ, was able to help grantees develop realistic evaluation plans. In our experience, it is important to help health IT project teams understand the benefits of evaluation and how evaluation can facilitate the overall implementation process. Once that is achieved, health IT implementers should incorporate the evaluation efforts as a key component of their overall project plan.
As projects teams worked on evaluation plans, many took advantage of the list of sample measures listed in the earliest version of the Evaluation Toolkit and provided long lists of outcome measures with which to evaluate their implementation. A common phrase in the critique of the evaluation plans by NRC reviewers was that the plans were “overly ambitious,” as many project teams failed to recognize the significant resources needed to effectively carry out their extensive evaluation plans. These teams either lacked the financial resources to support appropriate staff to execute the evaluation plans, or lacked access to the appropriate experts (e.g., statisticians) to guide the evaluation. Another concern of the V&E team was the possibility of false positives if too many outcomes were examined.
Subsequent versions of the toolkit, which laid out a framework for choosing among the many possible evaluation metrics, resulted in more realistic evaluation plans. Once the project teams explicitly assessed the feasibility and the importance of each desired evaluation metric in light of stakeholders' goals and the resources available, they were able to focus their energies on metrics that were feasible to collect data on and would yield information meaningful to their local stakeholders. This was to be expected as the toolkit became more proscriptive and AHRQ and the NRC strongly encouraged the second group of implementation grantees to use it. Future health IT implementers should leverage this “lesson learned” and plan for evaluation efforts that address the primary needs of key stakeholders without taking valuable resources away from the IT implementation project.
Mismatch between Evaluating Metrics Chosen and the Health IT Being Implemented
Some project teams chose evaluation metrics without understanding if each was relevant to the specific health IT implementation and the implementation environment. For example, if a stand-alone inpatient pharmacy system is being implemented, should the rate of pneumococcal vaccine administration be chosen as an evaluation metric? Appropriate use of pneumococcal vaccines is a practice that is well supported by evidence in the literature; it represents an important quality improvement goal for many hospitals. In deciding whether this metric is appropriate for the implementation of a stand-alone pharmacy system, one should determine whether such a system would actually affect the rate of pneumococcal vaccine administration. In theory, pharmacists could remind physicians to prescribe the pneumococcal vaccine for eligible patients. However, if the pharmacy system is not integrated with the patient's outpatient electronic health record (EHR), then a pharmacist would have to take the initiative to review a patient's outpatient EHR for vaccine status. In such circumstances, most busy pharmacists would not be able to overcome the many barriers to improving the rate pneumococcal vaccine use. No one can guarantee that any particular measure will be affected by health IT, but project teams need to focus their limited resources on metrics that are likely to reflect an impact of their implementation. Health IT implementers need to think through workflow and cultural issues in conjunction with health IT being implemented to formulate appropriate and testable hypotheses and choose the right metrics to test those hypotheses directly.
Chasing Rare Events without Adequate Statistical Power
Health IT has the potential for impact on significant patient outcomes, such as mortality and adverse drug events. To detect these relatively rare events, a high volume of observations must be made. In some cases, project teams did not have the resources to collect adequate data, and in others, particularly in rural areas, it would take many years to make a sufficient number of observations. In such cases, the advice from V&E team was to select measures for which they would have sufficient statistical power, even if it meant focusing on process rather than outcome measures. Understanding the need for sufficient statistical power will be critical to future evaluation of health IT implementation as more and more such implementations are built around improving quality of care.
Limitations of Data Available
It is often possible to use data collected for another purpose to support the evaluation efforts. The V&E team encouraged this practice, especially if project teams had limited resources to devote to evaluation. Common sources included billing data, quality improvement data, and data used for external reporting. However, these data sources may have limitations that deserve consideration. For example, billing data may not adequately or accurately capture the care given, unless clinicians are going to be incrementally reimbursed for a specific activity. In some cases, quality improvement data and data collected for external reporting may represent an insufficient sample for statistical inferencing, limiting generalizability of the findings. These challenges are not insurmountable, but require mitigation strategies, including data validation and consideration for statistical power before these data sources can be used for evaluation. One technique the NRC often suggested, that is applicable to most future implementations, is to pilot data collection and analysis efforts early so that midcourse corrections are possible should initial assumptions about the quality of data and feasibility of data collection methods prove to be incorrect.
Improper Comparison Group
To demonstrate that the health IT being implemented has an impact on metrics chosen, data must be collected on a valid comparison group. When the energy of the project teams is directed at the new technology, it becomes easy to forget to do so. In our experience, even if evaluation resources are limited, it is possible to capture at least baseline data through low-cost methods such as surveys or data that has been collected for other purposes such as billing. By collecting data using the same methodology before and after the implementation of health IT, implementers are at least be able to conduct a valid before-and-after study to measure the impact of health IT.
Many health IT project teams wanted to follow the gold standard of study design by conducting randomized-controlled trials. Some of them realized that logistically it was not possible to do so because the community implementing health IT did not find it acceptable to delay implementation even for a short time period for a randomly chosen subset of the community. In these cases, valid comparison groups on which to collect data could still be identified. For example, project teams could identify another community that was not implementing any similar form of health IT and use it as a “control” community. If data could be collected in both the grantee's community and in the “control” community before and after the implementation of health IT, then the change in outcome over time could be compared between the two communities to determine if the health IT affected on outcome. In other cases, communities were planning to roll out the health IT in a staggered fashion over the course of months to years across different sites within the community. In these cases, project teams could collect outcomes data before and after the rollout of health IT in each site, and data collected in this fashion could be used to support time series analyses. Alternative approaches to the traditional randomized controlled trial can frequently satisfy the needs of the health IT implementers and their key stakeholders.
Insufficient Details on Data Collection and Analysis
Details are important, and the process of developing an evaluation plan offers the opportunity to define them. At a minimum, the plan should consider how the data needed to support the chosen metrics will be collected, the population on which the data will be collected, and when the data will be collected. If these details are not thought through, it is easy to “over-promise” on the number of measures to be collected. The plan should also discuss how the data collected will be analyzed, and statistical power calculations should be part of the plan. Our experience at the NRC suggests that these gaps in the evaluation planning can be addressed through access to evaluation plan templates and remote mentorship.
Exclusive Focus on Quantitative Methods
Data collected using qualitative methodologies may be as illustrative of lessons learned, if not more so, than data collected through quantitative methodologies. While quantitative methodologies are powerful and efficient at capturing healthcare outcomes, qualitative methodologies are often superior at capturing the “why's” and the “how to's”. To that end, the NRC has continued to encourage the use of qualitative methodologies, such as focus groups, semi-structured interviews, and surveys, to capture the lessons learned, barriers encountered, and the success factors in each project. The V&E team discovered that many project teams were not familiar with these methods or were sometimes reluctant to use them, believing that findings from qualitative methodologies are not concrete and are difficult to disseminate, particularly in peer-reviewed journals. Because qualitative methodologies offer unique tools for health IT evaluation, they should not be discounted because of misperceptions or lack of expertise. In some cases, this expertise can be identified in nonhealth-IT fields, such as sociology and anthropology, although experts in those areas are likely to need assistance in gaining health IT domain knowledge.
Lessons Learned in Providing Evaluation Assistance to the Inexperienced
We found that many health IT project teams have little experience in the complexities of evaluation of health information technology. We believe that our grantee sample is reasonably representative of those implementing health IT outside of academic centers throughout the United States. Although experts from disciplines outside of health IT, such as epidemiology, biostatistics, and program evaluation, can contribute to health IT evaluation efforts, they will need to be educated to appreciate the challenges unique to health IT (such as the limitations of electronic data entered by clinicians). Future efforts by national organizations, such as AMIA and HIMSS to build the talent pool for health IT (for example, through the AMIA 10 × 10 effort) should prominently include in their curricula the basics of health IT evaluation.
Beyond the lack of knowledge of how to conduct health IT evaluations, we also discovered a lack of appreciation for the importance of evaluation. Within academic informatics circles, it is generally accepted that formative evaluation ensures that the project meets established benchmarks and provides for mid project redirection.18 This may not be understood outside academia. IT professionals may emphasize project delivery and implementation of software but not the evaluation of its effectiveness. Evaluation may be thought of as a “luxury”. Organizations such as AHRQ, HIMSS, and AMIA may need to expend significant effort to demonstrate the value of evaluation in health IT design and implementation. These organizations may need to collaborate with vendors and healthcare organizations so that metrics can be more broadly used to evaluate whether the goals of health IT projects have been met.
We were encouraged by the progress many of the grantees made in response to NRC's educational efforts and AHRQ's growing emphasis on solid evaluation. Many of the issues identified in this manuscript are correctable, especially if caught in the early phases of the project. The NRC V&E group also learned that the assistance provided must be matched to the experience level of the recipients. For example, in the hands of the uninitiated, an early version of the evaluation toolkit that offered an extensive list of evaluation metrics without any guidance on how to pick them may have invited grantees to submit evaluation proposals that were unfocused and over-scoped. Our experience suggests that project teams with very little experience with evaluation may require more intensive assistance as they formulate their evaluation plans.
As the saying goes, “the perfect is the enemy of the good”. In the face of finite evaluation resources, the NRC and the AHRQ grantees have discovered together that practicing the “art of the good enough” is intellectually challenging, and the fruits of such labor rewarding. As the examples in this manuscript illustrate, the gold standard of randomized controlled trials and high-impact metrics such as patient mortality may appear as obvious first choices. Nonetheless, when lofty goals are unachievable, trying to attain them at the expense of more modest ones may be detrimental to the overall evaluation effort. Organizations such as AMIA, HIMSS, and AHRQ should facilitate the evaluation efforts of organizations with modest resources by developing and promoting low-cost evaluation methodologies in health IT and providing vehicles to share their results. Funding agencies should ensure that evaluation efforts are at least moderately resourced and explicitly considered in the funding decision making process.
There are limitations to our experience and hence to the viewpoints expressed here. First, while the AHRQ portfolio is diverse, it does not represent the entire field of health IT. It is perhaps reasonable to assume that AHRQ grantees may already be more sophisticated than the average implementer of health IT, and it is likely that the relatively basic set of issues encountered by the portfolio will be encountered by others. Second, the criteria used to evaluate the quality of evaluation plans are somewhat different from earlier attempts that focused on research published in peer-reviewed journals.19 While we agree on the importance and the validity of the earlier criteria, the set of basic issues uncovered by the V&E team highlight the need to address rudimentary evaluation challenges before addressing the more nuanced ones, such as sophisticated study designs, regression modeling, and statistical clustering. Third, members of the V&E team might have brought personal biases to this work.
Evaluation is important because the value proposition of health IT continues to be challenged. A review article19 showed that a preponderance of evidence has emerged only from four academic health centers, and the impact and value of health IT in the community remains an open question. Until this is determined, it may be difficult for individual healthcare organizations to justify the often enormous investments needed for health IT as opposed to other pressing issues. This may change as the national health care reform initiatives and funding opportunities focus on widespread adoption of health IT. If it does, the need for a health IT evaluation component in implementation projects will become even more critical. In addition, better evidence may spur ancillary payers to contribute to the upfront investments needed for health IT and prompt greater demand from healthcare consumers. Evaluation will likely allow lessons learned to be translated into more efficient implementation and change management strategies for future adopters of health IT, thus decreasing the financial and organizational burden of implementation. Making evaluation methodologies more accessible and more widely used by a wide range of healthcare organizations that are implementing health IT should be a priority for those interested in accelerating the adoption of IT in the United States.
The authors thank the members of the V&E team: Davis Bu, MD, MSc, Karen Cheung, MPH, Daniel Gaylin, MA, Adil Moiduddin, MA, Anita Samarth, B Eng, Jan Walker, RN, MBA, Jon White, MD, Atif Zafar, MD.
This work was supported in part by the AHRQ National Resource Center for Health IT, contract number 290-04-0016.
The viewpoints expressed in this manuscript reflect the experience of the coauthors who worked on the NRC V&E group. Members of the AHRQ HIT leadership team were asked to review the manuscript draft and their comments have been incorporated into the present manuscript.