The use of count data models in biomedical informatics evaluation research
- 1Division of Health Policy and Management, University of Minnesota, Minneapolis, Minnesota, USA
- 2Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota, USA
- Correspondence to Jing Du, Division of Health Policy and Management, 420 Delaware Street SE, Minneapolis, MN 55455, USA; duxxx031{at}umn.edu
- Received 17 March 2011
- Accepted 5 June 2011
- Published Online First 29 June 2011
Abstract
Objectives Studies on the impact and value of health information technology (HIT) have often focused on outcome measures that are counts of such things as hospital admissions or the number of laboratory tests per patient. These measures with their highly skewed distributions (high frequency of 0s and 1s) are more appropriately analyzed with count data models than the much more frequently used variations of ordinary least squares (OLS). Use of a statistical procedure that does not properly fit the distribution of the data can result in significant findings being overlooked. The objective of this paper is to encourage greater use of count data models by demonstrating their utility with an example based on the authors' current work.
Target audience Researchers conducting impact and outcome studies related to HIT.
Scope We review and discuss count data models and illustrate their value in comparison to OLS using an example from a study of the impact of an electronic health record (EHR) on laboratory test orders. The best count data model reveals significant relationships that OLS does not detect. We conclude that comprehensive model checking is highly recommended to identify the most appropriate analytic model when the dependent variable being examined contains count data. This strategy can lead to more valid and precise findings in HIT evaluation studies.
- Poisson regression
- negative binomial regression
- hurdle regression
- zero-inflated regression
- count data model
- phone number
- YT
- health IT adoption
- people and organizational issues
- measuring/improving patient safety and reducing medical errors
- classical experimental and quasi-experimental study methods (lab and field)
- methods for integration of information from disparate sources
- supporting practice at a distance (telehealth)
- data models
- data exchange
- communication and integration across care settings (inter- and intra-enterprise)
- human-computer interaction and human-centered computing
Footnotes
-
Funding This project was funded in part under grant number UC1 HS16155 from the Agency for Healthcare Research and Quality, Department of Health and Human Services.
-
Competing interests None.
-
Provenance and peer review Not commissioned; externally peer reviewed.









