Using the time and motion method to study clinical work processes and workflow: methodological inconsistencies and a call for standardized research
- 1School of Public Health, Department of Health Management and Policy, University of Michigan, Ann Arbor, Michigan, USA
- 2School of Information, University of Michigan, Ann Arbor, Michigan, USA
- 3College of Medicine, University of Florida, Gainesville, Florida, USA
- 4Department of Pediatrics, University of Michigan, Ann Arbor, Michigan, USA
- 5Comprehensive Cancer Center, University of Michigan, Ann Arbor, Michigan, USA
- Correspondence to Dr Kai Zheng, Department of Health Management and Policy, School of Public Health, University of Michigan, M3531 SPH II, 109 South Observatory Street, Ann Arbor, MI 48109-2029, USA;
- Received 1 January 2011
- Accepted 1 April 2011
- Published Online First 27 April 2011
Objective To identify ways for improving the consistency of design, conduct, and results reporting of time and motion (T&M) research in health informatics.
Materials and methods We analyzed the commonalities and divergences of empirical studies published 1990–2010 that have applied the T&M approach to examine the impact of health IT implementation on clinical work processes and workflow. The analysis led to the development of a suggested ‘checklist’ intended to help future T&M research produce compatible and comparable results. We call this checklist STAMP (Suggested Time And Motion Procedures).
Results STAMP outlines a minimum set of 29 data/ information elements organized into eight key areas, plus three supplemental elements contained in an ‘Ancillary Data’ area, that researchers may consider collecting and reporting in their future T&M endeavors.
Discussion T&M is generally regarded as the most reliable approach for assessing the impact of health IT implementation on clinical work. However, there exist considerable inconsistencies in how previous T&M studies were conducted and/or how their results were reported, many of which do not seem necessary yet can have a significant impact on quality of research and generalisability of results. Therefore, we deem it is time to call for standards that can help improve the consistency of T&M research in health informatics. This study represents an initial attempt.
Conclusion We developed a suggested checklist to improve the methodological and results reporting consistency of T&M research, so that meaningful insights can be derived from across-study synthesis and health informatics, as a field, will be able to accumulate knowledge from these studies.
- Time and motion studies (F02.784.412.846.707)
- workflow (L01.906.893)
- health information technology (L01.700)
- medical informatics applications (L01.700.508)
- collaborative technologies
- personal health records and self-care systems
- developing/using clinical decision support (other than diagnostic) and guideline systems
- systems supporting patient-provider interaction
- human-computer interaction and human-centered computing
- improving healthcare workflow and process efficiency
- system implementation and management issues
- social/organizational study
- qualitative/ethnographic field study
- cognitive study (including experiments emphasizing verbal protocol analysis and usability)
- methods for integration of information from disparate sources
- information storage and retrieval (text and images)
- data exchange
- integration across care settings (inter- and intra-enterprise)
- visualization of data and knowledge
- developing/using computerized provider order entry
While there has been a widely acknowledged consensus that health IT implementation often introduces radical changes to clinical work processes and workflow,1–4 it remains unclear what these changes are and how they impact actual clinical efficiency, team coordination, and ultimately quality of care and patient safety. Developing such an understanding requires rigorously conducted research that can generate compatible and comparable results to inform effective technology designs and implementation approaches.
Methods for studying changes to work processes and workflow vary widely depending on research contexts and research objectives.4 In this paper, we focus on quantitative approaches for measuring the impacts of such changes by means of quantifying clinicians' time utilization and delineating how their time is allocated to different types of clinical and non-clinical activities. Among several approaches commonly used to date, time and motion (T&M), which involves continuous and independent observation of clinicians' work, is generally regarded as the most reliable method compared to alternative methods such as work sampling and time efficiency questionnaires.5 6 In recent years, informatics research utilizing T&M has grown substantially thanks to a series of pioneering papers7 8 that led to a widely-used T&M data acquisition tool made available through the Agency for Healthcare Research and Quality.9
The proliferation of T&M studies creates a great opportunity to aggregate their results for broader learning and discovery. Unfortunately, in our prior work, we found that the design, conduct, and results reporting of existing T&M studies vary to a considerable degree making cross-study synthesis unnecessarily difficult, or even impossible.10 This observation motivated the present paper, which aimed to identify key methodological considerations and results reporting requirements in order to improve the consistency of T&M research. Toward this goal, we first reviewed existing empirical studies that have applied the T&M method to assess the impact of health IT implementation. Then, we distilled the results into a suggested ‘checklist’ intended to help standardize the research design, conduct, and results reporting of future T&M studies, which we refer to as STAMP (Suggested Time And Motion Procedures). To the best of our knowledge, this is the first such checklist. Based on STAMP, we also analyzed the existing T&M studies identified to delineate key areas in which methodological and results reporting inconsistencies have occurred most frequently and where additional attention is most needed.
Implementation of health IT systems such as electronic health records and computerized prescriber order entry inevitably changes established clinical work processes and workflow.1–4 Some of the changes are intended, in order to reengineer existing operations to both accommodate and take full advantage of the capabilities provided by electronic systems; others, however, may be unintended due to software defects, problematic implementation processes, or policy oversights, many of which could be associated with adverse consequences on time efficiency and patient safety.11–14
Developing a better understanding of such changes is therefore important, and is often achieved by quantifying and comparing clinicians' time efficiency before and after health IT adoption (or with and without), for instance by calculating how clinicians' time is redistributed among various types of clinical and non-clinical activities.7 8 10 While T&M is best suited for this task, the results of T&M studies are highly sensitive to nuances in research design and study conduct, because: (1) human observers are the sole source of conventional T&M data; their ability to discern and timely record complex clinician activities can therefore highly influence the reliability and replicability of research outputs; and (2) the sample size of T&M studies is usually small due to high resource demands for conducting independent and continuous field observations, which exacerbates the potential effect of observer biases and imposes a higher requirement on subject selection and subject–observer assignment. Further, the way in which T&M data are recorded, such as how tasks are defined and the granularity of the classification of the tasks, can be also critical to whether the results will be useful in systematic reviews and meta-analyses.
As a result, we believe that it is time to identify ways to standardize T&M research so that future T&M studies can consistently produce compatible and comparable results. Seeking ways to standardize research design, conduct, and results reporting is crucial to enabling collective knowledge accumulation.15 16 This is particularly true in health sciences where novel discoveries may require generations of endeavors before the results can be applied at the bedside. Notable existing effort toward research standardization in health sciences includes STARE-HI (STAtement on the Reporting of Evaluation studies in Health Informatics),17 CONSORT (Consolidated Standards of Reporting Trials),18 QUOROM (QUality Of Reporting Of Meta-analyses),19 and STARD (STAndards for the Reporting of Diagnostic accuracy),20 among others. Many of these standards are created or supported by the U.S. National Library of Medicine and the EQUATOR Network (Enhancing the QUAlity and Transparency Of health Research), ‘an international initiative that seeks to enhance reliability and value of medical research literature by promoting transparent and accurate reporting of research studies.’21 22
As the focus of this research is on informatics studies conducted in health and healthcare domains, we limited the literature search to PubMed/MEDLINE only. The search was also restricted to empirical studies published in English within the last 20 years (1990–2010). Further, we applied two salient inclusion/exclusion criteria in study selection, namely: (1) the quantitative method employed by the study must be T&M, and the T&M method used must conform to the following definition: ‘independent and continuous observation of clinicians' work to record the time required to perform a series of clinical or non-clinical activities.’5 6 With this definition, we excluded work sampling research, ‘time’ studies for collecting efficiency measures on isolated events (eg, medication turnaround time), and ‘motion’ studies that do not collect time-based data; and (2) the main purpose of the study must be to examine the impact of health IT implementation on clinical work processes or workflow. Therefore, empirical studies outside this context, such as those evaluating standalone medical devices or other types of interventions (eg, management or clinical protocols), were also excluded.
The PubMed/MEDLINE search was performed on July 26, 2010. Three MeSH terms, ‘time and motion studies,’ ‘health information technology,’ and ‘medical informatics applications,’ were used to assist in the search, in addition to various combinations of the following search terms: ‘time,’ ‘motion,’ ‘time utilization,’ ‘time efficiency,’ ‘informatics,’ and ‘health IT.’ The initial search resulted in a total of 204 articles. Thirty-seven were excluded immediately because they were not in English or were in the form of conference abstracts, posters, or commentaries or reviews. The two faculty authors (KZ and DAH) each screened a random half set of the abstracts of the remaining papers; 141 were determined as not meeting the inclusion criteria. Then, all three authors read the full text of the 26 papers that remained; 12 more were determined to not meet the criteria, leaving 14 papers included in the final review.7 8 23–34 Table 1 summarizes the reasons for excluding the 153 papers.
Further, we extended the study pool by analyzing the citations of the 14 papers identified, as well as by incorporating a few new publications that came to our attention after the initial literature screening. This led to the inclusion of an additional 10 papers that met the aforementioned criteria; six of them were published in 2010.10 35–43
Next, we analyzed each of these 24 papers in-depth and iteratively developed a ‘study feature’ schema to capture the commonalities of how current T&M research is conducted and reported. The schema includes (1) basic information such as empirical setting, type of health IT system studied, and data collection tool used, and (2) other less obvious study properties such as whether the same clinician subjects were observed across different phases in multistage studies, and how transition periods between consecutive tasks were recorded and analyzed. The results were distilled into the checklist for standardizing T&M research that we propose in this paper. Then, we anatomized the design, conduct, and results reporting of each of the 24 studies to identify common divergences in existing T&M research, which point to key areas where additional attention is needed.
A complete list of the papers we reviewed is provided in appendix 1 as an online supplement. Appendix 2 (also available online) presents the study feature schema. As shown in appendix 2, about two thirds of these papers were published in the past 5 years (2005–2010), over half employed a prospective before and after design, and the majority were conducted in urban academic medical centers to evaluate the implementation of homegrown or commercial electronic health record systems or computerized prescriber order entry systems (or ePrescribing systems used in ambulatory settings).
Proposed STAMP checklist
The checklist that we propose in this paper, STAMP, was developed based on the study feature schema shown in appendix 2. Comprised of 29 main items organized into eight key areas, STAMP outlines a minimum set of data/information elements that researchers may consider collecting and reporting in their future T&M endeavors.
In a ninth area called ‘Ancillary Data,’ we also introduced three additional data elements, namely interruption, interaction, and location. These elements have not been rigorously studied among the papers we reviewed, but are found in related literature and can be valuable additions to enrich T&M analyses evaluating the impact of health IT implementation.44–47 First, interruptions, a pervasive phenomenon in healthcare, have a particularly detrimental impact on the quality and efficiency of clinical work. Their frequency and/or effects may diminish because of health IT adoption, or could escalate instead as a result. Second, patterns of interpersonal interaction—embodying communication, cooperation, and coordination—are an integral part of clinical work, which may also be substantially altered by the adoption of health IT. The basic elements of interpersonal interaction activities, such as with whom and via what method, can be readily recorded as part of T&M observations. Finally, clinicians' spatial movements in the patient care area may allude to how the physical layout of a hospital/clinic (eg, location of computer terminals) or characteristics of the implementation (eg, use of portable computing devices) may affect workflow efficiency. While none of the existing T&M studies had recorded information regarding the location where the activities took place, we believe that adding this information would incur very little extra cost, yet the results can greatly enhance our understanding of clinical work processes and workflow.
The 29 main items of STAMP, in addition to the three extra items, are presented in table 2. In figure 1, we depict STAMP as a flow diagram and provide it in a cheat-sheet format in appendix 3 in the online supplement.
Even though not all of these data/information elements are universally applicable, we believe that a majority of them are and, if so, they should be collected and reported in an explicit manner. The checklist therefore may also serve as an initial step toward standardized T&M research by inviting future studies to incorporate these critical methodological considerations into their design and conduct. Reasonable deviations are possible depending on the specific research contexts and objectives, the justification for which, however, should be provided.
Divergences in existing T&M research
Based on STAMP, we analyzed the variability in how existing T&M studies were conducted and/or the way in which their results were reported. Table 3 presents the results. As shown in the table, observer training and methods for collecting field data are two areas where inconsistencies have occurred most often.
Below, we discuss three salient issues observed from table 3 that we deem most crucial to T&M research. We believe these issues are, in general, reasonably under researchers' control (ie, can be avoided despite practical constraints) and addressing them will not incur a dramatically increased demand for research resources.
First, while human observers play a central role in T&M research, the existing studies tend to under-report the preparedness of the human observers involved. In addition, very few studies provided information regarding whether pilot observation sessions were conducted and whether calibration was attempted to improve consistency across multiple observers (if applicable). These issues could pose severe threats to the validity of T&M research: while there are exceptions, capturing the subtleties of clinical work may exceed the ability of average research assistants, especially if they do not possess a prior clinical background and/or experience in conducting observational studies in clinical settings. For example, use of the data collection tool published by the Agency for Healthcare Research and Quality requires human observers to be able to distinguish the following pairs of activities: ‘Paper Writing: Orders’ versus ‘Paper Writing: Forms’ and ‘Procedures: Lab Test’ versus ‘Procedures: Phlebotomy.’7–9 The nuances between these activities, however, may not be obvious to research personnel who do not have adequate training and relevant backgrounds. Unfortunately, many of the existing studies employed temporary student assistants to collect field data, and more importantly, a significant proportion of the studies (16 out of 24) disclosed very little information about who the observers were and how they were trained.
Second, more than half of the studies (15 out of 24) utilized a prospective before and after design, yet five of them did not report whether the pre- and the post-stages involved the same cohort of clinician subjects. Because the sample size of T&M studies is usually small, conclusions drawn based on pre–post comparison without a within-subject control design could be very questionable. Further, most of these studies were vague with respect to whether the field observations were conducted by the same observer(s) across different study phases. If not, then this calls into question whether the pre–post differences revealed by the study might be principally introduced by observer biases rather than by the intervention studied. While it should be acknowledged that it can be practically difficult to ensure subject and observer continuity due to many uncontrollable factors (eg, trainee rotation and staff turnover), we deem it important for T&M studies to report compromises made, even if inevitable, so that the results can be more meaningfully interpreted and used in research synthesis.
Third, 14 out of the 24 studies did not provide adequate detail regarding non-observed periods, which may include lunchtime, the time window after a shift change, and when clinician subjects temporarily rounded off the study site. Given that a significant amount of clinical activity may take place during such periods (eg, clinical documentation completed in offices outside the patient care area and deferred documentation completed after the shift), not taking them into account could result in serious validity issues. For example, it may be reasonable to assume that, compared to paper-based operations, offsite or off-duty documentation activities occur more often in a digital environment because of the ‘access anywhere and anytime’ nature of electronic data. Therefore, comparison of ‘documentation time’ versus ‘time spent on direct patient care,’ before and after health IT implementation, may produce misleading results if offsite or off-duty documentation activities are not captured as part of the post-implementation observations. Similarly, more than half of the studies did not report how between-task transition periods were handled, that is, whether the ‘stopwatch’ was paused when the clinician was about to finish a task and move on to the next one. Consider the following scenario: a clinician was talking to a hospitalized patient, then walked out of the room to grab the chart folder located by the door of the ward—should the time incurred for performing this transition be counted toward the ‘Talking to Patient’ activity, or the subsequent ‘Patient Chart Reading’ activity, or a standalone ‘Walking’ activity? We deem that T&M studies should consider and report such nuances: the duration of each individual incident may be short and seemingly trivial, but the cumulative amount could be substantial and may have a significant impact on research outputs.
In addition to the issues mentioned above, our review of the literature also raised two other concerns. First, as shown in table 1 that reports common reasons for paper exclusion, the term ‘time and motion’ may have been overused as it appears frequently in studies that did not actually use the T&M method according to the prevalent definition5 6; examples include work sampling studies, studies on turnaround time, and questionnaire surveys. This issue could create chaos in knowledge accumulation and cause unnecessary difficulties for researchers conducting systematic reviews and meta-analyses. Second, there does not seem to be a standard way to train T&M observers or to calibrate inter-observer disagreements (if applicable). This fact may exacerbate potential observer biases and undermine the validity of cross-study research synthesis. Developing a standard observer training set, possibly in the form of video recordings of typical clinical sessions annotated by T&M experts, would therefore be highly valuable.
The literature review results also suggest several general methodological and results reporting issues that are not necessarily unique to T&M research. Because many informatics evaluation studies feature a prospective before and after design, the right time to embark on post-intervention research activities can be a crucial decision, which should ideally occur after the ‘burn-in’ effect is diminished so that stable and sustainable user behaviors can be observed. However, in the health informatics research community, there does not seem to be a common consensus regarding the definition of ‘intervention maturity,’ or readily available methods that can be used to determine if intervention maturity has been reached. Among the pre–post T&M studies that we reviewed, this time point was by and large arbitrarily determined, and the range varied widely from immediately post-implementation to 3 months later, 6 months later, etc. Further, one-fifth of these studies did not report at all when their post-implementation data collection activities started, and very few studies explicitly mentioned whether all research subjects had been equally exposed to the intervention since the day it was introduced into the empirical environment.
Another area that deserves closer attention is the identification of pertinent measures. Individual characteristics, for example, can be influential factors moderating clinicians' acceptance of health IT systems.50 However, the decision about which individual characteristics to collect and analyze seems to arbitrary: gender, age, level of training, medical specialty, prior experiences, etc, frequently appeared in the studies we reviewed but in assorted combinations, and very few studies provided justifications as to why some of these variables were considered while others were not. Further, additional research is needed on the selection of appropriate measures to quantify clinical work processes and workflow.4 Aggregated average amount of time spent on performing certain tasks (eg, examining patients) or certain task groups (eg, direct patient care) is the most commonly used measure to date. As we demonstrated in our prior work, this measure may not be adequate to capture the level of granularity needed for revealing the true impact of health IT implementation on workflow (eg, workflow fragmentation and task switching frequency).10 Further, researchers have demonstrated that nuances in how measures such as time on task are calculated, in the context of interruptions, can have a significant impact on research results and conclusions.47 Finally, analytical methods (eg, statistical tests and regression analyses) are another area in which a greater consensus would be very beneficial. Among the T&M studies we reviewed, analytical methods varied considerably even for similar data collected under similar conditions.
It should be acknowledged that the suggested checklist presented in this paper, STAMP, was developed through reviewing the empirical studies that we were able to identify. While we believe its data/information elements adequately cover most of the critical aspects of T&M research, they may not be comprehensive given that our literature search may not have included all relevant studies, and the studies we included may not have contained all possible T&M research features. Further, the checklist only reflects the current best-known knowledge of how T&M studies should be conducted. Therefore, it needs to be constantly updated in order to accommodate novel field observation techniques and technologies, such as automated activity capture and recognition using radio-frequency identification tags and video/audio recording devices.44 51 In the future, we will evaluate the adoption of STAMP by the research community, keep a watch on emerging methods and thus new requirements for results reporting, and accordingly adjust and update STAMP's structure and the data/information elements it encompasses.
We reviewed existing empirical studies that have applied the T&M approach to evaluate the impact of health IT implementation on clinical work processes and workflow. Based on the results, we developed a suggested ‘checklist’ intended to help standardize the design, conduct, and results reporting of future T&M studies. We believe that the resulting checklist, STAMP (Standard Time And Motion Procedures), may contribute to improving the methodological and results reporting consistency of T&M research, so that meaningful insights can be derived from across-study synthesis and health informatics, as a field, will be able to accumulate knowledge from these studies.
Funding This project was supported in part by Grant # UL1RR024986 received from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH) and NIH Roadmap for Medical Research.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.