Handheld vs. Laptop Computers for Electronic Data Collection in Clinical Research: A Crossover Randomized Trial
- Guy Haller, MD, MSc, PhDa,b,f,
- Dagmar M Haller, MD, PhDd,g,
- Delphine S Courvoisier, MSc, PhDb,
- Christian Lovis, MD, MPHc,e
- aDepartment of Anesthesiology, Geneva University Hospitals, University of Geneva, Geneva, Switzerland
- bDivision of Clinical Epidemiology, Geneva University Hospitals, University of Geneva, Geneva, Switzerland
- cDivision of Medical Informatics, Unit of Clinical Informatics, Geneva University Hospitals, University of Geneva, Geneva, Switzerland
- dDepartment of Community Medicine and Primary Care, Geneva University Hospitals-University of Geneva Faculty of Medicine, Geneva, Switzerland
- eUniversity of Geneva, Geneva, Switzerland
- fDepartment of Epidemiology and Preventive Medicine, Monash University, Melbourne, Australia
- gDepartment of General Practice, the University of Melbourne, Australia
- Correspondence: Guy Haller, MD, MSc, PhD, Unit of Clinical Epidemiology, Department of Anesthesiology, Geneva University Hospitals, 24 Rue Micheli-du-Crest, 1211, Geneva 14-Switzerland (Email: ).
- Received 20 October 2008
- Accepted 2 June 2009
Objective To compare users' speed, number of entry errors and satisfaction in using two current devices for electronic data collection in clinical research: handheld and laptop computers.
Design The authors performed a randomized cross-over trial using 160 different paper-based questionnaires and representing altogether 45,440 variables. Four data coders were instructed to record, according to a random predefined and equally balanced sequence, the content of these questionnaires either on a laptop or on a handheld computer. Instructions on the kind of device to be used were provided to data-coders in individual sealed and opaque envelopes. Study conditions were controlled and the data entry process performed in a quiet environment.
Measurements The authors compared the duration of the data recording process, the number of errors and users' satisfaction with the two devices. The authors divided errors into two separate categories, typing and missing data errors. The original paper-based questionnaire was used as a gold-standard.
Results The overall duration of the recording process was significantly reduced (2.0 versus 3.3 min) when data were recorded on the laptop computer (p < 0.001). Data accuracy also improved. There were 5.8 typing errors per 1,000 entries with the laptop compared to 8.4 per 1,000 with the handheld computer (p < 0.001). The difference was even more important for missing data which decreased from 22.8 to 2.9 per 1,000 entries when a laptop was used (p < 0.001). Users found the laptop easier, faster and more satisfying to use than the handheld computer.
Conclusions Despite the increasing use of handheld computers for electronic data collection in clinical research, these devices should be used with caution. They double the duration of the data entry process and significantly increase the risk of typing errors and missing data. This may become a particularly crucial issue in studies where these devices are provided to patients or healthcare workers, unfamiliar with Computer Technologies, for self-reporting or research data collection processes.
Large amounts of data are collected, stored and processed in clinical research. With computer technologies, this information can be captured directly in an electronic format, increasingly replacing paper-based data records.1 2 Electronic data offer the advantages of improved data quality and consistency through the use of automated validation procedures and data range checks. They can integrate different kind of formats (images, texts, physiological signals) which can easily be transferred over long distances through wireless networks. Recent advances in hardware and software technologies allow such data to be collected on increasingly smaller portable devices such as laptops and handheld computers. This is particularly convenient for studies performed at patients' bedside, or in practice or home environments. It is currently unknown which of the two devices is the best for electronic data collection in clinical research. This cross-over randomized controlled trial assesses users' accuracy, efficacy and satisfaction in using the two devices.
Handheld computing devices such as personal digital assistants (PDA) and Smartphones are used by more than 50% of physicians in OECD countries3 4 and by 75% of United States residents.5 Their extended functionalities associated with easy touch input on display screens or miniature keyboards make them very popular in busy clinical and academic environments. Handheld computers are used to access medical literature, display electronic pharmacopeias, track patients, or prescribe drugs.6 In classrooms, they are used to download lecture materials, images or multimedia files, and as polling tools.7 8 9 10 11 As researchers are progressively turning to electronic data collection methods, handhelds are increasingly used in clinical research to record and process data. They are particularly convenient for field studies and self-reporting data collection processes. Gupta et al. report the use of handheld computers to perform a survey on more than 99,598 tobacco users in Mumbai, India.12 The device was found to be a particularly convenient tool to collect data directly in the study field of a densely populated city. Lal et al. used handheld computers for data collection in burn patients.13 Handheld computers were found to be 23% faster and 58% more accurate than paper and pencil recording. Their multiple functionalities associated with user-friendly touch screen technologies make them a particularly attractive alternative to paper-based diaries or questionnaires for patients' self reporting use, particularly children and young adults14 15 16 the electronic format of handheld computers allows the capture and recording not only of text data but also of virtual electrocardiograms, electrochemical data and photographs. These can be encrypted and transmitted to a central database management system through a wireless connection to a local area network (LAN) or the Internet.17 18 19 Since 2000, more than 40,000 handhelds have been sold in 48 countries for use in clinical trials.17
Data quality is a crucial factor in clinical research. An increasing number of treatments, diagnostic strategies, or clinical guidelines are based on evidence, the best of which comes from randomized trials.20 Time and its financial correlates is also increasingly of essence in such trials. If the collected data are inaccurate or missing, conclusions will be biased and the scientific evidence subsequently misleading. There are many examples of publication retractions due to data management errors.21 Consequences can be serious as even retracted articles are still cited and misleading results still used to guide clinical practice.22
Despite the above-cited advantages, some authors suggest that the use of handhelds could negatively impact data quality. The small screen size along with the peculiarities of text entry on handhelds (character recognition or on-screen keyboards) could make the data entry process slower and more prone to errors than other electronic data collection tools such as desktop or laptop computers.23 24 As laptops are becoming increasingly cheaper and handier, these devices represent an alternative to handheld computers for electronic data collection in research. Laptops are portable devices, usable in a natural environment, which also have wireless network facilities allowing data to be transferred quickly and efficiently over long distances.
Research Question and Objectives
It is currently unknown which of the two portable devices (laptop or handheld computer) is the fastest, most accurate, and has the preference of users. The purpose of this randomized cross-over trial was to compare users' speed, number of entry errors, and satisfaction in using the two different devices.
Following University Hospitals Human Research and Ethics Committee's exemption, we recruited through web advertisement at the Hospital and University of Geneva four study volunteers. Participants needed to have at least 1 year regular data recording and typing experience with a laptop or desktop computer. They also needed to be reasonably familiar with handheld computers and have a good general knowledge of information technologies. We excluded participants aged over 55 years or who had uncorrected visual impairments.
Laptop and Handheld Interface Design
We used a common commercially available laptop, the Dell® latitude 860 (Dell, Inc). The data base interface we used was the program EpiData (version 2.1 EpiData Association, Odense-DK). This program is widely used as it is freely available on the Internet and offers all the usual features of commercial databases (data entry forms, input masks, validation rules, automatic filters) to ensure data consistency and completeness.
For the handheld computer, we chose the Palm®-tungsten E2 (PalmSource, Inc, Sunnyvale, CA), also widely available on the market. Because there is no version of EpiData for handheld computers (Palm OS or Pocket PC, we used HanDBase professional® (version 3.0, DDH-softwares, Inc-Wellington, FL) a commercial database package for Palm Pilot handhelds. This system is characterized by its flexibility and interoperability. Data collected on a handheld computer can be synchronized to a desktop computer and transformed into a CSV (Comma Separated Values), Access-Microsoft or Stata tables. The HanDBase professional® package also allows the implementation of a number of filters, pull-down menus and authorized values. Forms with buttons, checkboxes, pop-up lists and automated date and number entry can be used to enter data.
For both devices, we developed a form that was graphically as close as possible to the layout of the written questionnaire (see Figures 1 and 2). For the PDA, we designed low-level dialogue boxes to minimize the risk of text overload, a critical issue for 3-inch PDA screens. We used tabbing sequences as much as possible and options set within windows integrated within dialogue boxes. We also standardized controls and position buttons in a logical sequence, as close as possible to the initial written questionnaire. This contributed to making the handheld a flexible and user-friendly device.
Prior to the study, the overall data collection procedure was pilot tested by one of the coauthors (DH) on 126 paper-based questionnaires, randomly allocated to be recorded on the Palm®-Tungsten E2 handheld or on the Dell® latitude 860 laptop. The handheld data entry form and the computer-user screen interface were then finalized, taking into account minor problems identified in the pilot. The pilot study also allowed the measurement of errors for future sample size calculation and the estimation of the training required for users to become familiar with the data entry process on both devices.
We used a standard research paper-based questionnaire which had been developed for a study of young people attending general practices in Victoria (Australia).25 The questionnaire contained three different sections representing altogether 71 different fields. These included questions on sociodemographic data, past medical history, Kessler's scale of emotional distress (K10) and the SF12 quality of life questionnaire.26 With the exception of sociodemographic questions, most answers were rated on 5-point Likert scales or 10-point visual analogue scales. A code number was printed next to each answer option on the paper-based form. The same number was used to code answers in the electronic format.
The study took place between Oct 2007 and Feb 2008. Participants first attended a 1 hour information session in which the purpose of the study and the overall procedure were explained. This was followed by a 2 hour training session where participants were able to become familiar with both data entry forms, specific characteristics of the computerized devices and study requirements. During this session they were asked each to record 5 paper-based questionnaires representing 355 fields on each device. This had been found in a pilot study to be the minimum number of questionnaires required for participants to become equally familiar and confident with the two devices tested. This had been established by measuring the duration of the data entry process for each questionnaire. When this duration reached a steady state (2.3 min for the laptop and 3.1 min for the PDA after 2 × 5 questionnaires recorded by DH) it was considered that the top of the learning curve was reached.
Each participant then received 160 paper-based questionnaires representing altogether 45,440 fields to be recorded in an electronic format. Written instructions about the overall study procedure were also provided. Participants were asked to record all the fields of these questionnaires either on a laptop or on a handheld computer, according to a random and equally balanced data recording sequence. The random recording sequence was generated by computerized block randomization. Instructions on the kind of device (handheld or laptop) to be used for each paper-based questionnaire was provided to participants in individual sealed and opaque envelopes. These were opened by the data coder just before the data entry of the questionnaire. Participants were instructed to perform the study in a quiet location (at home or at work), to avoid recording all data during the same session and to rigorously keep to the data entry order defined by the envelopes. The study flowchart is provided in Figure 3.
At the end of each questionnaire recording process, participants were asked to complete a short form to indicate the time of the day, the duration of data entry and the position of this entry in the sequence of recordings of the day's data entry session. Participants were also required to describe noise, light conditions, and interruptions during the data entry process using a self-administered 5 levels Likert scale (very poor to excellent). Each participant also received an electronic stopwatch to measure recording duration. They were instructed to start the stopwatch just before activating the “NEW RECORD” button and to stop it immediately after having clicked on the “SAVE RECORD/OK” button. At the end of the study we asked participants to complete an additional short form to assess acceptability and satisfaction of using both devices (handheld and laptop).
Accuracy of the two devices was assessed by comparing each item recorded on HanDBase® and EpiData electronic databases with the original item from the paper-based questionnaire. We made a distinction between two types of errors: typing and missing data errors. Typing errors were defined as data recorded in the electronic database that did not correspond to information provided on the original handwritten questionnaire. Missing data were defined as missing values, including in fields where the coder should have used a specific code for the value “missing” (in this study the number 9).
Efficacy was measured by determining the overall duration of the data entry process on both devices. Participants were asked to start the stopwatch at the opening of a new patient form on the HanDBase® and EpiData databases and to stop time measurement when they ticked or pressed on “save full patient record”, at the end of the paper-based questionnaire data entry process.
Users' satisfaction was measured on a 12-item form designed to assess participants' preferences between the two devices. The survey explored three dimensions of users' satisfaction and preferences: perceived presentation/use; learning and handiness. A seven point Likert scale was used to rate participants' answers.
Possible confounding factors such as coders' characteristics, time of the day, number of previous questionnaires entered within the session, position of the entry in the sequence of recordings within a session, available light, interruptions and noise were also measured.
Descriptive summaries of confounding factors (i.e., conditions of data entry) included means (± SD) or medians with ranges, depending on distribution, for continuous variables. They were compared by the paired Student's t test or the Wilcoxon rank signed test if not normally distributed. For categorical variables we used frequencies and proportions.
Possible associations between duration of data entry for each paper-based questionnaire and the device used (handheld or laptop) adjusted for conditions of data entry were examined using multilevel linear models (MLM). To obtain a normal distribution of the dependent variable, we used the log of duration of data entry. Questionnaires were nested within periods of data entry, themselves nested within coder.
Number of errors and number of missing entries were examined using generalized linear multilevel models (GLMM). Number of errors and number of missing entries both have a zero-inflated Poisson distribution, i.e., they have too many zeros (more than half the questionnaires were entered without any errors or missing data) but then follow a classical Poisson distribution. Hence, we conducted the analysis in two steps. A first analysis investigated the influence of the independent variables on the occurrence of at least one error (0 v. ≥ 1 errors), specifying a logit link for the dependent variable. A second analysis investigated, among data records that had at least one error, the differences in number of errors due to the independent variables, specifying a Poisson distribution of the dependent variable. The independent variables were the device used and the confounding factors (i.e., noise, lights, interruptions, number of paper-based questionnaires recorded during the same round, position of the questionnaire in the sequence). A p value < 0.05 was considered statistically significant. We performed all analyses using the statistical software R, version 2.7.2 with the NLME and glmmML packages.27
The accuracy of data entry for handheld computers versus laptop has never been assessed before. This is why we performed a pilot study. One data enterer recorded 63 questionnaires (4,473 fields) on a laptop and 63 questionnaires (4,473 fields) on a PDA. The mean difference between the two series of questionnaires for recording errors between the two devices was 0.003 and its standard deviation 0.018. A total of 567 questionnaires (40,257 field entries) was therefore found to be necessary in this two intervention crossover study to have a probability of 80% that the study would detect a treatment difference of 0.003 U (± 0.018) at a two-sided significance level of 5%. To allow for possible dropouts or missing data, sample size was increased by 10%. The final sample size was therefore found to be 640 questionnaires or 160 (11,360 field entries) for each of the four data coders. Calculations were performed on the PASS software (PASS/NCSS 2000, NCSS Corporation, Kaysville, UT).
The four participants were young adults (range: 18–30), 50% were females. All had at least 1 year of formal training in computing technologies and regular practice in computer use and typing. All were familiar with a handheld computer but only one participant was a regular user.
Data were more frequently recorded during night-time (20 h00–8 h00) than during daytime (8 h00–20 h00). However, this was the case for both the handheld and laptop data entry modes and there was no significant difference between the two devices. There was also no difference between the two devices regarding the number of data entry sessions (periods) needed by coders to record all the data. The level of interruptions, the lighting, and noise conditions during the data entry process were also similar between the two groups. These results are summarized in Table 1.
The mean data entry duration for one questionnaire was 2.0 (SD 1.2) minutes on the laptop and 3.3 (SD 1.9) minutes on the handheld (p < 0.001). Differences in data entry duration were significant both for individual coders and for all coders together (see Figure 4).
There was also a significant difference between the two systems in relation to typing errors and missing data errors. The number of typing errors in data entry was 8.4 for 1,000 entries on the handheld and 5.8 for 1,000 entries on the laptop. The proportion of questionnaires recorded with one or more typing errors was 38.8% for the handheld and 21.3% for the laptop computer (p < 0.001). However, when one error had occurred on the laptop, it was followed by a larger number of subsequent errors with 27.1 per 1,000 versus 21.7 errors per 1,000 entries on the handheld (p < 0.001). Thus, while the laptop favored the occurrence of zero errors, when one typing error had occurred, it was usually followed by an increased number of subsequent errors as compared to data entry on the handheld.
There was a significant difference between the two systems regarding missing data errors: 22.8 per 1,000 entries on the handheld and 2.9 per 1,000 entries on the laptop. The proportion of questionnaires with missing data errors was 65.0% for the handheld and 14.4% for the laptop (p < 0.001). Among the questionnaires which contained at least one missing data error, the number of subsequent missing data errors was 35.1 versus 20.5 per 1,000 entries for the handheld and the laptop respectively (p < 0.001). Thus, missing data errors were more common on the handheld than on the laptop. These results are summarized in Table 2.
Participants expressed higher satisfaction in using the laptop than the handheld. They found the laptop to be easier, faster and friendlier in its use than the handheld (p < 0.001). These results are reported in Table 3.
This study provides good support for the benefits of laptop over handheld computers for electronic data recording. The overall duration of the recording process was significantly reduced (2.0 versus 3.3 min) when data were recorded on the laptop computer. The overall data accuracy also improved when the laptop was used. It reduced typing errors from 8.4 to 5.8 and missing data from 22.8 to 2.9 per 1000 entries. However, when one error occurred on the laptop, it led to a greater number of additional errors on the next two to twelve following fields. This was most often the case in the central section of the paper-based questionnaire where participants had to record electronically thirteen closely related fields. If the answer to the first or second field was missed, all the following fields were wrongly coded. This was probably due to participants recording mechanically answers with the keyboard without checking on the screen whether they matched the right field. All answers were thus shifted from one field to the next. This could not happen with the handheld computer as data could not be recorded without looking at the screen.
Little is known about the comparative performances of the two devices and no randomized controlled trial to which our study findings could be compared has previously been performed. Most available controlled studies analyzing the benefits of handheld computers used paper records in their control group.15 28 Some authors, however, compared the specific performances of a number of currently available handheld computers. Wright et al,29 for example, analyzed the accuracy of data recording on four different pocket PCs, comparing text entry with a touchscreen keyboard and an external keyboard. They included participants over 55 years and used early devices such as the Apple Newton® and the Hewlett Packard 360LX®. They found that touchscreen keyboards led to more errors and were more difficult to use than external traditional keyboards. There are several possible reasons for this. First, the authors included older users who were probably less familiar with touchscreen technology and may have had reading difficulties related to the small size of the characters. Secondly, the study assessed the accuracy of full text recording. Most of the time, handheld devices are used to record short information or numbers (codes). Thus, the findings of Wright et al.29 may not truly be generalizable. In addition, these authors did not assess other features of handhelds such as writing recognition or graphiti alphabet. These features currently represent the primary means of interaction between a user and this type of machine in close imitation to the traditional pen and paper interface, potentially limiting the number of typing errors.30 To make the best use of these features of handheld devices we therefore used a more recent handheld device in our study, the Palm® tungsten E2. To record data, study participants could use the touchscreen keyboard, the pull down menus of the HanDBase® database or the graffiti writing recognition system. To avoid additional and nonspecific variations between the two devices related to user-interface design, we chose to develop a form that was graphically as close as possible to the layout of the original paper-based questionnaire. We tested and adapted the original layout following a pilot study. We recruited study participants with good knowledge of computing technology and data entry skills. All were younger than 30 years. Despite this, the handheld computer did not compare favorably to the laptop. Data entry on the handheld was slower, produced more errors and less satisfaction in users.
This may be explained in several ways. First, although we developed and pretested a user-friendly graphical interface on the handheld, the stylus–handheld interaction, be it touchscreen keyboard, pull down menus, or graffiti writing recognition, is equivalent to single finger typing. This cannot be compared to traditional laptop keyboards where both hands and the QWERTY layout is used, a combination widely recognized to increase typing speed.31 32 Secondly, the EpiData electronic database allowed users to go automatically from one field to another by using the “enter” key. Thus data could be easily recorded on the laptop without having to look both on the handwritten questionnaire and the computer screen to enter the next field. This may have increased users' satisfaction and data recording speed. Finally, the size of both devices' screen may have had an impact on the overall performance of the systems tested. The handheld computer screen diagonal is 3′, while the laptop is 14′. To represent the 71 different fields of the original questionnaire in a user-friendly manner on the handheld computer, we had to use several pages. Users could change pages using a pencil command at the bottom of the page. Despite this graphical organization, data entry fields were close to each other, increasing the likelihood for data enterers of missing a field. This may explain why there were 8 times more missing data errors on the handheld than on the laptop computer.
There are some limitations to the current study. First, the researchers had knowledge of the study hypothesis and purpose. This may have caused a detection bias towards increased error detection according to the study hypothesis. To minimize this bias, the entire errors' assessment process was standardized and assessors were blinded to group allocation. The first assessor limited his activity to reading the original value of each field recorded on the handwritten questionnaires while the second assessor checked the corresponding value recorded on the two electronic devices tested. When it was unclear whether a mismatch had to be counted as an error or missing information, the case was discussed between the two assessors until a consensus was reached. To complete the error checking process, we also compared the electronic handheld and laptop records between each others. Any mismatch between the two was reanalyzed and a comparison with the paper-based gold-standard questionnaire performed to identify which of the laptop or handheld record contained the error.
The second type of limitation relates to participants' computer skills. If all had significant experience with laptop computers and were familiar with handheld computers, only one was a regular user of a Palm® device. This may have biased the results towards better performance with the laptop. However, to minimize this bias, all participants were trained to the use of the handheld computer before the beginning of the study. We also adjusted statistical comparison between the two groups in the GLMM for coders' characteristics.
The third type of limitation of this study relates to the use of only four data coders. Although the design of the study maximized power and allowed to show significant differences between the two devices, study results may not entirely be generalizable. Furthermore, study participants were highly motivated and had significant experience with computers. Many studies, such as the one by Gupta et al.12 for example, assess devices' performance used by non IT experts, often in natural environments. In our study we purposely avoided natural conditions (i.e., hospitals, medical practices, households) to minimize the confounding effects of fatigue, interruptions, noise, or light conditions which can impact data coders' performance. If this reinforced internal validity this may have affected the generalizability of our findings too. Many clinical research projects based on interviews or questionnaires are performed in ambulatory settings where data recording conditions may be much more chaotic than the ones in our study.
Finally we measured noise, light conditions and interruptions during the data entry process using self-reported perceptions rated on a 5-level Likert scale. This may have affected measurement precision. Future studies should consider the use of direct observations for the measurement of these confounding factors.
Despite these limitations, this is the first study assessing accuracy, efficacy and users' satisfaction of handheld computers compared to laptop computers for electronic data recording in clinical research. If handheld computers offer the advantage of portability and flexibility compared to laptops, this is at the cost of a heavier and less accurate data processing. It is unclear whether new developments such as haptic feedback in the touchscreen mode or voice-based data entry will improve data processing. Regardless of the model and characteristics of the PDA tested, their restricted size remains a major weakness during data entry process.7 At a time when governments, health-care organizations, and insurance companies focus on efficacy, the use handheld technology has to be justified by solid evidence that these devices actually improve the overall quality of medical practice, teaching and more specifically research. Innovations in hardware and software technologies and more particularly the development of tablet personal computers and ultralight laptops with foldable screens will increasingly challenge the use of handheld computers in clinical research in the future.33
Despite the promises offered by the portability and plasticity of handheld computers, these devices, when compared to traditional laptops, are slower and less accurate for data recording. This study clearly shows the limitations of using such devices for collecting data in clinical research. It opens new perspectives for the development and use of different devices such as small laptops or tablet-PC for collecting data in clinical research in the future.
The funding required for this project was provided by Geneva University Hospitals. The authors would like to acknowledge the support received for this project. The authors are grateful to Ms Jacqueline Haller, sociologist, who contributed to the assessment of data recording errors. The authors acknowledge the excellent work of the four data coders who participated with enthusiasm in this study: Mr Christopher Chung, Mr Julien Gobeil, Ms Sandra Papillon and Ms Chantal Plomb.