J Am Med Inform Assoc 15:150-157 doi:10.1197/jamia.M2544
  • Focus on Media-Based Biosurveillance
  • Model Formulation

HealthMap: Global Infectious Disease Monitoring through Automated Classification and Visualization of Internet Media Reports

  1. Clark C Freifelda,
  2. Kenneth D Mandla,b,c,
  3. Ben Y Reisa,b,c,
  4. John S Brownsteina,b,c
  1. aChildren’s Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, Boston, MA
  2. bDivision of Emergency Medicine, Children’s Hospital Boston, Boston, MA
  3. cDepartment of Pediatrics, Harvard Medical School, Boston, MA
  1. Correspondence: Clark C. Freifeld, Children’s Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, 300 Longwood Ave., Boston, MA 02115 (e-mail: <clark.freifeld{at}>)
  • Received 29 June 2007
  • Accepted 29 November 2007


Objective Unstructured electronic information sources, such as news reports, are proving to be valuable inputs for public health surveillance. However, staying abreast of current disease outbreaks requires scouring a continually growing number of disparate news sources and alert services, resulting in information overload. Our objective is to address this challenge through the Web application, an automated system for querying, filtering, integrating and visualizing unstructured reports on disease outbreaks.

Design This report describes the design principles, software architecture and implementation of HealthMap and discusses key challenges and future plans.

Measurements We describe the process by which HealthMap collects and integrates outbreak data from a variety of sources, including news media (e.g., Google News), expert-curated accounts (e.g., ProMED Mail), and validated official alerts. Through the use of text processing algorithms, the system classifies alerts by location and disease and then overlays them on an interactive geographic map. We measure the accuracy of the classification algorithms based on the level of human curation necessary to correct misclassifications, and examine geographic coverage.

Results As part of the evaluation of the system, we analyzed 778 reports with HealthMap, representing 87 disease categories and 89 countries. The automated classifier performed with 84% accuracy, demonstrating significant usefulness in managing the large volume of information processed by the system. Accuracy for ProMED alerts is 91% compared to Google News reports at 81%, as ProMED messages follow a more regular structure.

Conclusion HealthMap is a useful free and open resource employing text-processing algorithms to identify important disease outbreak information through a user-friendly interface.


  • This work was supported by R21LM009263-01, 1 R01 LM007677, and N01-LM-3-3515 from the National Library of Medicine, National Institutes of Health, and the Canadian Institutes of Health Research.

Free Sample

This recent issue is free to all users to allow everyone the opportunity to see the full scope and typical content of JAMIA.
View free sample issue >>

Access policy for JAMIA

All content published in JAMIA is deposited with PubMed Central by the publisher with a 12 month embargo. Authors/funders may pay an Open Access fee of $2,000 to make the article free on the JAMIA website and PMC immediately on publication.

All content older than 12 months is freely available on this website.

AMIA members can log in with their JAMIA user name (email address) and password or via the AMIA website.

Navigate This Article