There was no cholera in Haiti until October 2010, when epidemic cholera swept the country. Within 6 months, more than 250,000 people were sickened and 4,000 died. A catastrophic earthquake earlier that year had exacerbated human and environmental risks by displacing millions of people and disrupting public health infrastructure. But there would have been no epidemic without the bacterium, Vibrio cholerae. How did the pathogen enter the picture? An epidemiologic investigation in November 2010 suggested that cholera could have been imported to the island. Genetic analyses of the outbreak strain—first by pulsed field gel electrophoresis (PFGE) and later by whole genome sequencing—were undertaken to characterize its virulence, origins, and spread.
The November 2011 theme issue of Emerging Infectious Diseases (EID) presents a year’s worth of lessons learned from the public health response to cholera in Haiti. An article titled Comparative genomics of Vibrio cholerae from Haiti, Asia, and Africa discusses limitations of the genetic analyses carried out early in the epidemic. First, PFGE testing indicated that the outbreak strain was clonal, i.e., that the isolates shared a similar genotype and could have stemmed from a single source; however, PFGE could not differentiate the outbreak strain from several different isolates obtained from South Asia and Africa within a five-year period. Furthermore, the initial whole genome sequencing study had limited capacity to explore the origins of the outbreak strain because it lacked recent, globally distributed isolates for comparison.
The authors of the Comparative Genomics article conducted a more extensive whole genome sequence analysis, in which they compared nine outbreak-related isolates with 12 contemporary Asian or African isolates matched on PFGE pattern and 2 non-matching isolates from the Western hemisphere. All of the outbreak samples were virtually identical, confirming that they derived from a common source. Although they were genetically related to isolates from both India and Cameroon, none of these was a perfect match.
A study published elsewhere characterized 24 cholera isolates obtained in Nepal between July and November 2010. The PFGE and antimicrobial susceptibility profiles of these isolates were similar to those previously reported for Haitian outbreak-related isolates. Furthermore, a phylogenetic analysis incorporating previously reported whole genome sequence data found that the Haitian isolates were indistinguishable from one of four closely related genetic clusters among the isolates from Nepal.
PFGE is currently the method of choice for subtyping of pathogenic bacteria in outbreak investigations. It is fairly fast, broadly applicable, and comparable among public health laboratories that participate in CDC’s PulseNet, which employs standardized protocols and quality assurance procedures. PFGE helps epidemiologists find outbreak-associated cases and identify environmental sources of infection. However, it is not possible to draw reliable conclusions about the evolutionary relatedness of bacterial isolates from their PFGE “DNA fingerprints” because the targets of this method (restriction fragments with no detailed sequence information) do not contain any phylogenetic information. Thus, it is not surprising that PFGE patterns from the Haitian cholera outbreak isolates matched several others from Asia and Africa, even though they were not genetically identical.
Whole genome sequencing offers increased capacity for comparing and distinguishing among genotypes of pathogen isolates. This technology is becoming much cheaper and more accessible and its role in public health investigations is sure to grow. In fact, the Science list of six Areas to Watch in 2012 puts genomic epidemiology fourth—right after the Higgs boson, faster-than-light neutrinos, and stem-cell metabolism.
Increasing the resolution of genetic and genomic methods is however only one factor in improving the ability to make inferences from molecular epidemiologic studies. Just as PFGE testing relies on comparison with archived data (as in PulseNet), the public health value of high-resolution genome sequences depends on the availability of appropriate, high-quality sequence data for comparison.
The authors of the Nepal study put it this way:
“Infectious disease tracking requires global-scale information and cooperation. The current study was reliant upon genome analyses performed previously from other international studies. Future investigations will require high-quality genome databases that include representative isolates and metadata from geographically distributed samples, representing both historical and contemporary epidemics…. It is now the charge of the world’s national health agencies and disease researchers to populate these databases with both sequences and rich metadata.”
Advocates of human genome sequencing for personalized medicine should also take note: an individual genome sequence, whether microbial or human, is useful only to the extent that meaningful comparisons with other sequences are possible. Those comparisons depend on the existence, availability, and reliability of genome sequences from other individuals. Using this information to predict, prevent, detect, or control disease also requires the availability of relevant epidemiologic and clinical data. Observing, documenting, and analyzing those data—the “metadata” that make genomic information actionable—is a joint responsibility of laboratory experts, bioinformaticians and epidemiologists.