{"id":3209,"date":"2015-02-09T14:46:27","date_gmt":"2015-02-09T19:46:27","guid":{"rendered":"http:\/\/blogs.cdc.gov\/genomics\/?p=3209"},"modified":"2024-04-08T16:30:54","modified_gmt":"2024-04-08T20:30:54","slug":"100000-studies","status":"publish","type":"post","link":"https:\/\/blogs.cdc.gov\/genomics\/2015\/02\/09\/100000-studies\/","title":{"rendered":"100,000 Studies: A Milestone for Human Genome Epidemiology (HuGE) and the HuGE Navigator"},"content":{"rendered":"<p><a href=\"https:\/\/blogs.cdc.gov\/genomics\/files\/2015\/02\/2015-02_tmpt_update.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright wp-image-3215 size-full\" src=\"https:\/\/blogs.cdc.gov\/genomics\/wp-content\/uploads\/sites\/20\/2015\/02\/2015-02_tmpt_update.jpg\" alt=\"a HUGE odometer with 100000 on it\" width=\"241\" height=\"136\" \/><\/a><\/p>\n<p>The <a href=\"https:\/\/phgkb.cdc.gov\/PHGKB\/hNHome.action\" target=\"_blank\" rel=\"noopener noreferrer\">HuGE published literature database<\/a> now contains more than 100,000 citations, a milestone reached at the end of 2014. The Office of Public Health Genomics has compiled this database since 2001 via weekly systematic sweeps of <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/\" target=\"_blank\" rel=\"noopener noreferrer\">PubMed<\/a> performed by a single curator. For the first five years, a complex PubMed query was used to identify studies of genotype prevalence, gene-disease association, gene-environment interaction, and the performance characteristics of genetic tests. In 2006, a <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/18227866\/\" target=\"_blank\" rel=\"noopener noreferrer\">data mining approach<\/a> using support vector machines replaced the PubMed query, reducing the time needed for hand curation and improving both sensitivity and specificity. The database and a suite of online tools to explore it were re-launched as the <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/18227866\/\" target=\"_blank\" rel=\"noopener noreferrer\">HuGE Navigator<\/a>.<!--more--><\/p>\n<p>Since the first draft of the human genome sequence was announced in 2001, PubMed has added more than one million articles on human genetics and genomics. Human genome epidemiology has grown, too, but studies of genetic variation and disease in populations\u2014i.e., groups of people not defined by family relationships\u2014still accounts for only a small fraction of the total (Figure 1).<\/p>\n<figure id=\"attachment_3210\" aria-describedby=\"caption-attachment-3210\" style=\"width: 300px\" class=\"wp-caption alignright\"><a href=\"https:\/\/blogs.cdc.gov\/genomics\/files\/2015\/02\/Figure1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-3210 size-medium\" src=\"https:\/\/blogs.cdc.gov\/genomics\/wp-content\/uploads\/sites\/20\/2015\/02\/Figure1-300x199.png\" alt=\"Articles in HuGE published literature database\" width=\"300\" height=\"199\" srcset=\"https:\/\/blogs.cdc.gov\/genomics\/wp-content\/uploads\/sites\/20\/2015\/02\/Figure1-300x199.png 300w, https:\/\/blogs.cdc.gov\/genomics\/wp-content\/uploads\/sites\/20\/2015\/02\/Figure1.png 705w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><figcaption id=\"caption-attachment-3210\" class=\"wp-caption-text\">Articles in HuGE published literature database,<br \/>by year of publication \u2013 2001-2014*<\/figcaption><\/figure>\n<p>A boom in gene discovery followed the introduction of <a href=\"https:\/\/phgkb.cdc.gov\/PHGKB\/searchSummary.action?firstQuery=gwa\" target=\"_blank\" rel=\"noopener noreferrer\">genome-wide association studies (GWAS)<\/a> (hotlink) in 2005; following up on these discoveries to unravel genetic contributions to disease, however, remains extremely challenging. There are no \u201chigh-throughput\u201d shortcuts to understanding. Now that it seems clear that <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/19812666\/\" target=\"_blank\" rel=\"noopener noreferrer\">common genetic variants have only small effects<\/a> on disease risk, the field has shifted toward studies of rare variants with large effects. This may look like a return to the pre-Human Genome Project roots of genetic epidemiology; discoveries in this phase, however, are just the next steps toward building the knowledge base for population-level interpretation.<\/p>\n<p>Meta-analysis has become popular as a first step in knowledge synthesis. Concern over the proliferation of poorly conducted meta-analyses, however, led the editors of <a href=\"https:\/\/everyone.plos.org\/2014\/04\/04\/meta-analyses-genetic-association-studies-plos-ones-approach\/\" target=\"_blank\" rel=\"noopener noreferrer\">PLOS ONE<\/a> to establish explicit quality criteria for submitted manuscripts and the <a href=\"https:\/\/pubmed.ncbi.nlm.nih.gov\/25164421\/\" target=\"_blank\" rel=\"noopener noreferrer\">American Journal of Epidemiology<\/a> has endorsed this approach. Although rigorous meta-analysis can be useful for assessing and refining gene discoveries, it does not suggest next steps. Other methods are needed to integrate genetic data into ways of thinking that can help us understand, prevent and treat disease. Human genome epidemiology must evolve to help meet this challenge.<\/p>\n<figure id=\"attachment_3211\" aria-describedby=\"caption-attachment-3211\" style=\"width: 300px\" class=\"wp-caption alignright\"><a href=\"https:\/\/blogs.cdc.gov\/genomics\/files\/2015\/02\/Figure2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-3211 size-medium\" src=\"https:\/\/blogs.cdc.gov\/genomics\/wp-content\/uploads\/sites\/20\/2015\/02\/Figure2-300x198.png\" alt=\"Countries with authors of articles in HuGE\" width=\"300\" height=\"198\" srcset=\"https:\/\/blogs.cdc.gov\/genomics\/wp-content\/uploads\/sites\/20\/2015\/02\/Figure2-300x198.png 300w, https:\/\/blogs.cdc.gov\/genomics\/wp-content\/uploads\/sites\/20\/2015\/02\/Figure2.png 666w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><figcaption id=\"caption-attachment-3211\" class=\"wp-caption-text\">Countries with authors of articles in HuGE<br \/>published literature database \u2013 2001-2014*<\/figcaption><\/figure>\n<p>On Jan 5, 2015, the HuGE Navigator completed transition to a completely automated curation process based on machine learning and data extraction. This method has achieved 90% sensitivity and specificity when tested against the previous, semi-automated process. The HuGE published literature database will continue to be updated weekly with automatic indexing of gene symbols, study type (meta-analysis, GWAS), and category (pharmacogenomics, genetic testing).<\/p>\n<p>Human genome epidemiology is a global enterprise. The first 100,000 articles in the database included authors from 151 countries (Fig 2). The HuGE Navigator will remain online as a freely accessible resource for all who are interested in human genetic variation and population health.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The HuGE published literature database now contains more than 100,000 citations, a milestone reached at the end of 2014. The Office of Public Health Genomics has compiled this database since 2001 via weekly systematic sweeps of PubMed performed by a single curator. For the first five years, a complex PubMed query was used to identify<\/p>\n","protected":false},"author":128,"featured_media":3215,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[15972,5236,31872],"tags":[31869,31856,15987,31849],"_links":{"self":[{"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/posts\/3209"}],"collection":[{"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/users\/128"}],"replies":[{"embeddable":true,"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/comments?post=3209"}],"version-history":[{"count":10,"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/posts\/3209\/revisions"}],"predecessor-version":[{"id":6484,"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/posts\/3209\/revisions\/6484"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/media\/3215"}],"wp:attachment":[{"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/media?parent=3209"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/categories?post=3209"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blogs.cdc.gov\/genomics\/wp-json\/wp\/v2\/tags?post=3209"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}