Genomics
Part of a series on |
Genetics |
---|
Genomics is an interdisciplinary field of
The field also includes studies of intragenomic (within the genome) phenomena such as epistasis (effect of one gene on another), pleiotropy (one gene affecting more than one trait), heterosis (hybrid vigour), and other interactions between loci and alleles within the genome.[9]
History
Etymology
From the Greek ΓΕΝ[10] gen, "gene" (gamma, epsilon, nu, epsilon) meaning "become, create, creation, birth", and subsequent variants: genealogy, genesis, genetics, genic, genomere, genotype, genus etc. While the word genome (from the German Genom, attributed to Hans Winkler) was in use in English as early as 1926,[11] the term genomics was coined by Tom Roderick, a geneticist at the Jackson Laboratory (Bar Harbor, Maine), over beers with Jim Womack, Tom Shows and Stephen O’Brien at a meeting held in Maryland on the mapping of the human genome in 1986.[12] First as the name for a new journal and then as a whole new science discipline.[13]
Early sequencing efforts
Following
DNA-sequencing technology developed
In addition to his seminal work on the amino acid sequence of insulin,
Complete genomes
The advent of these technologies resulted in a rapid intensification in the scope and speed of completion of
Most of the microorganisms whose genomes have been completely sequenced are problematic
A rough draft of the human genome was completed by the Human Genome Project in early 2001, creating much fanfare.[41] This project, completed in 2003, sequenced the entire genome for one specific person, and by 2007 this sequence was declared "finished" (less than one error in 20,000 bases and all chromosomes assembled).[41] In the years since then, the genomes of many other individuals have been sequenced, partly under the auspices of the 1000 Genomes Project, which announced the sequencing of 1,092 genomes in October 2012.[42] Completion of this project was made possible by the development of dramatically more efficient sequencing technologies and required the commitment of significant bioinformatics resources from a large international collaboration.[43] The continued analysis of human genomic data has profound political and social repercussions for human societies.[44]
The "omics" revolution
The English-language neologism omics informally refers to a field of study in biology ending in -omics, such as genomics, proteomics or metabolomics. The related suffix -ome is used to address the objects of study of such fields, such as the genome, proteome, or metabolome (lipidome) respectively. The suffix -ome as used in molecular biology refers to a totality of some sort; similarly omics has come to refer generally to the study of large, comprehensive biological data sets. While the growth in the use of the term has led some scientists (Jonathan Eisen, among others[45]) to claim that it has been oversold,[46] it reflects the change in orientation towards the quantitative analysis of complete or near-complete assortment of all the constituents of a system.[47] In the study of symbioses, for example, researchers which were once limited to the study of a single gene product can now simultaneously compare the total complement of several types of biological molecules.[48][49]
Genome analysis
After an organism has been selected, genome projects involve three components: the sequencing of DNA, the assembly of that sequence to create a representation of the original chromosome, and the annotation and analysis of that representation.[9]
Sequencing
Historically, sequencing was done in sequencing centers, centralized facilities (ranging from large independent institutions such as Joint Genome Institute which sequence dozens of terabases a year, to local molecular biology core facilities) which contain research laboratories with the costly instrumentation and technical support necessary. As sequencing technology continues to improve, however, a new generation of effective fast turnaround benchtop sequencers has come within reach of the average academic laboratory.[50][51] On the whole, genome sequencing approaches fall into two broad categories, shotgun and high-throughput (or next-generation) sequencing.[9]
Shotgun sequencing
Shotgun sequencing is a sequencing method designed for analysis of DNA sequences longer than 1000 base pairs, up to and including entire chromosomes.[52] It is named by analogy with the rapidly expanding, quasi-random firing pattern of a shotgun. Since gel electrophoresis sequencing can only be used for fairly short sequences (100 to 1000 base pairs), longer DNA sequences must be broken into random small segments which are then sequenced to obtain reads. Multiple overlapping reads for the target DNA are obtained by performing several rounds of this fragmentation and sequencing. Computer programs then use the overlapping ends of different reads to assemble them into a continuous sequence.[52][53] Shotgun sequencing is a random sampling process, requiring over-sampling to ensure a given nucleotide is represented in the reconstructed sequence; the average number of reads by which a genome is over-sampled is referred to as coverage.[54]
For much of its history, the technology underlying shotgun sequencing was the classical chain-termination method or '
High-throughput sequencing
The high demand for low-cost sequencing has driven the development of high-throughput sequencing technologies that parallelize the sequencing process, producing thousands or millions of sequences at once.[58][59] High-throughput sequencing is intended to lower the cost of DNA sequencing beyond what is possible with standard dye-terminator methods. In ultra-high-throughput sequencing, as many as 500,000 sequencing-by-synthesis operations may be run in parallel.[60][61]
The Illumina dye sequencing method is based on reversible dye-terminators and was developed in 1996 at the Geneva Biomedical Research Institute, by
An alternative approach, ion semiconductor sequencing, is based on standard DNA replication chemistry. This technology measures the release of a hydrogen ion each time a base is incorporated. A microwell containing template DNA is flooded with a single
Assembly
Sequence assembly refers to
Assembly approaches
Assembly can be broadly categorized into two approaches: de novo assembly, for genomes which are not similar to any sequenced in the past, and comparative assembly, which uses the existing sequence of a closely related organism as a reference during assembly.
Finishing
Finished genomes are defined as having a single contiguous sequence with no ambiguities representing each replicon.[67]
Annotation
The DNA sequence assembly alone is of little value without additional analysis.
- identifying portions of the genome that do not code for proteins
- identifying elements on the genome, a process called gene prediction, and
- attaching biological information to these elements.
Automatic annotation tools try to perform these steps in silico, as opposed to manual annotation (a.k.a. curation) which involves human expertise and potential experimental verification.[69] Ideally, these approaches co-exist and complement each other in the same annotation pipeline (also see below).
Traditionally, the basic level of annotation is using
Sequencing pipelines and databases
The need for reproducibility and efficient management of the large amount of data associated with genome projects mean that computational pipelines have important applications in genomics.[71]
Research areas
Functional genomics
Functional genomics is a field of
A major branch of genomics is still concerned with sequencing the genomes of various organisms, but the knowledge of full genomes has created the possibility for the field of functional genomics, mainly concerned with patterns of gene expression during various conditions. The most important tools here are microarrays and bioinformatics.
Structural genomics
Structural genomics seeks to describe the
Epigenomics
Epigenomics is the study of the complete set of
Metagenomics
Metagenomics is the study of metagenomes,
Model systems
Viruses and bacteriophages
Cyanobacteria
At present there are 24
Applications
Genomics has provided applications in many fields, including
Genomic medicine
Next-generation genomic technologies allow clinicians and biomedical researchers to drastically increase the amount of genomic data collected on large study populations.
Synthetic biology and bioengineering
The growth of genomic knowledge has enabled increasingly sophisticated applications of
Population and conservation genomics
Conservationists can use the information gathered by genomic sequencing in order to better evaluate genetic factors key to species conservation, such as the
See also
- Cognitive genomics
- Computational genomics
- Epigenomics
- Functional genomics
- GeneCalling, an mRNA profiling technology
- Genomics of domestication
- Genetics in fiction
- Glycomics
- Immunomics
- Metagenomics
- Pathogenomics
- Personal genomics
- Proteomics
- Transcriptomics
- Venomics
- Psychogenomics
- Whole genome sequencing
- Thomas Roderick
References
- S2CID 4268222.
- S2CID 15829893.
- PMID 16920639.
- PMID 33692541.
- ^ "WHO definitions of genetics and genomics". World Health Organization. Archived from the original on June 30, 2004.
- ISBN 978-0-321-72412-0.
- ISBN 978-0-02-865606-9.
- PMID 23615871.
- ^ ISBN 978-0-470-08585-1.
- ISBN 978-1-61427-397-4.
- ^ "Genome, n". Oxford English Dictionary (Third ed.). Oxford University Press. 2008. Retrieved 2012-12-01.(subscription required)
- PMID 18166670.
- PMID 35701371.
- PMID 12798815.
- PMID 14299636.
- S2CID 40989800.
- PMID 5330357.
- S2CID 4153893.
- S2CID 4289674.
- S2CID 1634424.
- ISBN 978-0-07-124320-9.
- ^ Sanger F (1980). "Nobel lecture: Determination of nucleotide sequences in DNA" (PDF). Nobelprize.org. Retrieved 2010-10-18.
- ^ S2CID 4206886.
- PMID 14651855.
- PMID 271968.
- PMID 265521.
- ^ a b Darden L, Tabery J (2010). "Molecular Biology". In Zalta EN (ed.). The Stanford Encyclopedia of Philosophy (Fall 2010 ed.).
- S2CID 4355527.(subscription required)
- PMID 16453699.
- S2CID 4311952.
- S2CID 4271784.
- S2CID 10423613.
- S2CID 211123134.(subscription required)
- ^ "Complete genomes: Viruses". NCBI. 17 November 2011. Retrieved 2011-11-18.
- ^ "Genome Project Statistics". Entrez Genome Project. 7 October 2011. Retrieved 2011-11-18.
- ISSN 0362-4331. Retrieved 2012-12-21.
- PMID 20033048.
- ^ "Human gene number slashed". BBC. 20 October 2004. Retrieved 2012-12-21.
- S2CID 21797344.
- ^ National Human Genome Research Institute (14 July 2004). "Dog Genome Assembled: Canine Genome Now Available to Research Community Worldwide". Genome.gov. Retrieved 2012-01-20.
- ^ ISBN 978-0-465-04333-0.
- PMID 23128226.
- PMID 20981085.
- ^ ISBN 978-0-226-17295-8.
- PMID 23587201.
- ISSN 0099-9660. Retrieved 2013-01-04.
- ^ Scudellari M (1 October 2011). "Data Deluge". The Scientist. Retrieved 2013-01-04.
- PMID 22983030.
- PMID 21835622.
- ^ a b Baker M (14 September 2012). "Benchtop sequencers ship off" (Blog). Nature News Blog. Retrieved 2012-12-22.
- PMID 22827831.
- ^ PMID 461197.
- PMID 6269069.
- ^ PMID 19482960.
- PMID 1100841.
- PMID 23251337.
- ^ Illumina, Inc. (28 February 2012). An Introduction to Next-Generation Sequencing Technology (PDF). San Diego, California, USA: Illumina, Inc. p. 12. Retrieved 2012-12-28.
- PMID 17449817.
- S2CID 28769137.
- PMID 18832462.
- PMID 19679224.
- ^ US 20050100900, Kawashima EH, Farinelli L, Mayer P, "Method of nucleic acid amplification", published 12 May 2005, issued 26 July 2011, assigned to Solexa Ltd Great Britain.
- PMID 18576944. Archived from the original(PDF) on 2013-05-18. Retrieved 2013-01-04.
- ^ Davies K (2011). "Powering Preventative Medicine". Bio-IT World (September–October).
- ^ "Home". PacBio.
- ^ "home". Oxford Nanopore Technologies.
- PMID 19815760.
- S2CID 12044602.
- S2CID 20412451. Archived from the original(PDF) on 2013-05-29. Retrieved 2013-01-04.
- PMID 23203987.
- PMID 18720577.
- PMID 17349043.
- PMID 10739263.
- S2CID 5656447.
- ^ ISBN 978-0-393-07005-7.
- PMID 32214243.
- PMID 36224627.
- S2CID 6780101.
- PMID 9733676.
- PMID 17355177.
- ISBN 978-1-904455-54-7.
- ISBN 978-1-904455-87-5.
- PMID 12794192.
- ISBN 978-1-904455-14-1.
- PMID 17062630.
- ISBN 978-1-904455-15-8.
- PMID 21916641.
- PMID 22129254.
- PMID 25059740.
- PMID 20435227.
- PMID 21935354.
- PMID 24618965.
- ^ Robbins R (16 August 2019). "Top U.S. medical centers roll out DNA sequencing clinics for healthy (and often wealthy) clients". STAT News.
- ^ "Two Boston Health Systems Enter the Growing Direct-to-Consumer Gene Sequencing Market by Opening Preventative Genomics Clinics, but Can Patients Afford the Service?". Dark Daily. The Dark Intelligence Group. 3 January 2020.
- ^ "NIH-funded genome centers to accelerate precision medicine discoveries". National Institutes of Health: All of Us Research Program. National Institutes of Health. 25 September 2018.
- ISBN 978-0-465-02175-8.
- S2CID 205064528.
- S2CID 8516357.
- .
- S2CID 10811958.
Further reading
- Lesk AM (2017). Introduction to Genomics (3rd ed.). New York: Oxford University Press. p. 544. .
- Stunnenberg HG, Hubner NC (June 2014). "Genomics meets proteomics: identifying the culprits in disease". Human Genetics. 133 (6): 689–700. PMID 24135908.
- Shibata T (October 2012). "Cancer genomics and pathology: all together now". Pathology International. 62 (10): 647–659. S2CID 27886018.
- Roychowdhury S, Chinnaiyan AM (2016). "Translating cancer genomes and transcriptomes for precision oncology". CA. 66 (1): 75–88. PMID 26528881.
- Gladyshev VN, Zhang Y (2013). "Chapter 16 Comparative Genomics Analysis of the Metallomes". In Banci L (ed.). Metallomics and the Cell. Metal Ions in Life Sciences. Vol. 12. Springer. ISSN 1868-0402
External links
- Annual Review of Genomics and Human Genetics Archived 2009-01-18 at the Wayback Machine
- BMC Genomics: A BMC journal on Genomics
- Genomics journal
- Genomics.org: An openfree genomics portal.
- NHGRI: US government's genome institute
- JCVI Comprehensive Microbial Resource
- KoreaGenome.org: The first Korean Genome published and the sequence is available freely.
- GenomicsNetwork: Looks at the development and use of the science and technologies of genomics.
- Institute for Genome Sciences: Genomics research.
- MIT OpenCourseWare HST.512 Genomic Medicine A free, self-study course in genomic medicine. Resources include audio lectures and selected lecture notes.
- ENCODE threads explorer Machine learning approaches to genomics. Nature (journal)
- Global map of genomics laboratories
- Genomics: Scitable by nature education