| Literature DB >> 22155605 |
Nasheen Naidoo1, Yudi Pawitan, Richie Soong, David N Cooper, Chee-Seng Ku.
Abstract
Substantial progress has been made in human genetics and genomics research over the past ten years since the publication of the draft sequence of the human genome in 2001. Findings emanating directly from the Human Genome Project, together with those from follow-on studies, have had an enormous impact on our understanding of the architecture and function of the human genome. Major developments have been made in cataloguing genetic variation, the International HapMap Project, and with respect to advances in genotyping technologies. These developments are vital for the emergence of genome-wide association studies in the investigation of complex diseases and traits. In parallel, the advent of high-throughput sequencing technologies has ushered in the 'personal genome sequencing' era for both normal and cancer genomes, and made possible large-scale genome sequencing studies such as the 1000 Genomes Project and the International Cancer Genome Consortium. The high-throughput sequencing and sequence-capture technologies are also providing new opportunities to study Mendelian disorders through exome sequencing and whole-genome sequencing. This paper reviews these major developments in human genetics and genomics over the past decade.Entities:
Mesh:
Year: 2011 PMID: 22155605 PMCID: PMC3525251 DOI: 10.1186/1479-7364-5-6-577
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
Major developments and landmarks in human genetics and genomics, 1977 to date
| Year | Development | References |
|---|---|---|
| 1977 | Sanger dideoxynucleotide/chain termination sequencing method developed | [ |
| Mammalian genes shown to contain introns | [ | |
| 1978 | First report of characterisation of gross gene deletions responsible for human inherited disease (α- and β-thalassaemia) by Southern blotting | [ |
| 1979 | First single base-pair substitution causing a human inherited disease (β-thalassaemia) characterized by DNA sequencing | [ |
| 1980 | Construction of a genetic linkage map in humans using restriction fragment length polymorphisms | [ |
| 1990 | Initiation of the Human Genome Project (HGP) | [ |
| 1992 | Second-generation linkage map of the human genome | [ |
| 1996 | The Human Gene Mutation Database (HGMD), an attempt to collate known (published) gene lesions responsible for human inherited disease, established and made available at | [ |
| Genome-wide association studies (GWAS) approach for genetic studies of complex diseases first proposed | [ | |
| 2001 | Completion of draft DNA sequences of the human genome by the International Human Genome Sequencing Consortium (IHGSC) and Celera Genomics | [ |
| International SNP Map Working Group identifies 1.42 million SNPs in the human genome | [ | |
| Genetic architecture of complex diseases subjected to intense debate | [ | |
| Linkage disequilibrium (LD) patterns documented between SNPs in regions of the human genome | [ | |
| 2003 | Initiation of the International HapMap Project | [ |
| First whole-genome SNP genotyping array - Affymetrix GeneChip 10K | [ | |
| 2004 | IHGSC publishes the 'finished version' of the DNA sequence of the human genome | [ |
| Initiation of the ENCODE project | [ | |
| Discovery of hundreds of copy number variations (CNVs) in the human genome | [ | |
| Database of Genomic Variants (DGV) established to catalogue CNVs | [ | |
| First new-generation sequencing (NGS) technology - Roche 454 GS 20 System | [ | |
| 2005 | Completion of the International HapMap Phase I Project | [ |
| First proper GWAS using a commercial whole-genome SNP genotyping array | [ | |
| 2005-present | Rapid developments of whole-genome and custom SNP genotyping arrays and technologies | [ |
| Rapid developments of sequencing technologies | [ | |
| 2006 | Discovery of more than 1,000 regions of homozygosity > 1 megabase (Mb) in the genomes of outbred populations | [ |
| First comprehensive map of CNVs in the HapMap populations | [ | |
| An initial map of insertion and deletion variants in the human genome | [ | |
| Illumina sequencing platform commercially marketed | [ | |
| 2007 | The first human diploid genome (Craig Venter's genome) sequenced by the Sanger sequencing method | [ |
| Completion of the International HapMap Phase II Project and extension to Phase III | [ | |
| Genome-wide detection and characterisation of positive selection in human populations | [ | |
| Completion of the ENCODE project | [ | |
| Explosion of GWAS publications ('Year of GWAS'), approximately 100 new GWASs | [ | |
| 'Human Genetic Variation' considered to be the 'Breakthrough of The Year' in 2007 by | [ | |
| Sequence capture or enrichment methods and technologies developed | [ | |
| Pervasive transcription documented | [ | |
| Demonstration of paired-end mapping (PEM) to detect structural variation using NGS technologies | [ | |
| Demonstration of ChIP-Seq to map transcription factor binding sites | [ | |
| Demonstration of ChIP-Seq to interrogate histone modifications | [ | |
| Life Technologies SOLiD sequencing platform commercially marketed | [ | |
| A community resource project launched to sequence large-insert clones from many individuals, systematically discovering and resolving these complex variants at the DNA sequence level (The Human Genome Structural Variation Working Group) | [ | |
| 2007-Present | Microarray-based methods increasingly supplanted by sequencing-based approaches such as ChIP-Seq, RNA-Seq, Methyl-Seq and CNV-Seq | [ |
| 2008 | First human diploid genome (James Watson's genome) sequenced by NGS technologies | [ |
| First whole cancer genome (acute myeloid leukaemia [AML]) sequenced | [ | |
| Initiation of the 1000 Genomes Project | [ | |
| Vast majority of human genes shown to undergo alternative splicing (RNA-Seq) | [ | |
| Large scale mapping and sequencing of structural variation using a clone-based method | [ | |
| Demonstration of depth-of-coverage approach to detect CNVs using NGS technologies | [ | |
| First GWAS meta-analysis using imputation methods | [ | |
| The issue of 'missing heritability' in GWASs recognised | [ | |
| 2009 | Feasibility of exome sequencing approach to identify a causal mutation for a Mendelian disorder first demonstrated | [ |
| Exome sequencing as a useful tool for diagnostic application demonstrated | [ | |
| Third generation sequencing (TGS; single molecule sequencing) technology introduced --Heliscope Single Molecule Sequencer (Helicos Biosciences) commercially marketed | [ | |
| First human diploid genome sequenced by TGS technology | [ | |
| Latest assembly of the human genome (Genome Reference Consortium, release GRCh37, February 2009), Genebuild published by Ensembl (database version 56.37a) includes 23,616 protein-coding genes, 6,407 putative RNA genes and 12,346 pseudogenes | ||
| Large intergenic non-coding RNAs (lincRNAs) found to represent a novel category of evolutionarily conserved RNAs | [ | |
| Direct single molecule RNA sequencing without prior conversion of RNA to cDNA | [ | |
| First human DNA methylomes at base resolution | [ | |
| Comprehensive mapping of long-range chromatin interactions | [ | |
| 2010 | Number of disease-causing/disease-associated germline mutations collated in the Human Gene Mutation Database exceeds 100,000 in > 3,700 different nuclear genes | [ |
| More than 17 million SNPs in the human genome catalogued in the SNP Database (dbSNP; | [ | |
| As of 2nd November 2010, DGV catalogued 66,741 CNVs, 953 inversions and 34,229 insertions and deletions (indels) (100 base pairs (bp) -- 1 kilobase (kb) from 42 published studies | ||
| 1,048 microRNAs found in the human genome | miRBase, Release 16.0: September 2010, | |
| Completion of the International HapMap Phase III Project | [ | |
| Completion of pilot phase of the 1000 Genomes Project | [ | |
| Second generation whole-genome SNP genotyping array (with SNP selection from the 1000 Genomes Project) launched | ||
| Cost of whole-genome sequencing (at several tenfold of sequencing coverage depth) reduced to less than $5,000 | [ | |
| Metagenomic sequencing of human gut microbes accomplished using NGS technologies | [ | |
| Exome sequencing study identifies causal mutations and genes for previously unexplained Mendelian disorders | [ | |
| GWAS meta-analysis involving total sample size of > 249,000 | [ | |
| Comprehensive mapping of CNVs using high-resolution tiling oligonucleotide microarrays (42 million probes) | [ | |
| Characterisation of 20 sequenced human genomes to evaluate the prospects for identifying rare functional variants | [ | |
| Neanderthal genome sequenced | [ | |
| The genome of an extinct Palaeo-Eskimo sequenced | [ | |
| Exome sequencing of 200 individuals identifies an excess of low-frequency non-synonymous coding variants | [ | |
| International Cancer Genome Consortium (ICGC) launched | [ | |
| Largest GWAS of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls performed | [ | |
| 2011 | As of 13th May 2011, 874 publications and 4,327 SNPs documented in the National Human Genome Research Institute (NHGRI) 'A Catalog of Published Genome-wide Association Studies' | |
| Comprehensive mapping of copy number variations based on whole-genome DNA sequencing data | [ | |
| Developments of other TGS technologies, such as single-molecule real-time sequencing and nanopore sequencing, are on the horizon | [ | |
| New addition to the NGS market -- the Ion Torrent Personal Genome Machine (PGM), produced by Life Technologies (Carlsbad, CA) | ||
| Single-cell sequencing to infer tumour evolution | [ |
Special features of human autosomes 1-22 and the sex chromosomes, including respective lengths, gene number and density
| Chromosome | Chromosome length (bp)a | Number of known protein-coding genes per chromosomea | Gene density (genes/Mb) | Special features | Reference |
|---|---|---|---|---|---|
| 1 | 247,249,719 | 2,189 | 8.85 | Largest human chromosome. Rich in disease genes. Huge (~30 Mb) pericentromeric heterochromatic region at 1q12 spans ~5% of the length of the chromosome. Contains clusters of amylase genes (1p21), U1 snRNA genes (1q12-q22) and 5S RNA genes (1q) as well as multiple (~250) tRNA genes | 1 |
| 2 | 242,951,149 | 1,328 | 5.47 | Chromosome 2 (along with chromosome 4) exhibits the lowest recombination rate of all the autosomes. Contains at 2q13 an ancient telomere-telomere fusion junction at the position where two ape chromosomes once fused to give rise to this human chromosome | 2 |
| 3 | 199,501,827 | 1,112 | 5.57 | Lowest rate of segmental duplication of all human chromosomes. Contains several olfactory receptor gene clusters | 3 |
| 4 | 191,273,063 | 797 | 4.17 | Chromosome 4 (along with chromosome 2) exhibits the lowest recombination rate of all the autosomes. Highest percentage of LINE elements among all chromosomes | 2 |
| 5 | 180,857,866 | 903 | 4.99 | Rich in intra-chromosomal duplications. Contains interleukin and protocadherin gene clusters on 5q31 | 4 |
| 6 | 170,899,992 | 1,133 | 6.62 | Harbours the major histocompatibility complex and the largest tRNA gene cluster in the human genome. Contains at least three imprinted genes | 5 |
| 7 | 158,821,424 | 1,023 | 6.44 | Contains the highest number of intra-chromosomal duplications among all human chromosomes. Contains at least six imprinted genes | 6, 7 |
| 8 | 146,274,826 | 747 | 5.11 | Contains a fast-evolving 15 Mb region on distal 8p with genes related to the innate immunity and nervous systems that appear to have evolved under positive selection | 8 |
| 9 | 140,273,252 | 929 | 6.62 | Structurally highly polymorphic. Contains the large (~14 Mb) block of pericentromeric heterochromatin. Contains large numbers of intra- and inter-chromosomal segmental duplications, as well as the largest interferon gene cluster in the human genome (9p22) | 9 |
| 10 | 135,374,737 | 834 | 6.16 | Region of extensive segmental duplication located on 10q11 | 10 |
| 11 | 134,452,384 | 1,385 | 10.30 | Rich in both genes and disease genes. Contains 40% of all olfactory receptor gene clusters. Contains at least nine imprinted genes | 11 |
| 12 | 132,349,534 | 1,080 | 8.16 | Chromosome 12 has a unique history of evolutionary rearrangements that occurred in the rodent and primate lineages. Contains clusters of proline-rich protein and type II keratin genes at 12q13 | 12 |
| 13 | 114,142,980 | 361 | 3.16 | Low gene density in general; contains a central 38 Mb segment where the gene density drops to only 3.1 genes per Mb. This acrocentric chromosome contains ribosomal RNA genes at 13p12 and at least one imprinted gene | 13 |
| 14 | 106,368,585 | 669 | 6.29 | This acrocentric chromosome contains ribosomal RNA genes at 14p12. Contains two 1 Mb regions of crucial importance to the immune system (T cell receptor and immunoglobulin heavy chain genes). Contains serpin gene cluster at 14q32.1 and several regions with imprinted genes | 14 |
| 15 | 100,338,915 | 641 | 6.39 | This acrocentric chromosome contains ribosomal RNA genes at 15p12. Two large clusters of clinically important segmental duplications are located in the proximal and distal regions of 15q. Contains a number of imprinted genes | 15 |
| 16 | 88,827,254 | 925 | 10.41 | Relatively high gene density. Contains a large number of segmental duplications | 16 |
| 17 | 78,774,742 | 1,236 | 15.69 | High gene density. Has undergone extensive intra-chromosomal rearrangement, many of which were probably mediated by segmental duplications. High G + C content of 45% (genome average: 41%) | 17 |
| 18 | 76,117,153 | 295 | 3.88 | Low gene density overall. Contains serpin gene cluster at 18q21.3 | 18 |
| 19 | 63,811,651 | 1,443 | 22.61 | Highest gene density of all human chromosomes. One quarter of the genes on chromosome 19 belong to tandemly arranged gene families, encompassing 25% of the length of the chromosome. High G + C content of 48-49% (genome average: 41%). Repetitive sequences constitute 53-57% of the chromosome, as compared with a genome average of 40-44%. Contains clusters of olfactory receptor genes and cytochrome P450 genes, and multiple clusters of zinc finger genes, and at least two imprinted genes | 19 |
| 20 | 62,435,964 | 617 | 9.88 | Smallest metacentric autosome. Rich in both genes and disease genes. Contains type 2 cystatin gene cluster and at least two imprinted genes | 20 |
| 21 | 46,944,323 | 284 | 6.05 | Smallest human chromosome with fewer genes than any other autosome. This acrocentric chromosome contains ribosomal RNA genes at 21p12 | 21 |
| 22 | 49,691,432 | 519 | 10.44 | This acrocentric chromosome contains ribosomal RNA genes at 22p12. Relatively high gene density. Clusters of segmental duplications at 22q11.2 are associated with several genomic disorders | 22 |
| X | 154,913,754 | 891 | 5.75 | Contains the pseudoautosomal regions, PAR1 and PAR2, at the tips of the short and long arms, respectively. These regions are essential for normal male meiosis and recombination. PAR1 undergoes an obligate crossover with the Y chromosome, thereby giving this region the highest recombination rate in the human genome, at least in males. One X chromosome is subject to inactivation in females. Highly enriched in interspersed repeats and has a low G + C content of 39% (genome average: 41%) | 23 |
| Y | 57,772,954 | 80 | 1.38 | Lowest gene density of all human chromosomes (contains only 82 known genes). Contains the male-specific region which is a mosaic of heterochromatin and euchromatic X-transposed, X-degenerate and ampliconic sequences that make up 30% of the euchromatin. PAR1 undergoes an obligate crossover with the X chromosome. The virtual absence of homologous recombination between the X and the Y chromosomes has led to a gradual degeneration of Y chromosomal genes over evolutionary time. However, the absence of recombination, at least within the extensive non-recombining region of the Y chromosome, has also favoured the evolutionary accumulation of transposable elements on the Y chromosome | 24 |
aChromosome lengths and the numbers of genes per chromosome are according to the Ensembl database, version 47.36. The chromosome length corresponds to the length of each chromosome that has been sequenced so far. The number of known protein-coding genes represents a conservative estimate of the likely total number, comprising genes which have been fully annotated. An earlier version of this table was published by Kehrer-Sawatzki and Cooper.25
1Gregory, S.G., Barlow, K.F., McLay, K.E., Kaul, R. et al. (2006), 'The DNA sequence and biological annotation of human chromosome 1', Nature Vol. 441, pp. 315-321.
2Hillier, L.W., Graves, T.A., Fulton, R.S., Fulton, L.A. et al. (2005), 'Generation and annotation of the DNA sequences of human chromosomes 2 and 4', Nature Vol. 434, pp. 724-731.
3Muzny, D.M., Scherer, S.E., Kaul, R., Wang, J. et al. (2006), 'The DNA sequence, annotation and analysis of human chromosome 3', Nature Vol. 440, pp. 1194-1198.
4Schmutz, J., Martin, J., Terry, A., Couronne, O. et al. (2004), 'The DNA sequence and comparative analysis of human chromosome 5', Nature Vol. 431, pp. 268-274.
5Mungall, A.J., Palmer, S.A., Sims, S.K., Edwards, C.A. et al. (2003), 'The DNA sequence and analysis of human chromosome 6', Nature Vol. 425, pp. 805-811.
6Hillier, L.W., Fulton, R.S., Fulton, L.A., Graves, T.A. et al. (2003), 'The DNA sequence of human chromosome 7', Nature Vol. 424, pp. 157-164.
7Scherer, S.W., Cheung, J., MacDonald, J.R., Osborne, L.R. et al. (2003), 'Human chromosome 7: DNA sequence and biology', Science Vol. 300, pp. 767-772.
8Nusbaum, C., Mikkelsen, T.S., Zody, M.C., Asakawa, S. et al. (2006), 'DNA sequence and analysis of human chromosome 8', Nature Vol. 439, pp. 331-335.
9Humphray, S.J., Oliver, K., Hunt, A.R., Plumb, R.W. et al. (2004), 'DNA sequence and analysis of human chromosome 9', Nature Vol. 429, pp. 369-374.
10Deloukas, P., Earthrowl, M.E., Grafham, D.V., Rubenfield, M. et al. (2004), 'The DNA sequence and comparative analysis of human chromosome 10', Nature Vol. 429, pp. 375-381.
11Taylor, T.D., Noguchi, H., Totoki, Y., Toyoda, A. et al. (2006), 'Human chromosome 11 DNA sequence and analysis including novel gene identification', Nature Vol. 440, pp. 497-500.
12Scherer, S.E., Muzny, D.M., Buhay, C.J., Chen, R. et al. (2006), 'The finished DNA sequence of human chromosome 12', Nature Vol. 440, pp. 346-351.
13Dunham, A., Matthews, L.H., Burton, J., Ashurst, J.L. et al. (2004), 'The DNA sequence and analysis of human chromosome 13', Nature Vol. 428, pp. 522-528.
14Heilig, R., Eckenberg, R., Petit, J.L., Fonknechten, N. et al. (2003), 'The DNA sequence and analysis of human chromosome 14', Nature Vol. 421, pp. 601-607.
15Zody, M.C., Garber, M., Sharpe, T., Young, S.K. et al. (2006), 'Analysis of the DNA sequence and duplication history of human chromosome 15', Nature Vol. 440, pp. 671-675.
16Martin, J., Han, C., Gordon, L.A., Terry, A. et al. (2004), 'The sequence and analysis of duplication-rich human chromosome 16', Nature Vol. 432, pp. 988-994.
17Zody, M.C., Garber, M., Adams, D.J., Sharpe, T. et al. (2006), 'DNA sequence of human chromosome 17 and analysis of rearrangement in the human lineage', Nature Vol. 440, pp. 1045-1049.
18Nusbaum, C., Zody, M.C., Borowsky, M.L., Kamal, M. et al. (2005), 'DNA sequence and analysis of human chromosome 18', Nature Vol. 437, pp. 551-555.
19Grimwood, J., Gordon, L.A., Olsen, A., Terry, A. et al. (2004), 'The DNA sequence and biology of human chromosome 19', Nature Vol. 428, pp. 529-535.
20Deloukas, P., Matthews, L.H., Ashurst, J., Burton, J. et al. (2001), 'The DNA sequence and comparative analysis of human chromosome 20', Nature Vol. 414, pp. 865-871.
21Hattori, M., Fujiyama, A., Taylor, T.D., Watanabe, H. et al. (2000), 'The DNA sequence of human chromosome 21', Nature Vol. 405, pp. 311-319.
22Dunham, I., Shimizu, N., Roe, B.A., Chissoe, S. et al. (1999), 'The DNA sequence of human chromosome 22', Nature Vol. 402, pp. 489-495.
23Ross, M.T., Grafham, D.V., Coffey, A.J., Scherer, S. et al. (2005), 'The DNA sequence of the human X chromosome', Nature Vol. 434, pp. 325-337.
24Skaletsky, H., Kuroda-Kawaguchi, T., Minx, P.J., Cordum, H.S. et al. (2003), 'The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes', Nature Vol. 423, pp. 825-837.
25Kehrer-Sawatzki, H. and Cooper, D.N. (2008), 'Sequencing the human genome: novel insights into its structure and function', in: Encyclopedia of Life Sciences (ELS), John Wiley & Sons Ltd, Chichester.
Figure 1Summary of the approaches identifying disease-associated variants.