Literature DB >> 1568121

Sequence ordinations: a multivariate analysis approach to analysing large sequence data sets.

D G Higgins1.   

Abstract

Ordination is a powerful method for analysing complex data sets but has been largely ignored in sequence analysis. This paper shows how to use principal coordinates analysis to find low-dimensional representations of distance matrices derived from aligned sets of sequences. The method takes a matrix of Euclidean distances between all pairs of sequence and finds a coordinate space where the distances are exactly preserved. The main problem is to find a measure of distance between aligned sequences that is Euclidean. The simplest distance function is the square root of the percentage difference (as measured by identities) between two sequences, where one ignores any positions in the alignment where there is a gap in any sequence. If one does not ignore positions with a gap, the distances cannot be guaranteed to be Euclidean but the deleterious effects are trivial. Two examples of using the method are shown. A set of 226 aligned globins were analysed and the resulting ordination very successfully represents the known patterns of relationship between the sequences. In the other example, a set of 610 aligned 5S rRNA sequences were analysed. Sequence ordinations complement phylogenetic analyses. They should not be viewed as a complete alternative.

Mesh:

Substances:

Year:  1992        PMID: 1568121     DOI: 10.1093/bioinformatics/8.1.15

Source DB:  PubMed          Journal:  Comput Appl Biosci        ISSN: 0266-7061


  24 in total

1.  A phylogeny of the land snails (Gastropoda: Pulmonata).

Authors:  C M Wade; P B Mordan; B Clarke
Journal:  Proc Biol Sci       Date:  2001-02-22       Impact factor: 5.349

Review 2.  HIV sequence databases.

Authors:  Carla Kuiken; Bette Korber; Robert W Shafer
Journal:  AIDS Rev       Date:  2003 Jan-Mar       Impact factor: 2.500

3.  Molecular evolutionary relationships between partulid land snails of the Pacific.

Authors:  S L Goodacre; C M Wade
Journal:  Proc Biol Sci       Date:  2001-01-07       Impact factor: 5.349

4.  Founder virus population related to route of virus transmission: a determinant of intrahost human immunodeficiency virus type 1 evolution?

Authors:  V V Lukashov; J Goudsmit
Journal:  J Virol       Date:  1997-03       Impact factor: 5.103

5.  Predicting ligand-binding function in families of bacterial receptors.

Authors:  J M Johnson; G M Church
Journal:  Proc Natl Acad Sci U S A       Date:  2000-04-11       Impact factor: 11.205

6.  Intrahost human immunodeficiency virus type 1 evolution is related to length of the immunocompetent period.

Authors:  V V Lukashov; C L Kuiken; J Goudsmit
Journal:  J Virol       Date:  1995-11       Impact factor: 5.103

7.  Weighting in sequence space: a comparison of methods in terms of generalized sequences.

Authors:  M Vingron; P R Sibbald
Journal:  Proc Natl Acad Sci U S A       Date:  1993-10-01       Impact factor: 11.205

8.  Modular arrangement of proteins as inferred from analysis of homology.

Authors:  E L Sonnhammer; D Kahn
Journal:  Protein Sci       Date:  1994-03       Impact factor: 6.725

9.  A classification approach for genotyping viral sequences based on multidimensional scaling and linear discriminant analysis.

Authors:  Jiwoong Kim; Yongju Ahn; Kichan Lee; Sung Hee Park; Sangsoo Kim
Journal:  BMC Bioinformatics       Date:  2010-08-21       Impact factor: 3.169

10.  Divergence in codon usage of Lactobacillus species.

Authors:  P H Pouwels; J A Leunissen
Journal:  Nucleic Acids Res       Date:  1994-03-25       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.