Literature DB >> 1641329

A comparison of several similarity indices used in the classification of protein sequences: a multivariate analysis.

C Landès1, A Hénaut, J L Risler.   

Abstract

The present work describes an attempt to identify reliable criteria which could be used as distance indices between protein sequences. Seven different criteria have been tested: i and ii) the scores of the alignments as given by the BESTFIT and the FASTA programs; iii) the ratio parameter, i.e. the BESTFIT score divided by the length of the aligned peptides; iv and v) the statistical significance (Z-scores) of the scores calculated by BESTFIT and FASTA, as obtained by comparison with shuffled sequences; vi) the Z-scores provided by the program RELATE which performs a segment-by-segment comparison of 2 sequences, and vii) an original distance index calculated by the program DOCMA from all the pairwise dotplots between the sequences. These 7 criteria have been tested against the aminoacid sequences of 39 globins and those of the 20 aminoacyl-tRNA synthetases from E. coli. The distances between the sequences were analyzed by the multivariate analysis techniques. The results show that the distances calculated from the scores of the pairwise alignments are not adequately sensitive. The Z-score from RELATE is not selective enough and too demanding in computer time. Three criteria gave a classification consistent with the known similarities between the sequences in the sets, namely the Z-scores from BESTFIT and FASTA and the multiple dotplot comparison distance index from DOCMA.

Entities:  

Mesh:

Substances:

Year:  1992        PMID: 1641329      PMCID: PMC334011          DOI: 10.1093/nar/20.14.3631

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  28 in total

1.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

2.  A strategy for the rapid multiple alignment of protein sequences. Confidence levels from tertiary structure comparisons.

Authors:  G J Barton; M J Sternberg
Journal:  J Mol Biol       Date:  1987-11-20       Impact factor: 5.469

3.  Profile analysis: detection of distantly related proteins.

Authors:  M Gribskov; A D McLachlan; D Eisenberg
Journal:  Proc Natl Acad Sci U S A       Date:  1987-07       Impact factor: 11.205

4.  CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.

Authors:  D G Higgins; P M Sharp
Journal:  Gene       Date:  1988-12-15       Impact factor: 3.688

5.  A possible three-dimensional structure of bovine alpha-lactalbumin based on that of hen's egg-white lysozyme.

Authors:  W J Browne; A C North; D C Phillips; K Brew; T C Vanaman; R L Hill
Journal:  J Mol Biol       Date:  1969-05-28       Impact factor: 5.469

6.  An evolutionary tree for invertebrate globin sequences.

Authors:  M Goodman; J Pedwaydon; J Czelusniak; T Suzuki; T Gotoh; L Moens; F Shishikura; D Walz; S Vinogradov
Journal:  J Mol Evol       Date:  1988       Impact factor: 2.395

7.  Structure of E. coli glutaminyl-tRNA synthetase complexed with tRNA(Gln) and ATP at 2.8 A resolution.

Authors:  M A Rould; J J Perona; D Söll; T A Steitz
Journal:  Science       Date:  1989-12-01       Impact factor: 47.728

8.  Cysteinyl-tRNA synthetase: determination of the last E. coli aminoacyl-tRNA synthetase primary structure.

Authors:  G Eriani; G Dirheimer; J Gangloff
Journal:  Nucleic Acids Res       Date:  1991-01-25       Impact factor: 16.971

9.  Structure of tyrosyl-tRNA synthetase refined at 2.3 A resolution. Interaction of the enzyme with the tyrosyl adenylate intermediate.

Authors:  P Brick; T N Bhat; D M Blow
Journal:  J Mol Biol       Date:  1989-07-05       Impact factor: 5.469

10.  Glutamyl-tRNA synthetases of Bacillus subtilis 168T and of Bacillus stearothermophilus. Cloning and sequencing of the gltX genes and comparison with other aminoacyl-tRNA synthetases.

Authors:  R Breton; D Watson; M Yaguchi; J Lapointe
Journal:  J Biol Chem       Date:  1990-10-25       Impact factor: 5.157

View more
  3 in total

1.  The human EBNA-2 coactivator p100: multidomain organization and relationship to the staphylococcal nuclease fold and to the tudor protein involved in Drosophila melanogaster development.

Authors:  I Callebaut; J P Mornon
Journal:  Biochem J       Date:  1997-01-01       Impact factor: 3.857

2.  An analysis of the sequence of part of the right arm of chromosome II of S. cerevisiae reveals new genes encoding an amino-acid permease and a carboxypeptidase.

Authors:  F Nasr; A M Bécam; E Grzybowska; M Zagulski; P P Slonimski; C J Herbert
Journal:  Curr Genet       Date:  1994-07       Impact factor: 3.886

3.  Heterospecific cloning of Arabidopsis thaliana cDNAs by direct complementation of pyrimidine auxotrophic mutants of Saccharomyces cerevisiae. I. Cloning and sequence analysis of two cDNAs catalysing the second, fifth and sixth steps of the de novo pyrimidine biosynthesis pathway.

Authors:  F Nasr; N Bertauche; M E Dufour; M Minet; F Lacroute
Journal:  Mol Gen Genet       Date:  1994-07-08
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.