| Literature DB >> 11861885 |
Zhenglong Gu1, Andre Cavalcanti, Feng-Chi Chen, Peter Bouman, Wen-Hsiung Li.
Abstract
We conducted a detailed analysis of duplicate genes in three complete genomes: yeast, Drosophila, and Caenorhabditis elegans. For two proteins belonging to the same family we used the criteria: (1) their similarity is > or =I (I = 30% if L > or = 150 a.a. and I = 0.01n + 4.8L(-0.32(1 + exp(-L/1000))) if L < 150 a.a., where n = 6 and L is the length of the alignable region), and (2) the length of the alignable region between the two sequences is > or = 80% of the longer protein. We found it very important to delete isoforms (caused by alternative splicing), same genes with different names, and proteins derived from repetitive elements. We estimated that there were 530, 674, and 1,219 protein families in yeast, Drosophila, and C. elegans, respectively, so, as expected, yeast has the smallest number of duplicate genes. However, for the duplicate pairs with the number of substitutions per synonymous site (K(S)) < 0.01, Drosophila has only seven pairs, whereas yeast has 58 pairs and nematode has 153 pairs. After considering the possible effects of codon usage bias and gene conversion, these numbers became 6, 55, and 147, respectively. Thus, Drosophila appears to have much fewer young duplicate genes than do yeast and nematode. The larger numbers of duplicate pairs with K(S) < 0.01 in yeast and C. elegans were probably largely caused by block duplications. At any rate, it is clear that the genome of Drosophila melanogaster has undergone few gene duplications in the recent past and has much fewer gene families than C. elegans.Entities:
Mesh:
Substances:
Year: 2002 PMID: 11861885 DOI: 10.1093/oxfordjournals.molbev.a004079
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240