| Literature DB >> 16553962 |
Seung-Bum Lee1, Charalambos Kaittanis, Robert K Jansen, Jessica B Hostetler, Luke J Tallon, Christopher D Town, Henry Daniell.
Abstract
BACKGROUND: Cotton (Gossypium hirsutum) is the most important fiber crop grown in 90 countries. In 2004-2005, US farmers planted 79% of the 5.7-million hectares of nuclear transgenic cotton. Unfortunately, genetically modified cotton has the potential to hybridize with other cultivated and wild relatives, resulting in geographical restrictions to cultivation. However, chloroplast genetic engineering offers the possibility of containment because of maternal inheritance of transgenes. The complete chloroplast genome of cotton provides essential information required for genetic engineering. In addition, the sequence data were used to assess phylogenetic relationships among the major clades of rosids using cotton and 25 other completely sequenced angiosperm chloroplast genomes.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16553962 PMCID: PMC1513215 DOI: 10.1186/1471-2164-7-61
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Gene map of the Gossypium hirsutum chloroplast genome. The thick lines indicate the extent of the inverted repeats (IRa and IRb), which separate the genome into small (SSC) and large (LSC) single copy regions. Genes on the outside of the map are transcribed in the clockwise direction and genes on the inside of the map are transcribed in the counterclockwise direction. Numbered lines around the map indicate the location of repeated sequences found in the cotton genome (see Table 1 for details). The SSC region is in the reverse orientation relative to tobacco [80]. This does not reflect any differences in gene order for cotton but simply reflects the well-known phenomenon that the SSC exists in two orientations in chloroplast genomes [84].
Figure 2Histogram showing the number of repeated sequences ≥ 30 bp long with a sequence identity ≥ 90% in the cotton chloroplast genome.
Location of identified repeats in the cotton plastid genome. Table includes repeats at least 30 bp in size, with a sequence identity greater than or equal to 90%. IGS = Intergenic spacer. See Fig. 1 for location of repeats on the gene map.
| 1 | 30 | IGS |
| 2 | 30 | IGS |
| 3 | 30 | |
| 4 | 30 | |
| 5 | 31 | IGS |
| 6 | 32 | |
| 7 | 32 | IGS |
| 8 | 32 | IGS |
| 9 | 32 | IGS (4 bp) – |
| 10 | 32 | IGS |
| 11 | 34 | |
| 12 | 34 | IGS |
| 13 | 34 | |
| 14 | 34 | |
| 15 | 34 | |
| 16 | 34 | |
| 17 | 35 | |
| 18 | 36 | |
| 19 | 38 | |
| 20 | 38 | |
| 21 | 38 | |
| 22 | 38 | |
| 23 | 40 | IGS |
| 24 | 43 | IGS |
| 25 | 47 | |
| 26 | 52 | |
| 27 | 58 | IGS |
| 28 | 64 | |
| 29 | 64 | |
| 30 | 72 | |
| 31 | 30 | IGS (2 bp) – |
| 32 | 30 | IGS |
| 33 | 30 | IGS |
| 34 | 31 | IGS |
| 35 | 34 | IGS |
| 36 | 34 | |
| 37 | 34 | |
| 38 | 34 | IGS |
| 39 | 34 | IGS |
| 40 | 34 | |
| 41 | 34 | |
| 42 | 36 | |
| 43 | 38 | IGS, |
| 44 | 38 | |
| 45 | 38 | |
| 46 | 41 | |
| 47 | 41 | IGS |
| 48 | 43 | IGS |
| 49 | 43 | IGS |
| 50 | 48 | IGS |
| 51 | 52 | |
| 52 | 52 | |
| 53 | 64 | |
| 54 | 64 |
Differences observed by comparison of cotton chloroplast genome sequences with EST sequences obtained by BLAST searches of GenBank.
| 591 | 228–537 | 5 | A-G | 523 | M-A | |
| T-C | 524 | |||||
| T-A | 528 | I-M | ||||
| T-G | 531 | G-S | ||||
| G-A | 532 | |||||
| 363 | 76–363 | 1 | T-C | 323 | L-S | |
| 354 | 1–354 | 2 | A-G | 263 | K-R | |
| C-U | 308 | S-L | ||||
| 282 | 85–282 | 1 | C-U | 89 | S-L | |
| 657 | 274–657 | 2 | T-G | 275 | L-R | |
| A-C | 302 | K-T |
Sequence analyzed coordinates based on the gene sequence, considering the first base of the initiation codon as bp 1. Variable position is given in reference to the first base of the initiation codon of the gene sequence.
Taxa included phylogenetic analyses with GenBank accession numbers and references. Taxa in bold are those which have not appeared in any previous phylogenetic studies using 61 genes from complete chloroplast genome sequences.
| Taxon | GenBank Accession Numbers | Reference |
| Gymnosperms –Outgroups | ||
| | Wakasugi et al. 1994 [72] | |
| | Leebens-Mack et al 2005 [27] | |
| Basal Angiosperms | ||
| | Goremykin et al. 2003 [29] | |
| | Leebens-Mack et al 2005 [27] | |
| | Goremykin et al. 2004 [28] | |
| Monocots | ||
| | Leebens-Mack et al 2005 [27] | |
| | Hiratsuka et al. 1989 [73] | |
| | Asano et al. 2004 [74] | |
| | Ikeo and Ogihara, unpublished | |
| | Leebens-Mack et al 2005 [27] | |
| | Leebens-Mack et al 2005 [27] | |
| | Maier et al. 1995 [75] | |
| Magnoliids | ||
| | Goremykin et al. 2003 [76] | |
| Eudicots | ||
| | Sato et al. 1999 [77] | |
| | Schmitz-Linneweber et al. 2002 [53] | |
| | ||
| | ||
| | ||
| | ||
| | Kato et al. 2000 [79] | |
| | Lin et al., unpublished | |
| | Shinozaki et al. 1986 [80] | |
| | Hupfer et al. 2000 [81] | |
| | Kim and Lee 2004 [82] | |
| | Leebens-Mack et al 2005 [27] | |
| | ||
| | ||
| | Schmitz-Linneweber et al. 2001 [83] |
Figure 3Parsimony tree based on 61 chloroplast protein-coding genes. The tree has a length of 49,957, a consistency index of 0.46 (excluding uninformative characters) and a retention index of 0.6. Numbers above node indicate number of changes along each branch and numbers below nodes are bootstrap support values. Taxa in red are those which have not appeared in any previous phylogenetic studies using 61 genes from complete chloroplast genome sequences. Ordinal and higher level group names follow APG II [85]. The maximum likelihood tree has the same topology but is not shown.