| Literature DB >> 31621849 |
Yutaka Satou1, Ryohei Nakamura2, Deli Yu1, Reiko Yoshida1, Mayuko Hamada3, Manabu Fujie4, Kanako Hisata5, Hiroyuki Takeda2, Noriyuki Satoh5.
Abstract
Since its initial publication in 2002, the genome of Ciona intestinalis type A (Ciona robusta), the first genome sequence of an invertebrate chordate, has provided a valuable resource for a wide range of biological studies, including developmental biology, evolutionary biology, and neuroscience. The genome assembly was updated in 2008, and it included 68% of the sequence information in 14 pairs of chromosomes. However, a more contiguous genome is required for analyses of higher order genomic structure and of chromosomal evolution. Here, we provide a new genome assembly for an inbred line of this animal, constructed with short and long sequencing reads and Hi-C data. In this latest assembly, over 95% of the 123 Mb of sequence data was included in the chromosomes. Short sequencing reads predicted a genome size of 114-120 Mb; therefore, it is likely that the current assembly contains almost the entire genome, although this estimate of genome size was smaller than previous estimates. Remapping of the Hi-C data onto the new assembly revealed a large inversion in the genome of the inbred line. Moreover, a comparison of this genome assembly with that of Ciona savignyi, a different species in the same genus, revealed many chromosomal inversions between these two Ciona species, suggesting that such inversions have occurred frequently and have contributed to chromosomal evolution of Ciona species. Thus, the present assembly greatly improves an essential resource for genome-wide studies of ascidians.Entities:
Keywords: zzm321990 Ciona intestinalis type A (C. robusta); ascidian; chromosomal inversion; genome
Mesh:
Year: 2019 PMID: 31621849 PMCID: PMC6836712 DOI: 10.1093/gbe/evz228
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 2.—An inversion in chromosome 4. (A) Hi-C data mapping on the new HT-version of the assembly. Mapping data show an overall high level of consistency, except for a small region in chromosome 4 (shown by B). (B) Hi-C mapping data for chromosome 4 of the HT version of the assembly. Note that the assembly is based on genomic DNA derived from the T-inbred line, and that Hi-C data were obtained from embryos derived from wild-caught animals. A possible inversion is indicated with an arrow. (C) The Nucmer alignment (Kurtz et al. 2004) of chromosome 4 of the HT version assembly with the corresponding region of KH/Hi-C linked scaffolds. A PCR experiment to confirm the inversion. (D) Three primers were designed, and their locations and orientations are shown by large arrows. Small arrows indicate genes, and the same genes are linked with broken lines. Note that the region indicated by the black line in HT does not have a corresponding region in the KH version. (E) Two sets of primers were used to examine which set gave specific amplification. PCR products were analyzed by agarose gel electrophoresis. The set of For and Test gave specific amplification for F8b, whereas the set of Rev and Test gave specific amplification for the seven wild-caught animals.
. 1.—Assembly of the genome of an inbred strain. (A) The assembly strategy. See supplementary figure S2, Supplementary Material online for details. (B) Hi-C data mapping on the KH-version of chromosomes and scaffolds for identification of candidates for misassemblies and linkages. (C) A candidate for an artifactitious inversion site in chromosome 1 of the KH-version of the assembly. Note a clear disconnection at the point indicated by arrows. The Hi-C data demonstrate the proximity between the initial ∼200 kb region and the region around 750 kb, suggesting that the initial ∼750 kb region is inverted. (D) A candidate for a possible linkage between chromosome 1 and scaffold KhL24. Chromosomes 7 (E) and 8 (F) are probably partly heterozygous in the animal used for PacBio RSII sequencing. Arrows indicate contigs used for genome assembly. These sequences were aligned with Nucmer and visualized with Mummer Plot (Kurtz et al. 2004). Forward alignments are shown in red and reverse alignments are shown in blue.
Basic Statistics of the Present and Previous Assemblies
| HT Assembly (Present) | KH Assembly (Previous) | |
|---|---|---|
| Total nucleotide length (bp) | 122,951,598 | 112,162,187 |
| Total nucleotide length including “N” length (bp) | 122,991,600 | 115,226,814 |
| Number of chromosomes | 14 | 14 |
| Number of contigs/scaffolds that are not included in chromosomes | 53 | 1,258 |
| N50 (bp) | 8,327,059 | 5,152,901 |
| L50 | 6 | 9 |
| N90 (bp) | 4,872,821 | 40,806 |
| L90 | 13 | 196 |
Evaluation of Present and Previous Versions of the Assemblies and the Gene Model Sets Using BUSCO
| Found | Missing | ||
|---|---|---|---|
| Complete | Fragmented | ||
| Genome | |||
| HT assembly (new) | 94.6% | 0.7% | 4.7% |
| KH assembly (old) | 93.0% | 1.2% | 5.8% |
| Gene models | |||
| KY models (new) | 95.6% | 1.3% | 3.1% |
| KH models (old) | 95.0% | 1.7% | 3.3% |
List of Chromosomes and Unassembled Contigs that Contain Telomeres, 18S/28S RNA Genes, and SL RNAs Genes
| Chromosomes/Unassembled Contigs | |
|---|---|
| Chromosomes with telomeres in both ends | Chromosomes 3, 9, 14 |
| Chromosomes with telomeres in either end | Chromosomes 4, 5, 6, 7, 10, 12, 13 |
| Contigs containing 18S/28S RNA genes | UAContigs 2, 6, 7, 13, 17, 22, 28, 31, 32, 33, 34, 36, 38, 39, 41, 47, 49, 51, 53 |
| Chromosomes/contigs containing SL RNA gene clusters | Chromosome 8, UAContigs 11, 12 |
. 3.—Possible inversions between chromosomes of two Ciona species. Genes of the two Ciona species are shown as black lines in the upper (HT chromosomes) and lower rows (C. savignyi reftigs) along genomic regions indicated above and below the rows. Colinear genomic blocks are shown with colored arrows, and 5′-ends of putative orthologous genes are linked by lines of the same color. Putative inversions are indicated with black arrows. Single inversions can explain the gene arrangements in (A) and (C). Two inversions can explain the gene arrangement in (D). (B) A dot plot represents the rank order position of orthologous gene pairs in chromosome 10 in the HT-assembly and reftig 37 of C. savignyi. The region shown in a higher-magnification view includes genes shown in (A).