Literature DB >> 32817122

"Mind the Gap": Hi-C Technology Boosts Contiguity of the Globe Artichoke Genome in Low-Recombination Regions.

Alberto Acquadro1, Ezio Portis1, Danila Valentino1, Lorenzo Barchi2, Sergio Lanteri1.   

Abstract

Globe artichoke (Cynara cardunculus var. scolymus; 2n2x=34) is cropped largely in the Mediterranean region, being Italy the leading world producer; however, over time, its cultivation has spread to the Americas and China. In 2016, we released the first (v1.0) globe artichoke genome sequence (http://www.artichokegenome.unito.it/). Its assembly was generated using ∼133-fold Illumina sequencing data, covering 725 of the 1,084 Mb genome, of which 526 Mb (73%) were anchored to 17 chromosomal pseudomolecules. Based on v1.0 sequencing data, we generated a new genome assembly (v2.0), obtained from a Hi-C (Dovetail) genomic library, and which improves the scaffold N50 from 126 kb to 44.8 Mb (∼356-fold increase) and N90 from 29 kb to 17.8 Mb (∼685-fold increase). While the L90 of the v1.0 sequence included 6,123 scaffolds, the new v2.0 just 15 super-scaffolds, a number close to the haploid chromosome number of the species. The newly generated super-scaffolds were assigned to pseudomolecules using reciprocal blast procedures. The cumulative size of unplaced scaffolds in v2.0 was reduced of 165 Mb, increasing to 94% the anchored genome sequence. The marked improvement is mainly attributable to the ability of the proximity ligation-based approach to deal with both heterochromatic (e.g.: peri-centromeric) and euchromatic regions during the assembly procedure, which allowed to physically locate low recombination regions. The new high-quality reference genome enhances the taxonomic breadth of the data available for comparative plant genomics and led to a new accurate gene prediction (28,632 genes), thus promoting the map-based cloning of economically important genes.
Copyright © 2020 Acquadro et al.

Entities:  

Keywords:  Cynara cardunculus; Genomics; HI-C libraries; NGS

Mesh:

Year:  2020        PMID: 32817122      PMCID: PMC7534446          DOI: 10.1534/g3.120.401446

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Globe artichoke (Cynara cardunculus var. scolymus) is native to the Mediterranean region, where it is largely cropped for the production of edible immature inflorescences, being Italy the leading world producer (about 388K tons in 2017) (FAO). Immigrants introduced this crop to the Americas, and more recently its cultivation has spread to the eastern part of the world (e.g., China). C. cardunculus includes two further taxa: the cultivated cardoon (var. altilis), grown for the production of fleshy stems (Portis ), and wild cardoon (var. sylvetris), the progenitor of both cultivated forms (Portis ; Mauro ). The three taxa are exploited for the production of a number of nutraceutically and pharmaceutically active compounds such as phenylpropanoids (Pandino ) and sesquiterpene lactones (cynaropicrin and grosheimin) (Eljounaidi ) and particularly cultivated cardoon is a source of both ligno-cellulosic biomass and seed oil for edible and biofuel uses (Portis ). The continuous evolution of Next Generation Sequencing (NGS) technologies is triggering data production, and analysis, and massively parallel sequencing has proven revolutionary, shifting the paradigm of genomics to address biological questions at a genome-wide scale (Koboldt ). Today, in the case of relatively small genomes (e.g., bacterial or viral), complete genome sequences can frequently be reconstructed computationally; however, the reconstruction of large and complex eukaryotic genomes, such as the ones of plants, continue to pose significant challenges (Ghurye and Pop 2019). Short reads technology (e.g.: Illumina) is generally combined with long-reads sequencing technologies, such as Single-molecule real-time sequencing (SMRT, Pacific Biosciences) or nanopore sequencing (Oxford Nanopore technologies). Furthermore, with the goal of improving the assembly quality, cutting edge scaffolding technologies such as linked-reads (10X Genomics), optical mapping (Bionano Genomics) and proximity ligation methods (Hi-C, Dovetail Genomics) are adopted. Hi-C is a proximity ligation based method, which relies on the fact that, after fixation, segments of DNA in close proximity in the nucleus are more likely ligated together and sequenced as pairs in respect to more distant regions. As a result, the number of read pairs between intra-chromosomal regions is a slowly decreasing function of the genomic distance between them. Furthermore, Hi-C could theoretically allow score contact frequency between virtually any pair of genomic loci (Lieberman-Aiden ). Globe artichoke harbors a highly heterozygous genetic background, which hampers the production of a reference assembly. We developed an inbred genotype with a 10% of residual heterozygosity, of which we released the first globe artichoke genome sequence (Scaglione ). The assembly (v1.0) was generated using ∼133-fold Illumina sequencing data and covered 725 of the 1,084 Mb genome. Through genetic mapping, we anchored 526 Mb (73%) of the genome sequence to 17 chromosomal pseudomolecules, although ∼199 Mb (27%) remained unplaced. More recently, we released an improved annotation (v1.1) of the v1.0 assembly and the genome sequence of four globe artichoke genotypes (Acquadro ), as well as a genotype of cultivated cardoon. Here we report on a new reference genome (v2.0), obtained by sequencing a Hi-C genomic library and assembling data with previously generated sequence datasets. This new chromosome-level version is characterized by a high contiguity and reduces drastically the number of unplaced scaffolds.

Materials and methods

Hi-C Library preparation, sequencing and assembling

Fresh etiolated leaves of a globe artichoke inbred line (2C), from which we generated the reference genome (Scaglione ), was provided to Dovetail Genomics (https://dovetailgenomics.com). DNA was extracted from leaf samples and used to construct a Hi-C library following manufacturer protocols (Putnam ). The Hi-C library was then quality checked through sequencing (2M PE 75bp reads, Illumina, MiSeq) and reads mapped back to the draft assembly. Afterward, extensive Illumina sequencing was performed with an Illumina HiSeq X instrument (PE150bp reads chemistry). Hi-C data, as well as 20-30X shotgun data (project PRJNA238069), were used in the HiRise pipeline (https://github.com/DovetailGenomics/HiRise_July2015_GR) to perform scaffolding of the input assembly (v1.0), adopting standard procedures. BlastN was used to reconcile superscaffolds with pseudomolecule nomenclature (Scaglione ).

Gene prediction

The new assembly was masked using RepeatMasker (Smit ) using a combination of homology-based and de novo approaches. After a soft masking step, a gene prediction was performed using Maker-P (Campbell ). Augustus (Stanke ) Hidden Markov Models and SNAP (Bromberg and Rost 2007) gene prediction algorithms were combined with artichoke transcripts available in NCBI and proteins alignments as evidence to support prediction. All predicted gene models were filtered to maintain only those with a AED ≤ 0.35; this value measures the concordance between the predicted model and the experimental tests, with reliability of the higher models and low AED values. For each predicted gene, the gene function was assigned by a BlastP (Altschul ) search against the Uniprot/Swissprot Viridiplantae database (The UniProt Consortium 2014), using the default parameters, with the exception of the e-value (< 1e-5). The sequences of the predicted proteins were also noted using InterproScan (v. 5.33-72.0; (Jones )) compared to all the available databases (ProSitePro 2018_02 (Sigrist ), PANTHER-12 (Mi ), Coils-2.2.1 (Lupas ), PIRSF-3.02 (Wu ), Hamap-2018_3 (Lima ), Pfam-32 (Punta ), ProSitePatterns 2018_02 (Sigrist ), SUPERFAMILY-1.75 (de Lima Morais ), ProDom-2006.1 (Bru ), SMART-7.1 (Letunic ), Gene3D-4.2 (Lees ) and TIGRFAM-15 (Haft )). The MIReNA (Mathelier and Carbone 2010) software was used for the identification of high confidence miRNA-coding sequences (miRBase release 21 (Kozomara and Griffiths-Jones 2011): high confidence database). An homology search was conducted with known miRNAs from an array of 13 species (plants and algae), including: Solanum lycopersicum, Solanum tuberosum, Nicotiana tabacum, Vitis vinifera, Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, Medicago trunculata, Zea mays, Picea abies, Triticum aestivum, Physcomitrella patens, Chlamydomonas reinhardtii. MIReNA was run with default parameters and the maximum number of allowed mismatches between known miRNAs and putative miRNAs was set to 10.

Genome integrity and completeness

The QUAST pipeline (Mikheenko ), which includes the BUSCO software (Simão ), was used for the comparison among the new and the previous versions of the genome. Plant dataset (Embryophyta, odb9) was downloaded from Busco (Simão ) and manually implemented in the QUAST pipeline. A comparison between different versions of the globe artichoke assembled genomes was conducted retrieving co-linear blocks through Last aligner (Kiełbasa ). Only blocks with pairwise minimal identity major/equal than 99% were plotted using Circos tool (Krzywinski ).

Data availability

Raw reads are publicly available in the NCBI sequence read archive under the bioproject: PRJNA238069. The reference assembly (v2.0) and annotation data are either available for downloading from http://www.artichokegenome.unito.it.

Results and discussion

Sequencing, assembling and metrics

We developed a new genome assembly (v2.0) using Hi-C technology, which is based on proximity ligation and massively parallel sequencing to probe the three-dimensional structure of chromosomes within the nucleus, and capture interactions by paired-end sequencing (Putnam ; Ghurye ). A single genomic library was sequenced using Illumina chemistry and a total of 156,683,926 pair end reads (2x150bp; 47.01 Gbp) generated. Hi-C reads were used in the assembly procedure, by adopting the existing genomic scaffolds as starting sequences (Scaglione ), through the HiRise assembly pipeline, and enabled an accurate assembly of the globe artichoke genome up to the chromosome-level (Table 1). In all 5,023 super-scaffolds were generated, with an average size of 144,578 bp. The largest 18 super-scaffolds were assigned to chromosomes using reciprocal blast procedures. The 17 pseudomolecules were reconstructed also by joining together two super-scaffolds (13,663 and 1,119) in chromosome 6.
Table 1

– Metrics for the v1.0 (reference) scaffolds, the v1.0 (reference) pseudomolecules, and v2.0 (Hi-C) super-scaffolds

Metricsv2.0 (Hi-C)v1.0 (pseudomolecules)v1.0 (scaffolds)
Total assembly size726,213,971725,337,666725,334,175
Number of contigs/scaffolds5,0238,34413,662
Average size144,57886,92953,091
N5044,809,92725,947,084125,836
L50791,411
N7531,669,976166,46559,381
L7511983,545
N9023,740,49245,16031,081
L90151,3845,853
Busco, complete genes (%)89.6589.4489.44
Busco, partial genes (%)3.061.981.98
Busco, overall (%)92.7191.4291.42
To assess the improvement obtained in the new assembly, a first comparison was performed between the Hi-C pseudomolecules (v2.0) toward the original scaffolds of v1.0. This resulted in an improvement of the N50 value, which increased from 126 kb to 44.8 Mb (∼356-fold increase) and the N90, which reached 17.8 Mb compared to the original v1.0 value of 29 kb (∼685-fold increase). The huge improvement of the HI-C assembly was also highlighted by the L90 value, which dramatically drop down from 6,123 scaffolds in the v1.0 version to just 15 super-scaffolds, a number close to the haploid chromosome number of the species. Similar remarkable improvements were also highlighted by comparing the Hi-C superscaffolds with the anchored version of the genome (v1.0, pseudomolecules-based plus scaffolds) (Figure 1; Table 1). As an example, the N50 value jumped from ∼26Mb in v1.0 to ∼45Mb in v2.0, while the L90 dropped down from 1,384 of the V1.0 to 15 in the HI-C assembly.
Figure 1

- Contiguity improvement performed on v1.0 genome (scaffolds), v1.0 reference genome (pseudomolecules plus unplaced scaffolds) and v2.0 genome (Hi-C superscaffolds). Top picture: Nx statistics with x varying between 1 and 100. Bottom picture: it represents the cumulative length increment of the genome through the scaffold/contig addition.

- Contiguity improvement performed on v1.0 genome (scaffolds), v1.0 reference genome (pseudomolecules plus unplaced scaffolds) and v2.0 genome (Hi-C superscaffolds). Top picture: Nx statistics with x varying between 1 and 100. Bottom picture: it represents the cumulative length increment of the genome through the scaffold/contig addition. Focusing on the unanchored portion of the genome (namely Chr0), the ∼199 Mb of unplaced sequence in v1.0, which included 8,327 scaffolds, was decreased to less than ∼34 Mb (5,005 sequences), as ∼165 Mb (∼83%) were assigned to super-scaffolds. On the whole, the percentage of anchored genome increased to ∼94% and the chromosome size extended with a medium gain of ∼36% (Table 4). The highest increase was observed in chromosome 14, whose size enlarged of ∼14Mb (97%), in respect to the v1.0. Some chromosomes showed scattered insertion of the new anchored scaffolds (i.e.: 1, 2, 6, 9, 10, 12, 13), while in others (i.e.: 3, 4, 5, 7, 8, 11, 14, 15, 16, 17) distinct extensive regions (ranging from 2.9Mb to 29.3Mb) were anchored (Figure 2).
Table 4

- Comparison in length between v1.0 (reference) pseudomolecules and v2.0 (Hi-C) super-scaffolds. Number of genes predicted from v1.0 and v2.0 are shown and compared. The number of genes reported in Acquadro (annotation v1.1) predicted on the v1.0 assembly are also shown

Size assembly (bp)N° Genes
Chromosomev2.0v1.0Δ (bp)Ratio (%) v2.0/v1.0v2.0v1.1v1.0Ratio (%) v2.0/v1.0
153,988,94049,754,8394,234,1019%2,8812,6922,63010%
275,886,34370,441,4305,444,9138%2,6962,5022,35115%
369,604,50540,297,36529,307,14073%2,2611,9421,86821%
423,740,49220,164,3183,576,17418%1,10499196215%
563,544,92737,196,51726,348,41071%1,9671,7231,64020%
624,383,71720,634,0513,749,66618%1,08495690320%
718,502,61115,568,8872,933,72419%1,00393390711%
844,609,78525,947,08418,662,70172%1,5291,2501,19628%
917,815,53218,344,014−528,482−3%1,0611,0471,0065%
1031,669,97629,133,1432,536,8339%1,6091,5161,43612%
1134,212,86122,016,82512,196,03655%1,6111,4591,45311%
1244,809,92739,693,0555,116,87213%1,5901,4731,40413%
1344,877,40541,551,3993,326,0068%2,0771,8731,80115%
1428,499,37114,487,74814,011,62397%1,00366964655%
1538,772,90921,275,02517,497,88482%1,7511,5011,46619%
1630,156,65321,933,5108,223,14337%1,19396494926%
1747,245,61437,737,7879,507,82725%1,6551,3491,27730%
Unplaced scaffold33,892,403199,160,669−165,268,266−83%5573,4702,994−81%
Chromosomes692,321,568526,176,997+166,144,57132%28,07524,84023,89517%
Total assembled726,213,971725,337,666876,3050.12%28,63228,31026,8896%
Figure 2

Circos plot depicting the syntenic relationships between the chromosomes of the globe artichoke genome (v1.0, pseudomolecules, in red) and the new assembly (v2.0, Hi-C superscaffold, in blue). A - from chromosome 1 to 4; B - from chromosome 5 to 8; C) from chromosome 9 to 12; D) from chromosome 13 to 17. Blue dots highlights extended regions in the v2.0 assembly in pericentromeric positions in metacentric/sub-metacentric chromosomes. Red dots highlights extended regions in the v2.0 assembly in pericentromeric positions in acrocentric/telocentric chromosomes.

Circos plot depicting the syntenic relationships between the chromosomes of the globe artichoke genome (v1.0, pseudomolecules, in red) and the new assembly (v2.0, Hi-C superscaffold, in blue). A - from chromosome 1 to 4; B - from chromosome 5 to 8; C) from chromosome 9 to 12; D) from chromosome 13 to 17. Blue dots highlights extended regions in the v2.0 assembly in pericentromeric positions in metacentric/sub-metacentric chromosomes. Red dots highlights extended regions in the v2.0 assembly in pericentromeric positions in acrocentric/telocentric chromosomes.

Genome annotation

In the genome Hi-C version, the annotation pipeline predicted 28,632 genes, a higher number than the one predicted in v1.0 (i.e.: 26,889; (Scaglione )), and very close to the one we recently obtained following the genome reconstruction of globe artichoke genotypes (i.e.: 28,310, v1.1) (Acquadro ). The number of genes in unplaced scaffolds was just 557 (1,9% of the total genes), raising up the number of genes (+4,180, 17%) placed on pseudomolecules. This number (557) is by far lower than the one located on Chr0 in the two previous structural annotations: i.e., 2,994 (Scaglione ) and 3,471 (Acquadro ). Following Busco (Simão ) analysis, as expected the number of represented orthologs in Hi-C assembly (92.7%) was just slightly higher compared to the previous version (91.4%), being essentially unaltered the sequences of the contigs during the assembly process (data not shown). The InterProScan analyses highlighted about 80% of the predicted proteins with at least one IPR domain, in line with the previous v1.0 and v1.1 annotation. Among the top 20 SUPERFAMILY domains, listed in Table 2, the most abundant in all the genomes was SSF52540 (P-loop containing nucleoside triphosphate hydrolase), which is involved in several UniPathways, including chlorophyll or coenzyme A biosynthesis. The other most abundant Superfamilies were: SSF56112 (protein Kinase-like domain), which acts on signaling and regulatory processes in the eukaryotic cell, SSF52058 (Leucine-rich repeat domain, L domain-like), which is related to resistance to pathogens and SSF48371 (Armadillo-type fold), which plays a role in defense response and translation factor activity. These findings are comparable to both v1.0 and v1.1 annotations, suggesting that Hi-C had a greater effect in improving the quality of the genome sequence than its annotation.
Table 2

- TOP20 Superfamily in the v2 annotation, after Interproscan5 analyses and compared to v1 and v1.1 annotations

DomainDescriptionv2v1.1v1.0
SSF52540P-loop containing nucleoside triphosphate hydrolases1,3461,3471,311
SSF56112Protein kinase-like (PK-like)1,3101,3091,303
SSF52058L domain-like757806772
SSF57850RING/U-box530530529
SSF48371ARM repeat491493481
SSF51735NAD(P)-binding Rossmann-fold domains441443427
SSF48452TPR-like404402408
SSF54928RNA-binding domain, RBD431417401
SSF53474alpha/beta-Hydrolases390397391
SSF48264Cytochrome P450370380373
SSF46689Homeodomain-like372366372
SSF52047RNI-like292295296
SSF53335S-adenosyl-L-methionine-dependent methyltransferases288288289
SSF50978WD40 repeat-like278281281
SSF52833Thioredoxin-like271272275
SSF53756UDP-Glycosyltransferase/glycogen phosphorylase250251241
SSF81383F-box domain240238241
SSF49503Cupredoxins226230241
SSF51445(Trans)glycosidases235238241
From a search against miRBase 21 high confidence database, species-specific miRNAs were predicted. The total number of predicted non-redundant was 144 (in 253 genome regions of the reference 2C), in line with what previously reported on annotation v1.1 (143 (Acquadro ). The identified miRNAs belong to 37 families (Table 3), slightly lower than the ones previously reported (Acquadro ). Notwithstanding, the most highly-represented miRNA families are shared between the two annotations, which are conserved in many taxonomic groups, as already spotted in previous studies (Cuperus ; Chávez Montes ; Barchi ).
Table 3

- miRNA families in the v2.0 annotation compared to v1.1 annotation

miRNA familyAnnotation v2.0Annotation v1.1
1561415
76991314
1661813
17279
399108
39687
169106
39336
16045
16435
17185
16733
16843
31993
39433
15932
39012
40322
44412
47902
103002
144612
263032
15711
39711
39811
40801
53011
82401
83711
90201
115511
207901
265111
265701
265811
267301
268001
363301
441411
525411
525811
555901
575101
769611
104010
104410
523710
646310

Mis-assembly level and co-linearity among assemblies

The Hi-C increased of about 30% the size of anchored genome, and accordingly the majority of the newly assembled chromosomes increased their size (Table 4). In particular, chromosomes 3, 5, 8, 11, 14 and 15 expanded of at least 50% in size, compared to the v1.0. (Figure 2). The Quast (Gurevich ) analysis highlighted that 4,727 scaffolds were mis-assembled. The mis-assemblies were grouped in 3,553 re-locations on the same pseudomolecule, 1,157 translocations and 17 inversions. Following a more in-depth analysis, the mis-assembled scaffolds corresponded to just 54.6Mb of genomic sequence, which included small size fragments (average ∼11.6Kb, median ∼6.1Kb). Relocation involved ∼41.9 Mb (average ∼11.8Kb, median ∼6.6 Kb). Inversions involved ∼0.2 Mb (average ∼12.1 Kb, median ∼11.9 Kb). Translocations involved ∼12.4 Mb (average ∼10.8 Kb, median ∼4.3 Kb). The Hi-C and the v1.0 of the globe artichoke genome assembly were highly co-linear (pseudomolecules plus un-placed scaffold; Figure 2). The remarkable improvement in size of the Hi-C assembly is attributable to the ability of the proximity ligation-based approach to deal with heterochromatic (pericentromeric and telomeric) regions. The latter are characterized by a low recombination rate, low gene density and high TE accumulation (Nachman 2002), thus their analysis is a tough task (Zhang ) when a classical genetic mapping approach relying on the recombination rate (Scaglione ) is used. This is the case of v1.0. genome assembly, while the v2.0 was based on the proximity ligation technology, which is recombination rate aware. The case of chromosomes 3, 5, 8, 14 is emblematic. A clear un-aligned region (“extended gap”) was present in their metacentric/sub-metacentric region in version 1.0, which in chromosomes 3 and 5 spanned up to 30Mbs. Similarly, in the terminal region of chromosomes 11 and 15, which in a previous study (Scaglione ) appeared to be telocentric/acrocentric on the basis of their gene frequency, some scaffolds were missing in v1.0, but correctly assigned in v2.0. All this is confirmed by the fact that the gene frequency of the newly placed scaffolds in the v2.0 assembly was just 29 genes/Mb, by far lower than the average gene frequency detected in both v1.0 and v2.0 (45 genes/Mb), and that the large newly extended regions in chr. 3, 5, 8, 11, 14 and 15 showed a furtherly reduced gene frequency (16 genes/Mb, see Figure 3).
Figure 3

Gene frequency expressed in n° of genes/Mb calculated at chromosome level for the v1.0 genome (light blue bars), v2.0 genome (white bars) and newly extended regions. Blue arrows show newly extended regions in the v2.0 assembly in pericentromeric positions in metacentric/sub-metacentric-like chromosomes. Red arrows highlights newly extended regions in the v2.0 assembly in pericentromeric positions in acrocentric/telocentric-like chromosomes.

Gene frequency expressed in n° of genes/Mb calculated at chromosome level for the v1.0 genome (light blue bars), v2.0 genome (white bars) and newly extended regions. Blue arrows show newly extended regions in the v2.0 assembly in pericentromeric positions in metacentric/sub-metacentric-like chromosomes. Red arrows highlights newly extended regions in the v2.0 assembly in pericentromeric positions in acrocentric/telocentric-like chromosomes.
  38 in total

1.  PIRSF: family classification system at the Protein Information Resource.

Authors:  Cathy H Wu; Anastasia Nikolskaya; Hongzhan Huang; Lai-Su L Yeh; Darren A Natale; C R Vinayaka; Zhang-Zhi Hu; Raja Mazumder; Sandeep Kumar; Panagiotis Kourtesis; Robert S Ledley; Baris E Suzek; Leslie Arminski; Yongxing Chen; Jian Zhang; Jorge Louie Cardenas; Sehee Chung; Jorge Castro-Alvear; Georgi Dinkov; Winona C Barker
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

2.  MIReNA: finding microRNAs with high accuracy and no learning at genome scale and from deep sequencing data.

Authors:  Anthony Mathelier; Alessandra Carbone
Journal:  Bioinformatics       Date:  2010-06-30       Impact factor: 6.937

Review 3.  Evolution and functional diversification of MIRNA genes.

Authors:  Josh T Cuperus; Noah Fahlgren; James C Carrington
Journal:  Plant Cell       Date:  2011-02-11       Impact factor: 11.277

Review 4.  The next-generation sequencing revolution and its impact on genomics.

Authors:  Daniel C Koboldt; Karyn Meltz Steinberg; David E Larson; Richard K Wilson; Elaine R Mardis
Journal:  Cell       Date:  2013-09-26       Impact factor: 41.582

5.  MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations.

Authors:  Michael S Campbell; MeiYee Law; Carson Holt; Joshua C Stein; Gaurav D Moghe; David E Hufnagel; Jikai Lei; Rujira Achawanantakun; Dian Jiao; Carolyn J Lawrence; Doreen Ware; Shin-Han Shiu; Kevin L Childs; Yanni Sun; Ning Jiang; Mark Yandell
Journal:  Plant Physiol       Date:  2013-12-04       Impact factor: 8.340

6.  UniProt: a hub for protein information.

Authors: 
Journal:  Nucleic Acids Res       Date:  2014-10-27       Impact factor: 16.971

7.  InterProScan 5: genome-scale protein function classification.

Authors:  Philip Jones; David Binns; Hsin-Yu Chang; Matthew Fraser; Weizhong Li; Craig McAnulla; Hamish McWilliam; John Maslen; Alex Mitchell; Gift Nuka; Sebastien Pesseat; Antony F Quinn; Amaia Sangrador-Vegas; Maxim Scheremetjew; Siew-Yit Yong; Rodrigo Lopez; Sarah Hunter
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

8.  Scaffolding of long read assemblies using long range contact information.

Authors:  Jay Ghurye; Mihai Pop; Sergey Koren; Derek Bickhart; Chen-Shan Chin
Journal:  BMC Genomics       Date:  2017-07-12       Impact factor: 3.969

9.  Genome reconstruction in Cynara cardunculus taxa gains access to chromosome-scale DNA variation.

Authors:  Alberto Acquadro; Lorenzo Barchi; Ezio Portis; Giulio Mangino; Danila Valentino; Giovanni Mauromicale; Sergio Lanteri
Journal:  Sci Rep       Date:  2017-07-17       Impact factor: 4.379

10.  Versatile genome assembly evaluation with QUAST-LG.

Authors:  Alla Mikheenko; Andrey Prjibelski; Vladislav Saveliev; Dmitry Antipov; Alexey Gurevich
Journal:  Bioinformatics       Date:  2018-07-01       Impact factor: 6.937

View more
  2 in total

1.  Development of a High Oleic Cardoon Cell Culture Platform by SAD Overexpression and RNAi-Mediated FAD2.2 Silencing.

Authors:  Elisa Cappetta; Monica De Palma; Rosa D'Alessandro; Alessandra Aiello; Raffaele Romano; Giulia Graziani; Alberto Ritieni; Dario Paolo; Franca Locatelli; Francesca Sparvoli; Teresa Docimo; Marina Tucci
Journal:  Front Plant Sci       Date:  2022-06-20       Impact factor: 6.627

2.  Abiotic Stresses Elicitation Potentiates the Productiveness of Cardoon Calli as Bio-Factories for Specialized Metabolites Production.

Authors:  Rosa D'Alessandro; Teresa Docimo; Giulia Graziani; Vincenzo D'Amelia; Monica De Palma; Elisa Cappetta; Marina Tucci
Journal:  Antioxidants (Basel)       Date:  2022-05-24
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.