Literature DB >> 27940610

RPAN: rice pan-genome browser for ∼3000 rice genomes.

Chen Sun^1,2, Zhiqiang Hu^1,2, Tianqing Zheng³, Kuangchen Lu¹, Yue Zhao¹, Wensheng Wang³, Jianxin Shi⁴, Chunchao Wang³, Jinyuan Lu¹, Dabing Zhang^4,5, Zhikang Li⁶, Chaochun Wei^7,2.

Abstract

A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Year: 2016 PMID： 27940610 PMCID： PMC5314802 DOI： 10.1093/nar/gkw958

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Next-generation sequencing technologies have opened the possibility of sequencing a large number of individuals from one species, such as the 1001 Genomes Project for Arabidopsis thaliana (1), the 1000 Genomes Project for Human (2) and the 3000 Rice Genomes Project (3K RGP) for Oryza sativa L. (rice) (3). Pan-genome, a concept first introduced in the study of many genomes of a bacterial species in 2005 (4), is becoming prevalent in studies of bacteria and archaea which have small genome sizes (5). The pan-genome of a species consists of a ‘core genome’ that contains genes present in all individuals and a ‘distributed genome’ that comprises genes not shared by all individuals. In recent years, pan-genome analysis was also successfully applied to eukaryotes with large genome sizes, such as human (6), rice (7,8), soybean (9) and maize (10). The 3K RGP generated sequencing data for >3000 rice accessions with a mean coverage of 14.3×. The organization and visualization as well as the consequent analyses of >3000 genomes are big challenges; however, they are of extreme significance especially for the rice breeding. Therefore, development of a user-friendly visualization tool for rice pan-genome is in great demand. One of the most widely used genome visualization tools, the University of California Santa Cruz Genome Browser (UCSC Genome Browser) (11,12), takes one single genome as the reference genome and its visualization is based on this individual genome. Several visualization tools for pan-genome or comparative genomics have been developed recently, including stand-alone programs, such as Pan-Tetris (13), GenPlay Multi-Genome (14) and JContextExplorer (15), and web-based browsers, such as PopGeV (16), the browser for UK10K-cohorts project (17) and GPAC (18). These visualization tools were developed to meet specific needs. Pan-Tetris visualizes the gene occurrences for bacteria and represents each gene as a unicoloured glyph. GenPlay Multi-Genome visualizes the data in a resolution of a single base and compares allele-specific expression and functional genomic data for multiple closely related genomes. However, with multiple thousands of genomes, creating the meta-reference genome via multiple sequence alignment is prohibitively expensive. Therefore, GenPlay Multi-Genome does not fit for rice pan-genome. JContextExplorer is a tree-based approach comparing cross-species bacterial genomes. PopGeV is a web-based large-scale population genome browser mainly displaying SNP and InDel data. The browser for the UK10K-cohorts project retrieves genotype-phenotype association from the data and GPAC visualizes multiple genome-level changes. Nevertheless, none of these genome browsers visualizes genomes in a pan-genome approach (i.e. organizing a large number of genomes from a species and visualizing them accordingly as a genome of a species), nor displays the presence and absence variation (PAV) of genes, let alone the browser specific for rice pan-genome. Here we present Rice Pan-genome Browser (RPAN), an interactive web-based pan-genome browser (Supplementary Figure S1) for the rice pan-genome created from the 3K RGP. The browser contains information of the 3010 sequenced rice accessions, genomic sequences, gene annotations as well as gene expressions. It also provides several search functions, such as the search function for the PAV information with a list of rice accessions or gene IDs. Furthermore, it provides the search for specific DNA sequences against the rice pan-genome as well. The data resources and tools provided by RPAN will accelerate both the basic research in rice biology and the applied efforts in rice breeding.

MATERIALS AND METHODS

The sequencing data of 3010 rice accessions were acquired from the 3K RGP (3) and the pan-genome sequences were constructed based on IRGSP-1.0 genome, a widely used rice reference genome with high-quality annotations. The total size of the compressed raw sequencing data was about 15 TB.

Pan-genome construction

The raw sequencing data of each accession were first assembled with SOAPdenovo version r240 (19), and then the assembled contigs with lengths >500 bp were aligned to the IRGSP genome by the nucmer tool in Mummer package version 3.23 (20). For those unaligned contigs, the redundant sequences (identity threshold: 90%) were removed by CD-HIT version 4.6.1 (21). Next, various contaminants were removed by NCBI-blast (version 2.2.28+) (22) against the NT database. Then all-vs-all alignments with NCBI-blast was carried out to ensure no redundancy. The remained contigs formed the non-redundant novel sequence dataset (identity < 0.9 in comparison to IRGSP version 1.0, and identity <0.9 for all novel sequences). All the resulted non-redundant unaligned sequences were then categorized into 12 groups according to the classification of their corresponding rice accessions, which are predefined by population structure from SNP analysis (23,24). These groups include five subgroups (IG1, IG2, IG3, IG4 and IG5) of subspecies Indica, AUSG6, four subgroups (JG7, JG8, JG9 and JG10) of subspecies Japonica, AROG11 and admixtures (Adm). Then, all these contigs from the same varietal subgroup were concatenated with 100 consecutive Ns as the delimiters. Finally, the IRGSP genome (373 Mbps) and these novel sequences (268 Mbps) were merged as the reference pan-genome of rice (Supplementary Figure S2).

Pan-genome annotation

The protein coding genes on unaligned sequences were predicted by MAKER-P and the gene/transcript annotation of the IRGSP-1.0 genome was downloaded from the Rice Annotation Project (RAP) (25).

PAV determination

The coverage of a gene was used to determine its presence/absence. First, all raw reads of a rice accession were mapped to the pan-genome sequences by ‘bwa mem’ of BWA version 0.7.10 (26). Using samtools version 0.1.19 (27), a gene with coding region coverage >0.95 and gene region coverage >0.85 was considered as a presence in the accession.

Gene categorization

Genes were categorized according to their presence/absence in all accessions. In order to reduce the influence of the low sequencing coverages, the presence/absence of a gene was checked only for 453 high-quality accessions whose sequencing depths were >20× and mapping depths were >15×. Core genes are those present in all 453 rice accessions, and other genes are distributed. In order to reduce the false discovery rate of distributed genes, those genes absent in >1% of accessions (binomial tests, P-value < 0.05, null hypothesis is ‘A non-distributed gene is absent in <1% of accessions’) were kept as distributed while the other (we call them candidate core genes) were removed from distributed gene set. A distributed gene can be categorized further as subspecies unbalanced gene (whose frequency in one or more subspecies is significantly higher than that in other subspecies (Fisher tests, FDR < 0.05)) and subspecies specific gene (that is present in a subspecies but absent in all other subspecies). For subspecies unbalanced genes, if a gene is significantly (5%) more frequent in Indica (Japonica) accessions than in Japonica (Indica) subspecies, it is called Indica (Japonica) dominant gene. Similarly, in each subspecies, a gene can be further categorized as sub-group unbalanced gene, whose frequency in one or more sub-groups of a subspecies is significantly higher than the frequencies in other sub-groups in this subspecies. At the end, a distributed gene is called a random gene if it is neither subspecies unbalanced nor sub-group unbalanced. A random gene shows no significant difference in frequency among subspecies or sub-groups inside a subspecies.

Expression data

In total, 226 runs of RNA-seq data from diverse rice tissues, including seedling shoots, booting panicles, callus and four-leaf stage shoots were collected from all available public databases (See detailed information in Supplementary Table S1). The obtained RNA-seq data were first trimmed with Trimmomatic version 0.32 (28) with parameters ‘ILLUMINACLIP:2:30:10 LEADING:20 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:36’, which yielded a clean RNA-seq data with a size of 3.4 TB. Then the RNA-seq data were aligned to the rice pan-genome with HISAT2 version 2.0.1-beta (29) with default parameters. The alignment results were converted, sorted and stored in .bam file format with samtools version 1.2 (27). The coverage of each gene was calculated with ‘bedtools coverage’ in bedtools suite version 2.17.0 (30).

Visualization

The visualization page consists of two parts. A tree browser in the left panel and a genome browser in the right panel. The genome browser was constructed based on the JBrowse framework (31), and the tree browser was implemented in-house with HTML5, SVG and JavaScript. The tree browser not only generates an interactive tree view, but also supports intuitive track selection with the tree browser by scrolling, searching and node selecting in addition to the traditional track selection for the 3010 accessions. Another characteristic of the tree browser is that the selected accessions from the tree browser can be submitted to the genome browser directly. RPAN uses the newly developed rice pan-genome as the reference genome, and provides gene annotation track as a default track for the whole rice pan-genome with an additional 226 tracks of RNA-seq data.

RESULTS

A rice pan-genome browser, RPAN, was developed with the rice pan-genome derived from 3010 sequenced rice accessions (Table 1). The system diagram of RPAN is shown in Figure 1. RPAN contains a database for the rice pan-genome together with search and visualization tools. All the parts in RPAN are dynamic and interactive.

Table 1.

Statistics about genomes in RPAN

Accession group	Count
Indica	1764
Japonica	801
AUS	221
ARO	101
ADM	123
Total	3010

Figure 1.

The architecture of RPAN. This rice pan-genome browser contains table browser, genome browser and multiple search functions. Table browser allows users to summarize, display and download the contents of tracks. Tree browser can be used to select target genomes which can be displayed as tracks in genome browser. Genome browser contains a reference individual genome as well as those novel sequences not included in the reference genome. Users can also search the pan-genome with a list of genes or accessions, for the presence/absence of the genes in the list of selected rice accessions. Searched results can be displayed in the genome browser as well as in tables and figures.

The rice pan-genome

A rice pan-genome was constructed with the IRGSP-1.0 genome (25) and all novel sequences not included in the reference genome. The novel sequences were grouped according to their source rice accessions based on the phylogenic tree derived from SNPs (see Methods for more details). All novel sequences in a subgroup of rice accessions were concatenated as one pseudo-chromosome. The sequencing data of all 3010 rice accessions were mapped to the pan-genome and visualized through the JBrowse framework. RPAN database also includes basic information of >3000 rice accessions, genome-wide expression profile data, gene annotations, and the presence-absence variations. Basic information of 3010 rice accessions derived from the 3K RGP, including accession names, sequencing depths, mapping depths on the IRGSP-1.0 genome and meta-information such as geological locations, subspecies (or subgroup) categorization, etc. Coding sequences, protein sequences and annotations for 50 995 full-length coding genes (Table 2) in the rice pan-genome.

Table 2.

Statistics about rice gene categorization

Gene category	Count
Total genes	50 995
Core genes	23 914
Candidate core genes	4986
Distributed genes	22 095
Subspecies-unbalanced genes	13 617
Indica-dominant genes	5579
Japonica-dominant genes	6038
Subspecies-specific genes	853
Indica-specific genes	587
Japonica-specific genes	147
AUS-specific genes	67
ARO-specific genes	52
Subgroup-unbalanced genes	11 581
Indica-subgroup-unbalanced genes	9816
Japonica-subgroup-unbalanced genes	3418
Random genes	5316

Gene presence-absence variations (PAVs). The presence/absence of genes in the rice pan-genome were determined by 453 high-quality accessions. All genes were then categorized as core, candidate core or different types of distributed genes. In total, there are 23 914 core genes, 4986 candidate core genes and 22 095 distributed genes. Of the distributed genes, 853 genes are subspecies or varietal group specific, including 587, 147, 67 and 52 genes for Indica and Japonica subspecies, Aus and Aro groups, respectively (Table 2). Genome-wide expression profiles for the rice pan-genome, including 226 publicly available RNA-seq runs of pan-genome expression profiles (Supplementary Table S1).

Basic search functions

Users can search with a gene ID, a rice accession code, or genomic sequences. When a gene ID is searched, RPAN provides the results from eight aspects: basic gene information, gene categorization information, gene family, gene presence frequency and distribution, gene ontology, protein coding sequence and protein sequence. The basic gene information such as location on the pan-genome, the gene categorization information (including whether the gene is a core gene or distributed one, an unbalanced gene among subspecies or varietal subgroup, its gene age, etc.) and gene family information are displayed in the first three tables. The frequency of this gene present in five subspecies and 12 subgroups are shown in two heat maps. In addition, the relevant GO terms, protein coding sequence and protein sequence are also provided. When a rice accession code is searched, the basic information about the rice accession and the statistics of the genes in this rice accession are displayed in two tables. Meanwhile, three pie charts show the categorization of these genes in this rice accession. Users can also visualize the alignment of sequencing data of the queried rice accession against the rice pan-genome in the visualization page after clicking the ‘Genome Browser’ button. Users can also search with genomic sequences against the rice pan-genome directly with BLAT (32). One or more sequences in the FASTA format can be searched. All alignments can be further checked in a detailed page by clicking the ‘Genome Browser’ button in the record line and visualized in the pan-genome browser.

Advanced search functions

Searching multiple rice accession codes or a list of gene IDs is also supported in RPAN. Users can input multiple accession codes in the search box or upload a file containing accession codes. The least number of rice accessions sharing a specific gene can be an optional parameter. If this number is set to 1, the search result will be all genes existing in all the input accessions; similarly, if the number is set to the number of all input accessions, the core genes of all input accessions would be acquired. Then, the basic information of these accessions and the resulted genes can be downloaded and the statistics tables and charts for these genes are also provided. Likewise, users can search with a gene ID list to obtain rice accessions in which all genes present and acquire relevant information of accessions and genes. The visualization page contains two parts, a dynamic tree browser on the left panel and a genome browser on the right panel (Figure 2). The tree was constructed from the SNP data. Users can select multiple nodes (including leaf nodes and internal nodes) and click the ‘Submit’ button to visualize these rice accessions in the genome browser. The tree browser also supports search function to accelerate target genome selection. The pan-genome reference sequence, gene annotation and overall presence frequency of high quality accession are three basic tracks. There are 3010 rice genome tracks and 226 RNA-seq tracks. Users can select any number of accessions or expression data through the hidden ‘Select tracks’ panel or the tree browser as well. For the performance concern, we recommend to select less than 300 tracks each time. In addition, multiple useful tools such as screenshot and share link are listed in the toolbar.

Figure 2.

The tree browser and genome browser of RPAN. The left panel is the tree browser representing the clustering of ∼3000 individual rice genomes. The tree browser can be used to select genomic tracks for visualization in the genome browser. The right panel is the genome browser. The tracks in it from top to bottom are reference, gene annotation, presence frequency, accessions (red) and RNA-seq (blue). Its genomic sequences contain a reference individual rice genome as well as those novel sequences not included in the reference genome.

Table browser

All information in the pan-genome browser was stored in tables that can be downloaded. These tables include the rice accession information table, the gene information table and gene expression profile table. In the rice accession information table, users can filter the results by selecting browse options such as categories, geological regions and sequencing depth status (high/low). A summary table can be generated for filtered results. In the gene information table, there are 50 995 full-length genes. The basic gene information including chromosome positions on the reference pan-genome, strand, CDS length and exon number, are contained in the table. Visualization and detailed gene information, such as gene categorization (core/distributed), gene presence frequency, gene ontology, coding sequence and protein sequence could be acquired by clicking the related links. The location of a genomic region can also be searched in a format of ‘chromosome ID: start coordinate-end coordinate’. A total of 226 runs of RNA-seq data from diverse rice tissues were collected in the gene expression profile table. The detailed information of gene expression profiles could be acquired and visualized in the genome browser.

Use cases

Here we give two examples on using RPAN for rice breeding study. One case is that researchers often search candidate genes for abiotic stress tolerances. Os12g0569700 is a gene with potentially important roles in rice acclimation to salt and drought stresses (33). Users can input the gene ID, Os12g0569700, to RPAN, and RPAN will show that this gene is present in 1107 accessions and 795 of them are Japonica. The phylogenetic tree of frequency distribution and heat map of this gene (Figure 3) show this gene is Japonica-dominant with very low frequencies in the other varietal groups. Further screening of donors for salt/drought tolerances should be focused on the accessions with full length (presence) of this gene.

Figure 3.

Examples of search and visualization functions of RPAN. Os12g0569700, a gene related to rice acclimation to salt and drought stresses, was searched. Search results include (A) distribution of the gene in high-quality accessions and (B) heat maps of the gene presence frequency in different subspecies and subgroups. (C) The visualization of this gene with three RNA-seq tracks. Another case is selecting parental lines from the 3010 accessions for developing early maturing Japonica cultivars. In this work, donors with insensitivity to day length would be desirable. Instead of phenotyping of the whole set of 3010 accessions, breeders can shortlist the candidate accessions with genes controlling day length insensitivity. Gene Os08g0174500 is known to be able to suppress flowering time under the long-day condition and regulate the plant height and grain yield in rice (34–36). Users can firstly search the gene ID, and get the results shown in Figure 4A–C. RPAN shows that Os08g0174500 is a distributed gene, and is Japonica-dominant with the highest frequency of 96.8% in varietal subgroup J (Figure 4B and C). To further figure out the possible donors insensitive to the day-length carrying the non-functional alleles of Os08g0174500, users can pick multiple tracks with relatively significant differences in the Os08g0174500 regions (Figure 4D). After clicking the ‘submit’ button, as shown by the genome browser panel, some accessions (CX106 and B026 as examples here) can be found to carry the complete sequences of Os08g0174500, while the others have different sizes of deletions. Then, users are able to shortlist the candidates to smaller number of accessions (B024, IRIS_313-11275, B060, B067, B112 and IRIS_313-11859 in this example) which do not have this gene for further phenotyping. Finally, based on their flowering times under the long-day (LD) condition at Northern China, users may find that B024 could be a desirable donor for their breeding purposes (Supplementary Table S2).

Figure 4.

An example of searching the shortlist of candidate donor for early-japonica breeding by RPAN. With the visualization function for the distributed key gene Os08g0174500 (A–C) controlling the day-length sensitivity of rice, the donors were shortlisted by visualization of the PAVs of Os08g0174500 (D). Here, we show an example we did with RPAN for rice biology. Because the distributed genes of the rice pan-genome, particularly those group specific ones, are expected to have contributed significantly to the adaptations of specific rice varietal populations to their environments. Gene information in our rice pan-genome browser can be very useful for rice scientists to determine functions of those ‘domestication’ genes of rice (O. sativa L.) (37). We extracted the 132 domestication protein-coding genes reported previously (Supplementary Table S3). Through searching (Figure 5) in our pan-genome browser, we observed that 112 (84.8% of them) are the core or candidate core genes. Of the remaining 20 distributed genes, six genes are dominant in one of the subspecies, implicating their greater contributions to the population differentiation in rice.

Figure 5.

Examples of search and visualization functions of RPAN. A list of 132 genes associated with domestication were searched. Search results include (A) statistics about categorization of the genes; (B) the distribution of rice accessions containing all the genes and (C), (D) and (E), visualization of gene categorization in pie charts.

DISCUSSION

Clearly, pan-genome analyses have expanded the genome analyses from the level of focusing on all functional genes in single or few reference genomes of a species to the comprehensive analyses of all genes in large numbers of individuals of different populations of a species. This expansion is essential to understand the whole genomic diversity of any species. Since the public availability of the sequencing data and seeds of the 3010 rice accessions from the 3K RGP, tremendous efforts have been taken to analyze this huge amounts of rice genome sequence data to understand the population genomic organization of rice and to phenotype the 3K rice accessions to identify loci associated with important traits by GWAS. Clearly, the rice pan-genome browser and the analytic tools developed in this study are expected to contribute significantly to these efforts. As the functions and associated traits of more genes in the rice pan-genome are determined in the global rice functional genomics efforts, our browser will be updated regularly to facilitate various needs from the scientific community. As the fast progress of sequencing technology, it is becoming increasingly tractable to generate whole-genome sequencing data for a large number of individuals of important plant and animal species, in which, visualization of the generated huge data is of particular importance in subsequent data analysis. RPAN, the rice pan-genome browser we present here, will help rice researchers to search and visualize their results in a pan-genome context and also provide a useful template and tool to facilitate searching and visualization of results from future pan-genome analyses. In addition, the strategy and visualization method in RPAN can be a good example to many other species with a large number of individual genomes available.

36 in total

1. BLAT--the BLAST-like alignment tool.

Authors: W James Kent
Journal: Genome Res Date: 2002-04 Impact factor: 9.043

2. PopGeV: a web-based large-scale population genome browser.

Authors: Xinyi Shi; Jing Peng; Xiaohan Yu; Xiaohong Zhang; Dongye Li; Baohui Liu; Fanjiang Kong; Xiaohui Yuan
Journal: Bioinformatics Date: 2015-05-21 Impact factor: 6.937

3. The 1001 genomes project for Arabidopsis thaliana.

Authors: Detlef Weigel; Richard Mott
Journal: Genome Biol Date: 2009-05-27 Impact factor: 13.583

4. SNP-Seek database of SNPs derived from 3000 rice genomes.

Authors: Nickolai Alexandrov; Shuaishuai Tai; Wensheng Wang; Locedie Mansueto; Kevin Palis; Roven Rommel Fuentes; Victor Jun Ulat; Dmytro Chebotarov; Gengyun Zhang; Zhikang Li; Ramil Mauleon; Ruaraidh Sackville Hamilton; Kenneth L McNally
Journal: Nucleic Acids Res Date: 2014-11-27 Impact factor: 16.971

5. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler.

Authors: Ruibang Luo; Binghang Liu; Yinlong Xie; Zhenyu Li; Weihua Huang; Jianying Yuan; Guangzhu He; Yanxiang Chen; Qi Pan; Yunjie Liu; Jingbo Tang; Gengxiong Wu; Hao Zhang; Yujian Shi; Yong Liu; Chang Yu; Bo Wang; Yao Lu; Changlei Han; David W Cheung; Siu-Ming Yiu; Shaoliang Peng; Zhu Xiaoqian; Guangming Liu; Xiangke Liao; Yingrui Li; Huanming Yang; Jian Wang; Tak-Wah Lam; Jun Wang
Journal: Gigascience Date: 2012-12-27 Impact factor: 6.524

6. JContextExplorer: a tree-based approach to facilitate cross-species genomic context comparison.

Authors: Phillip Seitzer; Tu Anh Huynh; Marc T Facciotti
Journal: BMC Bioinformatics Date: 2013-01-16 Impact factor: 3.169

7. CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors: Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal: Bioinformatics Date: 2012-10-11 Impact factor: 6.937

8. Exploring the rice dispensable genome using a metagenome-like assembly strategy.

Authors: Wen Yao; Guangwei Li; Hu Zhao; Gongwei Wang; Xingming Lian; Weibo Xie
Journal: Genome Biol Date: 2015-09-07 Impact factor: 13.583

9. Pan-Tetris: an interactive visualisation for Pan-genomes.

Authors: André Hennig; Jörg Bernhardt; Kay Nieselt
Journal: BMC Bioinformatics Date: 2015-08-13 Impact factor: 3.169

10. A map of rice genome variation reveals the origin of cultivated rice.

Authors: Xuehui Huang; Nori Kurata; Xinghua Wei; Zi-Xuan Wang; Ahong Wang; Qiang Zhao; Yan Zhao; Kunyan Liu; Hengyun Lu; Wenjun Li; Yunli Guo; Yiqi Lu; Congcong Zhou; Danlin Fan; Qijun Weng; Chuanrang Zhu; Tao Huang; Lei Zhang; Yongchun Wang; Lei Feng; Hiroyasu Furuumi; Takahiko Kubo; Toshie Miyabayashi; Xiaoping Yuan; Qun Xu; Guojun Dong; Qilin Zhan; Canyang Li; Asao Fujiyama; Atsushi Toyoda; Tingting Lu; Qi Feng; Qian Qian; Jiayang Li; Bin Han
Journal: Nature Date: 2012-10-03 Impact factor: 49.962

35 in total

1. NGS sequencing reveals that many of the genetic variations in transgenic rice plants match the variations found in natural rice population.

Authors: Doori Park; Su-Hyun Park; Youn Shic Kim; Beom-Soon Choi; Ju-Kon Kim; Nam-Soo Kim; Ik-Young Choi
Journal: Genes Genomics Date: 2018-11-07 Impact factor: 1.839

Review 2. Pan-genomics in the human genome era.

Authors: Rachel M Sherman; Steven L Salzberg
Journal: Nat Rev Genet Date: 2020-02-07 Impact factor: 53.242

3. Long-read sequencing of 111 rice genomes reveals significantly larger pan-genomes.

Authors: Fan Zhang; Hongzhang Xue; Xiaorui Dong; Min Li; Xiaoming Zheng; Zhikang Li; Jianlong Xu; Wensheng Wang; Chaochun Wei
Journal: Genome Res Date: 2022-04-08 Impact factor: 9.438

4. A Reference Genome Assembly of Simmental Cattle, Bos taurus taurus.

Authors: Michael P Heaton; Timothy P L Smith; Derek M Bickhart; Brian L Vander Ley; Larry A Kuehn; Jonas Oppenheimer; Wade R Shafer; Fred T Schuetze; Brad Stroud; Jennifer C McClure; Jennifer P Barfield; Harvey D Blackburn; Theodore S Kalbfleisch; Kimberly M Davenport; Kristen L Kuhn; Richard E Green; Beth Shapiro; Benjamin D Rosen
Journal: J Hered Date: 2021-03-29 Impact factor: 2.645

5. Genome Editing and Designer Crops for the Future.

Authors: Sumi Rana; Pooja Rani Aggarwal; Varsa Shukla; Urmi Giri; Shubham Verma; Mehanathan Muthamilarasan
Journal: Methods Mol Biol Date: 2022

6. QTL mapping and candidate gene analysis of peduncle vascular bundle related traits in rice by genome-wide association study.

Authors: Laiyuan Zhai; Tianqing Zheng; Xinyu Wang; Yun Wang; Kai Chen; Shu Wang; Yun Wang; Jianlong Xu; Zhikang Li
Journal: Rice (N Y) Date: 2018-03-06 Impact factor: 4.783

7. The rice blast resistance gene Ptr encodes an atypical protein required for broad-spectrum disease resistance.

Authors: Haijun Zhao; Xueyan Wang; Yulin Jia; Bastian Minkenberg; Matthew Wheatley; Jiangbo Fan; Melissa H Jia; Adam Famoso; Jeremy D Edwards; Yeshi Wamishe; Barbara Valent; Guo-Liang Wang; Yinong Yang
Journal: Nat Commun Date: 2018-05-23 Impact factor: 14.919

8. RicyerDB: A Database For Collecting Rice Yield-related Genes with Biological Analysis.

Authors: Jing Jiang; Fei Xing; Xiangxiang Zeng; Quan Zou
Journal: Int J Biol Sci Date: 2018-05-22 Impact factor: 6.580

9. Genetic Basis Underlying Correlations Among Growth Duration and Yield Traits Revealed by GWAS in Rice (Oryza sativa L.).

Authors: Fengmei Li; Jianyin Xie; Xiaoyang Zhu; Xueqiang Wang; Yan Zhao; Xiaoqian Ma; Zhanying Zhang; Muhammad A R Rashid; Zhifang Zhang; Linran Zhi; Shuyang Zhang; Jinjie Li; Zichao Li; Hongliang Zhang
Journal: Front Plant Sci Date: 2018-05-22 Impact factor: 5.753

10. An imputation platform to enhance integration of rice genetic resources.

Authors: Diane R Wang; Francisco J Agosto-Pérez; Dmytro Chebotarov; Yuxin Shi; Jonathan Marchini; Melissa Fitzgerald; Kenneth L McNally; Nickolai Alexandrov; Susan R McCouch
Journal: Nat Commun Date: 2018-08-29 Impact factor: 14.919