| Literature DB >> 35456404 |
Sumit Kumar Aggarwal1, Alla Singh1, Mukesh Choudhary1,2,3, Aundy Kumar4, Sujay Rakshit1, Pardeep Kumar1, Abhishek Bohra5, Rajeev K Varshney3,6.
Abstract
Advances in sequencing technologies and bioinformatics tools have fueled a renewed interest in whole genome sequencing efforts in many organisms. The growing availability of multiple genome sequences has advanced our understanding of the within-species diversity, in the form of a pangenome. Pangenomics has opened new avenues for future research such as allowing dissection of complex molecular mechanisms and increased confidence in genome mapping. To comprehensively capture the genetic diversity for improving plant performance, the pangenome concept is further extended from species to genus level by the inclusion of wild species, constituting a super-pangenome. Characterization of pangenome has implications for both basic and applied research. The concept of pangenome has transformed the way biological questions are addressed. From understanding evolution and adaptation to elucidating host-pathogen interactions, finding novel genes or breeding targets to aid crop improvement to design effective vaccines for human prophylaxis, the increasing availability of the pangenome has revolutionized several aspects of biological research. The future availability of high-resolution pangenomes based on reference-level near-complete genome assemblies would greatly improve our ability to address complex biological problems.Entities:
Keywords: NGS; biological research; evolution; genome sequence; germplasm; novel genes; pangenome
Mesh:
Year: 2022 PMID: 35456404 PMCID: PMC9031676 DOI: 10.3390/genes13040598
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Figure 1Organization of a pangenome composed of core and dispensable components of the genome.
Figure 2Different forms of a reference genome. The horizontal bars represent the DNA sequence of a genome. represents a disabling mutation that disrupts the gene function. , , and depict various sequence polymorphisms.
Figure 3Genomic variation in terms of proportion (a) and distribution (b) of PAVs and CNVs in the genome of major crops for agriculturally important traits (Interpreted from Tao et al. [21].
Figure 4A basic approach for pangenome construction. Genome sequences of different strains represented schematically as A (blue), B (red), C (green), D (light blue), and E (yellow) are aligned to identify the core and accessory components of the pangenome.
Software/tools for pangenome analysis.
| Software/Tool | Description/Role | URL Link | References |
|---|---|---|---|
| PanSeq | Extract the regions unique in the genome, Identify the SNPs and construct the file for phylogeny programme. | [ | |
| PanFunPro | Homology detection and pairwise genome analysis in pan/core genome. | [ | |
| GET_ | Clustering proteins and nucleotide sequence into homologous group and analysis of overlapping sets of proteins | [ | |
| ITEP | It is use for sequence alignment, metabolic, clustering, and protein prediction | [ | |
| PanGP | Use for large-scale bacterial pangenome profile analysis with sampling algorithms. | [ | |
| PGAP | Detection of homologous genes, orthologous genes, SNP, phylogenetic studies, pangenome plotting and functional annotation. | [ | |
| PGAT | To compare the gene content and sequence across multiple microbial genomes to identify the SNPs. | [ | |
| EDGAR | EDGAR performs homology analyses with a specific cutoff, Venn diagrams and interactive synteny plots. | [ | |
| Micropan | This allows integration of pangenome and additional analyses within a single programming language environment | Package “micropan” in r software (accessed on 17 September 2021) | [ |
| SplitMem | A graphic software for pangenome analysis software by de Bruijn graph. | [ | |
| ClustAGE | Focused on the accessory genomic dimension of pangenome | [ | |
| DeNoGAP | Help in gene prediction, protein classification and orthology search | [ | |
| EUPAN | This was first to analyze eukaryotic pangenomes to identify core and accessory gene datasets | [ | |
| Harvest | This is useful for the analysis based on three modules Parsnp (core-genome analysis), Gingr (output visualization), and Harvest Tools (meta-analysis) | [ | |
| LS-BSR | Calculates a score ratio per coding sequence within a pangenome dataset using BLAST | [ | |
| NGSPanPipe | Identify pangenome from short reads and output is compatible with other pangenome analysis tools | [ | |
| PanACEA | Identification of genomic regions those are phylogenetically dissimilar. | [ | |
| PanCake | Useful for clustering homologous genes and analyzing core/accessory genome | [ | |
| PanGeT | Pangenome analysis based on comparison at genome and proteome levels. | [ | |
| PanGFR-HM | Genomic/functional diversity and phylogenetic on genome-based between human associated microbial genomes | [ | |
| PANINI | For rapid online visualization and analysis of the core and accessory genome evolutionary signal. | [ | |
| PANNOTATOR | To ensure quality and standards for functional genome annotation among different strains | [ | |
| PanOCT | PanOCT is a graph-based ortholog clustering tool of closely related prokaryotic genomes. | [ | |
| Pan-Tetris | An interactive and dynamic visual inspection of gene occurrences in a pangenome table. | [ | |
| PanTools | Annotating pangenomes, sequences adding, grouping genes, retrieving genomic regions and querying pangenome | [ | |
| PanViz | It can visualize from range of data formats of pangenomic data and mapping genes from existing pangenome. | [ | |
| PanWeb | It is a graphical interface of pangenome analysis generated from PGAP software. | [ | |
| PanX | This tool identifies orthologous gene clusters in pangenomes, visualization, presence/absence pattern and identify SNPs | [ | |
| PGAdb-Builder | This is used to constructs a pangenome allele database (PGAdb). | [ | |
| PGAP-X | Genome diversity and visualize genome structure and gene content to understand the evolution. | [ | |
| Piggy | Detection of highly divergent (“switched”) intergenic regions (IGRs) upstream of genes in pangenome | [ | |
| Pyseer | This is helpful in genome-wide association studies in the microbes to identify potential genetic variation. | [ | |
| Seq-seq-pan | For sequential alignment of sequences to build a pangenome data structure and a whole-genome alignment. | [ | |
| Spine and AGEnt | Spine, find core-genome from a group of genomic sequences and AGEnt, find the accessory genome in draft genomic sequences | [ | |
| BPGA | Pangenome profile analysis, pangenome sequence extraction, exclusive gene family analysis, atypical GC content analysis and species phylogenetic analysis. | [ | |
| BGDMdocker | For pangenome analysis, visualization, clustering and genome annotation. | [ | |
| PAN2HGENE | To identify new products, resulting in altering the α value behavior in the pangenome without altering the original genomic sequence. | [ | |
| PATO | Core-genome and accessory genome identification and help to characterize population structure, annotate pathogenic features and create gene sharedness networks. | [ | |
| Panakeia | It analyses synteny and multiple structural patterns of the pangenome, help for biological diversity and evolution studied. | [ | |
| HUPAN | It is developed for pangenome analysis for humans/mammals | [ |
Figure 5A model of heterosis proposed by Swanson-Wagner et al. [105]. Bars represent genes. Three genes are considered in each hypothetical gene family, situated on different chromosomes. represents “functional block” leading to null or altered protein function. In a real scenario, accumulation of a similar effect with many gene families leads to reduced vigor in inbreeds and heterosis in hybrid. Pangenomics can help to unravel heterosis in a phenotypic trait by discovering new gene variants.