| Literature DB >> 35860739 |
Xiaofeng Dai1, Li Shen1.
Abstract
The human history has witnessed the rapid development of technologies such as high-throughput sequencing and mass spectrometry that led to the concept of "omics" and methodological advancement in systematically interrogating a cellular system. Yet, the ever-growing types of molecules and regulatory mechanisms being discovered have been persistently transforming our understandings on the cellular machinery. This renders cell omics seemingly, like the universe, expand with no limit and our goal toward the complete harness of the cellular system merely impossible. Therefore, it is imperative to review what has been done and is being done to predict what can be done toward the translation of omics information to disease control with minimal cell perturbation. With a focus on the "four big omics," i.e., genomics, transcriptomics, proteomics, metabolomics, we delineate hierarchies of these omics together with their epiomics and interactomics, and review technologies developed for interrogation. We predict, among others, redoxomics as an emerging omics layer that views cell decision toward the physiological or pathological state as a fine-tuned redox balance.Entities:
Keywords: mass spectrometry; next generation sequencing; omics; redoxomics; third generation sequencing
Year: 2022 PMID: 35860739 PMCID: PMC9289742 DOI: 10.3389/fmed.2022.911861
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Figure 1Conceptual illustration on the hierarchy of different omics covered in this paper. We classify omics technologies into two categories, i.e., technology- and knowledge- based. Technology-based omics are based on technologies developed for understanding the “central dogma,” which can be further divided into three groups, i.e., the “four big omics” (genomics, transcriptomics, proteomics, and metabolomics), epiomics (epigenomics, epitranscriptomics, and epiproteomics), and their interactomics (DNA-RNA interactomics, RNA-RNA interactomics, DNA-protein interactomics, RNA-protein interactomics, protein-protein interactomics, and protein-metabolite interactomics). Omics indicated by the horizontal (above) and vertical (right-hand side) pink boxes of each interactomic term constitute to its two interacting omics. Knowledge-based omics are developed to understand a particular knowledge domain in a systematic way through integrating multiple omics information. Examples of this category include immunomics, microbiomics, and beyond.
Comparisons of high-throughput approaches for omics studies.
|
|
|
|
| |
|---|---|---|---|---|
|
| ||||
| DNA microarray | • Inexpensive; | • Inability to detect | ( | |
| The first-generation sequencing | Sanger sequencing | • Long read lengths and high per-base accuracies. | • High cost and low throughput. | ( |
| The next-generation sequencing | Cyclic-array sequencing | • Lower cost; | • The average single reading accuracy is low. | ( |
| Microelectrophoretic | • Low cost. | • Low throughput. | ( | |
| Sequencing by hybridization | • Improved throughput by avoiding the electrophoresis step that allows more samples to be sequenced in parallel. | • A single sample must first be cloned, amplified and purified. | ( | |
| Real-time observation of single molecules | • Higher speed and throughput; | • Short read lengths and less accuracy; | ( | |
| The third-generation sequencing | PacBio | • Real long reads; | • Expensive sequencer and relatively high cost per Gb; | ( |
| ONT | • Real (ultra-) long reads that with no upper limit; | • High overall error rate and systematic errors with homopolymers; | ( | |
|
| ||||
| RNA microarray | • Less expensive; | • Inability to detect | ( | |
| Tag-based methods | DGE seq | • More economical than traditional RNA sequencing for a given sequencing depth; | • Biases from fragmentation, adapter ligation and PCR can make tag-based data more prone to batch effects. | ( |
| 3' end seq | • Direct sequencing of the 3' end. | • Generate a high frequency of truncated cDNA; | ( | |
| Probe alternative splicing and gene fusion | SMRT | • Offers long reads. | • Costly, and has a high error-rate and low multiplexing capacity. | ( |
| SLR-RNA-Seq | • Delivers longer transcripts and more detected isoforms. | • Genome wide analysis is not possible; | ( | |
| Targeted RNA sequencing | Target capture | • Greater complexity and uniformity; | More costly. | ( |
| Amplicon sequencing | • Low cost; | • Cannot involve complex analysis; | ( | |
| Single cell RNA sequencing | CEL-seq2 | • High sensitivities; | • Lower throughput. | ( |
| Drop-seq | • High throughput; | • Lower sensitivities. | ( | |
|
| ||||
| High resolution MS methods | Orbitrap | • High resolving power; | • The only fragmentation method available is ion trap-based CID, a method that has limitations on modified peptides with important PTMs;•The practical accurate mass MS/MS scan rate is slow; | ( |
| MALDI-TOF-TOF | • Fast scanning speed; | • Low resolving power. | ( | |
| FT-ICR | • Very high mass accuracy and resolving power. | • Equipment takes up more space; | ( | |
| Low resolution MS approaches | Quadrupole | • Low cost; | • Less suitable for pulsed ion sources; | ( |
| Ion-trap | • Improved sensitivity; | • Low resolving power. | ( | |
| Tandem mass spectrometric techniques | CID | • Mature technology with wide applications. | • Cannot capture unstable PTM information. | ( |
| ECD | • The retention of labile groups is far superior than CID; | • Negative ions formed by ESI are usually not amenable; | ( | |
| ETD | • The retention of labile groups is far superior than CID; | • Negative ions formed by ESI are usually not amenable; | ( | |
| EID | • Can be used to induce fragmentation in singly protonated or deprotonated ions. | • Negative ions formed by ESI are usually not amenable. | ( | |
|
| ||||
| Spectroscopy | FT-IR spectroscopy | • Low cost; | • A long preparation process may lead to errors. | ( |
| Raman spectroscopy | • Non-destructive, non-invasive; | • Low sensitivity; | ( | |
| NMR spectroscopy | • Simple sample preparation and highly reproducible molecule quantification; | • Less sensitive than LC/MS and GC/MS. | ( | |
| MS | MS | • Mature technology with wide applications; | • MS data are less reproducible than NMR spectroscopy; | ( |
| MS/MS | • It compensates for the poor chromatographic ability of LC/GC. | • Not all molecules can be efficiently fragmented or detected. | ( | |
| GC-MS | Mature technology and cheap price. | • Analytes have to be volatile or volatilizible by derivatization; | ( | |
| LC-MS | • No limitations by molecular mass or polarity of target analytes; | • More cost; | ( | |
|
| ||||
| Hi-C | • High resolution; | • Cannot capture the fine detail of sub-nuclear compartments; | ( | |
| MiGS | • Can analyze whole genome methylation; | • The description of methylation is not a single base pair resolution. | ( | |
|
| ||||
| Enzyme-based | PARS | • Increased sensitivity by sequencing both single- and double-stranded regions. | • RNA was folded | ( |
| FragSeq | • Simple and fast protocol; | • Does not consider single-hit kinetics that may lead to RNA restructuring after cleavage. | ( | |
| PARTE | • Measures melting temperature; | • Introns and lowly expressed antisense or cryptic unstable transcripts are not well-interrogated; | ( | |
| Chemical-based | Mod-seq | • Can probe structures of long RNAs | • Limited to the analysis of two bases (As and Cs). | ( |
| Structure-seq | • Single-nucleotide resolution; | • Limited to the analysis of two bases (As and Cs); | ( | |
| DMS-seq | • Identifies RNA structure in native conditions; | • Limited to the analysis of two bases (As and Cs); | ( | |
| Chemical-based | CIRS-seq | • Single-nucleotide resolution; | • Uses DMS to methylate the N1 of adenosine and N3 of cytosine residues, and uses CMC to modify pseudouridines, where DMS and CMS may react with non-secondary RNA structures. | ( |
| SHAPE-MaP | • Can be customized for different applications; | • Length of the RNA must be at least ~150nt for the randomer and native workflow, and at least ~40nt for the small-RNA workflow. | ( | |
| icSHAPE | • Measures base flexibility; | • Limited to the analysis of relatively short (300nt) | ( | |
| MARIO | • Many-to-many mapping; | • Loses RNA duplexes that are not associated with any proteins. | ( | |
| RIP-seq | • Mature technology; | • The washing conditions are quite strict; | ( | |
| LAIC-seq | • Could differentiate m6A methylation levels between mRNA isoforms without prior fragmentation. | • Losing the positional information. | ( | |
| miCLIP-seq | • M6A is detected with high specificity and sensitivity; | • The method is dependent on m6A-specific antibodies, suffering from poor reproducibility and complicated process; | ( | |
| m1A-MAP | • Reveal distinct classes of base-resolution m1A methylome in the nuclear- and mitochondrial-encoded transcripts. | • Large sample size is required. | ( | |
| m7G-MeRIP-seq | • Precisely map the m7G methylomes in RNA. | • The mild chemical reactions for selective m7G reduction and depurination could not achieve quantitative yields. | ( | |
| RNA BisSeq | • Can accurately identify m5C sites; | • It may be disturbed by some cytosine modifications other than m5C; | ( | |
| MAZTER-seq | • Allows detecting and quantifying m6A levels at endogenous sites; | • Allows quantification of only a subset of m6A sites that both occur at ACA sites and are within suitable distances of adjacent ACA sites; | ( | |
| m6A-REF-seq | • High throughput; | • It can only identify ~16 to 25% m6A sites because of the restrictions of MazF that specifically recognizes the ACA motif. | ( | |
| LC-MS/MS | • The presence and quantification of all RNA modifications can be determined. | • Requires large amount of input samples; | ( | |
|
| ||||
| Microsequencing | • Mature technology. | • Time-consuming and requires a large amount of highly purified sample. | ( | |
| Western blotting | • Simple operation; | • It is an error-prone method due to its time-consuming multistep protocol; | ( | |
| Immunofluorescence analysis | • Permits visualization of virtually many components in any given tissue or cell type; | • Usually restricted to permeabilized cells or extracellular or endocytosed proteins. | ( | |
| ChIP | • High resolution; | • Sequencing errors may occur at the end of each read; | ( | |
| MS | • Enabled the characterization of protein PTMs in a high-throughput manner; | • False positive identification will be introduced during data verification; | ( | |
|
| ||||
| Mapping genome-wide locations of a specific RNA | ChIRP | • Tilling the entire transcript with antisense DNA. | • Limited to analyzing RNA at a time. | ( |
| CHART | • Tilling the RNase H accessible region by antisense DNA. | • Limited to analyzing RNA at a time. | ( | |
| RAP | Tilling the entire transcript with complimentary RNA. | • Limited to analyzing RNA at a time; | ( | |
| Mapping all chromatin-interacting RNAs together with their genomic interacting regions | MARGI | • Many-to-many mapping; | • Requires a large number (107) of cells. | ( |
| ChAR-seq | • Many-to-many mapping; | • Only sequencing reads that cover the entire bridge sequence are informative, reducing the number of informative reads. | ( | |
| GRID-seq | • Many-to-many mapping; | • The informative sequence lengths on the RNA side and the DNA side are both limited to 20 bases, resulting in challenges in unambiguous sequence mapping. | ( | |
|
| ||||
| hiCLIP | • Incorporation of an adaptor between two RNA molecules increases ligation efficiency and improves accuracy in sequence mapping. | • Requires prior knowledge of an RNA-binding protein; | ( | |
| PARIS | • Many-to-many mapping. | • 4'-Aminomethyl trioxsalen (AMT) preferentially crosslinks pyrimidine bases and may introduce bias. | ( | |
| SPLASH | • Improves signal-to-noise ratio by leveraging biotinylated psoralen; | • Psoralen preferentially crosslinks pyrimidine bases and may introduce bias. | ( | |
| LIGR-seq | • Many-to-many mapping. | • AMT preferentially crosslinks pyrimidine bases and may introduce bias. | ( | |
|
| ||||
| ACE | • It can be a technique of choice to validate high throughput screening results; | • The higher the analyte concentration, the bigger the systematic error will be. | ( | |
| ChIP-Chip | • technology with wide applications. | • Lots of cells are generally needed to obtain a robust result. | ( | |
| SELEX | • Mature technology with wide applications; | • Prior exhaustive knowledge of protein target and high purity recombinant protein is necessary prior to selection of aptamers. | ( | |
|
| ||||
| CLIP-seq | • Cross linking occurs between RNA and protein before cell death. | • Large sample size is required. | ( | |
| CLASH | Stringent purification conditions remove nonphysiological interactions. | • Requires prior knowledge of an RNA-binding protein; | ( | |
|
| ||||
| Y2H | • Mature technology. | • Cannot identify multi-protein complexes in one run. | ( | |
| LC-MS/MS | • Can tag several members of a complex; | • May miss some complexes that are not present under the given conditions; | ( | |
| coIP-MS | • Can rapidly identify multiple interacting proteins; | • The outcome is dependent on the efficiency of the antibody immunoprecipitating the bait protein. | ( | |
| AlphaLISA | • Can study a wide range of analytes; | • Excess target protein may oversaturate the donor or acceptor beads that results in a progressive signal decrease. | ( | |
|
| ||||
| Protein tagging | • Can identify interacting metabolites for a specific protein. | • Low throughput. | ( | |
| Metabolite modification | • Available for a wide range of compound classes. | • Limited to compounds chemically stable over the course of the experiment. | ( | |
| PROMIS | • Low false positives related to a high concentration of the bait molecule; | • Poorly predictive. | ( | |
| NMR-based approach | • Widely applicable; | • Does not directly translate into changes in protein activity due to restrictions to protein-metabolite binding; | ( | |
| NMR relaxometry | • No separation step during sample preparation; | • Less sensitive than a state-of-the-art NMR system. | ( | |
Figure 2Conceptual illustration on the future trend in omics technology development. There are three trends regarding the developmental paths of omics technologies. The first category of tasks is to resolve the technical problems existing in current omics techniques. The second trend is the identification of novel types of omics especially novel epiomics derived from modifications by various intermediate metabolites. With increased understanding on the importance of cell homeostasis at various levels regarding human disease management, there is a trend of cutting into a particular knowledge domain from a systematic angle via omics data integration as demonstrated by immunomics and microbiomics. In this trend, we propose “redoxomics” as an emerging type of knowledge-based omics given the critical roles of redox homeostasis in maintaining cells at the healthy state and the pathogenesis of various diseases including cancers.