Literature DB >> 36256656

Discovery of a MUC3B gene reconstructs the membrane mucin gene cluster on human chromosome 7.

Tiange Lang1, Thaher Pelaseyed2.   

Abstract

Human tissue surfaces are coated with mucins, a family of macromolecular sugar-laden proteins serving diverse functions from lubrication to the formation of selective biochemical barriers against harmful microorganisms and molecules. Membrane mucins are a distinct group of mucins that are attached to epithelial cell surfaces where they create a dense glycocalyx facing the extracellular environment. All mucin proteins carry long stretches of tandemly repeated sequences that undergo extensive O-linked glycosylation to form linear mucin domains. However, the repetitive nature of mucin domains makes them prone to recombination and renders their genetic sequences particularly difficult to read with standard sequencing technologies. As a result, human mucin genes suffer from significant sequence gaps that have hampered the investigation of gene function in health and disease. Here we leveraged a recent human genome assembly to characterize a previously unmapped MUC3B gene located at the q22 locus on chromosome 7, within a cluster of four structurally related membrane mucin genes that we name the MUC3 cluster. We found that MUC3B shares high sequence identity with the known MUC3A gene and that the two genes are governed by evolutionarily conserved regulatory elements. Furthermore, we show that MUC3A, MUC3B, MUC12, and MUC17 in the human MUC3 cluster are expressed in intestinal epithelial cells (IECs). Our results complete existing genetic gaps in the MUC3 cluster which is a conserved genetic unit in vertebrates. We anticipate our results to be the starting point for the detection of disease-associated polymorphisms in the human MUC3 cluster. Moreover, our study provides the basis for the exploration of intestinal mucin gene function in widely used experimental models such as human intestinal organoids and genetic mouse models.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 36256656      PMCID: PMC9578598          DOI: 10.1371/journal.pone.0275671

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.752


Introduction

The first draft of the human genome published twenty years ago offered a unique opportunity to decipher the causal relationship between genetic sequence, gene function, and disease biology [1, 2]. But reading and measuring repetitive genomic elements remains a major technological challenge that has left the human genome riddled with significant sequence gaps. Mucin (MUC) genes are characterized by subexonic repeats, consisting of multiple repeated short DNA sequences within coding exons. The resulting tandemly repeated sequences encode extended domains that are rich in proline, threonine, and serine (PTS) residues [3]. Mucin-type tandem repeats undergo O-linked glycosylation on serines and threonines to form densely O-glycosylated linear mucin domains [4]. The number and sequence identity of tandem repeats vary between MUC genes and are further confounded by considerable length polymorphism between individuals, resulting in variable number of tandem repeats (VNTRs). VNTRs present a major challenge in analyzing mucin sequences since their repetitive nature and size in several kilobases cause intrinsic instabilities that are difficult to maintain in bacterial artificial chromosomes. Consequently, mucin gene VNTRs are underrepresented in the human genome assembly [5] and continue to hamper efforts to investigate MUC gene function. Mucins are an ancient family of proteins in the animal kingdom. The earliest mucin genes appeared 700–800 million years ago in primitive marine metazoans such as sea anemones, sponges, and jelly combs and have since expanded to all branches of the tree of life [6, 7]. Currently, the human mucin family consists of secreted gel-forming mucins (MUC2, MUC5B, MUC5AC, MUC6, MUC7, and MUC20) and a distinct subfamily of membrane mucins (MUC1, MUC3, MUC4, MUC12, MUC13, MUC15, MUC16, MUC17, MUC21, and MUC22) that are inserted into cell membranes via a transmembrane domain [8]. Membrane mucins are single-pass type I transmembrane proteins that are guided to the secretory pathway via an N-terminal signal sequence. In the endoplasmic reticulum, membrane mucins undergo N-linked glycosylation and folding, which in most cases requires a strain-dependent autocatalytic cleavage at a Sea urchin sperm protein, Enterokinase and Agrin (SEA) domain [9]. The cleaved protein fragments remain non-covalently attached at the SEA domain as the mucin protein transits to the Golgi apparatus for O-linked glycosylation. Consequently, the mature SEA-type membrane mucin reaches the plasma membrane as a heterodimer with a glycosylated extracellular N-terminal subunit that remains non-covalently linked to a membrane-attached C-terminal subunit. The evolutionary origins of the SEA domain date back to single-celled eukaryotes while SEA-type membrane mucins emerged in vertebrates [3, 10]. The SEA domain is a mechanosensor that undergoes conformational unfolding in response to mechanical tension, but its ultimate biological function remains elusive [11]. In humans, the SEA-type membrane mucin genes MUC3, MUC12, and MUC17 map to the chromosomal locus 7q22. The three genes are arranged in a MUC3-MUC12-MUC17 cluster (hereafter called MUC3 cluster), and flanked by ACHE upstream of MUC3, and TRIM56 and SERPINE1 downstream of MUC17 (Fig 1A). The stereotypic ACHE-MUC3-MUC12-MUC17-TRIM56- SERPINE1 unit is highly conserved in vertebrates. Mus musculus carries a MUC3 cluster on chromosome 5, where three membrane mucin genes are flanked by Ache and Trim56-Serpine1. Notably, the Muc3 gene in M. musculus maps directly upstream of Trim56 and shares 43% sequence identity with human MUC17, but only 28% identity with human MUC3, indicating that murine Muc3 is a homolog of human MUC17 while the murine homologs for MUC3 and MUC12 are poorly defined [3]. The nonmammalian vertebrate Xenopus tropicalis carries seven genes encoding SEA-type membrane mucins, of which three are arranged in tandem on chromosome 3 followed by a homolog of human SERPINE1, suggesting that the MUC3 cluster first emerged in amphibians [3].
Fig 1

Conservation of MUC3 cluster in Cercopithecoid and Hominoid superfamilies.

(A) The MUC3 cluster at locus q22 in human chromosome 7 in the GRCh38.p13 assembly is flanked by genes ACHE, TRIM56, and SERPINE1. (B) Members of the Cercopithecoid and Hominoid superfamilies, except for H. sapiens and G. gorilla, carry a MUC3 cluster consisting of MUC3A, MUC3B, MUC12, and MUC17 genes. (C) Presence of the MUC3B gene in MUC3 cluster in Catarrhini parvorder (filled black circles). Open circles indicated the presence of MUC3, MUC12, and MUC17 genes in the Platyrrhini parvorder. NA indicates a lack of sufficient sequence information for the detection of MUC3 cluster genes.

Conservation of MUC3 cluster in Cercopithecoid and Hominoid superfamilies.

(A) The MUC3 cluster at locus q22 in human chromosome 7 in the GRCh38.p13 assembly is flanked by genes ACHE, TRIM56, and SERPINE1. (B) Members of the Cercopithecoid and Hominoid superfamilies, except for H. sapiens and G. gorilla, carry a MUC3 cluster consisting of MUC3A, MUC3B, MUC12, and MUC17 genes. (C) Presence of the MUC3B gene in MUC3 cluster in Catarrhini parvorder (filled black circles). Open circles indicated the presence of MUC3, MUC12, and MUC17 genes in the Platyrrhini parvorder. NA indicates a lack of sufficient sequence information for the detection of MUC3 cluster genes. The current human genome assembly GRCh38.p13 is estimated to contain unsolved gaps corresponding to nearly 150 million base pairs (Mbp) [12], which we postulate underlie the lack of complete sequences for human MUC genes in general and the MUC3 cluster in particular. In this work, we take advantage of the most recent T2T-CHM13 assembly of the human genome [12] to provide evidence for the existence of a human MUC3B gene. We also demonstrate that MUC3A and MUC3B genes are conserved in late hominoids such as the chimpanzee as well as Old World monkeys. Finally, by exploring published RNA-sequencing data sets, and applying quantitative gene expression analysis in human tissues, we show that MUC3A and MUC3B expression is limited to IECs.

Material and methods

Recruitment of patients and sample collection

Patients ≥18 years who were referred to Sahlgrenska University Hospital (Gothenburg, Sweden) for colonoscopy, were eligible for inclusion and subject to the provision of written informed consent. Patients with macroscopic/microscopic evidence of ileocolonic pathology other than Inflammatory bowel disease were excluded. Eight biopsies were obtained from the terminal ileum of each patient. The study protocol was approved by the regional ethics committee (Ethical permit #2020–03196) and complied with the Declaration of Helsinki.

Phylogenetic data

Phylogenetic trees and molecular time estimates were extracted from TimeTree [6, 13].

Sequence alignments

Local sequence similarity search and identity measurements of MUC genes were performed using NCBI BLAST [14]. Primer specificity was analyzed using primer BLAST [15]. Multiple sequence alignment of MUC gene and protein homologs was conducted using CLUSTALW [16]. Perl scripts were used for all data extraction (see supplementary methods in S1 File). Promotor regions -1 kb from the transcription start site of MUC3A and MUC3B in Cercopithecoid and Hominoid superfamilies were aligned using Multiple Alignment using Fast Fourier Transform (MAFFT) high-speed multiple sequence alignment tool [17].

Generation of dot plots for pairwise sequence alignment and sequence logo representations

Dot plots representing pairwise sequence alignments were generated using Genome Pair Rapid Dotter (GEPARD) version 1.40 [18]. Sequence logos of perfect tandem repeats were generated using WebLogo3 [19].

Mapping of DNase-seq and ChIP-seq data to the human genome

DNase hypersensitive sequences upstream of MUC3A in GRCh38.p13 and Chromatin immunoprecipitation (ChIP) sequencing of the human small intestine, colon, and stomach samples are summarized in S1 Table [20]. Graphical representation of epigenetic signatures was prepared by aggregating multiple segment-sorted tracks using the Matplot function in Washington University Epigenome Browser v53.5.0 [21].

Single-cell expression of transcription factors

Expression profiles for transcription factors were extracted from the following data sets available at Single Cell Portal (Broad institute): single-cell transcriptome analysis of human small intestine (GSE148829) [22], human colon (GSE178341) [23], and mouse small intestine (GSE92332) [24].

Mapping of RNA-sequencing data to T2T-CHM13 human genome assembly

The T2T-CHM13 human genome assembly was downloaded from NCBI BioProject PRJNA559484. Fastq-dump was used to obtain RNA-sequencing reads. Burrows-Wheeler Aligner (BWA) software package [25] was used to align RNA-sequencing reads to exonic sequences of genes belonging to the MUC3 cluster. Perl scripts were used to perform quality control and measure read number (see supplementary methods in S1 File). The following publicly available data sets were used to determine MUC gene expression in human tissues: single-cell transcriptome analysis of human ileum, colon, rectum (GSE125970) [26], human liver (GSE124395) [27], and human kidney (GSE131685) [27]. Gene expression of individual MUC genes in the MUC3 cluster was calculated as transcripts per million (TPM) as previously described [28].

RNA extraction from human ileum, cDNA synthesis, and RT-qPCR

RNA from human ileal biopsies was extracted using RNeasy Mini Kit (Qiagen). 500 ng of RNA was reverse transcribed to cDNA with TaqMan Reverse Transcription kit (#N8080234, Applied Biosystems), using 2.5 μM random primers and the cycling parameters 25.0°C for 10 min, 37.0°C for 30 min, and 95.0°C for 5 min. 750 ng of cDNA was used for downstream reverse transcription quantitative PCR (RT-qPCR) with 0.3 μM MUC3A-specific primers (forward 5’-TGGGGGTCAGTGGGATGGCCTCAAA-3’; reverse 5’-CACGTGGGACCGCTCGTCTCC) or MUC3B-specific primers (forward 5’-CGGGGGCCAGTGGGATGGCCTCAAG-3’; reverse 5’-CACGCGGGACCGCTCGTCTCT-3’) using SsoFast EvaGreen Supermix (#1725200, Bio-Rad) on a CFX96 Real-Time PCR Detection System (Bio-Rad) with the cycling parameters 95.0°C for 3 min, 39 cycles of 95.0°C for 10 s, 63.5°C for 10 s, 72.0°C for 20 s. Melting curve analysis was performed at 95.0°C for 10 s, and 65.0°C to 95.0°C at an increment of 5°C for 5 s.

Restriction site analysis and agarose gel electrophoresis

5 μL of the RT-qPCR reaction was digested with 1 μL FastDigest PstI restriction enzyme (#FD0614, ThermoFisher Scientific) for 1 h at 37°C. Full-length amplicons and digestion products were separated on 1.5% agarose gel with ethidium bromide.

Statistics

Statistical analysis and graphical illustrations were performed using GraphPad PRISM 8.3.1 (GraphPad Software). Statistical tests were applied using two-way ANOVA and corrected for multiple comparisons using Tukey´s test. Data are presented as mean ± standard deviation (SD). For all statistical analyses: * p<0.05, ns = Not significant.

Results

The evolution of a MUC3 cluster in Cercopithecoids and Hominoids

The human chromosome locus 7q22 contains three MUC genes MUC3, MUC12, and MUC17, arranged in a MUC3 cluster flanked by ACHE at its 5’ end, and TRIM56 and SERPINE1 at its 3’ end (Fig 1A). Using ACHE, TRIM56, and SERPINE1 as genomic markers, we identified the MUC3 cluster in species belonging to the Catarrhini parvorder, namely Cercopithecoid (Old World monkeys) and Hominoid superfamilies, the latter including the genera Pongo (orangutang), Gorilla, Pan (chimpanzee and bonobo) and Homo (Fig 1B). In Cercopithecoids we identified a MUC3 cluster with a length of 153 kilobase pairs (kbp) in Macaca mulatta (rhesus), while the corresponding gene cluster in the Papio Anubis (baboon) consisted of two mapped sequences with a total length of 138 kbp (Fig 1B). In Hominoids, MUC3 cluster length ranged from 106 kbp in the Nomascus leucogenys (gibbon) to 203 kbp in Pongo abelii (orangutang). In the Homininae subfamily, we observed striking differences between MUC3 cluster length in Pan troglodytes (chimpanzee) and its two closest relatives; the gene cluster in H. sapiens GRCh38.p13 assembly was 73 kbp shorter and in G. gorilla (gorilla) 66 kbp shorter than in the chimpanzee (Fig 1B). Thus, we hypothesized that the MUC3 cluster within the human GRCh38.p13 assembly contains significant sequence gaps that may obscure unknown MUC genes. To test our hypothesis, we used a set of defined criteria when exploring available primate genome assemblies for unidentified MUC3 cluster genes. We scanned the clusters for 1) start codons, 2) long mucin-type PTS-encoding exons, 3) SEA domains conserved in membrane mucins and, 4) unique intronic and exonic sequences that separate individual MUC genes. Our analysis revealed that all Cercopithecoids carried a MUC3 cluster consisting of MUC3, MUC12, and MUC17 genes (Fig 1B). Strikingly, the primate MUC3 gene existed as two distinct MUC3A and MUC3B genes, although only partial sequences of the MUC3B gene were identified in P. anubis. The Hominoid superfamily, except for H. sapiens and G. gorilla, carried a MUC3A gene and full or partial sequences of MUC3B. Thus, we identified a MUC3B gene exclusively in species belonging to the Catarrhini parvorder, which diverged from Platyrrhini (New World monkeys) around 43 million years ago (Mya) (Fig 1C). However, because of inadequate sequence coverage of the MUC3 cluster in Platyrrhini, and Scandentia (treeshrew) and Dermoptera (colugos) orders that constitute the closest relatives of primates, we were not able to determine when MUC3B first emerged during vertebrate mammalian evolution.

The human 7q22 locus contains a MUC3B gene

Since humans and chimpanzees share 98.8% of their genomic DNA and the chimpanzee genome carries a MUC3B gene in the MUC3 cluster, we hypothesized that the absence of a MUC3B gene in humans is a result of sequence gaps in the GRCh38.p13 assembly. In the quest for a human MUC3B gene, we explored PacBio Single Molecule Real-Time (SMRT) reads from a human HX1 [29] and identified 3 individual reads that covered the 3’ end region of MUC3A (encoding the C-terminal region of MUC3A protein, designated MUC3A C-term), an intergenic region, and the 5’ end region of a putative MUC3B gene (designated MUC3B N-term) (Fig 2A). Strikingly, the length of the intergenic region was on average 10,939 bp, which corresponded to the length of the MUC3A-MUC3B intergenic region in Catarrhines (average of 11,810 bp). Moreover, we identified 5 SMRT reads covering MUC3B C-term, an intergenic region, and MUC12 N-term (Fig 2A). The average length of the MUC3B-MUC12 intergenic region was 2469 bp and conserved in Catarrhines (average of 2491 bp). This initial exploration provided evidence for the existence of a distinct human MUC3B gene. However, because MUC3A and MUC3B share high sequence identity (87% and 94% for N-term and C-term across catarrhines) and the error rate of the SMRT reads was 70–85%, the HX1 assembly could not with high confidence distinguish between the two MUC3 genes. Moreover, the reads failed to capture the length and sequence of a predicted single PTS-encoding exon in MUC3B.
Fig 2

Evidence of a putative MUC3B gene in recent human genome assemblies.

(A) Exploration of PacBio sequencing of HX1 genome identified SMRT reads covering the intergenic region between MUC3A and putative MUC3B, an incomplete PTS sequence, and intergenic sequences between putative MUC3B and MUC12. (B) The T2T-CHM13 assembly contains a 60 kb gap between MUC3A and MUC12. (C) Sequence alignments of SEA, transmembrane (TM), and cytoplasmic tails (CT) of MUC3A and putative MUC3B show high sequence identity, nucleotide mismatches, and a conserved PDZ binding motif.

Evidence of a putative MUC3B gene in recent human genome assemblies.

(A) Exploration of PacBio sequencing of HX1 genome identified SMRT reads covering the intergenic region between MUC3A and putative MUC3B, an incomplete PTS sequence, and intergenic sequences between putative MUC3B and MUC12. (B) The T2T-CHM13 assembly contains a 60 kb gap between MUC3A and MUC12. (C) Sequence alignments of SEA, transmembrane (TM), and cytoplasmic tails (CT) of MUC3A and putative MUC3B show high sequence identity, nucleotide mismatches, and a conserved PDZ binding motif. The current GRCh38.p13 draft covers lightly packed euchromatic regions corresponding to 92% of the human genome, while more complex regions including long tandem repeats in MUC genes are underrepresented. A recently published CHM13 T2T v1.1 assembly, based on long-read genome sequencing of homozygous complete hydatidiform mole (CHM) cells followed by gapless telomere-to-telomere assembly, adds approximately 200 Mbp to the GRCh38.p13 assembly [12]. Importantly, the T2T-CHM13 assembly filled a 60 kbp gap between MUC3A and MUC12 at locus 7q22 (Fig 2B). Within this gap, we identified a 39,267 bp long PTS-encoding exon flanked upstream by a 2,187 bp long sequence with 87% identity to MUC3A N-term. Downstream of the PTS-encoding exon, we identified a 6,303 bp long sequence that was 92% identical to MUC3A C-term and contained a SEA domain, a transmembrane domain, and a cytoplasmic tail with a conserved PDZ motif [30] (Fig 2C). Thus, our findings suggest that the T2T-CHM13 assembly contains a putative MUC3B gene at locus 7q22 with a high sequence identity with MUC3A. Although previous studies have proposed the existence of a human MUC3B gene [31, 32], most recently in the African pan-genome [33] that contains a 22,827 bp contig aligning with MUC3B exon 2 (S1A Fig in S1 File), the complete length, sequence and exon-intron architecture of MUC3B, including its repetitive PTS-coding exon 2, remain unresolved.

Distinct human MUC3A and MUC3B genes share high sequence homology

To better characterize the putative human MUC3B gene, we compared the exon-intron architecture of MUC3B to MUC3A. MUC3A has been reported to contain 11 exons, including a PTS-encoding exon with a length of at least 6 kbp [34]. Our analysis showed that MUC3A and MUC3B both have 12 exons, revealing a previously overlooked exon 4 (S1B, S1C Fig in S1 File). Exons of the two MUC3 genes have nearly identical nucleotide lengths, except the single PTS-encoding exon 2 which measures 15,873 bp (5,291 amino acids) in MUC3A and 39,267 bp (13,089 amino acids) in MUC3B (Fig 3A). Nucleotide sequence identity between MUC3A and MUC3B was on average 93% for exons, and 92% for introns. The superfamilies of Hominoids (apes and humans) and Cercopithecoids (Old World monkeys) diverged around 29 Mya [6]. Sequence alignments between N- and C-termini of MUC genes in the MUC3 cluster showed a high degree of conservation between H. sapiens and members of the Cercopithecoid and Hominoid branches. Human MUC3A N-term was 99% identical to chimpanzee MUC3A N-term and 90–91% identical to MUC3A N-term in Cercopithecoid members rhesus and baboon. MUC3A C-term showed a slightly higher degree of divergence compared to MUC3A N-term (Fig 3B and S2 Table). The same trend was observed for MUC3B, in which the MUC3B C-term was less conserved than MUC3B N-term. Tandem repeat regions are prone to duplications and deletions caused by recombination [35]. Accordingly, we observed higher evolutionary sequence divergence in the PTS-encoding exon 2 of MUC genes in the MUC3 cluster (Fig 3B and S2 Table). Moreover, pairwise alignment of Catarrhini MUC3 cluster genes revealed a general trend toward the expansion of tandem repeats during primate evolution (S1D, S1E Fig in S1 File). Specifically, within exon 2 of human MUC3A and MUC3B, we identified imperfect repeats with 87% amino acid sequence identity between MUC3A and MUC3B. In addition, MUC3B contained an additional 1368 amino acids of imperfect repeats (Fig 3C). MUC3A and MUC3B also harbored 166 and 549 perfect tandem repeats, respectively, consisting of a 17 amino acids long consensus sequence (ITTTETTSHSTPSFTSS) (Fig 3D). We conclude that the genetic structure of MUC3B is highly similar to MUC3A and that the two MUC3 genes are likely paralogous genes characterized by variable number of tandem repeats.
Fig 3

Comparison of genetic and structural features of MUC3A and MUC3B genes.

(A) Exon structure and length of exon 2 of MUC3A and MUC3B. (B) The evolutionary rate of N-terminal-, PTS- and C-terminal-encoding exons in MUC3A, MUC3B, MUC12, and MUC17 measured as gene content conservation (%) versus evolutionary distance (Mya). (C) Dot plot of pairwise sequence alignment of MUC3A and MUC3B identified imperfect (blue) and perfect (red) tandem repeat sequences in exon 2. (D) Sequence logo representation of 17 amino acids long consensus sequence in 166 and 549 perfect tandem repeats (TRs) in exon 2 of MUC3A and MUC3B, respectively.

Comparison of genetic and structural features of MUC3A and MUC3B genes.

(A) Exon structure and length of exon 2 of MUC3A and MUC3B. (B) The evolutionary rate of N-terminal-, PTS- and C-terminal-encoding exons in MUC3A, MUC3B, MUC12, and MUC17 measured as gene content conservation (%) versus evolutionary distance (Mya). (C) Dot plot of pairwise sequence alignment of MUC3A and MUC3B identified imperfect (blue) and perfect (red) tandem repeat sequences in exon 2. (D) Sequence logo representation of 17 amino acids long consensus sequence in 166 and 549 perfect tandem repeats (TRs) in exon 2 of MUC3A and MUC3B, respectively.

MUC3A and MUC3B are regulated by conserved regulatory elements

Sequences upstream of transcription start sites (TSS) contain regulatory elements that dictate gene expression. Promotor activity within positions -1 –-242 upstream of MUC3A TSS has been reported previously [32], yet evolutionary conservation of the promotor regions and transcription factors that potentially regulate MUC3 genes are largely unknown. Sequence analysis of presumed regulatory sequences -1 kbp upstream of human MUC3A TSS identified a candidate cis-Regulatory Element (cCRE) at position -1 –-403 bp (Fig 4A), which shared 83% identity with the corresponding region in MUC3B. Published DNase I hypersensitive site sequencing (DNase-seq) data sets from the human small intestine and colon revealed high signals within the cCRE (Fig 4A). Moreover, we identified high signals for active chromatin markers H3K9ac and H3K4me3 within the MUC3A cCRE in the human small intestine and colon, while active chromatin signals in the stomach were either low or not detected. By predicting transcription factor binding sites (TFBSs) using the JASPAR CORE vertebrate collection [36, 37], we identified putative TFBSs in cCRE of MUC3A (S2 Fig in S1 File). Seven of these transcription factors (ELF3, HNF4A, HNF4G, KLF4, PPARA, STAT3, and XBP1) were enriched in transporting IECs in human and mouse intestines (Fig 4B and S3 Fig in S1 File). Alignment of putative promoter regions upstream of MUC3A and MUC3B genes in Hominoid and Cercopithecoid superfamilies identified conserved TFBSs for HNF4A, HNF4G, and STAT3, strongly suggesting that the two MUC3 genes share an evolutionarily conserved regulatory expression program in the small intestine and colon (Fig 4C, S4 Fig in S1 File and S3 Table).
Fig 4

Conserved regulatory elements upstream of MUC3A and MUC3B genes.

(A) Epigenetic analysis of the human small intestine and colon reveals a DNase I-sensitive cCRE and specific histone modifications surrounding the MUC3A transcription start site. (B) Single-cell analysis of human and mouse intestines shows gene expression of transcription factors in transporting IECs, with conserved binding sites upstream of MUC3A and MUC3B. (C) Binding sites for transcription factors STAT3 and HNF4A/G are completely conserved upstream of MUC3A and MUC3B in Cercopithecoid and Hominoid superfamilies.

Conserved regulatory elements upstream of MUC3A and MUC3B genes.

(A) Epigenetic analysis of the human small intestine and colon reveals a DNase I-sensitive cCRE and specific histone modifications surrounding the MUC3A transcription start site. (B) Single-cell analysis of human and mouse intestines shows gene expression of transcription factors in transporting IECs, with conserved binding sites upstream of MUC3A and MUC3B. (C) Binding sites for transcription factors STAT3 and HNF4A/G are completely conserved upstream of MUC3A and MUC3B in Cercopithecoid and Hominoid superfamilies.

Expression of human MUC3A and MUC3B genes in the human intestine

To determine whether MUC3B is transcribed into messenger RNA, we mapped published RNA-sequencing data sets from the human intestine [26], liver [27], and kidneys [38] to the T2T-CHM13 assembly. A considerable number of sequenced reads from MUC3A and MUC3B transcripts were detected in the human ileum, colon, and rectum (Fig 5A), while the liver and kidneys were devoid of transcripts from the MUC3 cluster genes (S4 Table). Our findings are supported by the human cell atlas [39], which shows that MUC3A is mainly expressed in epithelial cells of the small intestine and colon (1446 of 2316 MUC3A cells) (S5 Fig in S1 File). Our data contradict a previous observation of MUC3A transcripts in the heart, liver, prostate, and thymus, and MUC3B transcripts in the small intestine and colon [40]. Notably, primer BLAST analysis showed that the MUC3B probe used in the study was specific for exon 3 of MUC17, whereas the MUC3A probe was specific for MUC3A exon 2 (S4 Table). Since these exons encode the repetitive mucin-type PTS-domain, we cannot exclude unspecific detection of mucin transcripts in other tissues.
Fig 5

Expression of MUC3B gene in the human intestine.

(A) Unique reads for MUC3A and MUC3B in RNA-sequencing data from human ileum, colon, and rectum mapped to T2T-CHM13 human genome assembly. (B) Gene expression of MUC3 cluster genes in human ileum, colon, and rectum. 2 samples per tissue segment. * p<0.05 as determined by two-way ANOVA, corrected for multiple comparisons using Tukey´s test. Data are presented as mean ± standard deviation (SD). (C) Specific primers amplify a 646 bp cDNA spanning exons 3–8 in MUC3A and MUC3B transcripts from the ileum of five individuals. MUC3A cDNA carries a PstI restriction site in exon 6 that distinguishes MUC3A from MUC3B transcripts. Agarose gel electrophoresis of PstI restriction digests of amplified cDNA from MUC3A and MUC3B transcripts results in 380 bp and 266 bp fragments from MUC3A cDNA. (D) Quantification of bands from agarose gel in C. n = 5 individuals. Data are presented as mean ± standard deviation (SD).

Expression of MUC3B gene in the human intestine.

(A) Unique reads for MUC3A and MUC3B in RNA-sequencing data from human ileum, colon, and rectum mapped to T2T-CHM13 human genome assembly. (B) Gene expression of MUC3 cluster genes in human ileum, colon, and rectum. 2 samples per tissue segment. * p<0.05 as determined by two-way ANOVA, corrected for multiple comparisons using Tukey´s test. Data are presented as mean ± standard deviation (SD). (C) Specific primers amplify a 646 bp cDNA spanning exons 3–8 in MUC3A and MUC3B transcripts from the ileum of five individuals. MUC3A cDNA carries a PstI restriction site in exon 6 that distinguishes MUC3A from MUC3B transcripts. Agarose gel electrophoresis of PstI restriction digests of amplified cDNA from MUC3A and MUC3B transcripts results in 380 bp and 266 bp fragments from MUC3A cDNA. (D) Quantification of bands from agarose gel in C. n = 5 individuals. Data are presented as mean ± standard deviation (SD). Because exons encoding the N-terminal, PTS, and C-terminal regions of MUC3A and MUC3B share 87%, 83%, and 92% identity and PTS-encoding exons are highly repetitive, it is challenging to detect unique reads that accurately distinguish between the two MUC3 genes. Therefore, we turned our attention to reads that map to exons 3–12, where we identified an average of 3.5±1.6 unique reads per kilobase transcript (RPK) of MUC3A and 13.0±4.4 unique RPK of MUC3B (Fig 5A and S4 Table). We next used unique and shared reads in the C-terminal region to calculate normalized gene expression of MUC3A, MUC3B, MUC12, and MUC17 in the human intestine. In the ileum, MUC17 showed significantly higher expression than MUC3A, MUC3B, and MUC12, while MUC12 showed a trend towards higher expression in the rectum compared to the ileum. We detected comparable numbers of MUC3A and MUC3B transcripts in all three intestinal segments (Fig 5B). Next, we applied targeted reverse transcriptase quantitative polymerase chain reaction (RT-qPCR) to validate the presence of unique MUC3A and MUC3B transcripts in ileum collected from five human patients (S5 Table). For this purpose, we designed gene-specific primer pairs that target exons 3 and 8 with a 9.5–12.0% mismatch between MUC3A and MUC3B. The resulting 646 bp cDNA amplicons from each gene transcript were further distinguishable by a unique PstI restriction site in the MUC3A cDNA amplicon (Fig 5C). RT-qPCR from all five patients resulted in the expected 646 bp cDNA amplicon and subsequent PstI-digestion produced 380 bp and 266 bp restriction fragments (Fig 5C). Notably, we observed significant differences in intensities of PstI-sensitive and PstI-resistant fragments produced by the two gene-specific primer pairs. 75% of amplicons generated by MUC3A-specific primers were PstI-sensitive and therefore originated from MUC3A transcripts (Fig 5D). Similarly, 76% of amplicons generated by MUC3B-specific primers were PstI-resistant MUC3B transcripts. Thus, despite significant sequence similarity between the two MUC3 genes, we successfully identified and distinguished between MUC3A and MUC3B transcripts in the human intestine.

Completion of a gapless human MUC3 cluster at locus 7q22

Finally, based on the T2T-CHM13 assembly, we revised the gapless length and sequence of all four membrane mucins genes in the MUC3 cluster at locus 7q22. In this new assembly, we identified a longer PTS-encoding exon 2 of 15,870 bp in MUC3A compared to 8805 bp in GRCh38.p13. The PTS-encoding exon of human MUC12 was 32,428 bp long compared to 14,935 bp in GRCh38.p13 (S6 Table). The complete gapless sequences of MUC3A, MUC3B, MUC12, and MUC17 genes at the 7q22 locus are publicly available via Mucin Biology Groups’ Mucin database (http://www.medkem.gu.se/mucinbiology/databases/index.html).

Discussion

Mucin genes contain long protein-coding sequences consisting of tandem repeats that are difficult to read and measure. As a result, many human mucin gene sequences remain incomplete. Sequence gaps also appear in the genome of Mus musculus, an important model organism for understanding human gene function. In an attempt to fill critical knowledge gaps in mucin genetics, we focused on a cluster of membrane mucins genes, the MUC3 cluster, at locus q22 on human chromosome 7. The MUC3 cluster is conserved in the Cercopithecoid and Hominoid superfamilies, where two distinct MUC3A and MUC3B genes are annotated in all species except in H. sapiens and G. gorilla. In this study, we leveraged the recent T2T-CHM13 assembly of the human genome to fill a 60 kb sequence gap sandwiched between MUC3A and MUC12 genes. Sequence alignment revealed a membrane mucin gene that shares high structural and sequence similarity with MUC3A; it consists of 12 exons and carries a PTS-encoding exon 2 encompassing imperfect and perfect tandem repeats that are conserved in MUC3A. Moreover, the MUC3B gene encodes a SEA domain, a transmembrane domain, and a cytoplasmic tail with a Class I PDZ motif that is conserved in the annotated membrane mucins of the MUC3 cluster. Importantly, nucleotide mismatches in introns and exons clearly distinguished MUC3B from MUC3A. Also, the lengths of intergenic regions spanning MUC3A, the putative MUC3B, and MUC12 corresponded to the intergenic lengths observed within the MUC3 cluster of Cercopithecoids and Hominoids. Our study presents the complete, gap-less sequence of MUC3B and the entire human MUC3 cluster including all introns and exons. The evolutionary conservation of MUC3A and MUC3B genes suggests that their regulation is conserved in higher mammals. Comparative sequence alignments and available DNase I- and ChIP-seq data sets uncovered a conserved cis-regulatory element upstream of MUC3B that contains binding sites for transcription factors HNF4A, HNF4G, and STAT3. Notably, HNF4A and HNF4G regulate the expression of genes encoding proteins that regulate the assembly and maintenance of the microvillus-studded apical brush border in transporting IECs [41]. STAT3 acts downstream of the heteromeric epithelial cell receptor for cytokine IL-22 that regulates the expression of MUC17, which builds a protective glycocalyx barrier atop the brush border of transporting IECs [42]. Finally, mapping of published RNA-seq data sets to the T2T-CHM13 assembly identified unique sequencing reads for MUC3A and MUC3B genes in the human intestine, while gene expression was absent in the liver and kidneys. Finally, we validated high-throughput expression data by a targeted quantitative detection of distinct MUC3A and MUC3B transcripts in the human ileum. Collectively, we identified a previously unannotated MUC3B gene at locus 7q22 and provide evidence for its expression in human IECs. All examined species of the Catarrhini parvorder carry MUC3A and MUC3B, while Platyrrhini and the closest evolutionary relatives of primates only carry MUC3A. Albeit tempting to suggest a MUC3 gene duplication event in the Simian infraorder, the lack of long sequencing reads (30–40 kbp) covering the MUC3 cluster in genomes outside the Catarrhini limits our understanding of when MUC3B emerged during evolution. Interestingly, the N- and C-terminal regions of MUC3A and MUC3B are highly conserved within Catarrhini, whereas the PTS-encoding exons exhibit higher evolutionary divergence. The PTS domains of membrane mucins genes are encoded by short nucleotide sequences organized in tandem repeats. PTS domains are generally poorly conserved and polymorphic [3] since individual repeats are added or removed through recombination to generate VNTRs. In analogy with other genes carrying VNTRs [43], our study shows that the tandem repeat regions of MUC3 cluster genes have undergone expansion during primate evolution. Low conservation and considerable polymorphism between and within species suggest that O-glycosylation of mucin VNTRs is a non-template-driven process under evolutionary and environmental pressure. For example, glycosylation of mucin VNTRs in microbe-rich environments such as the oral cavity and gastrointestinal tract are likely under selective pressure to maintain appropriate interactions with microorganisms that have coevolved with the host through various periods of geographical, dietary, and lifestyle adaptations. This co-speciation is evident in the gastrointestinal tract, where the microbiome of present-day humans is enriched in mucin-degrading genes compared to a higher abundance of starch- and chitin-degrading genes in our ancestral microbiome [44]. Another example is found in the epithelial cell surface glycocalyx, where O-glycosylation underwent major remodeling >2 Mya when a human ancestor acquired an inactivating mutation in CMAH, a gene responsible for converting N-acetylneuraminic acid (Neu5Ac) to N-glycolylneuraminic acid (Neu5Gc) [45]. The resulting accumulation of terminal Neu5Ac in the glycocalyx of human cells has since been exploited by numerous pathogens such as Vibrio cholera [46] and SARS-CoV-2 [47]. The emergence of mucin genes as a result of environmental adaptation has been attributed to the process of convergent evolution, where genes encoding proline-rich proteins independently gain serine and threonine residues that assemble into tandemly repeated O-glycosylated mucin domains [48]. Due to existing challenges in sequencing very long repetitive regions, the nature of mucin polymorphism and its contribution to human disease phenotypes remains elusive. A recent study showed that the length of VNTRs in membrane mucin MUC1 is associated with several disease phenotypes related to kidney function [49], supporting the notion that glycosylated PTS domains of membrane mucins play critical roles in organ function and homeostasis. Intestinal membrane mucin MUC17 is genetically and structurally related to MUC3A and MUC3B and functions as a major building block of the dense glycocalyx covering transporting IECs. In mouse small intestine, Muc17 expression is induced during the suckling-weaning transition when the quantity and complexity of the gut microbiota increases and creates a demand for IECs to establish a cell-attached glycocalyx that prevents adhesion of luminal bacteria to the epithelium [42]. While the function of the MUC3A, MUC3B, and MUC12 remains elusive, their expression varies along different segments of the human intestine, suggesting that MUC3 cluster genes perform segment- and cell-specific functions in humans and other mammalian vertebrates. Our comprehensive map of the MUC3 cluster in the human genome provides opportunities to identify new VNTR polymorphisms associated with disease phenotypes and allows for future exploration of gene orthologs of the MUC3 cluster in experimental mammalian models such as the mouse.

DNase-sequencing and Chromatin immunoprecipitation sequencing data sets used in this study.

(XLSX) Click here for additional data file.

Conservation of N-terminal, PTS, and C-terminal regions of MUC3 cluster genes in primates.

(XLSX) Click here for additional data file.

Binding sites upstream of the MUC3A gene transcription start site for transcription factors enriched in intestinal epithelial cells.

(XLSX) Click here for additional data file.

Identification of MUC3 cluster genes in human tissues.

(XLSX) Click here for additional data file.

Patient demographics.

(XLSX) Click here for additional data file.

Statistical summary of MUC3 cluster genes belonging to members of Cercopithecoids and Hominoids superfamilies.

(XLSX) Click here for additional data file.

Supporting information containing S1-S5 Figs and supplementary methods.

(DOCX) Click here for additional data file.

Raw image of agarose gel shown in Fig 5C.

(PDF) Click here for additional data file.

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present. 29 Jun 2022
PONE-D-22-00623
Discovery of a MUC3B gene reconstructs the membrane mucin gene cluster on human chromosome 7
PLOS ONE Dear Dr. Pelaseyed, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.
 
Please pay particular attention to the comments from the reviewers concerning additional discussion points. especially those associated with currently non-discussed studies. Please submit your revised manuscript by Aug 31 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Michael Scott Brewer, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. PLOS ONE now requires that authors provide the original uncropped and unadjusted images underlying all blot or gel results reported in a submission’s figures or Supporting Information files. This policy and the journal’s other requirements for blot/gel reporting and figure preparation are described in detail at https://journals.plos.org/plosone/s/figures#loc-blot-and-gel-reporting-requirements and https://journals.plos.org/plosone/s/figures#loc-preparing-figures-from-image-files. When you submit your revised manuscript, please ensure that your figures adhere fully to these guidelines and provide the original underlying images for all blot or gel data reported in your submission. See the following link for instructions on providing the original image data: https://journals.plos.org/plosone/s/figures#loc-original-images-for-blots-and-gels. In your cover letter, please note whether your blot/gel image data are in Supporting Information or posted at a public data repository, provide the repository URL if relevant, and provide specific details as to which raw blot/gel images, if any, are not available. Email us at plosone@plos.org if you have any questions 3. Thank you for stating the following in the Acknowledgments Section of your manuscript: "We thank Professor Gunnar C. Hansson for valuable discussions. This work was supported by the Swedish Society for Medical Research (Svenska Sällskapet för Medicinsk Forskning, grant S17-0005), National Institutes of Health (grants 5U01AI095542-08-WU-19-95 and 5U01AI095542-09-WU-20-77), Wenner-Gren Foundations (grants FT2017-0002, UPD2018-0065, and WUP2017-0005), Jeansson Foundations (grant JS2017-0003), and the Åke Wiberg Foundation (grant M17-0062)." We note that you have provided funding information that is not currently declared in your Funding Statement. However, funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript and let us know how you would like to update your Funding Statement. Currently, your Funding Statement reads as follows: "TP was supported by - Grant S17-0005, Swedish Society for Medical Research, https://www.ssmf.se - Grants 5U01AI095542-08-WU-19-95 and 5U01AI095542-09-WU-20-77, National Institutes of Health, https://www.niaid.nih.gov - Grants FT2017-0002, UPD2018-0065, and WUP2017-0005, Wenner-Gren Foundations, https://www.swgc.org/ - Grant JS2017-0003, Jeansson Foundations, http://jeanssonsstiftelser.se/en/ - Grant M17-0062, Åke Wiberg Foundation, https://ake-wiberg.se/ The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." Please include your amended statements within your cover letter; we will change the online submission form on your behalf. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: This study looks to resolve a complex locus of the human genome that contains a cluster of transmembrane mucin genes. Since mucin genes are rich in repeats, short read sequencing technologies have trouble genotyping mucins, and can even lead to misassemblies or gaps in the genome. The authors focus on resolving this specific gene locus by analyzing recently available human genome assembly, CH1M3. By integrating cross-primate syntenic analysis, available tissue expression data, and validation approaches, the authors confirm the presence of MUC3B in the transmembrane mucin cluster. Overall the authors findings help resolve a complicated locus in the human genome. I think the paper adds valuable data to the mucin genetics community. I have some comments that should be addressed. Major Comments: Although the authors are first to better resolve this gene locus using the new T2T human reference genome in an independent study, the authors only briefly stated that MUC3B has been previously found and distinguished from MUC3A : https://www.sciencedirect.com/science/article/abs/pii/S0006291X00934065?via%3Dihub, https://www.jbc.org/article/S0021-9258(20)75740-6/fulltext. Recent papers have highlighted this locus as having MUC3-like absence in the hg38 reference genome: https://www.nature.com/articles/s41588-018-0273-y.pdf. These papers should be discussed further as they described MUC3B at the sequence and transcriptomic levels previously. Overall more emphasis should be placed on prior discovery and distinguish the findings of this particular study from previous work In Line 188 the authors state that the MUC3B “exclusively” evolved in Catarrhini? Was there any sort of BLAST analysis, or syntenic analysis in other species where MUC3B may be ancestral or have evolved recurrently in other species. If not, the language should be toned-down and stated that other species outside of primates were not examined. Minor Comments: The author should assess the language more carefully in the manuscript. There are several typos and grammatical issues, particularly in the Discussion section. The examples include but are not limited to the following: Line 38 : “characterized” is used twice Line 49: “investigation of” should read “to investigate” Line 63: “reach” should read “reaches” Line 229: “in” should be “on” Line 271: should the second “MUC3A” in the sentence read “MUC3B”? Line 308: The opening sentence of the Discussion is not a complete sentence. Line 309: “As results” should read “as a result” Line 310: “uncomplete” should read “incomplete” In the introduction lines 69-81 the authors describe the MUC3 sequence of the mouse as having homology to MUC17. There is no paper cited for this claim. Reviewer #2: The manuscript entitled “Discovery of a MUC3B gene reconstructs the membrane mucin gene cluster on human chromosome 7” used patients’ tissue from colonoscopy to construct sequence alignment and sequence based on VNTRs in human genome assemblies. The study design was well with valid verification. However, is the novel gene reconstructs specific for intestinal tissues? Or general expression? Please discuss this part in the discussion part. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 23 Aug 2022 Response Reviewers (see “Revised Manuscript with Track Changes”) Reviewer #1 Major Comments: 1. Although the authors are first to better resolve this gene locus using the new T2T human reference genome in an independent study, the authors only briefly stated that MUC3B has been previously found and distinguished from MUC3A: https://www.sciencedirect.com/science/arNcle/abs/pii/S0006291X00934065?via%3Dihub https://www.jbc.org/arNcle/S0021-9258(20)75740-6/fulltext. Recent papers have highlighted this locus as having MUC3-like absence in the hg38 reference genome: https://www.nature.com/arNcles/s41588-018-0273-y.pdf. These papers should be discussed further as they described MUC3B at the sequence and transcriptomic levels previously. Overall more emphasis should be placed on prior discovery and distinguish the findings of this particular study from previous work. We thank reviewer #1 for these important comments. We made substantial additions to the original manuscript to address and discuss earlier works by: Gum JR et al (1997) MUC3 human intestinal mucin. Analysis of gene structure, the carboxyl terminus, and a novel upstream repetitive region. J Biol Chem 272: 26678–86 Pratt WS et al (2000) Multiple Transcripts of MUC3: Evidence for Two Genes, MUC3A and MUC3B. Biochem Biophys Res Commun 275: 916–923 Gum JR et al (2003) Initiation of transcription of the MUC3A human intestinal mucin from a TATA-less promoter and comparison with the MUC3B amino terminus. J Biol Chem 278: 49600–49609 Sherman RM et al (2018) Assembly of a pan-genome from deep sequencing of 910 humans of African descent. Nat Genet 2018 511 51: 30–35 On lines 355-359, we discuss work by Sherman et al (2018). Our sequence alignment shows that contig 316 aligns with exon 2 of MUC3B in T2T-CHM13. Our alignment also reveals that contig 316 contains synonymous, missense, and nonsense SNPs as well as in-frame insertions and deletions. The alignment is presented in S1A Fig. On lines 363-366, we discuss work by Gum et al (1997). The study identified the MUC3 gene (later named MUC3A), encoded by 11 exons. Pratt et al (2000) also identified 11 exons for MUC3A. Our study updates the exon-intron architecture of MUC3A and MUC3B, both of which consists of 12 exons. Importantly, we identified a new exon 4 that was overlooked in the earlier publications. We have added this information in S1B-C Fig. On lines 427-429, we acknowledge work by Gum et al (2003), in which the MUC3A promotor region was mapped to positions -1 - -242 upstream of the transcription start site. Our study complements these findings by adding epigenetic and evolutionary analysis of the promotor regions of MUC3A and MUC3B in primates. 2. In Line 188 the authors state that the MUC3B “exclusively” evolved in Catarrhini? Was there any sort of BLAST analysis, or syntenic analysis in other species where MUC3B may be ancestral or have evolved recurrently in other species. If not, the language should be toned-down and stated that other species outside of primates were not examined. We thank reviewer #1 for the valuable comment. We have used BLAST analysis to search for MUC3B in species belonging to New World monkeys (Platyrrhini), as well as Scandentia (treeshrew) and Dermoptera (colugos) orders, which share a common ancestor with primates. While we find partial or complete sequences for MUC3A, MUC12, and MUC17 in these branches of life, we have not been able to identify any unique sequences belonging to MUC3B outside the Catarrhini parvorder. The available genomic sequences for the species outside Catarrhini lack long reads that cover the MUC3 cluster, hampering our efforts to distinguish between MUC3A and MUC3B. We conclude that available genomes outsides Catarrhini suffer from gaps, as has been observed for human genome assemblies prior to T2T-CHM13. Consequently, we have not been able to determine when during vertebrate mammalian evolution a gene duplication event gave rise to two distinct MUC3 genes. We have toned down the language on lines 30-32. We have added the above conclusions on lines 288-294 and lines 590-594. Minor Comments: 1. The author should assess the language more carefully in the manuscript. There are several typos and grammatical issues, particularly in the Discussion section. The examples include but are not limited to the following: Line 38 : “characterized” is used twice Line 49: “investigation of” should read “to investigate” Line 63: “reach” should read “reaches” Line 229: “in” should be “on” Line 271: should the second “MUC3A” in the sentence read “MUC3B”? Line 308: The opening sentence of the Discussion is not a complete sentence. Line 309: “As results” should read “as a result” Line 310: “uncomplete” should read “incomplete” We have corrected all the above mistakes kindly pointed out by the reviewer. In addition, we have carefully assessed the language and corrected the grammar and typos. 2. In the introduction lines 69-81 the authors describe the MUC3 sequence of the mouse as having homology to MUC17. There is no paper cited for this claim. We apologize for overlooking the reference for this statement. We have added the correct reference in line 100. Reviewer #2 The manuscript entitled “Discovery of a MUC3B gene reconstructs the membrane mucin gene cluster on human chromosome 7” used patients’ tissue from colonoscopy to construct sequence alignment and sequence based on VNTRs in human genome assemblies. The study design was well with valid verification. However, is the novel gene reconstructs specific for intestinal tissues? Or general expression? Please discuss this part in the discussion part. We thank reviewer #2 for the valuable comment. In the original version of our manuscript, we analyzed single-cell RNA-seq from the human ileum, colon, and rectum as well as kidneys and liver. We found a high number of unique reads for MUC3A, MUC3B, MUC12, and MUC17 in the ileum, colon, and rectum. None of the genes were expressed in the kidneys and liver (see S3 Table, Tab A). We further experimentally validated mucin gene expression in the human ileum. Our findings are supported by the human cell atlas Tabula sapiens. Single-cell analysis of 500,000 cells from 24 organs of 15 normal human subjects showed that >60% of MUC3A+ cells constitute various populations of intestinal epithelial cells, dominated by transporting epithelial cells (enterocytes). We have added this analysis as a new S5 Fig, and we discuss the findings on lines 478-480. Notably, our findings contradict previous observations by Kyo K, et al ((2001) J Hum Genet 2001 461 46: 5–20), who detected MUC3A transcripts in a broad set of human tissues (heart, liver, prostate, and thymus) using Northern blot. Although we cannot fully explain this discrepancy, our primer-BLAST analysis shows that probes against MUC3B were in fact specific for the repetitive exon of MUC17. Probes against MUC3A were specific but hybridized with exon 2 of MUC3A, which is a highly repetitive region with tandem repeats occurring in all members of the mucin gene family. Thus, we can exclude unspecific hybridization to mucin tandem repeat transcripts in the heart, liver, prostate, and thymus. We have inserted our conclusions on lines 480-485 and in S3 Table (tab E). Submitted filename: Response to reviewers.docx Click here for additional data file. 20 Sep 2022
PONE-D-22-00623R1
Discovery of a MUC3B gene reconstructs the membrane mucin gene cluster on human chromosome 7
PLOS ONE Dear Dr. Pelaseyed, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The reviewers and I are generally pleased with the revisions. Thank you for taking the time and care to address their questions. Please see the comments from the reviewer concerning the newly published material. Beyond this, I am happy to move forward. I just want to give you a chance to include the newer material. Please submit your revised manuscript by Nov 04 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:
If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Michael Scott Brewer, Ph.D. Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The authors have answered my comments to my satisfaction. They have also thoroughly revised the language and grammar in the manuscript, and it reads much better. I have one additional suggestion that I recommend. During the time of this review, two highly relevant publications on mucins were published in high profile journals. It would be wonderful if the authors could cite these two papers in the discussion or introduction. The first proposes a new way mucins evolve (also shows MUC3 locus across species): https://www.science.org/doi/10.1126/sciadv.abm8757 . The other redefines the number of mucins in humans: https://www.nature.com/articles/s41467-022-31062-4. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No ********** [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
20 Sep 2022 Response Reviewers (see “Revised Manuscript with Track Changes”) Reviewer #1 The authors have answered my comments to my satisfaction. They have also thoroughly revised the language and grammar in the manuscript, and it reads much better. I have one additional suggestion that I recommend. During the time of this review, two highly relevant publications on mucins were published in high profile journals. It would be wonderful if the authors could cite these two papers in the discussion or introduction. The first proposes a new way mucins evolve (also shows MUC3 locus across species): https://www.science.org/doi/10.1126/sciadv.abm8757 . The other redefines the number of mucins in humans: https://www.nature.com/articles/s41467-022-31062-4. We thank reviewer #1 for these important comments. We have carefully read the recommended papers. Pajic et al. (Science Advances 2022) propose convergent evolution as a mechanism by which precursor genes coding for proline-rich proteins gain serine and threonine residues that form mucin-type O-glycosylated exonic repeats. The authors suggest that the retention of novel mucins genes is beneficial for the host as mucins are mechanistic lubricants, serve as matrices enclosing immunological factors, and mediate host-microbe interactions. These arguments are also presented by us in the “Discussion” section. Thus, we have added a reference to Pajic et al. at lines 433-437. Malaker et al. (Nature Communications 2022) used in silico and mucin-selective affinity purification strategies to identify proteins with a putative mucin domain. Expectedly, the reported “mucinome” includes known members of the canonical mucin family (MUC1, MUC4, MUC5AC, MUC5B, MUC6, MUC13, MUC16, and MUC20), whereas the study did not experimentally detect other canonical mucins (MUC2, MUC3A, MUC3B, MUC12, and MUC17) since the analyzed biological samples did not have an intestinal origin. The canonical mucin family is defined by the presence of evolutionarily conserved protein domains that together with an extended and repetitive proline-threonine-serine (PTS)-rich domain construct gel-forming and membrane mucins. The conserved domains include vWD, CysD, and CK domains in gel-forming mucins [1], and SEA or NIDO-AMOP-VWD domains in membrane mucins [2-5]. Notably, many of the detected proteins by Malaker et al. do not carry these conserved domains and exhibit lower PTS content compared to canonical mucins. See table below for the PTS content of a selection of identified proteins. Uniprot Protein PTS (% of total residues) Canonical mucin (YES/NO) P15941 MUC1 44.8 Yes P98088 MUC5AC 45.6 Yes Q9H3R2 MUC13 30.6 Yes Q685J3 MUC17 56.5 Yes O00468 Agrin 21.9 No Q14118 DAG1 20.0 No Q6WRI0 IGSF10 26.4 No Q14114 LRP8 19.2 No Q6ZSS7 MFSD6 21.8 No P14543 NID1 22.3 No Q8N131 Porimin 35.6 No While the idea of a “mucinome” is conceptually intriguing, we argue for a more stringent definition of what constitutes a mucin [6]. Thus, we have referenced Malaker et al. since the study does not identify novel genes that fulfil the criteria for canonical mucins. References 1. Trillo-Muyo S, Nilsson HE, Recktenwald C V., Ermund A, Ridley C, Meiss LN, et al. Granule-stored MUC5B mucins are packed by the noncovalent formation of N-terminal head-to-head tetramers. J Biol Chem. 2018;293: 5746. doi:10.1074/jbc.RA117.001014 2. Moniaux N, Nollet S, Porchet N, Degand P, Laine A, Aubert JP. Complete sequence of the human mucin MUC4: a putative cell membrane-associated mucin. Biochem J. 1999;338: 325. doi:10.1042/0264-6021:3380325 3. Moniaux N, Escande F, Porchet N, Aubert JP, Batra SK. Structural organization and classification of the human mucin genes. Front Biosci. 2001/10/02. 2001;6: D1192-206. Available: http://www.ncbi.nlm.nih.gov/pubmed/11578969 4. Ligtenberg MJL, Kruijshaar L, Buijs F, Van Meijer M, Litvinov S V., Hilkens J. Cell-associated episialin is a complex containing two proteins derived from a common precursor. J Biol Chem. 1992;267: 6171–6177. doi:10.1016/S0021-9258(18)42677-4 5. Macao B, Johansson DG, Hansson GC, Hard T. Autoproteolysis coupled to protein folding in the SEA domain of the membrane-bound MUC1 mucin. Nat Struct Mol Biol. 2006;13: 71–76. Available: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16369486 6. Arike L, Hansson GC. The Densely O-glycosylated MUC2 Mucin Protects the Intestine and Provides Food for the Commensal Bacteria. J Mol Biol. 2016;428: 3221. doi:10.1016/J.JMB.2016.02.010 Submitted filename: Response to reviewers.docx Click here for additional data file. 21 Sep 2022 Discovery of a MUC3B gene reconstructs the membrane mucin gene cluster on human chromosome 7 PONE-D-22-00623R2 Dear Dr. Pelaseyed, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Michael Scott Brewer, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 26 Sep 2022 PONE-D-22-00623R2 Discovery of a MUC3B gene reconstructs the membrane mucin gene cluster on human chromosome 7 Dear Dr. Pelaseyed: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Michael Scott Brewer Academic Editor PLOS ONE
  49 in total

1.  Initial sequencing and analysis of the human genome.

Authors:  E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

2.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles.

Authors:  Albin Sandelin; Wynand Alkema; Pär Engström; Wyeth W Wasserman; Boris Lenhard
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

3.  MUC3 human intestinal mucin. Analysis of gene structure, the carboxyl terminus, and a novel upstream repetitive region.

Authors:  J R Gum; J J Ho; W S Pratt; J W Hicks; A S Hill; L E Vinall; A M Roberton; D M Swallow; Y S Kim
Journal:  J Biol Chem       Date:  1997-10-17       Impact factor: 5.157

4.  Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction.

Authors:  Jian Ye; George Coulouris; Irena Zaretskaya; Ioana Cutcutache; Steve Rozen; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2012-06-18       Impact factor: 3.169

5.  The complete sequence of a human genome.

Authors:  Sergey Nurk; Sergey Koren; Arang Rhie; Mikko Rautiainen; Andrey V Bzikadze; Alla Mikheenko; Mitchell R Vollger; Nicolas Altemose; Lev Uralsky; Ariel Gershman; Sergey Aganezov; Savannah J Hoyt; Mark Diekhans; Glennis A Logsdon; Michael Alonge; Stylianos E Antonarakis; Matthew Borchers; Gerard G Bouffard; Shelise Y Brooks; Gina V Caldas; Nae-Chyun Chen; Haoyu Cheng; Chen-Shan Chin; William Chow; Leonardo G de Lima; Philip C Dishuck; Richard Durbin; Tatiana Dvorkina; Ian T Fiddes; Giulio Formenti; Robert S Fulton; Arkarachai Fungtammasan; Erik Garrison; Patrick G S Grady; Tina A Graves-Lindsay; Ira M Hall; Nancy F Hansen; Gabrielle A Hartley; Marina Haukness; Kerstin Howe; Michael W Hunkapiller; Chirag Jain; Miten Jain; Erich D Jarvis; Peter Kerpedjiev; Melanie Kirsche; Mikhail Kolmogorov; Jonas Korlach; Milinn Kremitzki; Heng Li; Valerie V Maduro; Tobias Marschall; Ann M McCartney; Jennifer McDaniel; Danny E Miller; James C Mullikin; Eugene W Myers; Nathan D Olson; Benedict Paten; Paul Peluso; Pavel A Pevzner; David Porubsky; Tamara Potapova; Evgeny I Rogaev; Jeffrey A Rosenfeld; Steven L Salzberg; Valerie A Schneider; Fritz J Sedlazeck; Kishwar Shafin; Colin J Shew; Alaina Shumate; Ying Sims; Arian F A Smit; Daniela C Soto; Ivan Sović; Jessica M Storer; Aaron Streets; Beth A Sullivan; Françoise Thibaud-Nissen; James Torrance; Justin Wagner; Brian P Walenz; Aaron Wenger; Jonathan M D Wood; Chunlin Xiao; Stephanie M Yan; Alice C Young; Samantha Zarate; Urvashi Surti; Rajiv C McCoy; Megan Y Dennis; Ivan A Alexandrov; Jennifer L Gerton; Rachel J O'Neill; Winston Timp; Justin M Zook; Michael C Schatz; Evan E Eichler; Karen H Miga; Adam M Phillippy
Journal:  Science       Date:  2022-03-31       Impact factor: 63.714

6.  Long-read sequencing and de novo assembly of a Chinese genome.

Authors:  Lingling Shi; Yunfei Guo; Chengliang Dong; John Huddleston; Hui Yang; Xiaolu Han; Aisi Fu; Quan Li; Na Li; Siyi Gong; Katherine E Lintner; Qiong Ding; Zou Wang; Jiang Hu; Depeng Wang; Feng Wang; Lin Wang; Gholson J Lyon; Yongtao Guan; Yufeng Shen; Oleg V Evgrafov; James A Knowles; Francoise Thibaud-Nissen; Valerie Schneider; Chack-Yung Yu; Libing Zhou; Evan E Eichler; Kwok-Fai So; Kai Wang
Journal:  Nat Commun       Date:  2016-06-30       Impact factor: 14.919

7.  An integrative ENCODE resource for cancer genomics.

Authors:  Jing Zhang; Donghoon Lee; Vineet Dhiman; Peng Jiang; Jie Xu; Patrick McGillivray; Hongbo Yang; Jason Liu; William Meyerson; Declan Clarke; Mengting Gu; Shantao Li; Shaoke Lou; Jinrui Xu; Lucas Lochovsky; Matthew Ung; Lijia Ma; Shan Yu; Qin Cao; Arif Harmanci; Koon-Kiu Yan; Anurag Sethi; Gamze Gürsoy; Michael Rutenberg Schoenberg; Joel Rozowsky; Jonathan Warrell; Prashant Emani; Yucheng T Yang; Timur Galeev; Xiangmeng Kong; Shuang Liu; Xiaotong Li; Jayanth Krishnan; Yanlin Feng; Juan Carlos Rivera-Mulia; Jessica Adrian; James R Broach; Michael Bolt; Jennifer Moran; Dominic Fitzgerald; Vishnu Dileep; Tingting Liu; Shenglin Mei; Takayo Sasaki; Claudia Trevilla-Garcia; Su Wang; Yanli Wang; Chongzhi Zang; Daifeng Wang; Robert J Klein; Michael Snyder; David M Gilbert; Kevin Yip; Chao Cheng; Feng Yue; X Shirley Liu; Kevin P White; Mark Gerstein
Journal:  Nat Commun       Date:  2020-07-29       Impact factor: 14.919

8.  A mechanism of gene evolution generating mucin function.

Authors:  Petar Pajic; Shichen Shen; Jun Qu; Alison J May; Sarah Knox; Stefan Ruhl; Omer Gokcumen
Journal:  Sci Adv       Date:  2022-08-26       Impact factor: 14.957

Review 9.  Genetic variation and the de novo assembly of human genomes.

Authors:  Mark J P Chaisson; Richard K Wilson; Evan E Eichler
Journal:  Nat Rev Genet       Date:  2015-10-07       Impact factor: 53.242

10.  Single-cell transcriptome analysis reveals differential nutrient absorption functions in human intestine.

Authors:  Yalong Wang; Wanlu Song; Jilian Wang; Ting Wang; Xiaochen Xiong; Zhen Qi; Wei Fu; Xuerui Yang; Ye-Guang Chen
Journal:  J Exp Med       Date:  2020-02-03       Impact factor: 14.307

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.