Literature DB >> 31650958

Comment on 'Single nucleus sequencing reveals evidence of inter-nucleus recombination in arbuscular mycorrhizal fungi'.

Benjamin Auxier1, Anna Bazzicalupo2.   

Abstract

Chen et al. recently reported evidence for inter-nucleus recombination in arbuscular mycorrhizal fungi (Chen et al., 2018a). Here, we report a reanalysis of their data. After filtering the data by excluding heterozygous sites in haploid nuclei, duplicated regions of the genome, and low-coverage depths base calls, we find the evidence for recombination to be very sparse.
© 2019, Auxier and Bazzicalupo.

Entities:  

Keywords:  AMF; evolutionary biology; fungi; genetics; genomics; recombination

Mesh:

Year:  2019        PMID: 31650958      PMCID: PMC6814362          DOI: 10.7554/eLife.47301

Source DB:  PubMed          Journal:  Elife        ISSN: 2050-084X            Impact factor:   8.140


Introduction

For many years arbuscular mycorrhizal fungi (AMF) were presumed to be asexual as no one had witnessed sexual structures in these fungi. This was puzzling because AMF retain core meiosis genes (Halary et al., 2011), indicating that a meiosis-like process most likely occurs in this lineage. Previous evidence for recombination later turned out to be based on duplicated gene copies (Croll and Sanders, 2009), or ribosomal RNA sequences that were paralogs (Pawlowska and Taylor, 2004; Maeda et al., 2018). Recently, based on work comparing single nuclei whole-genome sequences to bulk sequencing data, new evidence for recombination in these fungi was reported (Chen et al., 2018a). The isolates were dikaryotic, containing nuclei of two classes defined by their mating type (MAT) locus (Ropars et al., 2016). For each sequenced nucleus, PCR-amplification was attempted to assign a mating type class (MAT-1 up to MAT-5). Recombination was then inferred based on: (i) base-pair calls classed as one mating type found in the alternate mating type; (ii) nuclei of the same mating type showing variation in consecutive blocks of single nucleotide polymorphisms (SNPs); (iii) SNPs from nucleus 7 (SL1 strain) being more similar to SNPs of the alternate mating type, consistent with a recombination event spanning the MAT locus. Here, we ask how strong the signal of within-strain recombination was in the data from Chen et al. (2018a) if we excluded heterozygous sites in haploid nuclei, duplicated regions of the genome, and low-coverage depths base calls. By removing data that cannot confidently be distinguished from sequencing errors and repeated regions, we find that the evidence for recombination is very sparse. We also report specific examples of these possible errors to justify our more stringent filtering of the data.

Results

The effect of filtering positions based on reads

Our first analysis was of the dataset reported in Supplementary file 6 of Chen et al. (2018a), used for both Figures 2 and 3 of that manuscript. We filtered out: (1) positions where any single nucleus was heterozygous (defined as sites with read depth >10, and alternate allele >10%), (2) any individual site with less than five reads coverage, and (3) positions with more than one high-confidence BLAST hit using the settings specified in Chen et al. (2018a). We applied the filters individually or in combination, to see how many of the variable sites inferred as signals of recombination would be removed. Applying each filter individually removed between 19–77% of recombined positions. Filtering of low coverage sites had the strongest effect, a ~ 75% reduction. Applying all three filters together removed 91% of recombined sites (Figure 1). Notably, these filters had much less effect on the total number of analyzed sites, with the combined application of all three filters reducing the total number of sites by only ~22%.
Figure 1.

Filtering SNP data shows a decrease of 91% in number of recombined sites, but only ~20% decrease in total sites.

Left panel shows the effect of filtering on all sites included in Supplementary file 6 of Chen et al. (2018a), while right shows the effect on the number of recombined positions. Recombined positions identified based on second criterion in Figure 1—figure supplement 1. Different symbols show the effect on the three different strains (A4, A5, and SL1) used in our re-analysis.

To identify individual recombined sites, a parsimony criterion, used for Figure 2B was applied as seen in A), where the recombined nucleus was determined based on the majority genotype for a given mating type. Note that positions 5 and 6 for nucleus E are not identified as there is one nucleus of mating type II with each genotype, this criterion cannot identify sites when there are equal number of nuclei of a mating type with each genotype. To more fully identify recombination positions, we also used a criterion of a difference in genotype between nuclei of the same mating type. This criterion is capable of identify examples such as position 5 and 6, but does not identify which sites are recombined.

Filtering SNP data shows a decrease of 91% in number of recombined sites, but only ~20% decrease in total sites.

Left panel shows the effect of filtering on all sites included in Supplementary file 6 of Chen et al. (2018a), while right shows the effect on the number of recombined positions. Recombined positions identified based on second criterion in Figure 1—figure supplement 1. Different symbols show the effect on the three different strains (A4, A5, and SL1) used in our re-analysis.
Figure 1—figure supplement 1.

Example of the two criteria used to identify recombination.

To identify individual recombined sites, a parsimony criterion, used for Figure 2B was applied as seen in A), where the recombined nucleus was determined based on the majority genotype for a given mating type. Note that positions 5 and 6 for nucleus E are not identified as there is one nucleus of mating type II with each genotype, this criterion cannot identify sites when there are equal number of nuclei of a mating type with each genotype. To more fully identify recombination positions, we also used a criterion of a difference in genotype between nuclei of the same mating type. This criterion is capable of identify examples such as position 5 and 6, but does not identify which sites are recombined.

Example of the two criteria used to identify recombination.

To identify individual recombined sites, a parsimony criterion, used for Figure 2B was applied as seen in A), where the recombined nucleus was determined based on the majority genotype for a given mating type. Note that positions 5 and 6 for nucleus E are not identified as there is one nucleus of mating type II with each genotype, this criterion cannot identify sites when there are equal number of nuclei of a mating type with each genotype. To more fully identify recombination positions, we also used a criterion of a difference in genotype between nuclei of the same mating type. This criterion is capable of identify examples such as position 5 and 6, but does not identify which sites are recombined.
Figure 2.

Heterozygous positions in single nuclei from a duplicated region were treated as homozygous in Chen et al. (2018a) and reported as evidence for a long stretch of recombination.

Panel (A) shows the base calls for positions on scaffold 70 for six nuclei (four nuclei of mating type M-1 and two of M-2, as indicated in the top row). Each base call was assigned to a mating type class (green or yellow) in Chen et al. (2018a) based on an unspecified criterion. Variation between nuclei of the same mating type (e.g. variation among nuclei 2, 21, 22, and 24) is interpreted as recombination. We used their Illumina reads to show the ratio of reads supporting alternative nucleotides for each position. For example, in strain 4, mating-type 1, nucleus 2, position 100454, in Chen et al. the base was called as a G with a mating type ‘green’, and 51 of 52 reads matched G. However, for nucleus 21, only 85 of 151 reads supported an A at that position, while the other 66 supported a G. Panel (B) shows the alignment of the region shown in (A) with its best BLAST hit region on scaffold 3570. Heterozygous sites in the mapped reads of the dikaryon are indicated in bold, with the two alternate and reference bases shown slightly above/below. Gray boxes surround those sites included in (A). Note that both regions are heterozygous at the same aligned sites, and with the same alternate base for each heterozygous site. Graphic of (A) modified from Figure 3 of Chen et al. (2018a), with the addition of nuclei 21 and 24.

Recombination, when involving crossing over, exchanges physical blocks between homologous chromosomes resulting in consecutive allelic differences. We calculated the number of recombined sites, as well as the number of recombined blocks, consecutive SNPs of the alternate haplotype, shown in Table 1 (details of identified blocks can be found in Table 1—source data 1). Based on the analysis presented in Chen et al., all isolates show recombined blocks, in some cases spanning over a thousand base pairs. However, applying our three filters reduced these blocks. After filtering no consecutive SNPs remained for strain A4. Filtering applied to strains A5 and SL1 reduced the number of recombined blocks to 2 and 4, respectively, and also reduced the length of the remaining blocks.
Table 1.

Lengths of recombination events before and after additional filtering.

List of all recombined blocks identified, which was used for Table 1.

Original dataFiltered data
Isolate (Mating types)Number of recombined positions*Number of recombined blocks (>1 consecutive SNP recombined)Number of SNPs of longest recombined block (length in bp)Number of recombined positions*Number of recombined blocks (>1 consecutive SNP recombined)Number of SNPs of longest recombined block (length in bp)
A4 Mat-1/Mat-254/314335 (1131)0/1401 (1)
A5 Mat-3/Mat-641/1831822 (2145)2/1826 (670)
SL1 Mat-5/Mat-1111/1122016 (1872)22/946 (429)

*numbers before/after the slash separate the two mating types, listed in the leftmost column. Number calculated based on the criteria shown in Figure 1—figure supplement 1A. Table 1—source data 1 contains a list of all the recombined blocks identified.

Lengths of recombination events before and after additional filtering.

Recombined block locations.

List of all recombined blocks identified, which was used for Table 1. *numbers before/after the slash separate the two mating types, listed in the leftmost column. Number calculated based on the criteria shown in Figure 1—figure supplement 1A. Table 1—source data 1 contains a list of all the recombined blocks identified.

A specific example of repeated regions associated with biallelic sites in haploid nuclei

Chen et al. (2018a) interpreted consecutive SNPs differing between nuclei of the same mating type as a sign of recombination, as highlighted in Figure 3 of their article. The filtering used in Chen et al. (2018a) removed multi-copy sites ‘[i]f BLAST results returned more than two good hits’, but retained regions with two BLAST hits. This could lead to the inclusion of SNPs that are heterozygous due to duplicated regions of the genome. To show how repeated regions may lead to a false signal of recombination, we focused on an example highlighted in Figure 3 of Chen et al. (2018a), and discussed in the main text of that article. We found that several positions on scaffold 70 from isolate A4 were heterozygous in several nuclei (Figure 2), although they are treated as homozygous in Chen et al. (2018a). High sequencing depth (>30) eliminates rare sequencing errors as the cause. To test if duplicated regions could be the cause of the heterozygosity, we performed a BLAST search against the A4 reference genome with sequences from scaffold 70:100354–100657. This search resulted in two BLAST matches: the self match on scaffold 70, as well as an additional match on scaffold 3570 (Figure 2B). When the short reads of the dikaryon (bulk sequencing of all nuclei) were aligned to the reference genome, both these SNPs on scaffold 70 and their match on scaffold 3570 were heterozygous, and the BLAST hit result for both showed 100% identity match. Thus, this repeated sequence seems to have been assembled as a chimera of the two variants in both scaffolds, and the short reads from either copy are mapped equally to both. We feel this example illustrates the need to exclude repetitive regions from analyses of recombination.

Heterozygous positions in single nuclei from a duplicated region were treated as homozygous in Chen et al. (2018a) and reported as evidence for a long stretch of recombination.

Panel (A) shows the base calls for positions on scaffold 70 for six nuclei (four nuclei of mating type M-1 and two of M-2, as indicated in the top row). Each base call was assigned to a mating type class (green or yellow) in Chen et al. (2018a) based on an unspecified criterion. Variation between nuclei of the same mating type (e.g. variation among nuclei 2, 21, 22, and 24) is interpreted as recombination. We used their Illumina reads to show the ratio of reads supporting alternative nucleotides for each position. For example, in strain 4, mating-type 1, nucleus 2, position 100454, in Chen et al. the base was called as a G with a mating type ‘green’, and 51 of 52 reads matched G. However, for nucleus 21, only 85 of 151 reads supported an A at that position, while the other 66 supported a G. Panel (B) shows the alignment of the region shown in (A) with its best BLAST hit region on scaffold 3570. Heterozygous sites in the mapped reads of the dikaryon are indicated in bold, with the two alternate and reference bases shown slightly above/below. Gray boxes surround those sites included in (A). Note that both regions are heterozygous at the same aligned sites, and with the same alternate base for each heterozygous site. Graphic of (A) modified from Figure 3 of Chen et al. (2018a), with the addition of nuclei 21 and 24.

Confidence in low coverage sites to infer recombination

The data presented in Chen et al. (2018a) was filtered with a minimum of two reads. This is a very low threshold, and insufficient even for a consensus in the event of disagreeing reads. To look at the effect of low coverage on the signal of recombination, we compared the distribution of read depths between random and recombined sites. We first needed to identify recombined sites, as the method was lacking from the original manuscript, so we applied a parsimony criterion as detailed in Figure 1—figure supplement 1. While imperfect, this method certainly underestimates recombination, as it cannot identify recombined sites when equal numbers of nuclei within a mating type have alternate genotypes. Our method identified 733 positions, sufficient for analysis. Looking at the distribution of read depths, overall SL1 nuclei had ~95% fewer high coverage sites (average of 97 sites > 10 read depth for nuclei from SL1 versus 2290 for A4 and 2441 for A5) compared to A4 and A5 (Figure 3). We note here, as described in Table 1 of Chen et al. (2018a), that SL1 nuclei cover much less of the genome (14%) compared to A4 and A5 (53% and 42%, respectively). Another fact visible from Figure 3 is that, for A4 and A5, recombined sites are overrepresented by sites with low depth compared to sub-sampled non-recombined sites (Wilcoxon ranked sum test A4; p=3.2×10−09, A5 p=2.9×10−6). We note that for nuclei from isolate SL1, fewer overall recombined sites can be identified since the decreased breadth of coverage reduces overlap between nuclei, making it difficult to say whether this pattern of excess low-coverage sites is also present (p=0.11).
Figure 3.

Recombined sites are overrepresented for low coverage sites.

Top row: Distribution of sub-sampled read depths for non-recombined sites of individual nuclei for the three isolates, showing decreased coverage overall for SL1. Bottom row: Distribution of read depths for recombined sites does not mirror the distribution of random sites. Sites identified as recombined based on the parsimony criterion diagrammed in Figure 1—figure supplement 1. Note that read depth is plotted on a log scale.

Recombined sites are overrepresented for low coverage sites.

Top row: Distribution of sub-sampled read depths for non-recombined sites of individual nuclei for the three isolates, showing decreased coverage overall for SL1. Bottom row: Distribution of read depths for recombined sites does not mirror the distribution of random sites. Sites identified as recombined based on the parsimony criterion diagrammed in Figure 1—figure supplement 1. Note that read depth is plotted on a log scale.

Genome-wide pairwise SNP differences are reduced after filtering

We then assessed the evidence for genome-wide recombination based on pair-wise SNP differences. This is the analysis presented in Figure 3 of Chen et al. (2018a), showing overall more recombination in SL1 than A4 and A5, indicated by a ‘mosaic pattern’. We again note here that sequence from SL1 nuclei covers very little of the assembly (average of 14% from Table 1 of Chen et al., 2018a) means that very few positions will be covered between any two nuclei between nuclei of SL1 (14% in one nucleus x 14% in the other nucleus = 2% in both). After applying our filters, we find that in A4 and A5 almost all differences between nuclei of one mating type disappear (Figure 4). For nuclei from SL1, the filtering reduced the differences within a mating type, but since these nuclei cover so little of the genome, the overall dataset is reduced such that on average nuclei only share 9 SNPs that can be compared. Many nuclei have no overlapping SNPs and no comparison can be made (black squares in Figure 4). A few nuclei of opposite mating types, such as nuclei 17 and 25, show high similarity, but for these pairs the similarity is based on only one or two shared SNP positions.
Figure 4.

Genetic similarity of SL1 is strongly affected by SNP filtering methods.

(A) left column panels show heatmaps generated from original data presented in Supplementary file 6 of Chen et al. (B) Right panels show data after filtering. Black squares represent pairs that do not share any SNPs. (C) Average number of SNPs in the dataset from Chen et al. (2018a) and after filtering. Note that SL1 has the lowest number of SNPs due to low sequencing breadth. (D) Average number of pairwise overlapping SNPs per nucleus, note that after filtering nuclei from SL1 have fewer than 10 SNPs overlapping on average, again due to low sequencing breadth.

The two mating type regions for A4, A5, and SL1 were identified in silico with BLAST searches using the primer sequences and methodology of Ropars et al. (2016). Two mating type loci were identified for each of the three isolates. Reads mapping to either of the two alternate mating type loci are shown below on a log scale. Colors are according to Figure 3 of Chen et al. (2018a), and colored boxes near labels indicate mating type based on PCR amplification as reported in Chen et al. (2018a). Figure 4—figure supplement 1—source data 1. contains a list of all the mating loci identified in A4, A5, and SL1 genome assemblies. This figure shows a general congruence between PCR validated mating types and the number of reads that mapped to either of the two alternate mating loci. For isolate A4, for all nuclei with aligned reads, the PCR results matched the majority of the reads (note the log scale), and 02 and 03 were successful for PCR with no short reads mapped. For isolate A5, all nuclei with a PCR amplified mating type also had a majority of reads mapping to one mating type. Nuclei from SL1 had less success with both short read or PCR, likely due to decreased amplification breadth. Critically, nuclei 07 from SL1 (starred) had no short-reads mapped to either mating locus.

Ths file includes the locations of identified loci, and primer sequences used for BLAST searches. These locations were used for aligning reads from Figure 4—figure supplement 1.

Genetic similarity of SL1 is strongly affected by SNP filtering methods.

(A) left column panels show heatmaps generated from original data presented in Supplementary file 6 of Chen et al. (B) Right panels show data after filtering. Black squares represent pairs that do not share any SNPs. (C) Average number of SNPs in the dataset from Chen et al. (2018a) and after filtering. Note that SL1 has the lowest number of SNPs due to low sequencing breadth. (D) Average number of pairwise overlapping SNPs per nucleus, note that after filtering nuclei from SL1 have fewer than 10 SNPs overlapping on average, again due to low sequencing breadth.

The critical mating type of nucleus 07 (black star) from SL1 is unsupported by short-read data.

The two mating type regions for A4, A5, and SL1 were identified in silico with BLAST searches using the primer sequences and methodology of Ropars et al. (2016). Two mating type loci were identified for each of the three isolates. Reads mapping to either of the two alternate mating type loci are shown below on a log scale. Colors are according to Figure 3 of Chen et al. (2018a), and colored boxes near labels indicate mating type based on PCR amplification as reported in Chen et al. (2018a). Figure 4—figure supplement 1—source data 1. contains a list of all the mating loci identified in A4, A5, and SL1 genome assemblies. This figure shows a general congruence between PCR validated mating types and the number of reads that mapped to either of the two alternate mating loci. For isolate A4, for all nuclei with aligned reads, the PCR results matched the majority of the reads (note the log scale), and 02 and 03 were successful for PCR with no short reads mapped. For isolate A5, all nuclei with a PCR amplified mating type also had a majority of reads mapping to one mating type. Nuclei from SL1 had less success with both short read or PCR, likely due to decreased amplification breadth. Critically, nuclei 07 from SL1 (starred) had no short-reads mapped to either mating locus.

Mating loci identified in A4, A5, and SL1 genome assemblies.

Ths file includes the locations of identified loci, and primer sequences used for BLAST searches. These locations were used for aligning reads from Figure 4—figure supplement 1.
Figure 4—figure supplement 1.

The critical mating type of nucleus 07 (black star) from SL1 is unsupported by short-read data.

The two mating type regions for A4, A5, and SL1 were identified in silico with BLAST searches using the primer sequences and methodology of Ropars et al. (2016). Two mating type loci were identified for each of the three isolates. Reads mapping to either of the two alternate mating type loci are shown below on a log scale. Colors are according to Figure 3 of Chen et al. (2018a), and colored boxes near labels indicate mating type based on PCR amplification as reported in Chen et al. (2018a). Figure 4—figure supplement 1—source data 1. contains a list of all the mating loci identified in A4, A5, and SL1 genome assemblies. This figure shows a general congruence between PCR validated mating types and the number of reads that mapped to either of the two alternate mating loci. For isolate A4, for all nuclei with aligned reads, the PCR results matched the majority of the reads (note the log scale), and 02 and 03 were successful for PCR with no short reads mapped. For isolate A5, all nuclei with a PCR amplified mating type also had a majority of reads mapping to one mating type. Nuclei from SL1 had less success with both short read or PCR, likely due to decreased amplification breadth. Critically, nuclei 07 from SL1 (starred) had no short-reads mapped to either mating locus.

Ths file includes the locations of identified loci, and primer sequences used for BLAST searches. These locations were used for aligning reads from Figure 4—figure supplement 1.

Confirming the placement of nucleus 07 (SL1) could be strong evidence for recombination

In Figure 2 of Chen et al. (2018a), nucleus 07 of SL1 shows a strong similarity with nuclei of the alternate mating type, seen by the clustering of nucleus 07 with MAT-5. However, this nucleus was PCR-genotyped to be of mating type MAT-1. The incongruence between PCR-genotyped mating type and the SNP clustering would be evidence of recombination spanning the MAT locus. To confirm this, we looked in the mapped reads of each nucleus to find reads mapping to either alternate mating type locus (Figure 4—figure supplement 1]). We found that there were no Illumina sequencing reads of nucleus 07 mapped to either mating type locus, indicating the whole genome amplification step may not have amplified the mating locus. As such, we have no available evidence for the mating type of this nucleus. Without corresponding Illumina evidence, we consider the PCR product the sole remaining evidence of this recombination event. This PCR experiment that is not confirmed with Illumina data represents the only remaining evidence for recombination after read filtering. We find cross-contamination of the PCR to be a more likely scenario in the face of many billions of sequenced bases from an Illumina run.

Conclusion

Finding a balance between filtering poor data and losing informative data is a critical component of any analysis. For this dataset, we provide evidence for the necessity of stringent filtering to avoid inferences based on erroneous or misleading data. We do not consider our filters to be particularly strict, as removing low coverage sites, repeated regions, and heterozygous sites from haploid data is commonplace in genomic analyses. In SL1, three blocks of consecutive SNPs remained after our filtering, and two regions in A5. Some of these regions likely remain because our heterozygosity filter requires a minimum of ten reads, thus low coverage heterozygous sites are not excluded. This analysis used the first 100 contigs, covering approximately 10 Mb. As such, finding only a handful of putative recombined SNPs certainly cannot be confidently separated from amplification/sequencing noise. While small blocks of genetic exchange may be compatible with gene conversion, the limited number of markers involved greatly increases the difficulty in identifying high-confidence gene conversion events (Wijnker et al., 2013; Qi et al., 2014). Mapping recombination inside repetitive regions would require longer reads than available from standard short-read technologies. This is a formidable task due to the input requirements of PacBio technologies, but it has been accomplished using linked short reads from individual pollen cells (Sun et al., 2019). Notably, the use of a more contiguous reference genome will actually include additional repetitive regions, and the exclusion of repetitive regions will become even more important. As the apparent recombined blocks are much smaller than the contigs, a more contiguous genome assembly would not change our analysis. Single nucleus genome amplification with multiple displacement amplification produces extremely variable genome coverage. Normalization of the data was shown to improve the quality of AMF genomic data when coverage is variable (Montoliu-Nerin et al., 2019). In addition to normalization, low coverage sites could still be used with a model-based approach, which incorporates the associated uncertainty (Hinch et al., 2019; Bloom et al., 2013). Finally, removing heterozygous sites from haploid single-nuclei seems like a self-evident requirement. The conserved meiosis genes found in the genomes of Glomeromycotan species strongly suggests a meiosis-like process allowing recombination and re-shuffling of genetic material among genomes. Uncovering the details of this process would be a major scientific breakthrough. Given this importance, claims made regarding recombination in Glomeromycota require rigorous examination. While we acknowledge that the models presented in Chen et al. (2018a) are valuable framework to test hypotheses for meiosis-like mechanisms found in these fungi, the data presented are not robust enough to support or reject them. As such, we believe that one of the greatest remaining mysteries in mycology remains unknown.

Materials and methods

Raw data

We obtained the SPAdes assemblies for the three dikaryotic R. irregularis isolates from NCBI, as well as the paired-end read libraries from the dikaryons and the paired-end reads of the single nuclei. Details of the accession numbers used are found in Chen et al. (2018a).

Read processing

Short reads were cleaned with FASTP (Chen et al., 2018b), then aligned using BWA mem as in Chen et al. (2018a). Reads per nucleus were analyzed using the python modules pysam (Heger and Jacobs, 2019, and BLAST searches were scripted using biopython (Cock et al., 2009). Scripts used are available on GitHub repository (https://github.com/BenAuxier/Chen.2018.Response; Auxier, 2019; copy archived at https://github.com/elifesciences-publications/Chen.2018.Response). The parsimony criterion shown in Figure 1—figure supplement 1A. for identification of recombined sites, was performed manually. Filtering of excel files and calculations of recombined sites was performed with the R statistical language, which was also used to prepare plots with ggplot2 and distance matrices using ape (Wickham, 2016; Paradis and Schliep, 2019). We compared the coverage representation between non-recombined and recombined sites. We subsampled from non-recombined sites to match the number of reads in recombined sites and performed Wilcoxon ranked sum test in R between the subsampled set of non-recombined reads and the recombined reads.

Mating-type loci identification and mapping

As the location of the mating type loci was not specified in the results of Chen et al. (2018a), we identified them based on data from Ropars et al. (2016). Specifically, we used the primers sequences kary001, kary002, and kary003 as query sequences for BLAST searches against the A4, A5, and SL1 genomes. Each primer sequence had strong matches against two different scaffolds, consistent with divergent ideomorphs as found in Ropars et al. (2016). As these primers only target a small region, we extended the locus using the annotations found on NCBI to identify the boundaries of the pair of genes. These locations are reported in Figure 4—figure supplement 1—source data 1.. As no annotations are available for SL1, we used the entire sequence of the ideomorph on scaffold511 of A5 as a BLAST query to identify the homologous regions. With the mating locus identified, we then counted the number of reads that mapped to each sequence using samtools (Li et al., 2009).

Calculation of overlapping SNPs

To calculate the expected number of overlapping SNPs found in Figure 4D, we used the following formula: In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included. Thank you for submitting your article "Comment on 'Single nucleus sequencing reveals evidence of inter-nucleus recombination in arbuscular mycorrhizal fungi'" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by Raphael Mercier as the Reviewing Editor, Patricia Wittkopp as the Senior Editor, and Peter Rodgers, the eLife Features Editor. The reviewers have opted to remain anonymous. We invite you to submit a revised version of your manuscript that addresses the comments of reviewer #2 and reviewer #3 (please see below; reviewer #1 did not raise any points for you to address). Please also submit a point-by-point response to these comments. Reviewer #2: The authors present a rather convincing case for a re-appraisal of the data and conclusions of Chen et al., 2018. However, there are two major issues that should be addressed before publication. 1) More information is needed in order for readers to be able to correctly interpret the data presented in Table 1 and to relate it to the parsimony methodology presented in Figure 1—figure supplement 1. Is the number of recombined positions as defined in Figure 1—figure supplement 1B? If so, what do the numbers before and after the slash in the table entry represent? The number of recombined positions out of the total number of positions? If so, then is the total number of positions counted as only those positions where all nuclei were able to be genotyped at that position? If that is the case, then why are these numbers different from the number of shared SNPs presented in Figure 4C? And why would the number before the slash be greater than the number after the slash (as it is for SL1 after filtering)? For "recombined blocks", does this refer to what is shown and described as "sites" in Figure 1—figure supplement 1A? More information is needed to describe the criteria used for this analysis. I can imagine three possible haplotype situations that could be considered as recombinant sites according to what is presented in Figure 1—figure supplement 1A: (a) all contiguous SNPs match the haplotype assigned to the opposite mating type, (b) only one of the contiguous SNPs matches the other haplotype, and c) multiple SNPs match the other haplotype. These scenarios all have different likelihoods of representing true recombinant events. For example, scenario (a) only represents a true recombination event if the mating type is correctly assigned (as later discussed by the authors). For scenario (b), it is rather difficult to assign this to a recombination event, especially if it is only one SNP in the middle of a series of contiguous SNPs that would otherwise match the haplotype associated with its mating type or just a single SNP on that scaffold. Here, the most probable explanation might be a mutation. Scenario (c) has the strongest likelihood of representing a true recombination event, but only if there is a switch where there are contiguous SNPs from one haplotype to contiguous SNPs from another haplotype. Multiple haplotype switching would require multiple recombination events, which seems unlikely given that double crossovers are unlikely to occur within the short sequence lengths of the reported blocks in Supplementary file 2. It is not clear to me which of these scenarios is included in the parsimony criterion for sites. For positions, the attribution of SNP differences between mating types to recombination over other processes is unclear. In Figure 1—figure supplement 1B, for positions 5 and 6, only one nucleus is (D) is different from the others. This could arise through recombination or through mutation, (especially if it is only one SNP among many on a scaffold – which is not diagrammed in the example shown in Figure 1—figure supplement 1B, but could have been scored in the data). If it is several contiguous positions, then recombination is more likely than several independent mutations that would give rise to the other haplotype. The authors should clarify this in the text and/or by revising Figure 1—figure supplement 1. 2) My other major concern is the analysis presented in Figure 3 and associated commentary. The authors state "Another fact visible from Figure 3 is that for A4 and A5, recombined sites are overrepresented by sites with a read depth of 2 compared to random sites". I agree that it looks that way in the figure, but I would like to see some statistics to support this claim. The number of random sites is much, much greater than the number of recombined sites, further complicating a simple visual comparison. I suggest that the authors subsample the number of random sites to match the number of recombined sites and display this in the plot. Also, would "non-recombined" sites be a more appropriate term than random sites? The legend of this figure says that recombined sites are identified as recombined based on the parsimony criterion diagrammed in Figure 1—figure supplement 1. Is the read depth plotted for each individual SNP in a recombined site? Or is the average depth across all SNPs in a recombined site? The discrete values suggest that it is the former or that is actually referring to recombinant positions rather than sites. Reviewer #3: Auxier and Bazzicalupo express their concerns regarding an earlier study on inter-nucleus recombination in arbuscular mycorrhizal fungi, and tested their concerns regarding the data analysis by filtering the presented evidence for recombination using information on repetitive regions/heterozygous sites as well as short read alignment coverage information. In general, the search for rare events in the genome cannot be performed with relaxed filters, as false positives signal (even it is low) will affect rare events much more than frequent events. Figure 1 is absolutely convincing and clearly shows that the new filtering more strongly affects the sites that are annotated as recombined sites, as compared to the sites that are not recombined. This technical difference between recombined sites and non-recombined sites can only be true if these two sets of sites are technically different. There is no reason why recombination should lead to such a difference. This is true for the repeat analysis/heterozygous site analysis as well as for the low coverage analysis. In consequence, I do share the concerns regarding the finding of the original study. In consequence, I fully agree with the last statement of the manuscript: even though the new filtering does not remove all signals that could be recombination-induced, the (very little) remaining signal for recombination "are not robust enough to support or reject" meiosis-like mechanisms. Moreover, I would not even agree that the remaining evidence for recombination, the PCR based assessment of the mating type of nucleus 07 (SL1), is a strong evidence for recombination. It is a non-replicated experiment of a single event (as far as I understand) and thus does not meet the general criteria to support a conclusion of such large impact. My only concern regarding the comment: Even if stated in the original manuscript, I would not agree that longer conversion tracts support recombination events more than single marker conversions. Recombination does not necessarily exchange long tracts, for example gene conversion-like events could be very short (as for example shown in Arabidopsis where the majority of gene conversions is only supported by single markers). If gene conversion like mechanisms would act here, short tracts might be the expected pattern. Therefore, I do not see why the absence of long tracts should be prominently illustrated. Reviewer #2: The authors present a rather convincing case for a re-appraisal of the data and conclusions of Chen et al., 2018. However, there are two major issues that should be addressed before publication. 1) More information is needed in order for readers to be able to correctly interpret the data presented in Table 1 and to relate it to the parsimony methodology presented in Figure 1—figure supplement 1. Is the number of recombined positions as defined in Figure 1—figure supplement 1B? If so, what do the numbers before and after the slash in the table entry represent? The number of recombined positions out of the total number of positions? If so, then is the total number of positions counted as only those positions where all nuclei were able to be genotyped at that position? If that is the case, then why are these numbers different from the number of shared SNPs presented in Figure 4C? And why would the number before the slash be greater than the number after the slash (as it is for SL1 after filtering)? For "recombined blocks", does this refer to what is shown and described as "sites" in Figure 1—figure supplement 1A? Apologies for the confusion, we forgot to add the associated text to the table legend. The number before the slash is for nuclei from one mating type, and after the slash is for nuclei of the opposite mating type, with the mating types corresponding to the mating types listed in the leftmost column. These numbers differ from Figure 4C as Table 1 only includes sites identified by the criteria outlined in Figure 1—figure supplement 1, while Figure 4C is for all SNPs, between all nuclei. More information is needed to describe the criteria used for this analysis. I can imagine three possible haplotype situations that could be considered as recombinant sites according to what is presented in Figure 1—figure supplement 1A: (a) all contiguous SNPs match the haplotype assigned to the opposite mating type, (b) only one of the contiguous SNPs matches the other haplotype, and c) multiple SNPs match the other haplotype. It is unclear what contiguous SNPs refer to in this context. The SNPs identified by Chen et al. are rarely contiguous, as there are many invariant bases between SNPs. To clarify, no cases were identified by Chen et al. where all the SNPs on a single contig were of the opposite mating type. To hopefully clarify further, our criteria were used to per SNP identify either recombined sites in an individual nucleus, or positions where recombination is supposed to have occur, without determining which nuclei were recombinant. We understand that these criteria a not intuitive, but the authors of Chen et al. were either unwilling or unable to provide the criteria that they used for Figure 3 of their manuscript, and the original manuscript does not explain how colors in Figure 3 were assigned. Any analysis of the underlying data requires an objective set of criteria, which Chen and co-authors failed to provide. These scenarios all have different likelihoods of representing true recombinant events. For example, scenario (a) only represents a true recombination event if the mating type is correctly assigned (as later discussed by the authors). For scenario (b), it is rather difficult to assign this to a recombination event, especially if it is only one SNP in the middle of a series of contiguous SNPs that would otherwise match the haplotype associated with its mating type or just a single SNP on that scaffold. Here, the most probable explanation might be a mutation. Scenario (c) has the strongest likelihood of representing a true recombination event, but only if there is a switch where there are contiguous SNPs from one haplotype to contiguous SNPs from another haplotype. Multiple haplotype switching would require multiple recombination events, which seems unlikely given that double crossovers are unlikely to occur within the short sequence lengths of the reported blocks in Supplementary file 2. We agree with this logic, and we note that all of the claimed recombination events would involve double crossovers, as the haplotypes revert back to the original mating type in every case. Short stretches of this could indeed represent mutation, or alternatively gene conversion as emphasized by reviewer #3. It is not clear to me which of these scenarios is included in the parsimony criterion for sites. For positions, the attribution of SNP differences between mating types to recombination over other processes is unclear. In Figure 1—figure supplement 1B, for positions 5 and 6, only one nucleus is (D) is different from the others. This could arise through recombination or through mutation, (especially if it is only one SNP among many on a scaffold – which is not diagrammed in the example shown in Figure 1—figure supplement 1B, but could have been scored in the data). If it is several contiguous positions, then recombination is more likely than several independent mutations that would give rise to the other haplotype. The authors should clarify this in the text and/or by revising Figure 1—figure supplement 1. We agree with the reviewer and have tried to clarify our criteria linguistically. Additionally, we have modified Figure 1—figure supplement 1A to represent how we interpret singletons. We point to Figure 3 of Chen et al.’s original manuscript, where they use examples of singletons as well as contiguous regions. 2) My other major concern is the analysis presented in Figure 3 and associated commentary. The authors state "Another fact visible from Figure 3 is that for A4 and A5, recombined sites are overrepresented by sites with a read depth of 2 compared to random sites". I agree that it looks that way in the figure, but I would like to see some statistics to support this claim. The number of random sites is much, much greater than the number of recombined sites, further complicating a simple visual comparison. I suggest that the authors subsample the number of random sites to match the number of recombined sites and display this in the plot. We tested the conclusion of Figure 2 using the Wilcoxon ranked sum test. The statistical results are consistent with our interpretation. Results: “Another fact visible from Figure 3 is that, for A4 and A5, recombined sites are overrepresented by sites with low depth compared to non-recombined sites (Wilcoxon ranked sum test A4; p=3.2x10-09, A5 p=2.9x10-6). We note that for nuclei from isolate SL1, fewer overall recombined sites can be identified since the decreased breadth of coverage reduces overlap between nuclei, making it difficult to say whether this pattern of excess low-coverage sites is also present (p=0.11).” Materials and methods: “We compared the coverage representation between non-recombined and recombined sites. We subsampled from non-recombined sites to match the number of reads in recombined sites and performed Wilcoxon ranked sum test in R between the subsampled set of non-recombined reads and the recombined reads.” Also, would "non-recombined" sites be a more appropriate term than random sites? The original non-subsampled analysis did not exclude identified recombined sites, but we have excluded recombined sites from the subsampled analysis. We have changed the wording of Figure 3 accordingly. The legend of this figure says that recombined sites are identified as recombined based on the parsimony criterion diagrammed in Figure 1—figure supplement 1. Is the read depth plotted for each individual SNP in a recombined site? Or is the average depth across all SNPs in a recombined site? The discrete values suggest that it is the former or that is actually referring to recombinant positions rather than sites. Yes, it is the read depth per nucleus. We have clarified the legend of the figure. Reviewer #3: Auxier and Bazzicalupo express their concerns regarding an earlier study on inter-nucleus recombination in arbuscular mycorrhizal fungi, and tested their concerns regarding the data analysis by filtering the presented evidence for recombination using information on repetitive regions/heterozygous sites as well as short read alignment coverage information. In general, the search for rare events in the genome cannot be performed with relaxed filters, as false positives signal (even it is low) will affect rare events much more than frequent events. Figure 1 is absolutely convincing and clearly shows that the new filtering more strongly affects the sites that are annotated as recombined sites, as compared to the sites that are not recombined. This technical difference between recombined sites and non-recombined sites can only be true if these two sets of sites are technically different. There is no reason why recombination should lead to such a difference. This is true for the repeat analysis/heterozygous site analysis as well as for the low coverage analysis. In consequence, I do share the concerns regarding the finding of the original study. In consequence, I fully agree with the last statement of the manuscript: even though the new filtering does not remove all signals that could be recombination-induced, the (very little) remaining signal for recombination "are not robust enough to support or reject" meiosis-like mechanisms. Moreover, I would not even agree that the remaining evidence for recombination, the PCR based assessment of the mating type of nucleus 07 (SL1), is a strong evidence for recombination. It is a non-replicated experiment of a single event (as far as I understand) and thus does not meet the general criteria to support a conclusion of such large impact. We agree, but without any direct evidence against we feel the need to give the authors the benefit of the doubt. My only concern regarding the comment: Even if stated in the original manuscript, I would not agree that longer conversion tracts support recombination events more than single marker conversions. Recombination does not necessarily exchange long tracts, for example gene conversion-like events could be very short (as for example shown in Arabidopsis where the majority of gene conversions is only supported by single markers). If gene conversion like mechanisms would act here, short tracts might be the expected pattern. Therefore, I do not see why the absence of long tracts should be prominently illustrated. We discuss the length of tracts because they are highlighted by Chen et al. in their Results section: “In many cases, recombining genotypes encompass hundreds to thousands of base pairs, (Figure 3, Supplementary file 6). […] In this example, a single recombination event between genotypes harbored by the nuclei 22 (MAT-1) and 19 (MAT-2) resulted in a genetic exchange involving at least one thousand base pairs, and similar events are found elsewhere in the genome of A4.” While we agree that gene conversion could result in the transfer of single markers, Chen et al. refer to “meiotic-like processes” in the Abstract of their publication. If indeed a meiotic-like process is occuring then gene conversion should be paired with at least one crossover event per chromosome for proper segregation. It is possible that crossover events are only occurring on the distal ends of chromosomes, as in Agaricus bisporus, and these distal ends are not included in the largest 100 scaffolds. But there is no evidence presented for this scenario and we prefer not to speculate on potential mechanisms to explain Chen et al.’s low quality data. In consideration of the reviewer’s and editors comments, we have added a sentence to the conclusion acknowledging that gene conversion events would be of a size consistent with the size found by Chen et al., however gene conversion events are extremely difficult to confidently identify, as shown by Qi et al. who we have added as a reference as well as Wijnker et al. “While small blocks of genetic exchange may be compatible with gene conversion, the limited number of markers involved greatly increases the difficulty in identifying high-confidence gene conversion events (Wijnker et al., 2013; Qi et al., 2014).”
  15 in total

1.  Organization of genetic variation in individuals of arbuscular mycorrhizal fungi.

Authors:  Teresa E Pawlowska; John W Taylor
Journal:  Nature       Date:  2004-02-19       Impact factor: 49.962

2.  ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R.

Authors:  Emmanuel Paradis; Klaus Schliep
Journal:  Bioinformatics       Date:  2019-02-01       Impact factor: 6.937

3.  Factors influencing meiotic recombination revealed by whole-genome sequencing of single sperm.

Authors:  Anjali Gupta Hinch; Gang Zhang; Philipp W Becker; Daniela Moralli; Robert Hinch; Benjamin Davies; Rory Bowden; Peter Donnelly
Journal:  Science       Date:  2019-03-22       Impact factor: 47.728

4.  Evidence for the sexual origin of heterokaryosis in arbuscular mycorrhizal fungi.

Authors:  Jeanne Ropars; Kinga Sędzielewska Toro; Jessica Noel; Adrian Pelin; Philippe Charron; Laurent Farinelli; Timea Marton; Manuela Krüger; Jörg Fuchs; Andreas Brachmann; Nicolas Corradi
Journal:  Nat Microbiol       Date:  2016-03-21       Impact factor: 17.745

5.  Biopython: freely available Python tools for computational molecular biology and bioinformatics.

Authors:  Peter J A Cock; Tiago Antao; Jeffrey T Chang; Brad A Chapman; Cymon J Cox; Andrew Dalke; Iddo Friedberg; Thomas Hamelryck; Frank Kauff; Bartek Wilczynski; Michiel J L de Hoon
Journal:  Bioinformatics       Date:  2009-03-20       Impact factor: 6.937

6.  Conserved meiotic machinery in Glomus spp., a putatively ancient asexual fungal lineage.

Authors:  Sébastien Halary; Shehre-Banoo Malik; Levannia Lildhar; Claudio H Slamovits; Mohamed Hijri; Nicolas Corradi
Journal:  Genome Biol Evol       Date:  2011-08-29       Impact factor: 3.416

7.  fastp: an ultra-fast all-in-one FASTQ preprocessor.

Authors:  Shifu Chen; Yanqing Zhou; Yaru Chen; Jia Gu
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

8.  Single nucleus sequencing reveals evidence of inter-nucleus recombination in arbuscular mycorrhizal fungi.

Authors:  Eric Ch Chen; Stephanie Mathieu; Anne Hoffrichter; Kinga Sedzielewska-Toro; Max Peart; Adrian Pelin; Steve Ndikumana; Jeanne Ropars; Steven Dreissig; Jorg Fuchs; Andreas Brachmann; Nicolas Corradi
Journal:  Elife       Date:  2018-12-05       Impact factor: 8.140

9.  Recombination in Glomus intraradices, a supposed ancient asexual arbuscular mycorrhizal fungus.

Authors:  Daniel Croll; Ian R Sanders
Journal:  BMC Evol Biol       Date:  2009-01-15       Impact factor: 3.260

10.  Finding the sources of missing heritability in a yeast cross.

Authors:  Joshua S Bloom; Ian M Ehrenreich; Wesley T Loo; Thúy-Lan Võ Lite; Leonid Kruglyak
Journal:  Nature       Date:  2013-02-03       Impact factor: 49.962

View more
  2 in total

1.  Reciprocal recombination genomic signatures in the symbiotic arbuscular mycorrhizal fungi Rhizophagus irregularis.

Authors:  Ivan D Mateus; Ben Auxier; Mam M S Ndiaye; Joaquim Cruz; Soon-Jae Lee; Ian R Sanders
Journal:  PLoS One       Date:  2022-07-01       Impact factor: 3.752

2.  More Filtering on SNP Calling Does Not Remove Evidence of Inter-Nucleus Recombination in Dikaryotic Arbuscular Mycorrhizal Fungi.

Authors:  Eric C H Chen; Stephanie Mathieu; Anne Hoffrichter; Jeanne Ropars; Steven Dreissig; Jörg Fuchs; Andreas Brachmann; Nicolas Corradi
Journal:  Front Plant Sci       Date:  2020-07-07       Impact factor: 5.753

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.