Literature DB >> 35720975

Representing sex chromosomes in genome assemblies.

Sarah B Carey1,2, John T Lovell2, Jerry Jenkins2, Jim Leebens-Mack3, Jeremy Schmutz2,4, Melissa A Wilson5, Alex Harkess1,2.   

Abstract

Sex chromosomes have evolved hundreds of independent times across eukaryotes. As genome sequencing, assembly, and scaffolding techniques rapidly improve, it is now feasible to build fully phased sex chromosome assemblies. Despite technological advances enabling phased assembly of whole chromosomes, there are currently no standards for representing sex chromosomes when publicly releasing a genome. Furthermore, most computational analysis tools are unable to efficiently investigate their unique biology relative to autosomes. We discuss a diversity of sex chromosome systems and consider the challenges of representing sex chromosome pairs in genome assemblies. By addressing these issues now as technologies for full phasing of chromosomal assemblies are maturing, we can collectively ensure that future genome analysis toolkits can be broadly applied to all eukaryotes with diverse types of sex chromosome systems. Here we provide best practice guidelines for presenting a genome assembly that contains sex chromosomes. These guidelines can also be applied to other non-recombining genomic regions, such as S-loci in plants and mating-type loci in fungi and algae.

Entities:  

Year:  2022        PMID: 35720975      PMCID: PMC9205529          DOI: 10.1016/j.xgen.2022.100132

Source DB:  PubMed          Journal:  Cell Genom        ISSN: 2666-979X


THE HISTORY OF SEX CHROMOSOME ASSEMBLY

Dr. Nettie Stevens made the groundbreaking cytogenetic discovery that male mealworms (Tenebrio sp.) possessed a small chromosome that determined sex.[1] Deemed the “heterochromosome,” which we now recognize as the male-specific Y chromosome, this small chromosome was never found in eggs. Since then, sex chromosomes have been identified widely across plants, animals, and fungi.[2,3] Sex chromosomes were first discovered using microscopy and today genomic analyses enable their identification, assembly, and subsequent comparative analysis. The monumental, global effort that produced the first human genome draft published in 2000,[4] involved tiled sequencing of P1 artificial chromosomes (PACs), cosmids, and bacterial artificial chromosomes. The initial X chromosome was highly contiguous with only 14 intractable gaps.[5] It took nearly 20 more years for the human X chromosome[6] and autosomes[7] to be fully assembled, from telomere-to-telomere without any sequence gaps. Whereas substantial progress has been made in assembling the human Y chromosome,[8,9] telomere-to-telomere assembly remains unfinished due to the large heterochromatic segment taking up about two-thirds of the human Y, however, long-read sequencing is poised to resolve the complete sequence of the Y chromosomes soon as well[7,9] (Figure 1). To date, hundreds of plant and animal genomes with sex chromosomes have been sequenced, assembled, and published, with varying degrees of contiguity and completeness.[10,11] As genome sequencing technologies continue to improve with higher-fidelity long-read sequencing, combined with improvement of phased assembly and scaffolding algorithms, we expect that highly contiguous assemblies of sex chromosome pairs will soon become commonplace.
Figure 1.

Ideogram of human chromosomes

The human genome reference contains a single haplotype for autosomes (here only chromosomes 1 and 2 are shown, but the logic applies to all 22 autosomes). In contrast, both of the sex chromosomes are represented in a heterogametic assembly, which is important because, although they were once entirely homologous, they are highly diverged across most of their lengths. The male-specific region of the Y (MSY), also called the sex determination region (SDR), in humans has lost most genes and has accumulated many repeats, like in the ampliconic regions where the repeats have high sequence similarity (>99%) and can be found in palindromes or tandem arrays, and it has more heterochromatic regions when compared with the X. In contrast, the pseudoautosomal regions (PAR), which pair and freely recombine during meiosis, share 100% homology and are represented twice.

Approximately 95% of animals have separate sexes (called gonochory[12]) and 8% of land plants (called dioecy[10,13]). With several large genome projects in progress, such as the 10,000 Plants Genome Sequencing Project,[14] Earth BioGenome Project,[15] Global Invertebrates Genomics Alliance,[16] Vertebrate Genome Project,[11] and user-driven projects through the Department of Energy Joint Genome Institute (e.g., https://phytozome-next.jgi.doe.gov/ogg/), thousands of genome assemblies containing sex chromosomes will be published in the next decade. It is critical, therefore, that we develop a standard for consistent reporting of sex chromosomes in genome assemblies, if not across all gonochoric and dioecious eukaryotes, then at least for all species within taxa included in comparative analyses (e.g., mammals, birds, flies, flowering plants). The lack of standard representation of the sex chromosome pair in a genome assembly can be attributed to the immense variation in systems across eukaryotes (Table 1). Consequently, downstream analysis tools are missing rigorous considerations for accommodating the unique nature of sex chromosomes across all eukaryote lineages; indeed, many simply ignore the sex chromosomes all together. Here we outline the key issues with sex chromosome structure that impede genome assembly and describe how current technologies are poised to change these norms. Importantly, we describe key considerations for reporting sex chromosomes in genome assembly releases, encompassing X/Y, Z/W, and U/V sex chromosomes. These considerations will be crucial for ensuring that computational genomic analysis toolkits can be broadly applied to the oncoming deluge of genome assemblies with sex chromosomes, and that there is a consistent and practical format for releasing these genomes in public repositories.
Table 1.

Examples of sex chromosome variation across animals and plants

SpeciesSex chromosome cytologySource

Pufferfishproto-XYKamiya et al[17]
Garden asparagus, papaya, green anoleHomomorphic XYHarkess et al[18]; Liu et al[19]; Alfoldi et al[20]
Mealworm, human, common hop, white campionHeteromorphic XYStevens[1]; Rozen et al[21]; Winge[22]; Westergaard[23]
Japanese hopXY1Y2Kihara[24]
PlatypusX1X2X3X4X5Y1Y2Y3Y4Y5Veyrunes et al[25]
Smoky jungle frogX1X2X3X4X5X6Y1Y2Y3Y4Y5Y6Gazoni et al[26]
Spiny rat, nematodesXOKobayashi et al[27]; Hodgkin[28]
Most spidersX1X2OKral[29]
Heartwing sorrelXY and XY1Y2Smith[30]
Black muntjac deer, Drosophila mirandaneo-XYZhou et al[31]; Bachtrog and Charlesworth[32]
Strawberriesproto-ZWSpigler et al[33]
Emu, boa constrictor, red bayberryHomomorphic ZWEllegren[34]; Ohno[35]; Jia et al[36]
ChickenHeteromorphic ZWHirst et al[37]
Marsh marigold mothZOTraut and Marec[38]
Hochstetter’s frogWOGreen et al[39]
Darter characin fishZW1W2Filho et al[40]
Ancistrus catfishesZ1Z2W1W2de Oliveira et al[41]
Northeast-Asian wood white butterflyZ1Z2Z3Z4Z5Z6W1W2W3Sichova et al[42]
Western clawed frog, Burtoni cichlid fishYWZRoco et al[43]; Roberts et al[44]
Fire mossHomomorphic UVCarey et al., 2021[45]
Common liverwort, Sphaerocarpos liverwortHeteromorphic UVYamato et al[46]; Allen[47]
Dilated scalewortU1U2VSousa et al[48]

Note that many multiple sex chromosome systems may arise through the formation of neo-sex chromosomes but are not indicated here.

THE STRUCTURE OF SEX CHROMOSOMES

Over the last century of research into sex chromosome evolution, several key similarities have emerged among many, but not all, sex chromosomes. Sex chromosomes can evolve from an ancestral pair of autosomes, typically forming a region of suppressed recombination between the sex chromosome pair, called a “non-recombining region” or “sex determination region” (SDR) (Figure 1). Whereas the genes that initiate female or male sex determination typically reside in the SDR,[49] there are clear cases where these sex determination genes have translocated to other chromosomes.[50] Instead, for some systems, like in humans, a better way to refer to the non-recombining region is the male-specific region of the Y (MSY; Figure 1). To encompass a wide range of sex chromosome types across kingdoms, which we describe below, for simplicity we will use SDR to refer to the non-recombining region of a sex chromosome. In systems studied to date, the SDR varies in size, ranging from <100 kilobases (Kb) to >100 megabases (Mb), accounting for <1% to nearly 100% of a sex chromosome’s length (Figure 2). Flanking these non-recombining regions is the pseudoautosomal region (PAR), which is the homologous sequence of both sex chromosomes that pairs normally at meiosis and can recombine (Figure 1).
Figure 2.

Remarkable variation found across sex chromosomes

(A) Different routes to suppressed recombination have been identified involving inversions or hemizygosity through deletions or translocations. Some SDRs have instead evolved in regions of existing low recombination, such as centromeres.

(B) The size of the SDR varies across species, with some <1 Mb, representing <1% of the sex chromosome, while others are >110 Mb and across the entirety of the sex chromosome.

(C) There are differences in which sex contains the sex-specific chromosome. In XX/XY systems, males are XY, while females are XX. In ZZ/ZW systems, the opposite is true, where females are the heterogametic sex inheriting ZW and males are ZZ. In species that have haploid sex determination, the inheritance of a single U chromosome correlates with females and a single V with males.

(D) There is also cytological variation between the homologous pairs of sex chromosomes. Some are homomorphic, where the X and Y are the same in size, while others are heteromorphic, where either the X or Y is larger. In others, the sex-specific chromosome like the Y has been lost, and dosage of genes on the X determines sex. In other systems, several chromosomes are inherited in a sex-specific fashion, called “multiple” sex chromosomes. Neo-sex chromosomes have also been identified, where a fusion between an autosomal pair and the sex chromosomes has occurred. Examples for each of these sex chromosome types can be found in Table 1.

The SDR has been shown to evolve in existing regions of low recombination, including centromeres,[51] arise from large-scale mutations that inhibit recombination, including inversions,[52-54] deletions, or translocations, resulting in hemizygosity[18,55,56] (Figure 2), or through the gradual build-up of transposable elements.[57] While some sex chromosome pairs are stable across taxa, having a single origin tens of millions of years ago,[25,45,58,59] others are more labile and frequently transition to a new, non-homologous chromosome pair[55,60] or have a recent, independent origin from a hermaphroditic ancestor.[49] After their initial evolution, SDRs evolve on different molecular evolutionary trajectories than autosomes and PARs. The lack of recombination reduces the efficacy of natural selection, allowing for substantial changes in the sex chromosome haplotype, such as further structural variation, gene loss, and repeat accumulation.[61] An extreme example is the human XY, where 90% of the ancestral genes have been lost on the Y chromosome relative to the X over its 160 million years of evolution[62] (Figure 1). In other cases, like the flowering plant Silene latifolia, the Y chromosome has expanded with repetitive DNA to nearly twice the size of the X chromosome over the past 11 million years, but retains many homologous genes.[63,64] These “degenerative” processes occur at different structural and temporal scales across taxa, creating a kaleidoscope of sex chromosome haplotype variation.[49,65] Sex chromosomes also have incredibly diverse pairing systems, chromosomal structures, and genes that determine sex. For the purposes of this review, we define three major sexual chromosome systems that most plant and animal species fall into: X/Y, Z/W, and U/V (Figure 2). The differences between X/Y and Z/W systems depend on which sex, male or female, is heterogametic for the sex chromosome pair (i.e., can make gametes containing different sex chromosomes). In X/Y systems, males are typically heterogametic, carrying both an X and Y chromosome as a pair. Females are typically homogametic, carrying two copies of an X chromosome. In ZW systems, females are the heterogametic sex, carrying a Z and W, while males are ZZ. A third system, U/V, is found in haploid-dominant systems, where females inherit a single U chromosome and males a single V[2]. There is also remarkable diversity in sex chromosome cytotypes, including variation in the size of the Y/W compared with the X/Z (i.e., hetero- versus homogametic), dosage systems where one sex chromosome in the pair was lost (e.g., XX/XO or ZZ/ZO sex determination systems known in some species; Table 1), and multiple sex chromosome pairs (e.g., X1X2Y1Y2), as well as diversity within a species or genus, including aneuploidies and those with neo-sex chromosomes (Figure 2; Table 1). Because the non-recombinant SDRs of sex chromosomes evolve on separate evolutionary trajectories from each other and from the autosomes, the SDR haplotypes can diverge rapidly, producing tremendous sequence, structural, and functional variation among populations and species.[66]

CHALLENGES OF SEX CHROMOSOME ASSEMBLY

Because of the complex nature of SDRs, and half of the sequencing coverage relative to autosomes in XY or ZW genotypes, it is far more challenging to generate assemblies of sex chromosomes than for autosomes. Consequently, sex chromosomes have been the most poorly assembled and annotated regions of plant and animal genomes. For example, sex chromosomes in the Vertebrate Genome Project assemblies were typically more fragmented than autosomes.[11] Advances in genome sequencing, assembly, and long-range scaffolding techniques are poised to change this trend. Pacific Biosciences (PacBio) high-fidelity (HiFi) reads are medium sized (15–25 kb) and high accuracy (99%+), enabling the highly contiguous and allelephased assembly of complex genomes.[67] Oxford Nanopore Technologies reads can reach multi-Mb sizes though with a higher error rate, and were a key tool in scaffolding the first telomere-to-telomere X chromosome in humans.[6] While genome sequencing techniques have rapidly advanced, a key complication is that genome assembly algorithms are not designed with sex chromosomes in mind. The current generation of PacBio HiFi assembly algorithms, such as hifiasm,[68] IPA (https://github.com/PacificBiosciences/pbbioconda/wiki/Improved-Phased-Assembler), HiCanu,[69] and Flye[70] are designed to phase structurally similar autosomes into separate allelic haplotypes. Sex chromosomes often do not conform to this expectation, given their potentially large heteromorphy that can involve size, gene content, repeat content, and structural variation between the two members of a sex chromosome pair (Figures 1 and 2). In our experience, accurate HiFi assembly of sex chromosomes requires at least two additional analysis processes: Hi-C scaffolding and genetic inference of the identity of contigs belong to the non-recombining region of sex chromosomes. Inference of sex linkage can be aided by identification of sex-specific sequences and sex-biased sequencing coverage in analyses of relatively inexpensive short-read sequence data.[71,72] Integrated analyses of phased PacBio HiFi genome assemblies, Hi-C, and standard short-read data are now enabling the fulllength, accurately phased assembly of sex chromosomes,[73] although there are certainly cases where sex chromosome assembly will remain challenging (e.g., large genomes, polyploidy, high repeat content).

ISSUES WITH SEX CHROMOSOME INFORMATICS

Most analytical and assembly challenges stem from major sequence differences between the sex chromosomes and unique structural variation absent in autosomes. For example, the human reference genome contains 22 haploid representations of autosomal chromosomes, but a diploid representation of two structurally divergent X and Y chromosomes (Figure 1). While this is appropriate for the non-recombining and diverged regions, the homologous PARs on the ends of the X and Y are represented twice with nearly 100% sequence identity in the state-of-the-art human genome assembly. If not adequately controlled for, this duplicated region will cause erroneous interpretation of output from short-read-based analyses, with reads mapping identically to multiple places, resulting in a map quality score of 0 when both PARs are present in the genome.[74] In the human genome, these duplicated PARs represent a small amount of the total nuclear genome sequence (0.1%), likely limiting the global effects of potential biases.[8] However, the PARs are far larger in other systems (e.g., 0.7% of total nuclear sequence in Canis lupus familiaris and 11% in Asparagus officinalis).[18,75] Duplicated, meiotically homologous assemblies of these PARs could introduce major downstream analytical problems, including variant calling, gene and repeat annotation, and gene expression quantification. These issues would be compounded when using the same reference genome assembly representation (i.e., Chr01–22, X, Y, and mitochondria) for all individuals, whether they have a Y chromosome or not. For the homogametic sex (i.e., XX individuals), and samples that have lost the Y chromosome (as sometimes occurs with aging[76]), a simple solution is to soft or hard mask the Y chromosome completely, thus prohibiting mapping to this reference, but keeping it within the index for downstream analyses. The development of this approach has shown vast improvements in analyses in humans.[74,77] In contrast, for samples with evidence of a Y chromosome, one approach is to soft or hard mask one copy of the PARs (typically on the Y chromosome) prior to downstream analyses.[74] However, ad hoc modification of traditional genome analysis pipelines is limited by the lack of a standard for reporting sex chromosome complement-specific reference sequences, and by lack of reporting of important boundary regions of the sex chromosomes for each genome build. Other informatic issues exist with sex chromosomes where reference genomes contain a mixture of haploid and diploid representations of chromosomes. Any analysis step that uses coverage as a filter, as many variant callers do, will often apply the same read depth filter to the autosomes and sex chromosomes. However, genome coverage on the sex chromosomes in the heterogametic sex, for highly diverged regions, is expected to be approximately half that of autosomes, resulting in systematic biases in variant calling, though this effect has not been directly tested. While some tools focus specifically on analysis of the X chromosome in genome-wide association studies,[78] the sex chromosome pair is often removed from population genetic analyses,[79,80] which is problematic given the important role these genes have been shown to play in development and disease, among other traits.[49,76,81]

THE NEAR FUTURE OF SEX CHROMOSOME REPRESENTATION

In order for downstream (post-assembly) informatics tools to accurately incorporate the sex chromosomes, there needs to be a set of standards for reporting sex chromosomes in a genome assembly that the tools can use as input. As diverse genome sequencing technologies converge on both long and accurate reads, highly contiguous sex chromosome pair assemblies will very soon become the norm. Before this deluge of oncoming genomes, we have several recommendations for how to approach genome assembly projects. Here we discuss different scenarios for presenting and releasing sex chromosome assemblies in the context of the latest genome sequencing and assembly techniques that accommodate the diversity of sex chromosomes in eukaryotes. The goal of many large-scale genome projects is to provide a single, complete reference haplotype for a species. Ideally, the isolate used for genome sequencing should be of a known sex and this reported in the metadata and repositories in which the assembly is submitted (Box 1). For gonochoristic/dioecious species, publishing the genome sequence of an individual containing the homogametic sex chromosomes (i.e., ZZ or XX) can follow existing practices with reporting chromosomes, by numbering the autosomes and designating the X/Z chromosome. Targeting the homogametic sex also obviates many of the complications that we have discussed, such as the computational challenge of assembling highly diverged sex chromosome haplotypes. However, critically, the reference will not be adequate for ~50% of the individuals in the species (i.e., individuals carrying the Y or W) given the aforementioned immense variation in haplotype that can exist on an SDR. Therefore, it is our strong suggestion that the reference be an individual containing the heterogametic sex chromosome pair (i.e., ZW or XY). There are several possibilities for representing sex chromosomes in genome assemblies within a heterogametic individual, each with a different set of pros and cons that must be considered (Figure 3; Table 2). Like the human genome, one option is to represent a single haplotype for the autosomes and the full length of both the Y/W and the X/Z chromosomes (Figure 3). A challenge with this approach is that the PAR needs to be demarcated, otherwise there will be two chromosomes with a complement of meiotically homologous sequence that would severely complicate read mapping, protein mapping, and ab initio gene prediction and annotation. Although we recognize the PAR can sometimes be polymorphic within a species,[82,83] obscuring the demarcation of a single boundary, a highly informed boundary within the genome of the sequenced individual is vital. Similarly, representing the Y/W in full, but masking the PAR (i.e., hard mask by replacing sequence with “N” characters or soft mask by converting the sequence to lowercase) in the 3 reference release, or accompanying it, would eliminate these double-mapping issues at the outset, but maintain the context of the SDR within the chromosome (Figure 3).
Figure 3.

Solutions for representing sex chromosomes in genome assemblies

(A) In the genome release, one option is to provide the primary haplotype for the autosomes and both pairs of the sex chromosomes, like the human reference (see Figure 1).

(B) Because the PARs will be represented twice, causing issues with downstream analyses, a solution is to mask the PARs on the Y chromosome (in blue).

(C) Assembling both haplotypes is the best solution, because the entire genome would be represented twice.

(D) These first three approaches are ideal because the location of the SDR and structural variants are maintained. The hypothetical dot plot between two haplotypes highlights a large inversion on Chr01 and several structural variants in the SDR.

E and F) If assembling the whole chromosome is not possible, (E) the Y SDR could instead be represented as an alternative haplotype of the X or (F) as a separate contig. There are pros and cons for each of these representations of sex chromosomes in the genome (Table 2), but is imperative regardless of the approach for the SDR and PAR boundaries to be reported in the genome release, so comparative analyses can be undertaken.

Table 2.

Pros and cons in approaches for representing sex chromosomes in genome assemblies

Approaches for representing the genomeProConSolution for cons

Provide both sex chromosomes in fasta reference, but only one copy of each autosomeBoth sex chromosome haplotypes are available for mapping Context for each SDR representedPARs are identical and represented twice Homologous regions in the SDR with low divergence will have mapping issuesMask PARs Mask SDR for homogametic sex
Provide both sex chromosomes in fasta reference, but mask the PARsBoth sex chromosome haplotypes are available for mapping Context for each SDR represented, but only one PAR is available to mapSome SDRs are very small (<1% of the chromosome) and a chromosome composed nearly entirely of N’s would increase computational burden (e.g., storage requirements), while providing other no additional genomic information within these masked regions SDR boundaries can be variable within a speciesMaintain a version of genome assembly with and without masking in an accessible database
Provide contig of only SDRSDR available for mappingContext for location and structural variation for SDR is lostProvide coordinates for the homologous region of the SDR
Provide sex-specific chromosome as an alternate haplotypeGenome is represented as haploid (except for any alternate haplotype contigs)Context for location and structural variation for SDR is lostProvide coordinates for the homologous region of the SDR
Provide diploid genome assemblyAutosomes and sex chromosomes both represented as diploidGenerating fully phased diploid references currently a challengeMany current analysis tools are not designed for diploid assembliesUse trio-binning or Hi-C to aid in phasing
While haploid representations have been an integral first step in generating a reference genome, it is clear diploid representations, which contain homologous chromosome pairs for the entire genome, are better reflections of the genetic diversity that exists within a heterozygous individual.[84-87] Producing fully phased diploid representations of genomes, where every chromosome, both autosomes and sex chromosomes, would be represented as a homologous pair, would alleviate many of the bioinformatic complications of combining haploid and diploid chromosomal representations in a single assembly (Figure 3; Table 2). The recent advances in genome sequencing technology and analysis have unlocked the ability to produce phased diploid assemblies,[68,69] including the sex chromosome pair.[73] Further, publication of accurately phased, diploid assemblies would also aid comparative analyses of other non-recombining regions, such as large inversions on autosomal chromosomes and the S-locus in self-incompatible plants (Figure 3). However, the generation of phased diploid assemblies creates an additional problem: how should a reference genome that contains a sex chromosome pair be represented in a single fasta file? Phased genome assembly is still in its infancy, and since tools will continue to be built around the notion that phased assemblies will soon be commonplace, we propose that the most versatile path forward for representing sex chromosomes in genome assemblies is to preserve as much information as possible by publishing assemblies for each haplotype in full (Figure 3). In addition, we recommend providing genomic coordinates for the SDR/PAR in the release of these haplotype assemblies to aid in comparative analyses. This gives both the genome producer and users the ability to modify the reference genome to fit any number of bioinformatic scenarios of presenting the sex chromosome pair for a given analysis, such as hard masking PARs (Figure 3). Despite these advancements in phased diploid assembly, we realize there are biological, technical, and financial realities that limit the ability to produce such references. For example, in species with long stretches of low heterozygosity, phasing maternal and paternal haplotype blocks without high-quality trio bins is still currently difficult, meaning only a single collapsed haplotype can be assembled.[68,88,89] To accommodate situations in which a fully phased diploid assembly is intractable, a different approach for haploid representations of the sex chromosomes is to represent the Y or W as an alternative haplotype of the X or Z in assemblies[90] (Figure 3; Table 2). This may be an especially well-suited option when the SDR is a relatively small fraction of the sex chromosome like in A. officinalis, Morella rubra, or C. lupus familiaris.[18,36,75] In cases where an alternative haplotype cannot be assembled, but the Y or W can still be assembled separately, a similar approach would be to append the contig(s) containing the Y/W SDR to the primary assembly containing the autosomes and X/Z. A notable issue with these alternatives is that all necessary genomic context between the X/Z and Y/W is lost, including the true size of the Y or W chromosome, major structural variations between sex chromosomes of a heterogametic genotype, and the absolute base pair location of the SDR on the hemizygous chromosome. If using these approaches, it is also necessary to provide metadata with the location of the SDR relative to the X or Z to recover these important contexts. While diploid assemblies may be the best path forward for genome references, representing the sex chromosomes as either an alternative haplotype or as a pair in an otherwise haploid assembly, may be the most broadly applicable approach for most systems in which a fully phased diploid assembly is not feasible. UV sex chromosomes present a unique set of obstacles. Because UV systems are haploid, where females have a U chromosome and males have a V[2] (Figure 2), both sex chromosomes are sex-specific and there is no heterogametic sex to target for a genome reference. To capture the diversity between the U and V chromosomes, a genome reference will need to be generated for both sexes. This is functionally analogous to generating a phased diploid assembly, though perhaps easier to accomplish given a haploid individual only contains a single haplotype. This makes representing the individual references straightforward, by labeling the autosomes and sex chromosomes within each assembly respectively. Although, similar to diploid systems, UVs are expected to have PARs that should be demarcated on both for downstream analyses. An analogous approach can be extended to mating-type loci found in many algae and fungi. Because of the diversity of sex chromosomes that we have described, and others yet to be discovered, it is likely no one of these options will fit all scenarios. Regardless, moving toward a form of consistency is imperative, such that comparisons can easily be made across different species. This starts with unfailingly noting the sex of the genome reference, whether sex chromosomes are known in the species, and clearly noting contigs and coordinates for PARs and SDRs as part of the genome release and associated metadata (e.g., within a README file) (Box 1).

FUTURE PROSPECTS OF STUDYING SEX CHROMOSOMES

There are practical outcomes of assembling and properly representing diverse eukaryotic sex chromosomes. This includes the identification of genes and variants that are linked to sex-specific development, disease, breeding, and evolution. A consistent set of genome assembly representation standards that takes into account the unique biology of the species, as well as the quality and type of data available, will enable a powerful comparative framework to explore the veritable smorgasbord of sex chromosome evolution, function, and diversification.
  82 in total

1.  Occurrence of multiple sexual chromosomes (XX/XY1Y2 and Z1Z1Z2Z2/Z1Z2W1W2) in catfishes of the genus Ancistrus (Siluriformes: Loricariidae) from the Amazon basin.

Authors:  Renildo Ribeiro de Oliveira; Eliana Feldberg; Maeda Batista dos Anjos; Jansen Zuanon
Journal:  Genetica       Date:  2007-11-25       Impact factor: 1.082

2.  HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads.

Authors:  Sergey Nurk; Brian P Walenz; Arang Rhie; Mitchell R Vollger; Glennis A Logsdon; Robert Grothe; Karen H Miga; Evan E Eichler; Adam M Phillippy; Sergey Koren
Journal:  Genome Res       Date:  2020-08-14       Impact factor: 9.043

Review 3.  The relative and absolute frequencies of angiosperm sexual systems: dioecy, monoecy, gynodioecy, and an updated online database.

Authors:  Susanne S Renner
Journal:  Am J Bot       Date:  2014-09-24       Impact factor: 3.844

Review 4.  Plant sex chromosomes defy evolutionary models of expanding recombination suppression and genetic degeneration.

Authors:  Susanne S Renner; Niels A Müller
Journal:  Nat Plants       Date:  2021-03-29       Impact factor: 15.793

5.  Tree of Sex: a database of sexual systems.

Authors: 
Journal:  Sci Data       Date:  2014-06-24       Impact factor: 6.444

6.  Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome.

Authors:  Aaron M Wenger; Paul Peluso; William J Rowell; Pi-Chuan Chang; Richard J Hall; Gregory T Concepcion; Jana Ebler; Arkarachai Fungtammasan; Alexey Kolesnikov; Nathan D Olson; Armin Töpfer; Michael Alonge; Medhat Mahmoud; Yufeng Qian; Chen-Shan Chin; Adam M Phillippy; Michael C Schatz; Gene Myers; Mark A DePristo; Jue Ruan; Tobias Marschall; Fritz J Sedlazeck; Justin M Zook; Heng Li; Sergey Koren; Andrew Carroll; David R Rank; Michael W Hunkapiller
Journal:  Nat Biotechnol       Date:  2019-08-12       Impact factor: 54.908

7.  A General Model to Explain Repeated Turnovers of Sex Determination in the Salicaceae.

Authors:  Wenlu Yang; Deyan Wang; Yiling Li; Zhiyang Zhang; Shaofei Tong; Mengmeng Li; Xu Zhang; Lei Zhang; Liwen Ren; Xinzhi Ma; Ran Zhou; Brian J Sanderson; Ken Keefover-Ring; Tongming Yin; Lawrence B Smart; Jianquan Liu; Stephen P DiFazio; Matthew Olson; Tao Ma
Journal:  Mol Biol Evol       Date:  2021-03-09       Impact factor: 16.240

8.  Gene survival and death on the human Y chromosome.

Authors:  Melissa A Wilson Sayres; Kateryna D Makova
Journal:  Mol Biol Evol       Date:  2012-12-04       Impact factor: 16.240

9.  Primary sex determination in the nematode C. elegans.

Authors:  J Hodgkin
Journal:  Development       Date:  1987       Impact factor: 6.868

10.  The red bayberry genome and genetic basis of sex determination.

Authors:  Hui-Min Jia; Hui-Juan Jia; Qing-Le Cai; Yan Wang; Hai-Bo Zhao; Wei-Fei Yang; Guo-Yun Wang; Ying-Hui Li; Dong-Liang Zhan; Yu-Tong Shen; Qing-Feng Niu; Le Chang; Jie Qiu; Lan Zhao; Han-Bing Xie; Wan-Yi Fu; Jing Jin; Xiong-Wei Li; Yun Jiao; Chao-Chao Zhou; Ting Tu; Chun-Yan Chai; Jin-Long Gao; Long-Jiang Fan; Eric van de Weg; Jun-Yi Wang; Zhong-Shan Gao
Journal:  Plant Biotechnol J       Date:  2018-08-10       Impact factor: 9.803

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.