Literature DB >> 28968734

MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization.

Kazutaka Katoh, John Rozewicki, Kazunori D Yamada.

Abstract

This article describes several features in the MAFFT online service for multiple sequence alignment (MSA). As a result of recent advances in sequencing technologies, huge numbers of biological sequences are available and the need for MSAs with large numbers of sequences is increasing. To extract biologically relevant information from such data, sophistication of algorithms is necessary but not sufficient. Intuitive and interactive tools for experimental biologists to semiautomatically handle large data are becoming important. We are working on development of MAFFT toward these two directions. Here, we explain (i) the Web interface for recently developed options for large data and (ii) interactive usage to refine sequence data sets and MSAs.

Entities: Chemical Disease Species

Keywords: multiple sequence alignment; phylogenetic tree; sequence analysis

Mesh：

Year: 2019 PMID： 28968734 PMCID： PMC6781576 DOI： 10.1093/bib/bbx108

Source DB: PubMed Journal: Brief Bioinform ISSN： 1467-5463 Impact factor: 11.622

Multiple sequence alignment (MSA) is an important step in comparative analyses of biological sequences. We provide an online service for computing MSAs on the Web using MAFFT [1, 2]. MAFFT has several different options for computing large MSAs consisting of thousands of sequences. Our service also has some additional functions (interactive sequence selection and phylogenetic inference) for preprocessing and postprocessing MSA. Moreover, these processes can be circularly performed as necessary. Here, we describe usage of these functions, including recently added ones, and several tips for using our online service.

MSA of large data

The demand for MSAs with a large number of sequences is increasing along with the advance of sequencing technologies. The default option of MAFFT, FFT-NS-2, is applicable to most cases, but MAFFT has more options for constructing large MSAs. They can be selected in a designated page for large alignment on the MAFFT server: http://mafft.cbrc.jp/alignment/server/large.html. Below, we briefly explain the options available on this page. Headings (A)–(G) correspond to those in Figure 1. Benchmark results of these options are shown in Table 1. Commands for locally running those options are available in the last section.

Figure 1

Screenshot of input page for large MSAs in MAFFT online service. (A–G) are explained in the main text.

Table 1

Results of two different benchmarks, ContTest (136 entries, 1467–43 912 sequences) [3] and HomFam (89 entries; 93–93 681 sequences) [4], for some MAFFT options available on our online server

Method		ContTest		HomFam
Method		Accuracy score	CPU time (minutes)	Accuracy score (SP/TC)	CPU time (minutes)
A	PartTree (partsize = 50)	0.4103	61	0.7862/0.5658	47
	PartTree (partsize = 1000)	0.4364	140	0.8258/0.6377	94
	DPPartTree (partsize = 50)	0.4424	210	0.8413/0.6597	160
	DPPartTree (partsize = 1000)	0.4632	1000	0.8541/0.6934	820
B	FFT-NS-1	0.4856	170	0.8491/0.6669	160
B+C	FFT-NS-1 (memsavetree)	0.4835	280	0.8416/0.6667	260
D	FFT-NS-2	0.4998	500	0.8759/0.7162	460
D+C	FFT-NS-2 (memsavetree)	0.5099	1100	0.8611/0.7023	990
E	mafft-sparsecore (p = 100)	0.5153	730	0.8821/0.7274	650
	mafft-sparsecore (p = 500)	0.5361	1200	0.8970/0.7586	1300
	mafft-sparsecore (p = 1000)	0.5440	3400	0.9075/0.7810	4400
E+C	mafft-sparsecore (p = 100, memsavetree)	0.5298	1500	0.8845/0.7416	1300
	mafft-sparsecore (p = 500, memsavetree)	0.5438	2000	0.8995/0.7638	2000
	mafft-sparsecore (p = 1000, memsavetree)	0.5428	4200	0.9052/0.7826	5000
F	G-INS-1	0.5696	55 000	0.9306/0.8288	49000
G	Randomchain	0.5425	100	0.8349/0.6681	88

Note: The sum-of-pairs (SP) and total-column (TC) scores for HomFam were calculated by the FastSP program [5]. (A–G) correspond to the techniques explained in the main text. Command-line arguments are displayed after performing the calculation on the online service and also listed in the main text. Random numbers are used in (A), (E) and (G). In this test, only one set of random numbers was used for each method. For (E) and (G), seed of random numbers can be specified in the download version (see the last section in the main text) but cannot be specified in the online version. See https://mafft.sb.ecei.tohoku.ac.jp/ for detailed results.

Results of two different benchmarks, ContTest (136 entries, 1467–43 912 sequences) [3] and HomFam (89 entries; 93–93 681 sequences) [4], for some MAFFT options available on our online server Note: The sum-of-pairs (SP) and total-column (TC) scores for HomFam were calculated by the FastSP program [5]. (A–G) correspond to the techniques explained in the main text. Command-line arguments are displayed after performing the calculation on the online service and also listed in the main text. Random numbers are used in (A), (E) and (G). In this test, only one set of random numbers was used for each method. For (E) and (G), seed of random numbers can be specified in the download version (see the last section in the main text) but cannot be specified in the online version. See https://mafft.sb.ecei.tohoku.ac.jp/ for detailed results. Screenshot of input page for large MSAs in MAFFT online service. (A–G) are explained in the main text. A. PartTree and DPPartTree (Figure 1A) [6] are highly approximate options. These methods recursively cluster sequences and simultaneously compute a distance between the clusters, each of which is represented by a single sequence. The order of the computational time is , where N is the number of sequences. They are fast and applicable to large MSAs, but accuracy is sacrificed because of the approximation of guide tree calculation (Table 1). The PartTree and DPPartTree options share a basic design, but the former uses k-mer-based distance to estimate the similarity between sequences [7], while the latter uses dynamic programming (DP) [8] to estimate the similarity. Accordingly, the latter is slower but more accurate. In the command-line version, the balance between accuracy and speed can also be adjusted by a parameter, partsize, but this parameter is fixed to 1000 in the online service. B. FFT-NS-1 (Figure 1B): This is another approximate method. Its accuracy is higher than PartTree and DPPartTree in benchmark tests (Table 1). The input sequences are progressively aligned using a guide tree [6, 9, 10]. For constructing the guide tree, pairwise distances are computed based on the number of shared k-mers. The length of k-mer is 6 for both protein and nucleotide data, but 20 amino acids are grouped into six physicochemical groups [11], and an amino acid sequence is converted to a sequence composed of six letters. The current version of MAFFT uses the following formula to compute distance D between sequences i and j: where S is alignment score between sequences i and j. f(x, y) adjusts the distance to avoid a case where the distance between unrelated sequences happens to become zero when a long sequence and a short sequence are compared. where x and y are the lengths of the longer and the shorter sequence i or j, respectively. a, b and c are empirically determined parameters; a = 0.1, b = 10 000 (nucleotide), 2500 (amino acid) and c = 0.01. As D is computed for all sequence pairs, the computational time is proportional to N2, where N is the number of sequences. The space complexity is also O(N2) by default. To build a guide from distances, MAFFT uses a UPGMA-like method with a small modification [12]. When merging clusters L and R into a new cluster P, distance D from P to a third cluster C is calculated with: The resulting tree becomes more imbalanced [13] with smaller values of parameter s (0 ≤ s ≤1). The default s value has been unchanged from 0.1 since the initial release in 2002, but can be specified with the --mixedlinkage flag in the download version. C. To compute a guide tree with less RAM, a low-memory mode is available but not enabled by default (Figure 1C). If a calculation in the online service requires more RAM than a threshold, then the calculation is terminated and an error message is returned instructing the user to select the low-memory mode. In this mode, instead of storing a full distance matrix in RAM, distances are calculated two times during the tree building step. Accordingly, the calculation time is longer than the normal mode. D. FFT-NS-2 (Figure 1D): This is the default option of MAFFT. In this method, after performing FFT-NS-1, a new distance matrix and guide tree are recalculated based on the MSA, and then the final MSA is built using the new guide tree. In benchmark tests, the accuracy is generally improved by the recalculation of the guide tree as shown in Table 1. This method is at least two times slower than FFT-NS-1. The low-memory mode (Figure 1C) is also available for FFT-NS-2. E. mafft-sparsecore (Figure 1E) [12] is a combination of the iterative refinement method [14-16] and the progressive method. It aims to improve the alignment accuracy by partly applying the iterative refinement method, which is known to be more accurate than the progressive method. The procedure was described in Yamada et al. (2016) [12]: (i) the input sequences are sorted by length. From the upper n% of the sorted sequences, p sequences are randomly selected as ‘core’ sequences. The default values of n and p are 50 and 500, respectively. (ii) An MSA of the p core sequences is constructed by an iterative refinement option, G-INS-i. (iii) The remaining sequences are added to the core MSA using the –add option [17], which uses the progressive alignment method. The accuracy and speed are controlled by the parameter p. With larger p, the accuracy is improved, but computational cost becomes higher (Table 1), as more sequences are subjected to the iterative refinement calculation. The memory usage is mainly determined by the progressive alignment stage (iii). The low-memory mode (Figure 1C) is also available for mafft-sparsecore. F. G-INS-1 (Figure 1F): This gives more accurate MSAs [12, 18] but takes a longer computational time and requires more RAM than other methods. This method uses an accurate guide tree based on all-to-all DP calculation and a scoring function similar to COFFEE [19] in progressive alignment. We are developing a memory-efficient version of G-INS-1, which runs in parallel on distributed memory systems or shared memory systems (manuscript in preparation). This option is experimentally supported at http://mafft.cbrc.jp/alignment/server/large-lsf.html. G. Pileup (Figure 1G): This is the simplest strategy. The first and the second sequences are first aligned. Then, the other sequences are added to the alignment in the order in the input file. Random chain: This is similar to Pileup, but the order of sequences is randomized. The usefulness of this strategy is controversial [3, 12, 13, 20, 21]. However, because these methods have an advantage in computational simplicity, we have made them available in our service.

Selecting suitable strategies

To select suitable MAFFT options for specific problems, consider the following factors. For aligning a small number of sequences, the iterative refinement method is known to effectively improve the accuracy, as noted above. However, for large-scale MSAs (the subject of this article), the effect of iterative refinement was recently assessed to be small. More specifically, in Figure 1 in Le et al. [18], the advantage of MAFFT-L-INS-i (an iterative refinement method) over MAFFT-L-INS-1 (a progressive method) was clearly observed for a small number of sequences but not for thousands of sequences. Moreover, a direct application of the iterative refinement method to large sequence data sets is difficult in terms of computational resources. In benchmarks with ∼1000–100 000 sequences, G-INS-1 outperforms other methods in accuracy as shown in Table 1. The difference is statistically significant in several cases. Thus, this method is first recommended if computational resources allow. We are making an effort to decrease the computational resources required by this method. If it is difficult to apply G-INS-1, then the next candidate would be mafft-sparsecore, which uses the advantage of iterative refinement for small MSAs. These two methods can be applied to typical protein sequences with <10 000 sites, but cannot be applied to long DNA sequences. In such a case, FFT-NS-2 or FFT-NS-1 can be useful, as the computational time is proportional to , where L is sequence length, because of the FFT approximation [1]. However, this is only when the input sequences share global homology (from 5′ end to 3′end), and the similarity level is high. MAFFT cannot handle data with genomic rearrangements, such as inversions and translocations. Also note that an MSA can be built only when the sequences are all homologous. It does not make sense to construct an MSA of nonhomologous sequences. For a data set with much >100 000 sequences, PartTree and DPPartTree, can be applied if the sequences are homologous, as their time complexity is O(NlogN), where N is the number of sequences. However, there are also other popular programs, such as Clustal Omega [4] and UPP [22], for this purpose. The PartTree algorithm contributed to these programs theoretically and/or practically. Clustal Omega uses the mBed algorithm [23] to build a guide tree with a time complexity of . UPP uses PASTA [24] to build a backbone MSA of a small number of sequences and then adds the remaining sequences using hmmalign [25] with the time complexity of O(N). PASTA uses MAFFT-PartTree to generate the initial MSA and MAFFT-L-INS-i (an iterative refinement option for small data) to generate sub-MSAs of closely related sequences. Performance comparison including these methods can be seen on https://mafft.sb.ecei.tohoku.ac.jp/, which also includes detailed benchmark results for subsets with different data sizes. Similarity level and difference in sequence lengths also should be considered. If the sequences are highly similar to each other and their lengths are also similar, then fast methods, such as FFT-NS-1 or even Pileup should result in a useful MSA. If the input data have fragmentary sequences and full-length sequences, then a two-step strategy sometimes works well. That is (i) align the full-length sequences first and then (ii) add the fragmentary sequences to the full-length MSA using the --addfragments option (see next section).

Use of existing MSA

Each step of the calculation of mafft-sparsecore (Figure 1E) can be separately or manually performed. If a reliable MSA and a set of unaligned sequences are given to http://mafft.cbrc.jp/alignment/server/add_sequences.html, then an MSA of all the sequences is returned, in which the existing MSA is preserved as the original one. Several variants, --add, --addfull, --addfragments and --addlong, are available. They can be selected according to the relative length of new sequences to the existing MSA as illustrated in Figure 2. The four options work similarly to each other. However, sequences added with the --add option are subjected to distance calculation with time complexity of O(N2), where N is the number of sequences. In the other three options, distances between the sequences in the existing alignment are computed with a time complexity of O(M2), where M is the number of sequences in the existing MSA, to build a tree of the M sequences using the UPGMA-like method (see above). For each of (N−M) sequences to be added, distances to the M sequences are computed to locate the position of the sequence in the tree, followed by the building of an alignment of (M + 1) sequences. Then, a full MSA is built from the (N−M) MSAs. The latter strategy is useful when the new sequences do not overlap with each other (as in the case of fragmentary sequences) and when the phylogenetic relationship between new sequences is not necessary to consider. There are several other tools, such as hmmalign [25], PaPaRa [26] and PAGAN [27], to add sequences to an existing MSA.

Figure 2

Variants of --add option.

Variants of --add option. Note that the length of the resulting MSA can differ from that of the original MSA. This is because additional gaps are necessary when new sequences have insertions. All-gap sites, if any, in the original MSA are deleted. As such changes in length are not useful in some cases, we have implemented a new option, --keeplength, in which (1) insertions in the new sequences are deleted and (2) all-gap sites in the original MSA are reinserted as shown at the right end in Figure 2. This option is selectable in the online version and sometimes useful for mapping new sequences to a reference MSA.

Interactive sequence choice and visualization

Recently, we have access to huge amounts of sequence data from widely divergent organisms, but the quality of the data is not always high because of the limitations of sequencing technologies. In the case of amino acid sequence data, the difficulty in eukaryotic gene prediction [28-30] also results in errors in data. It might be possible to automatically exclude such problematic data in certain cases, but sometimes, biologically important information is in low-quality sequences, especially when interest is in nonmodel organisms. For such cases, it is necessary to manually choose sequences, but this is becoming difficult because of increasing data size. Therefore, an interactive tool to help this process is necessary. Our service has some functions for this purpose as explained in Kuraku et al. [31]. Sequences can be selected/unselected one by one in the sequence selection window (Figure 3B). Moreover, a group of sequences in a single phylogenetic cluster can be selected or unselected in a tree viewer. If you click on a node in a tree (Figure 3A), the descendant sequences under the node are selected or unselected together in the list of sequences (Figure 3B). Automated tools for sequence selection, such as CD-HIT [32] and MaxAlign [33], can also run on our service. The selected sequences are subjected to phylogenetic tree inference using the neighbor-joining method [34] or UPGMA [35] with several options, such as distance measure and the number of bootstrap cycles (Figure 3C). Then, the data set can be further refined using the new tree. The maximum-likelihood method is not supported because of the high computational costs. It must be performed locally or using other online services.

Figure 3

Interactive sequence selection. A group of sequences in guide tree (A) is selected at a time in sequence selection window (B). Several options for tree estimation can be selected (C). MSA can be visually checked using MSAViewer (D). Two tree viewers, Phylo.io [36] and Archaeopteryx [37], are used for sequence selection and visualization of phylogenetic trees. Originally, we used Archaeopteryx Java plugin, but modern browsers no longer support Java plugin for security reasons. Thus, we recently adopted Phylo.io, which is written in JavaScript and works with most modern browsers. With the addition of Phylo.io to our service, we have added some new features: Coloring of sequence title corresponding to the databases in aLeaves [29]. Interactive sequence selection (see above). Automatic rooting similar to mid-point rooting. This is just for visualization without any biological basis. To estimate the position of root, outgroup or other additional information is necessary. A JavaScript version of Archaeopteryx is being developed (C. Zmasek, personal communication), and we are planning to use this viewer, too. To visualize MSAs, two tools, Jalview [38] (as Java plugin) and MSAViewer [39] (written in JavaScript; Figure 3C), are available on our service.

Necessity of large MSAs

The relationship between alignment accuracy and data size is still unclear. It is naively expected that the accuracy of an MSA is improved with the number of input sequences. However, highly accurate methods cannot be applied to large data because of computational costs. Useful information related to this issue has recently been reported by Le et al. [18]. In their tests, the accuracy of downstream analysis (protein secondary structure prediction in this case) is improved with the increase of sequences for medium-scale data (<1000 sequences), but with more sequences, the accuracy reaches a sort of plateau. Thus, there may be optimal data size. Their test also suggested that the accuracy of MSA itself hits a maximum point at a smaller number of sequences (around 200) and that the accuracy of MSA decreases with an increase in the number of sequences. This observation is consistent with Sievers et al. [40]. Such optimal data sizes can differ for different problems. For example, in the case of prediction of contact residues based on co-evolution, larger MSAs are generally thought to be necessary [41, 42].

Command-line options

Each method also runs locally. In the current version (7.310; August 2017), the corresponding commands are as follows: PartTree (Figure 1A) mafft --parttree --partsize 1000 > DPPartTree (Figure 1A) mafft --dpparttree --partsize 1000 > FFT-NS-1 (Figure 1B) mafft --retree 1 > mafft --retree 1 --memsavetree > (low-memory mode) mafft --retree 1 --thread -1 > (multithread mode) With thread -1, the number of physical cores is automatically counted and all cores are used. See http://mafft.cbrc.jp/alignment/software/multithreading.html for detailed information on multithreading. FFT-NS-2 (Figure 1D) mafft > mafft --memsavetree > (low-memory mode) mafft --thread -1 > (multithread mode) mafft-sparsecore (Figure 1E) mafft-sparsecore.rb - p -n -s -i > mafft-sparsecore.rb -p -n -s -A ”--memsavetree” -i > (low-memory mode) mafft-sparsecore.rb -p -n -s -A ”--thread -1” -C ”--thread -1” -i > (multithread mode) p and n are as explained above, and s is seed for random numbers. Flags for the iterative refinement stage and those for the progressive stage can be specified after -C and -A, respectively. See http://mafft.cbrc.jp/alignment/software/sparsecore.html for detailed information. G-INS-1 (Figure 1F) mafft --globalpair > mafft --globalpair --thread -1 > (multithread mode) Pileup (Figure 1G) mafft --pileup > Random chain (Figure 1G) mafft --randomchain --randomseed > s is seed for random numbers. Adding new sequences to an MSA mafft --add > mafft --addfull > mafft --addlong > mafft --addfragments newSequences > The --keeplength flag can be added to each command (see above). Add --thread −1 to enable multithreading. MSA is an important step in phylogeny inference, functional prediction and many other analyses. The demand for MSAs with a large number of sequences is increasing. MAFFT has different options for computing large MSAs in both the local and online versions. The online version has additional features for preprocessing and postprocessing MSAs.

40 in total

1. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors: Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal: Nucleic Acids Res Date: 2002-07-15 Impact factor: 16.971

2. Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era.

Authors: Hetunandan Kamisetty; Sergey Ovchinnikov; David Baker
Journal: Proc Natl Acad Sci U S A Date: 2013-09-05 Impact factor: 11.205

3. COFFEE: an objective function for multiple sequence alignments.

Authors: C Notredame; L Holm; D G Higgins
Journal: Bioinformatics Date: 1998-06 Impact factor: 6.937

4. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

Authors: Siavash Mirarab; Nam Nguyen; Sheng Guo; Li-San Wang; Junhyong Kim; Tandy Warnow
Journal: J Comput Biol Date: 2014-12-30 Impact factor: 1.479

5. CLUSTAL: a package for performing multiple sequence alignment on a microcomputer.

Authors: D G Higgins; P M Sharp
Journal: Gene Date: 1988-12-15 Impact factor: 3.688

6. MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors: Kazutaka Katoh; Daron M Standley
Journal: Mol Biol Evol Date: 2013-01-16 Impact factor: 16.240

7. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

Authors: Fabian Sievers; Andreas Wilm; David Dineen; Toby J Gibson; Kevin Karplus; Weizhong Li; Rodrigo Lopez; Hamish McWilliam; Michael Remmert; Johannes Söding; Julie D Thompson; Desmond G Higgins
Journal: Mol Syst Biol Date: 2011-10-11 Impact factor: 11.429

8. Ultra-large alignments using phylogeny-aware profiles.

Authors: Nam-Phuong D Nguyen; Siavash Mirarab; Keerthana Kumar; Tandy Warnow
Journal: Genome Biol Date: 2015-06-16 Impact factor: 13.583

9. Assessment and refinement of eukaryotic gene structure prediction with gene-structure-aware multiple protein sequence alignment.

Authors: Osamu Gotoh; Mariko Morita; David R Nelson
Journal: BMC Bioinformatics Date: 2014-06-14 Impact factor: 3.169

10. Systematic exploration of guide-tree topology effects for small protein alignments.

Authors: Fabian Sievers; Graham M Hughes; Desmond G Higgins
Journal: BMC Bioinformatics Date: 2014-10-04 Impact factor: 3.169

1194 in total

Review 1. The DDX5/Dbp2 subfamily of DEAD-box RNA helicases.

Authors: Zheng Xing; Wai Kit Ma; Elizabeth J Tran
Journal: Wiley Interdiscip Rev RNA Date: 2018-12-02 Impact factor: 9.957

2. Origin and Evolution of Hybrid Shiga Toxin-Producing and Uropathogenic Escherichia coli Strains of Sequence Type 141.

Authors: Noble Selasi Gati; Barbara Middendorf-Bauchart; Stefan Bletz; Ulrich Dobrindt; Alexander Mellmann
Journal: J Clin Microbiol Date: 2019-12-23 Impact factor: 5.948

3. Comparative Analysis of the IclR-Family of Bacterial Transcription Factors and Their DNA-Binding Motifs: Structure, Positioning, Co-Evolution, Regulon Content.

Authors: Inna A Suvorova; Mikhail S Gelfand
Journal: Front Microbiol Date: 2021-06-10 Impact factor: 5.640

4. Gut Microbiota in Decapod Shrimps: Evidence of Phylosymbiosis.

Authors: Yuanyuan Tang; Ka Yan Ma; Man Kit Cheung; Chien-Hui Yang; Yaqin Wang; Xuelei Hu; Hoi Shan Kwan; Ka Hou Chu
Journal: Microb Ecol Date: 2021-02-24 Impact factor: 4.552

5. Baculovirus AC102 Is a Nucleocapsid Protein That Is Crucial for Nuclear Actin Polymerization and Nucleocapsid Morphogenesis.

Authors: Susan E Hepp; Gina M Borgo; Simina Ticau; Taro Ohkawa; Matthew D Welch
Journal: J Virol Date: 2018-05-14 Impact factor: 5.103

6. Spatio-temporal distribution analysis of circulating genotypes of dengue virus type 1 in western and southern states of India by a one-step real-time RT-PCR assay.

Authors: K Alagarasu; J A Patil; M B Kakade; A M More; M Bote; D Chowdhury; M Seervi; N T Rajesh; M Ashok; B Anukumar; A M Abraham; D Parashar; P S Shah
Journal: Infect Genet Evol Date: 2019-07-31 Impact factor: 3.342

7. BNT162b2 vaccine induces neutralizing antibodies and poly-specific T cells in humans.

Authors: Ugur Sahin; Alexander Muik; Isabel Vogler; Evelyna Derhovanessian; Lena M Kranz; Mathias Vormehr; Jasmin Quandt; Nicole Bidmon; Alexander Ulges; Alina Baum; Kristen E Pascal; Daniel Maurus; Sebastian Brachtendorf; Verena Lörks; Julian Sikorski; Peter Koch; Rolf Hilker; Dirk Becker; Ann-Kathrin Eller; Jan Grützner; Manuel Tonigold; Carsten Boesler; Corinna Rosenbaum; Ludwig Heesen; Marie-Cristine Kühnle; Asaf Poran; Jesse Z Dong; Ulrich Luxemburger; Alexandra Kemmer-Brück; David Langer; Martin Bexon; Stefanie Bolte; Tania Palanche; Armin Schultz; Sybille Baumann; Azita J Mahiny; Gábor Boros; Jonas Reinholz; Gábor T Szabó; Katalin Karikó; Pei-Yong Shi; Camila Fontes-Garfias; John L Perez; Mark Cutler; David Cooper; Christos A Kyratsous; Philip R Dormitzer; Kathrin U Jansen; Özlem Türeci
Journal: Nature Date: 2021-05-27 Impact factor: 49.962