Literature DB >> 11801178

Language trees and zipping.

Dario Benedetto1, Emanuele Caglioti, Vittorio Loreto.   

Abstract

In this Letter we present a very general method for extracting information from a generic string of characters, e.g., a text, a DNA sequence, or a time series. Based on data-compression techniques, its key point is the computation of a suitable measure of the remoteness of two bodies of knowledge. We present the implementation of the method to linguistic motivated problems, featuring highly accurate results for language recognition, authorship attribution, and language classification.

Mesh:

Substances:

Year:  2002        PMID: 11801178     DOI: 10.1103/PhysRevLett.88.048702

Source DB:  PubMed          Journal:  Phys Rev Lett        ISSN: 0031-9007            Impact factor:   9.161


  11 in total

Review 1.  Diversity, competition, extinction: the ecophysics of language change.

Authors:  Ricard V Solé; Bernat Corominas-Murtra; Jordi Fortuny
Journal:  J R Soc Interface       Date:  2010-06-30       Impact factor: 4.118

2.  Duplicate publication and 'paper inflation' in the Fractals literature.

Authors:  R N Kostoff; D Johnson; J A Del Rio; L A Bloomfield; M F Shlesinger; G Malpohl; H D Cortes
Journal:  Sci Eng Ethics       Date:  2006-07       Impact factor: 3.525

3.  Maximum entropy models capture melodic styles.

Authors:  Jason Sakellariou; Francesca Tria; Vittorio Loreto; Francois Pachet
Journal:  Sci Rep       Date:  2017-08-23       Impact factor: 4.379

Review 4.  Corpus-based typology: applications, challenges and some solutions.

Authors:  Natalia Levshina
Journal:  Linguist Typol       Date:  2021-03-30

5.  A grammar-based distance metric enables fast and accurate clustering of large sets of 16S sequences.

Authors:  David J Russell; Samuel F Way; Andrew K Benson; Khalid Sayood
Journal:  BMC Bioinformatics       Date:  2010-12-17       Impact factor: 3.169

6.  Connections of Grasping and Horizontal Hand Movements with Articulation in Czech Speakers.

Authors:  Mikko Tiainen; Jiří Lukavský; Kaisa Tiippana; Martti Vainio; Juraj Šimko; Fatima Felisberti; Lari Vainio
Journal:  Front Psychol       Date:  2017-04-05

7.  Generalising Ward's Method for Use with Manhattan Distances.

Authors:  Trudie Strauss; Michael Johan von Maltitz
Journal:  PLoS One       Date:  2017-01-13       Impact factor: 3.240

8.  Medical Image Retrieval Using Vector Quantization and Fuzzy S-tree.

Authors:  Jana Nowaková; Michal Prílepok; Václav Snášel
Journal:  J Med Syst       Date:  2016-12-15       Impact factor: 4.460

9.  An improved string composition method for sequence comparison.

Authors:  Guoqing Lu; Shunpu Zhang; Xiang Fang
Journal:  BMC Bioinformatics       Date:  2008-05-28       Impact factor: 3.169

10.  Comparison of next-generation sequencing samples using compression-based distances and its application to phylogenetic reconstruction.

Authors:  Ngoc Hieu Tran; Xin Chen
Journal:  BMC Res Notes       Date:  2014-05-29
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.