Literature DB >> 26476780

Scalable clustering algorithms for continuous environmental flow cytometry.

Jeremy Hyrkas1, Sophie Clayton2, Francois Ribalet2, Daniel Halperin3, E Virginia Armbrust4, Bill Howe3.   

Abstract

MOTIVATION: Recent technological innovations in flow cytometry now allow oceanographers to collect high-frequency flow cytometry data from particles in aquatic environments on a scale far surpassing conventional flow cytometers. The SeaFlow cytometer continuously profiles microbial phytoplankton populations across thousands of kilometers of the surface ocean. The data streams produced by instruments such as SeaFlow challenge the traditional sample-by-sample approach in cytometric analysis and highlight the need for scalable clustering algorithms to extract population information from these large-scale, high-frequency flow cytometers.
RESULTS: We explore how available algorithms commonly used for medical applications perform at classification of such a large-scale, environmental flow cytometry data. We apply large-scale Gaussian mixture models to massive datasets using Hadoop. This approach outperforms current state-of-the-art cytometry classification algorithms in accuracy and can be coupled with manual or automatic partitioning of data into homogeneous sections for further classification gains. We propose the Gaussian mixture model with partitioning approach for classification of large-scale, high-frequency flow cytometry data.
AVAILABILITY AND IMPLEMENTATION: Source code available for download at https://github.com/jhyrkas/seaflow_cluster, implemented in Java for use with Hadoop. CONTACT: hyrkas@cs.washington.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2015        PMID: 26476780     DOI: 10.1093/bioinformatics/btv594

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  4 in total

1.  Ultrafast clustering of single-cell flow cytometry data using FlowGrid.

Authors:  Xiaoxin Ye; Joshua W K Ho
Journal:  BMC Syst Biol       Date:  2019-04-05

2.  Real-Time Massive Vector Field Data Processing in Edge Computing.

Authors:  Kun Zheng; Kang Zheng; Falin Fang; Hong Yao; Yunlei Yi; Deze Zeng
Journal:  Sensors (Basel)       Date:  2019-06-07       Impact factor: 3.576

Review 3.  Advances in automated real-time flow cytometry for monitoring of bioreactor processes.

Authors:  Anna-Lena Heins; Manh Dat Hoang; Dirk Weuster-Botz
Journal:  Eng Life Sci       Date:  2021-11-12       Impact factor: 2.678

4.  Quantifying cell densities and biovolumes of phytoplankton communities and functional groups using scanning flow cytometry, machine learning and unsupervised clustering.

Authors:  Mridul K Thomas; Simone Fontana; Marta Reyes; Francesco Pomati
Journal:  PLoS One       Date:  2018-05-10       Impact factor: 3.240

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.