| Literature DB >> 26199848 |
Nastassja A Lewinski1, Bridget T McInnes2.
Abstract
Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP)-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics.Entities:
Keywords: data mining; informatics; name entity recognition; nano-informatics; nanoparticles; nanotechnology; nanotoxicity; natural language processing; text mining
Year: 2015 PMID: 26199848 PMCID: PMC4505089 DOI: 10.3762/bjnano.6.149
Source DB: PubMed Journal: Beilstein J Nanotechnol ISSN: 2190-4286 Impact factor: 3.649
Nanoinformatic system components from an NLP perspective.
| Nano Porter | Nano Mapper | Tech Perceptor | Text Mining Framework | Nano Device F & C | Nano Toxicity Searcher | Nano Sifter | Clinical Trial Doc. Class. | NEI Miner | ||
| machine learning algorithm | CRF | × | × | |||||||
| decision trees | × | × | ||||||||
| logistic regression | × | |||||||||
| naive Bayes | × | |||||||||
| nearest neighbor | × | |||||||||
| SVM | × | × | × | |||||||
| algorithm class | machine learning | × | × | × | ||||||
| pattern matching | × | |||||||||
| clustering | × | × | × | × | ||||||
| visualization | visualization modules | × | × | × | × | × | ||||
| taxonomy | FMA (in UMLS) | × | ||||||||
| MeSH (in UMLS) | × | |||||||||
| WordNet | × | |||||||||
| NanoParticle Ontology | × | |||||||||
| NLP tools | GATE ( NLP Toolkit) | × | ||||||||
| Xconc Suite (annotator) | × | |||||||||
| ABMiner (NLP Toolkit) | × | |||||||||
| Abner (NER) | × | |||||||||
| YamCha (Parser) | × | |||||||||
| GPoSSTTL (POS Tagger) | × | |||||||||
| ANNIE (GATE module) | × | |||||||||
| Mallet (NLP Toolkit) | × | |||||||||
| NLTK (NLP Toolkit) | × | |||||||||
| NLP sub task | POS tagging | × | × | × | × | |||||
| parsing | × | × | × | |||||||
| concept mapping | × | |||||||||
| stemming | × | |||||||||
| sentence similarity | × | |||||||||
| NLP task | document classification | × | ||||||||
| document clustering | × | |||||||||
| entity extraction | × | × | × | × | ||||||
| information retrieval | × | × | ||||||||
| patent analyzer | × | × | × | × | ||||||
| summarization | × | |||||||||
| topic identification | × | × | ||||||||
Nanoinformatic system components from a data perspective.
| Nano Porter | Nano Mapper | Tech Perceptor | Text Mining Framework | Nano Device F & C | Nano Toxicity Searcher | Nano Sifter | Clinical Trial Doc. Class. | NEI Miner | |||
| publication information | citation (e.g., author, journal, date) | × | × | × | × | × | × | ||||
| laboratory/ organization | × | × | |||||||||
| location | × | × | × | ||||||||
| content description | × | × | × | ||||||||
| patent classification (e.g., US, EU) | × | × | × | ||||||||
| physico- chemical character- ization | particle diameter | × | × | × | |||||||
| particle size distribution | × | ||||||||||
| hydrodynamic diameter | × | ||||||||||
| agglomeration and/or aggregation | × | ||||||||||
| shape | × | × | |||||||||
| core composition | × | × | × | × | |||||||
| crystallinity/crystalline state | × | × | |||||||||
| surface area | × | ||||||||||
| surface charge/zeta potential | × | × | × | ||||||||
| surface chemistry | × | × | |||||||||
| purity | × | × | |||||||||
| stability | |||||||||||
| solubility | |||||||||||
| concentration (mass, number, SA) | × | ||||||||||
| method of synthesis/preparation | × | × | × | ||||||||
| molecular weight | × | ||||||||||
| exposure | exposure media | × | |||||||||
| exposure pathway/route | × | × | |||||||||
| exposure duration | × | ||||||||||
| exposure dose | × | ||||||||||
| biological response | bioavailability/uptake | × | |||||||||
| biomagnification | × | ||||||||||
| cell viability | × | × | |||||||||
| cytotoxicity | × | × | × | ||||||||
| inflammatory response | × | ||||||||||
| genotoxicity | × | × | |||||||||
| EC50 (ppm) | × | ||||||||||
| IC50 | × | × | |||||||||
| LC50 (ppm) | × | ||||||||||
| organ response | × | ||||||||||
| whole organism response | × | × | |||||||||