MatScIE: An automated tool for the generation of databases of methods and parameters used in the computational materials science literature

by   Souradip Guha, et al.

The number of published articles in the field of materials science is growing rapidly every year. This comparatively unstructured data source, which contains a large amount of information, has a restriction on its re-usability, as the information needed to carry out further calculations using the data in it must be extracted manually. It is very important to obtain valid and contextually correct information from the online (offline) data, as it can be useful not only to generate inputs for further calculations, but also to incorporate them into a querying framework. Retaining this context as a priority, we have developed an automated tool, MatScIE (Material Scince Information Extractor) that can extract relevant information from material science literature and make a structured database that is much easier to use for material simulations. Specifically, we extract the material details, methods, code, parameters, and structure from the various research articles. Finally, we created a web application where users can upload published articles and view/download the information obtained from this tool and can create their own databases for their personal uses.


A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing

The ever-increasing number of materials science articles makes it hard t...

Material Named Entity Recognition (MNER) for Knowledge-driven Materials Using Deep Learning Approach

The scientific literature contains a wealth of cutting-edge knowledge in...

Referencing Sources of Molecular Spectroscopic Data in the Era of Data Science: Application to the HITRAN and AMBDAS Databases

The application described has been designed to create bibliographic entr...

Analyzing Research Trends in Inorganic Materials Literature Using NLP

In the field of inorganic materials science, there is a growing demand t...

PreprintResolver: Improving Citation Quality by Resolving Published Versions of ArXiv Preprints using Literature Databases

The growing impact of preprint servers enables the rapid sharing of time...

PubSqueezer: A Text-Mining Web Tool to Transform Unstructured Documents into Structured Data

The amount of scientific papers published every day is daunting and cons...
07/20/2022 A Web Ecosystem of Databases, Software and Tools

To enable materials databases supporting computational and experimental ...

Please sign up or login with your details

Forgot password? Click here to reset