NOMAD AI Toolkit — Discover patterns in materials science data

Features

What is the NOMAD Artificial-Intelligence Toolkit?

The NOMAD project revolutionizes materials design by offering an Artificial-Intelligence Toolkit. This powerful toolkit employs cutting-edge artificial-intelligence techniques like machine learning, compressed sensing, and data mining. It enables efficient sorting of vast material data, uncovering correlations, structures, trends, and anomalies. Scientists and engineers can identify materials suitable for specific applications and determine promising candidates for future studies.

Analyze

AI Toolkit allows you to run custom analytics at a large scale.

Learn

AI Toolkit allows you to learn state-of-the-art methods interactively.

Share

Share your results with the scientific community. Make use of networking practices.

Getting started

Learning Center

AI Toolkit empowers scientists and engineers to effectively sort and analyze vast material data sets, uncover correlations, structures, trends, and anomalies. By leveraging the Artificial-Intelligence Toolkit, researchers can make informed decisions on material suitability for specific applications and prioritize the exploration of new materials in future studies. As a part of NOMAD, AI Toolkit offers an entirely free experience and a well-known platform for scientific collaboration.

First Steps

Welcome to our Big-Data Analytics Tool! We're excited to show you all the possibilities available to you. Let's explore together!

Workflow

Dive into a web-based Artificial-Intelligence Toolkit. Get to know the methodology and techniques used in the AI Toolkit:

Jupyter Notebooks
Machine Learning Packages
Hosted by the Max Planck Computer and Data Facility

Jupyter Notebook

Learn how to access and modify files in NOMAD using JupyterLab. The AI Toolkit allows you to fully use machine learning by running your predefined notebooks.

News

Insights: Articles, and Exciting Findings

Welcome to our Insights Hub! Here, we gather the latest and most exciting information from conferences, articles, and discoveries. Our goal is to keep you informed and up-to-date as we explore the realms of scientific breakthroughs and technological innovations.

Chemical Society Reviews, (2025)

From text to insight: large language models for chemical data extraction

Mara Schilling-Wilhelmi, Martiño Ríos-García, Sherjeel Shabih, María Victoria Gil, Santiago Miret, Christoph T. Koch, José A. Márquez and Kevin Maik Jablonka

The vast majority of chemical knowledge exists in unstructured natural language, yet structured data is crucial for innovative and systematic materials design. The advent of large language models (LLMs) represents a significant shift, potentially enabling non-experts to extract structured, actionable data from unstructured text efficiently. While applying LLMs to chemical and materials science data extraction presents unique challenges, domain knowledge offers opportunities to guide and validate LLM outputs. This tutorial review provides a comprehensive overview of LLM-based structured data extraction in chemistry, synthesizing current knowledge and outlining future directions. We address the lack of standardized guidelines and present frameworks for leveraging the synergy between LLMs and chemical expertise. Additionally, we created an online Jupyter book, matextract.pub, full of hands-on examples of the different steps in the extraction workflow using LLMs.

Follow this tutorial to run the notebooks in Jupyter4NFDI.

npj Computational Materials volume 8, (2022)

The NOMAD Artificial-Intelligence Toolkit: turning materials-science data into knowledge and understanding

Luigi Sbailò, Ádám Fekete, Luca M. Ghiringhelli & Matthias Scheffler

We present the Novel-Materials-Discovery (NOMAD) Artificial-Intelligence (AI) Toolkit, a web-browser-based infrastructure for the interactive AI-based analysis of materials-science findable, accessible, interoperable, and reusable (FAIR) data. The AI Toolkit readily operates on the FAIR data stored in the central server of the NOMAD Archive, the largest database of materials-science data worldwide, as well as locally stored, users' owned data. The NOMAD Oasis, a local, stand-alone server can be also used to run the AI Toolkit. By using Jupyter notebooks that run in a web-browser, the NOMAD data can be queried and accessed; data mining, machine learning, and other AI techniques can be then applied to analyze them. This infrastructure brings the concept of reproducibility in materials science to the next level, by allowing researchers to share not only the data contributing to their scientific publications, but also all the developed methods and analytics tools. Besides reproducing published results, users of the NOMAD AI toolkit can modify the Jupyter notebooks toward their own research work.

Nature Reviews Physics volume 3, 724 (2021)

An AI-toolkit to develop and share research into new materials

Luca M. Ghiringhelli

Probably the biggest challenge in materials science is the discovery or design of new materials that exhibit exceptional performance for a desired function, or uncovering new properties of known materials. AI methods can be used to identify patterns and trends from big data to these ends. In materials science, these big data are a complex, hierarchical structure of experimental measures and theoretical estimates. Since 2014, the Novel Materials Discovery (NOMAD) Laboratory has established a materials data infrastructure, based on a large repository of materials data, and provides AI tools and training for researchers to freely access this resource, in compliance with the FAIR principles — that data should be findable, accessible, interoperable and reusable (or recyclable).