How to navigate the code¶
NOMAD is a complex project with lots of parts. This guide gives you a rough overview about the codebase and ideas about what to look at first.
There is one main NOMAD project
(and its fork on GitHub). This project contains
all the framework and infrastructure code. It instigates all checks, builds, and
deployments for the public NOMAD service, the NOMAD Oasis, and the
package. All contributions to NOMAD have to go through this project eventually.
All (Git) projects that NOMAD depends on are either a Git submodule (you find
them all in the
dependencies directory or its subdirectories) or they are
listed as PyPI packages in the
pyproject.toml of the main project (or one of its
You can also have a look at the list of parsers and built-in plugins that constitute the majority of these projects. The only other projects are MatID, DOS fingerprints, and the NOMAD Remote Tools Hub.
The GitLab organization nomad-lab and the GitHub organizations for FAIRmat and the NOMAD CoE all represent larger infrastructure and research projects, and they include many other Git projects that are not related. When navigating the codebase, only follow the submodules.
There are three main directories with Python code:
nomad: The actual NOMAD code. It is structured into more subdirectories and modules.
tests: Tests (pytest) for the NOMAD code. It follows the same module structure, but Python files are prefixed with
examples: A few small Python scripts that might be linked in the documentation.
nomad directory contains the following "main" modules. This list is not extensive
but should help you to navigate the codebase:
archive: Functionality to store and access archive files. This is the storage format for all processed data in NOMAD. See also the docs on structured data.
cli: The command line interface (based on Click). Subcommands are structured into submodules.
config: NOMAD is configured through the
nomad.yamlfile. This contains all the (Pydantic) models and default config parameters.
datamodel: The built-in schemas (e.g.
nomad.datamodel.metainfo.simulationused by all the theory parsers). The base sections and section for the shared entry structure. See also the docs on the datamodel and processing.
metainfo: The Metainfo system, e.g. the schema language that NOMAD uses.
normalizing: All the normalizers. See also the docs on processing.
parsing: The base classes for parsers, matching functionality, parser initialization, some fundamental parsers like the archive parser. See also the docs on processing.
units: The unit and unit conversion system based on Pint.
utils: Utility modules, e.g. the structured logging system (structlog), id generation, and hashes.
files.py: Functionality to maintain the files for uploads in staging and published. The interface to the file system.
search.py: The interface to Elasticsearch.
The NOMAD UI is written as a React single-page application (SPA). It
uses (among many other libraries) MUI,
Plotly, and D3. The GUI code is
maintained in the
gui directory. Most relevant code can be found in
gui/src/components. The application entry point is
The documentation is based on MkDocs. The important files and directories are:
docs: Contains all the Markdown files that contribute to the documentation system.
mkdocs.yml: The index and configuration of the documentation. New files have to be added here as well.
nomad/mkdocs.py: Python code that defines macros which can be used in Markdown.
Other top-level directories¶
dependencies: Contains all the submodules, e.g. the parsers.
ops: Contains artifacts to run NOMAD components, e.g.
docker-compose.yamlfiles, and our Kubernetes Helm chart.
scripts: Contains scripts used during the build or for certain development tasks.