NREL MatDB Metadata Policies and Format
Metadata policies
- Every vasp run must be in its own directory. If the results of multiple vasp runs are in the same directory, the results may be omitted or garbled.
- Every vasp run directory to be uploaded must contain the files:
- metadata
- INCAR
- KPOINTS
- POSCAR
- POTCAR
- OUTCAR
- vasprun.xml
- Optional files that will be archived if present are:
- DOSCAR
Other files may be present in the directory, but they will be ignored.
Every VASP run directory to be uploaded must contain a file named metadata. The metadata file is described below.
Metadata format
The general format is:
:field: content content content ...
where the content may continue across lines until the start of the next field or comment. The :field: must start in the first column. Using the :field: format reduces the chance that a long pasted-in comment will collide with the field syntax.
Comments
Lines having a hash character (#) in the first column are treated as comments. Comments are not captured for processing. The # must appear in the first column to be a comment.
Required fields
Field | Content |
---|---|
:firstName: | The first name of the person responsible for the run. |
:lastName: | The last name of the person responsible for the run. |
:publications: | The DOI (document object identifier) of the publications, or placeholders, separated by commas |
:standards: | Controlled vocabulary keywords, separated by commas. |
:keywords: | Uncontrolled keywords, separated by commas. |
:notes: | Notes and comments |
The :publication: should be the DOI without the initial http://. If you don't yet know the publication DOI, use:
:publications: DOI_to_be_determined
In the :standards: and :keywords: fields, individual keywords may not contain blanks. So the keyword absorption spectrum is illegal, but absorption_spectrum and absorptionSpectrum are legal.
The standards are a controlled vocabulary, and the only legal values are those shown in metadata standards.
The keywords can be a comma separated list of any terms you choose, to enable searching later on. For example:
Cu3N, ggau, defectsc=Vc_Cu, q=-1
Optional fields
Field | Content |
---|---|
:parents: | The sha1sum of a previous OUTCAR file |
The :parents: field is used to link a postprocessing step to the original step.
Order of fields
The order of the fields in the file is not important. So the following two examples equivalent:
:firstName: Sophie :lastName: Martinez
and:
:lastName: Martinez :firstName: Sophie
White space
White space is removed before and after a field, and before and after a value. But the white space within a value is kept. So the following two examples are equivalent:
:lastName: Martinez :notes: This is a long description. Description part 1: Etc, etc, etc.
and:
:lastName: Martinez :notes: This is a long description. Description part 1: Etc, etc, etc.
Complete example
:firstName: Sophie :lastName: Martinez :publications: dx.doi.org/10.1021/ja295760 dx.doi.org/11.1022/ja295765 :standards: fere, gwvd :keywords: enthalpy, bandgap, icsd, absorptionSpectrum, crystal, lattice :notes: This is a test of the VASP processing framework, with postprocessing.
Complete example with a parent
After running a previous step, one would issue:
cd previousStep sha1sum OUTCAR
This gives a result like:
20d79abf89cbe4f4f29f2a82f445b21824079f77 OUTCAR
The metadata for the following step would look like:
:firstName: Sophie :lastName: Martinez :publications: dx.doi.org/10.1021/ja295760 dx.doi.org/11.1022/ja295765 :standards: fere, gwvd :parents: 20d79abf89cbe4f4f29f2a82f445b21824079f77e75e...6a78 :keywords: enthalpy, bandgap, icsd, absorptionSpectrum, crystal, lattice :notes: This is a test of the VASP processing framework, with postprocessing.