id: https://w3id.org/nmdc/nmdc name: NMDC title: NMDC Schema notes: - not importing any MIxS terms where the relationship between the name (SCN) and the id isn't 1:1 description: >- Schema for National Microbiome Data Collaborative (NMDC). This schema is organized into multiple modules, such as: * a set of core types for representing data values * a subset of the mixs schema * an annotation schema * the NMDC schema itself, into which the other modules are imported license: https://creativecommons.org/publicdomain/zero/1.0/ version: 0.0.0 imports: - annotation # also brings core and portal_* - workflow_execution_activity prefixes: CATH: "https://bioregistry.io/cath:" CHEBI: "http://purl.obolibrary.org/obo/CHEBI_" CHEMBL.COMPOUND: "https://bioregistry.io/chembl.compound:" # https://bioregistry.io/chembl.compound:CHEMBL465070 CHMO: "http://purl.obolibrary.org/obo/CHMO_" Contaminant: http://example.org/contaminant/ # present in MongoDB DRUGBANK: "https://bioregistry.io/drugbank:" # https://bioregistry.io/drugbank:DB14938 EC: "https://bioregistry.io/eccode:" # https://bioregistry.io/eccode:1.1.1.1 EFO: http://www.ebi.ac.uk/efo/ EGGNOG: "https://bioregistry.io/eggnog:" # https://bioregistry.io/eggnog:veNOG12876 ENVO: "http://purl.obolibrary.org/obo/ENVO_" FBcv: "http://purl.obolibrary.org/obo/FBcv_" FMA: "http://purl.obolibrary.org/obo/FMA_" GO: "http://purl.obolibrary.org/obo/GO_" HMDB: "https://bioregistry.io/hmdb:" # https://bioregistry.io/hmdb:HMDB00001 ISA: http://example.org/isa/ KEGG.ORTHOLOGY: "https://bioregistry.io/kegg.orthology:" # https://github.com/prefixcommons/biocontext/blob/master/registry/idot_context.jsonld MASSIVE: "https://bioregistry.io/reference/massive:" MESH: "https://bioregistry.io/mesh:" # https://bioregistry.io/mesh:C063233 MS: "http://purl.obolibrary.org/obo/MS_" MetaNetX: http://example.org/metanetx/ NCBITaxon: "http://purl.obolibrary.org/obo/NCBITaxon_" NCBI: "http://example.com/ncbitaxon/" # temporary. see https://github.com/microbiomedata/issues/issues/893 NCIT: "http://purl.obolibrary.org/obo/NCIT_" OBI: http://purl.obolibrary.org/obo/OBI_ ORCID: https://orcid.org/ PANTHER.FAMILY: "https://bioregistry.io/panther.family:" # https://bioregistry.io/panther.family:PTHR12345 PATO: "http://purl.obolibrary.org/obo/PATO_" PFAM: "https://bioregistry.io/pfam:" # https://bioregistry.io/pfam:PF11779 PFAM.CLAN: "https://bioregistry.io/pfam.clan:" # https://bioregistry.io/pfam.clan:CL0192 PO: "http://purl.obolibrary.org/obo/PO_" PR: "http://purl.obolibrary.org/obo/PR_" PUBCHEM.COMPOUND: "https://bioregistry.io/pubchem.compound:" RO: "http://purl.obolibrary.org/obo/RO_" RetroRules: http://example.org/retrorules/ SO: "http://purl.obolibrary.org/obo/SO_" SUPFAM: "https://bioregistry.io/supfam:" # https://bioregistry.io/supfam:SSF57615 TIGRFAM: "https://bioregistry.io/tigrfam:" # https://bioregistry.io/tigrfam:TIGR00010 UBERON: "http://purl.obolibrary.org/obo/UBERON_" UO: "http://purl.obolibrary.org/obo/UO_" bioproject: "https://identifiers.org/bioproject:" biosample: "https://bioregistry.io/biosample:" cas: "https://bioregistry.io/cas:" doi: "https://bioregistry.io/doi:" edam.data: "http://edamontology.org/data_" emsl.project: "https://bioregistry.io/emsl.project:" emsl: "http://example.org/emsl_in_mongodb/" emsl_uuid_like: "http://example.org/emsl_uuid_like/" generic: https://example.org/generic/ gnps.task: "https://bioregistry.io/gnps.task:" gtpo: http://example.org/gtpo/ igsn: https://app.geosamples.org/sample/igsn/ img.taxon: "https://bioregistry.io/img.taxon:" jgi.analysis: "https://data.jgi.doe.gov/search?q=" jgi.proposal: "https://bioregistry.io/jgi.proposal:" jgi: http://example.org/jgi/ kegg: "https://bioregistry.io/kegg:" # https://bioregistry.io/kegg:hsa00190 linkml: https://w3id.org/linkml/ mgnify.proj: "https://bioregistry.io/mgnify.proj:" my_emsl: "https://release.my.emsl.pnnl.gov/released_data/" neon.identifier: http://example.org/neon/identifier/ neon.schema: http://example.org/neon/schema/ nmdc: https://w3id.org/nmdc/ prov: http://www.w3.org/ns/prov# rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# rdfs: http://www.w3.org/2000/01/rdf-schema# skos: http://www.w3.org/2004/02/skos/core# wikidata: "http://www.wikidata.org/entity/" xsd: http://www.w3.org/2001/XMLSchema# default_prefix: nmdc default_range: string emit_prefixes: - KEGG.ORTHOLOGY - MASSIVE - biosample - cas - doi - gnps.task - gold - img.taxon - jgi.proposal - kegg - rdf - rdfs - skos - xsd settings: id_nmdc_prefix: "^(nmdc)" id_shoulder: "([0-9][a-z]{0,6}[0-9])" id_blade: "([A-Za-z0-9]{1,})" id_version: "(\\.[0-9]{1,})" id_locus: "(_[A-Za-z0-9_\\.-]+)?$" classes: EukEval: description: This class contains information pertaining to evaluating if a Metagenome-Assembled Genome (MAG) is eukaryotic. comments: - A tool like eukCC (https://doi.org/10.1186/s13059-020-02155-4) would generate information for this class. slots: - type - completeness - contamination - ncbi_lineage_tax_ids - ncbi_lineage class_uri: nmdc:EukEval NucleotideSequencing: class_uri: nmdc:NucleotideSequencing is_a: DataGeneration description: A DataGeneration in which the sequence of DNA or RNA molecules is generated. comments: For example data generated from an Illumina or Pacific Biosciences instrument. slots: - gold_sequencing_project_identifiers - insdc_bioproject_identifiers - insdc_experiment_identifiers - ncbi_project_name - target_gene - target_subfragment slot_usage: id: structured_pattern: syntax: "{id_nmdc_prefix}:(dgns|omprc)-{id_shoulder}-{id_blade}$" interpolated: true MassSpectrometry: class_uri: nmdc:MassSpectrometry is_a: DataGeneration description: Spectrometry where the sample is converted into gaseous ions which are characterised by their mass-to-charge ratio and relative abundance. exact_mappings: - CHMO:0000470 slots: - eluent_introduction_category - has_calibration - has_chromatography_configuration - has_mass_spectrometry_configuration slot_usage: id: structured_pattern: syntax: "{id_nmdc_prefix}:(dgms|omprc)-{id_shoulder}-{id_blade}$" interpolated: true has_calibration: structured_pattern: syntax: "{id_nmdc_prefix}:calib-{id_shoulder}-{id_blade}$" interpolated: true has_chromatography_configuration: structured_pattern: syntax: "{id_nmdc_prefix}:chrcon-{id_shoulder}-{id_blade}$" interpolated: true has_mass_spectrometry_configuration: structured_pattern: syntax: "{id_nmdc_prefix}:mscon-{id_shoulder}-{id_blade}$" interpolated: true rules: - title: has_calibration_required_if_gc description: >- If eluent_introduction_category is gas_chromatography, then has_calibration is required. preconditions: slot_conditions: eluent_introduction_category: equals_string: gas_chromatography postconditions: slot_conditions: has_calibration: required: true - title: has_chromatography_configuration_required_if_lc_or_gc description: >- If eluent_introduction_category is liquid_chromatography or gas_chromatography, then has_chromatography_configuration is required. preconditions: slot_conditions: eluent_introduction_category: any_of: - equals_string: liquid_chromatography - equals_string: gas_chromatography postconditions: slot_conditions: has_chromatography_configuration: required: true Configuration: abstract: true is_a: InformationObject class_uri: nmdc:Configuration description: A set of parameters that define the actions of a process and is shared among multiple instances of the process. notes: - This class is intended to represent the parameters within a method file (or similar) that control a process. MassSpectrometryConfiguration: is_a: Configuration class_uri: nmdc:MassSpectrometryConfiguration description: A set of parameters that define and control the actions of a mass spectrometry process. notes: - This class is intended to represent a mass spectrometry method file that controls a mass spectrometry process. slots: - mass_spectrometry_acquisition_strategy - resolution_categories - mass_analyzers - ionization_source - mass_spectrum_collection_modes - polarity_mode slot_usage: name: required: true description: required: true id: structured_pattern: syntax: "{id_nmdc_prefix}:mscon-{id_shoulder}-{id_blade}$" interpolated: true ChromatographyConfiguration: is_a: Configuration class_uri: nmdc:ChromatographyConfiguration description: A set of parameters that define and control the actions of a chromatography process. notes: - This class is intended to represent a chromatography method file associated with a mass spectrometry process. slots: - chromatographic_category - ordered_mobile_phases - stationary_phase - temperature slot_usage: name: required: true description: required: true id: structured_pattern: syntax: "{id_nmdc_prefix}:chrcon-{id_shoulder}-{id_blade}$" interpolated: true FunctionalAnnotationAggMember: class_uri: nmdc:FunctionalAnnotationAggMember slots: - metagenome_annotation_id - gene_function_id - count - type slot_usage: metagenome_annotation_id: structured_pattern: # doesn't include act syntax: "{id_nmdc_prefix}:(wfmgan|wfmtan)-{id_shoulder}-{id_blade}{id_version}$" interpolated: true Database: class_uri: nmdc:Database tree_root: true aliases: - NMDC metadata object description: An abstract holder for any set of metadata and data. It does not need to correspond to an actual managed database top level holder class. When translated to JSON-Schema this is the 'root' object. It should contain pointers to other objects of interest. For MongoDB, the lists of objects that Database slots point to correspond to **collections**. slots: - biosample_set - calibration_set - chemical_entity_set - collecting_biosamples_from_site_set - configuration_set - data_generation_set - data_object_set - field_research_site_set - functional_annotation_agg - functional_annotation_set - genome_feature_set - instrument_set - material_processing_set - processed_sample_set - protocol_execution_set - storage_process_set - study_set - workflow_execution_set Pooling: class_uri: nmdc:Pooling is_a: MaterialProcessing description: physical combination of several instances of like material. slots: exact_mappings: - OBI:0600016 slot_usage: has_input: minimum_cardinality: 2 required: true structured_pattern: syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: required: true minimum_cardinality: 1 maximum_cardinality: 1 structured_pattern: syntax: "{id_nmdc_prefix}:procsm-{id_shoulder}-{id_blade}$" interpolated: true id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:poolp-{id_shoulder}-{id_blade}$" interpolated: true Extraction: class_uri: nmdc:Extraction is_a: MaterialProcessing description: A material separation in which a desired component of an input material is separated from the remainder. exact_mappings: - OBI:0302884 slots: - substances_used - extraction_targets - input_mass - volume slot_usage: has_input: required: true any_of: - range: Biosample - range: ProcessedSample structured_pattern: syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: required: true structured_pattern: syntax: "{id_nmdc_prefix}:(procsm)-{id_shoulder}-{id_blade}$" interpolated: true id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:extrp-{id_shoulder}-{id_blade}$" interpolated: true volume: description: The volume of the solvent/solute being used, not the input. LibraryPreparation: class_uri: nmdc:LibraryPreparation aliases: - LibraryConstruction is_a: MaterialProcessing slots: - is_stranded - library_preparation_kit - library_type - nucl_acid_amp - pcr_cond - pcr_cycles - pcr_primers - stranded_orientation close_mappings: - OBI:0000711 comments: - OBI:0000711 specifies a DNA input (but not ONLY a DNA input) slot_usage: has_input: required: true structured_pattern: syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: required: true structured_pattern: syntax: "{id_nmdc_prefix}:(procsm)-{id_shoulder}-{id_blade}$" interpolated: true id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:libprp-{id_shoulder}-{id_blade}$" interpolated: true pcr_cond: description: Description of reaction conditions and components of polymerase chain reaction performed during library preparation CollectingBiosamplesFromSite: class_uri: nmdc:CollectingBiosamplesFromSite is_a: PlannedProcess title: Collecting Biosamples From Site comments: - "this illustrates implementing a Biosample relation with a process class" close_mappings: - OBI:0000744 slot_usage: has_input: range: Site required: true structured_pattern: syntax: "{id_nmdc_prefix}:(frsite|site)-{id_shoulder}-{id_blade}$" interpolated: true has_output: range: Biosample required: true structured_pattern: syntax: "{id_nmdc_prefix}:bsm-{id_shoulder}-{id_blade}$" interpolated: true id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:clsite-{id_shoulder}-{id_blade}$" interpolated: true ProtocolExecution: class_uri: 'nmdc:ProtocolExecution' is_a: PlannedProcess description: A PlannedProces that has PlannedProcess parts. Can be used to represent the case of someone following a Protocol. slots: - has_process_parts - protocol_execution_category slot_usage: id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:pex-{id_shoulder}-{id_blade}$" interpolated: true has_input: structured_pattern: syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: structured_pattern: syntax: "{id_nmdc_prefix}:(procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_process_parts: required: true structured_pattern: syntax: "{id_nmdc_prefix}:(extrp|filtpr|dispro|poolp|libprp|subspr|mixpro|chcpr|cspro)-{id_shoulder}-{id_blade}$" interpolated: true description: The MaterialProcessing steps that are discrete parts of the ProtocolExecution. SubSamplingProcess: class_uri: 'nmdc:SubSamplingProcess' description: > Separating a sample aliquot from the starting material for downstream activity. related_mappings: - OBI:0000744 notes: - A subsample may be (a) a portion of the sample obtained by selection or division; (b) an individual unit of the lot taken as part of the sample; (c) the final unit of multistage sampling. The term 'subsample' is used either in the sense of a 'sample of a sample' or as a synonym for 'unit'. In practice, the meaning is usually apparent from the context or is defined. - TODO - Montana to visit slot descriptions contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-8683-0050 #Montana Smith - ORCID:0000-0001-9076-6066 #Mark Miller - ORCID:0009-0008-4013-7737 #James Tessmer is_a: MaterialProcessing slots: - container_size - contained_in - temperature - volume - mass - sampled_portion slot_usage: id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:subspr-{id_shoulder}-{id_blade}$" interpolated: true volume: description: The output volume of the SubSampling Process. mass: description: The output mass of the SubSampling Process. has_input: any_of: - range: Biosample - range: ProcessedSample structured_pattern: # MAM 2024-05-22 isn't that inherited from a parent class? syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: range: ProcessedSample description: The subsample. structured_pattern: # MAM 2024-05-22 isn't that inherited from a parent class? syntax: "{id_nmdc_prefix}:(procsm)-{id_shoulder}-{id_blade}$" interpolated: true MixingProcess: class_uri: 'nmdc:MixingProcess' description: > The combining of components, particles or layers into a more homogeneous state. contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-8683-0050 #Montana Smith is_a: MaterialProcessing comments: - The mixing may be achieved manually or mechanically by shifting the material with stirrers or pumps or by revolving or shaking the container. - The process must not permit segregation of particles of different size or properties. - Homogeneity may be considered to have been achieved in a practical sense when the sampling error of the processed portion is negligible compared to the total error of the measurement system. slots: - duration slot_usage: id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:mixpro-{id_shoulder}-{id_blade}$" has_input: any_of: - range: Biosample - range: ProcessedSample structured_pattern: # MAM 2024-05-22 isn't that inherited from a parent class? syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: range: ProcessedSample description: The mixed sample. structured_pattern: syntax: "{id_nmdc_prefix}:procsm-{id_shoulder}-{id_blade}$" interpolated: true FiltrationProcess: class_uri: 'nmdc:FiltrationProcess' description: >- The process of segregation of phases; e.g. the separation of suspended solids from a liquid or gas, usually by forcing a carrier gas or liquid through a porous medium. related_mappings: - CHMO:0001640 contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-8683-0050 #Montana Smith - ORCID:0000-0001-9076-6066 #Mark Miller - ORCID:0009-0008-4013-7737 #James Tessmer is_a: MaterialProcessing slots: - conditionings - container_size - filter_material - filter_pore_size - filtration_category - is_pressurized - separation_method - volume slot_usage: id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:filtpr-{id_shoulder}-{id_blade}$" interpolated: true volume: description: The volume of sample filtered. has_input: any_of: - range: Biosample - range: ProcessedSample structured_pattern: syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: range: ProcessedSample structured_pattern: syntax: "{id_nmdc_prefix}:procsm-{id_shoulder}-{id_blade}$" interpolated: true StorageProcess: class_uri: 'nmdc:StorageProcess' description: >- A planned process with the objective to preserve and protect material entities by placing them in an identified location which may have a controlled environment. is_a: PlannedProcess related_mappings: - OBI:0302893 slots: - substances_used - contained_in - temperature slot_usage: substances_used: description: The substance(s) that a processed sample is stored in. id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:storpr-{id_shoulder}-{id_blade}$" interpolated: true has_input: structured_pattern: syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: structured_pattern: syntax: "{id_nmdc_prefix}:procsm-{id_shoulder}-{id_blade}$" interpolated: true ChromatographicSeparationProcess: class_uri: 'nmdc:ChromatographicSeparationProcess' description: The process of using a selective partitioning of the analyte or interferent between two immiscible phases. contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-1368-8217 #Yuri Corilo is_a: MaterialProcessing slots: - chromatographic_category - ordered_mobile_phases - stationary_phase - temperature slot_usage: id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:cspro-{id_shoulder}-{id_blade}$" has_input: any_of: - range: Biosample - range: ProcessedSample structured_pattern: syntax: "{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$" interpolated: true has_output: range: ProcessedSample structured_pattern: syntax: "{id_nmdc_prefix}:procsm-{id_shoulder}-{id_blade}$" interpolated: true DissolvingProcess: class_uri: 'nmdc:DissolvingProcess' aliases: - Solubilization description: > A mixing step where a soluble component is mixed with a liquid component. exact_mappings: - CHMO:0002773 contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-1368-8217 #Yuri Corilo is_a: MaterialProcessing slots: - duration - temperature - substances_used slot_usage: id: required: true structured_pattern: syntax: "{id_nmdc_prefix}:dispro-{id_shoulder}-{id_blade}$" interpolated: true enums: StrandedOrientationEnum: description: This enumeration specifies information about stranded RNA library preparations. permissible_values: antisense orientation: description: Orientation that is complementary (non-coding) to a sequence of messenger RNA. comments: - See https://www.genome.gov/genetics-glossary/antisense exact_mappings: - SO:0000077 sense orientation: description: Orientation that corresponds to the coding sequence of messenger RNA. MassSpectrometryAcquisitionStrategyEnum: permissible_values: data_independent_acquisition: description: - Data independent mass spectrometer acquisition method wherein the full mass range is fragmented. Examples of such an approach include MS^E, AIF, and bbCID. aliases: - DIA - data independent acquisition from dissociation of full mass range exact_mappings: - MS:1003227 data_dependent_acquisition: description: Mass spectrometer data acquisition method wherein MSn spectra are triggered based on the m/z of precursor ions detected in the same run. aliases: -DDA exact_mappings: - MS:1003221 full_scan_only: aliases: - MS description: Mass spectrometer data acquisition method wherein only MS1 data are acquired. ResolutionCategoryEnum: permissible_values: high: description: higher than unit resolution low: description: at unit resolution MassAnalyzerEnum: permissible_values: time_of_flight: aliases: - TOF description: Instrument that separates ions by m/z in a field-free region after acceleration to a fixed acceleration energy. exact_mappings: - MS:1000084 quadrupole: aliases: - Quad - Q description: A mass spectrometer that consists of four parallel rods whose centers form the corners of a square and whose opposing poles are connected. The voltage applied to the rods is a superposition of a static potential and a sinusoidal radio frequency potential. The motion of an ion in the x and y dimensions is described by the Matthieu equation whose solutions show that ions in a particular m/z range can be transmitted along the z axis. exact_mappings: - MS:1000081 Orbitrap: aliases: - Orbi description: An ion trapping device that consists of an outer barrel-like electrode and a coaxial inner spindle-like electrode that form an electrostatic field with quadro-logarithmic potential distribution. The frequency of harmonic oscillations of the orbitally trapped ions along the axis of the electrostatic field is independent of the ion velocity and is inversely proportional to the square root of m/z so that the trap can be used as a mass analyzer. exact_mappings: - MS:1000484 ion_cyclotron_resonance: aliases: - ICR description: A mass spectrometer based on the principle of ion cyclotron resonance in which an ion in a magnetic field moves in a circular orbit at a frequency characteristic of its m/z value. Ions are coherently excited to a larger radius orbit using a pulse of radio frequency energy and their image charge is detected on receiver plates as a time domain signal. Fourier transformation of the time domain signal results in a frequency domain signal which is converted to a mass spectrum based in the inverse relationship between frequency and m/z. exact_mappings: - MS:1000079 ion_trap: aliases: - LTQ - Ion Trap - Paul Trap description: A device for spatially confining ions using electric and magnetic fields alone or in combination. exact_mappings: - MS:1000264 IonizationSourceEnum: permissible_values: electrospray_ionization: aliases: - ESI matrix_assisted_laser_desorption_ionization: aliases: - MALDI atmospheric_pressure_photo_ionization: aliases: - APPI atmospheric_pressure_chemical_ionization: aliases: - APCI electron_ionization: aliases: - EI MassSpectrumCollectionModeEnum: permissible_values: full_profile: { } reduced_profile: { } centroid: { } PolarityModeEnum: permissible_values: positive: { } negative: { } EluentIntroductionCategoryEnum: permissible_values: liquid_chromatography: aliases: - LC description: The processed sample is introduced into the mass spectrometer through a liquid chromatography process. gas_chromatography: aliases: - GC description: The processed sample is introduced into the mass spectrometer through a gas chromatography process. direct_infusion_syringe: description: The processed sample is introduced into the mass spectrometer through a direct infusion process using a syringe. direct_infusion_autosampler: description: The processed sample is introduced into the mass spectrometer through a direct infusion process using an autosampler. LibraryTypeEnum: permissible_values: DNA: { } RNA: { } ContainerCategoryEnum: description: The permitted types of containers used in processing metabolomic samples. contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-8683-0050 #Montana Smith permissible_values: v-bottom_conical_tube: falcon_tube: SeparationMethodEnum: description: The tool/substance used to separate or filter a solution or mixture. contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-8683-0050 #Montana Smith permissible_values: ptfe_96_well_filter_plate: syringe: StationaryPhaseEnum: description: The type of stationary phase used in a chromatography process. contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-4504-1039 #Katherine Heal permissible_values: BEH-HILIC: C18: C8: C4: C2: C1: C30: C60: CNT: CN: Diol: HILIC: NH2: Phenyl: Polysiloxane: PS-DVB: SAX: SCX: Silica: WCX: WAX: ZIC-HILIC: ZIC-pHILIC: ZIC-cHILIC: ProtocolCategoryEnum: description: The possible protocols that may be followed for an assay. permissible_values: mplex: derivatization: filter_clean_up: organic_matter_extraction: solid_phase_extraction: phosphorus_extraction: ph_measurement: respiration_measurement: texture_measurement: dna_extraction: phenol_chloroform_extraction: { } ChromatographicCategoryEnum: permissible_values: liquid_chromatography: aliases: - LC gas_chromatography: aliases: - GC solid_phase_extraction: aliases: - SPE SamplePortionEnum: permissible_values: supernatant: aliases: - top_layer pellet: aliases: - bottom_layer organic_layer: aqueous_layer: non_polar_layer: slots: polarity_mode: range: PolarityModeEnum description: the polarity of which ions are generated and detected mass_spectrum_collection_modes: range: MassSpectrumCollectionModeEnum description: Indicates whether mass spectra were collected in full profile, reduced profile, or centroid mode during acquisition. multivalued: true eukaryotic_evaluation: range: EukEval description: Contains results from evaluating if a Metagenome-Assembled Genome is of eukaryotic lineage. ncbi_lineage_tax_ids: range: string pattern: '^\d+(-\d+)*$' description: Dash-delimited ordered list of NCBI taxonomy identifiers (TaxId) comments: - Example 1-131567-2759-2611352-33682-191814-2603949 ncbi_lineage: range: string description: Comma delimited ordered list of NCBI taxonomy names. comments: - Example root,cellular organisms,Eukaryota,Discoba,Euglenozoa,Diplonemea,Diplonemidae has_failure_categorization: range: FailureCategorization multivalued: true inlined_as_list: true ionization_source: range: IonizationSourceEnum description: The ionization source used to introduce processed samples into a mass spectrometer exact_mappings: - MS:1000008 mass_analyzers: range: MassAnalyzerEnum description: The kind of mass analyzer(s) used during the spectra collection. multivalued: true exact_mappings: - MS:1000443 resolution_categories: range: ResolutionCategoryEnum description: The relative resolution at which spectra were collected. examples: - value: - high - low multivalued: true mass_spectrometry_acquisition_strategy: range: MassSpectrometryAcquisitionStrategyEnum description: Mode of running a mass spectrometer method by which m/z ranges are selected and ions possibly fragment. exact_mappings: - MS:1003213 eluent_introduction_category: range: EluentIntroductionCategoryEnum description: A high-level categorization for how the processed sample is introduced into a mass spectrometer. examples: - value: liquid_chromatography - value: direct_infusion_syringe has_mass_spectrometry_configuration: range: MassSpectrometryConfiguration description: The identifier of the associated MassSpectrometryConfiguration. has_chromatography_configuration: range: ChromatographyConfiguration description: The identifier of the associated ChromatographyConfiguration, providing information about how a sample was introduced into the mass spectrometer. metagenome_annotation_id: range: WorkflowExecution description: The identifier for the analysis activity that generated the functional annotation results, where the analysis activity is an instance of the/an appropriate subclass of WorkflowExecution required: true any_of: - range: MetagenomeAnnotation - range: MetatranscriptomeAnnotation gene_function_id: range: uriorcurie description: The identifier for the gene function. examples: - value: KEGG.ORTHOLOGY:K00627 required: true count: range: integer required: true functional_annotation_agg: range: FunctionalAnnotationAggMember multivalued: true inlined: true inlined_as_list: true ecosystem_path_id: range: string description: A unique id representing the GOLD classifiers associated with a sample. sample_collection_year: range: integer sample_collection_month: library_preparation_kit: range: string ## WHY ARE THESE COMMENTED OUT? # exact_mappings: # - GENEPIO:0001450 # processed_date: # can we move this or infer this from the date of some other process? # # range: date # range: string # extraction_date: # replacing with start_ and end_date # # range: date # range: string pcr_cycles: range: integer exact_mappings: - OBI:0002475 is_stranded: description: Is the (RNA) library stranded or non-stranded (unstranded). range: boolean comments: - A value of true means the library is stranded, flase means non-stranded. stranded_orientation: description: Lists the strand orientiation for a stranded RNA library preparation. range: StrandedOrientationEnum input_mass: title: sample mass used description: Total mass of sample used in activity. aliases: - sample mass - sample weight exact_mappings: - MS:1000004 narrow_mappings: - MIXS:0000111 range: QuantityValue library_type: title: library type examples: - value: DNA range: LibraryTypeEnum date_created: description: from database class etl_software_version: description: from database class object_set: inlined_as_list: true mixin: true multivalued: true description: Applies to a property that links a database object to a set of objects. This is necessary in a json document to provide context for a list, and to allow for a single json object that combines multiple object types chemical_entity_set: mixins: - object_set range: ChemicalEntity description: This property links a database object to the set of chemical entities within it. biosample_set: mixins: object_set range: Biosample description: This property links a database object to the set of samples within it. study_set: mixins: object_set range: Study description: This property links a database object to the set of studies within it. field_research_site_set: mixins: object_set range: FieldResearchSite collecting_biosamples_from_site_set: mixins: object_set range: CollectingBiosamplesFromSite data_object_set: mixins: object_set range: DataObject description: This property links a database object to the set of data objects within it. genome_feature_set: mixins: object_set range: GenomeFeature description: This property links a database object to the set of all features functional_annotation_set: mixins: object_set range: FunctionalAnnotation description: This property links a database object to the set of all functional annotations workflow_execution_set: mixins: object_set range: WorkflowExecution description: This property links a database object to the set of workflow executions. data_generation_set: mixins: object_set range: DataGeneration description: This property links a database object to the set of data generations within it. processed_sample_set: mixins: object_set range: ProcessedSample description: This property links a database object to the set of processed samples within it. instrument_set: mixins: object_set range: Instrument description: This property links a database object to the set of instruments within it. calibration_set: mixins: object_set range: CalibrationInformation description: This property links a database object to the set of calibrations within it. configuration_set: mixins: object_set range: Configuration description: This property links a database object to the set of configurations within it. protocol_execution_set: mixins: object_set range: ProtocolExecution description: This property links a database object to the set of protocol executions within it. storage_process_set: mixins: object_set range: StorageProcess description: This property links a database object to the set of storage processes within it. material_processing_set: mixins: object_set range: MaterialProcessing description: This property links a database object to the set of material processing within it. sample_collection_day: range: integer sample_collection_hour: range: integer sample_collection_minute: range: integer biogas_temperature: range: string soil_annual_season_temp: range: string biogas_retention_time: range: string completion_date: range: string container_size: range: QuantityValue description: The volume of the container an analyte is stored in or an activity takes place in contributors: - ORCID:0009-0001-1555-1601 #Anastasiya Prymolenna - ORCID:0000-0002-8683-0050 #Montana Smith protocol_execution_category: range: ProtocolCategoryEnum required: true has_process_parts: range: PlannedProcess description: A list of process parts that make up a protocol. required: true multivalued: true list_elements_ordered: true filter_material: description: "A porous material on which solid particles present in air or other fluid which flows through it are largely caught and retained." comments: - "Filters are made with a variety of materials: cellulose and derivatives, glass fibre, ceramic, synthetic plastics and fibres. Filters may be naturally porous or be made so by mechanical or other means. Membrane/ceramic filters are prepared with highly controlled pore size in a sheet of suitable material such as polyfluoroethylene, polycarbonate or cellulose esters. Nylon mesh is sometimes used for reinforcement. The pores constitute 80–85% of the filter volume commonly and several pore sizes are available for air sampling (0.45−0.8 μm are commonly employed)." range: string filter_pore_size: range: QuantityValue description: "A quantitative or qualitative measurement of the physical dimensions of the pores in a material." conditionings: range: string description: "Preliminary treatment of either phase with a suitable solution of the other phase (in the absence of main extractable solute(s)) so that when the subsequent equilibration is carried out changes in the (volume) phase ratio or in the concentrations of other components are minimized." multivalued: true list_elements_ordered: true separation_method: range: SeparationMethodEnum description: The method that was used to separate a substance from a solution or mixture. filtration_category: range: string description: The type of conditioning applied to a filter, device, etc. material_component_separation: range: string description: "A material processing in which components of an input material become segregated in space" value: range: QuantityValue modifier_substance: range: string description: The type of modification being done is_pressurized: range: boolean description: Whether or not pressure was applied to a thing or process. contained_in: range: ContainerCategoryEnum description: A type of container. examples: - value: test tube - value: falcon tube - value: whirlpak input_volume: # see also `volume` range: QuantityValue description: The volume of the input sample. ordered_mobile_phases: range: MobilePhaseSegment description: The solution(s) that moves through a chromatography column. multivalued: true list_elements_ordered: true inlined_as_list: true stationary_phase: range: StationaryPhaseEnum description: The material the stationary phase is comprised of used in chromatography. chromatographic_category: range: ChromatographicCategoryEnum description: The type of chromatography used in a process. sampled_portion: range: SamplePortionEnum multivalued: true description: The portion of the sample that is taken for downstream activity.