Andrea Poltronieri
Jacopo de Berardinis
Nicolas Lazzari
An extension of Music Meta to describe the metadata of music collections, corpora, containers, or simply music datasets.
CoMeta Ontology
CoMeta Ontology
An extension of Music Meta to describe the metadata of music collections, corpora, containers, or simply music datasets.
1.0
"2023-05-18"^^xsd:date
"2023-04-12"^^xsd:date
com:
https://w3id.org/polifonia/ontology/cometa/
Associates an annotation to the raw data, or data descriptor, it annotates
annotates data
Associates a data descriptor (e.g. features, encodings) to the raw data they were derived from
describes data
extends
Associates an API to the availability information of a dataset
has API
Associates the availability of a dataset to the way it can be accessed
has accessibility
Associates a dataset split to a dataset record
has assignment
Not needed, since a dataset split is still a dataset, hence this can use contains record
Associates a dataset to information related to its availability and access
has availability
Associates a dataset to its content
has content
Associates raw data content or annotations to their type (e.g. tag, pattern, emotion).
has content type
Associates a data descriptor to the type of feature it provides
has feature type
Associates raw data content to the modality the data provides
has modality
Associates a dataset to a proper partition (a particular subset) of it
has split
A dataset can be a subset of another dataset
has subset
Associates a dataset record to an atomic music element
includes content
Associates a dataset to a task it enables
is aimed for
Associates a dataset to one of its maintainers (e.g. a person, an institution)
is maintained by
Associates a dataset record to the dataset it belongs to.
contains record
A textual description
description
download link
The number of records that are contained in a data container
record count
release date
Dataset content providing annotations that were produced or obtained from raw data content, or alternatively, from a data descriptor.
Content Annotation
Dataset content that describes the raw data content via features or encodings extracted from the former. This should not be confused with an annotation, but as a supplementary view of the raw data content of a dataset.
Content Descriptor
Describes the type of content that the raw data, or its annotaitons, provide. In the music domain, this may correspond to chord, pattern, emotion, etc.
Content Type
Dataset content providing raw data of structured (e.g. tabular data) or unstructured (e.g. audio files). For example, a dataset folder containing images can be described as raw data content.
Raw Data Content
Describes an Application Program Interface (API) for accessing, recreating, or extracting a dataset.
API
Describes the availability of data content according to the release strategy and policies of the dataset. For example, a music dataset may provide complete data records (full tracks) or contain audio clips or snippets (excertps) only.
Content Availability
Describes the accessibility of a dataset, instructing users on the modalities put in the place by the maintaners to access its content.
Data Accessibility
Describes the available of a dataset as a whole, or of a part of its content.
Data Availability
The format of the data in which content is provided.
Data Format
Describes the modality of dataset content such as audio, video, image, etc.
Data Modality
A container of data records with summative properties that allow the contextualisation of its content, availability and licensing.
Dataset
Describe the content of a dataset from a summative perspective (e.g. the audio content of a music collection, the audio features it provides, etc.) and its production process (provenance).
Dataset Content
A record of a dataset, providing references to its properties and annotations.
Dataset Record
Describes a partition of a dataset via its association with individual data record, which can be used for training, validating, or testing a computational method.
Dataset Split
Describes the type of feature provided by a content descriptor.
Feature Type
A production method is an activity that generates one or more artefacts that joinlty characterises data content.
Production Method
The type of a split that associates a function to the corresponding data partitions (e.g. a training set).
Split Type
Audio as a modality
Audio
Chords refer to harmonic structures found in music data.
Chord
Chroma-based features are descriptor of pitched audio signals (e.g. music).
Chroma Features
A production method based on a computational procedure
Algorithmic
Computational
A production method collecting data via crowdsourcing
Crowdsourced
A dataset split including the validation and test sets
Development Set
Emotion can be either perceived or induced from the data
Emotion
A production method relying on human analysis
Expert Human
Data is made available in its entirety (e.g. full audio tracks)
Full Content
Image as a modality
Image
A Mel Spectrogram is a descriptor, or feature, of an audio signal.
Mel Spectogram
MFCC Features
Mel-frequency cepstral coefficients (MFCCs) are an audio features.
Access to the dataset undergoes a request procedure
On Request Access
Access to the dataset is open
Open Access
Patterns are usually found in the data to express and formalise regularities.
Pattern
Data is made available in partial form (e.g. audio snippets).
Preview Content
Structural content refers to segments or sub-sequencies found in sequential data. In the context of music, this may correspond to segments related to musical form (e.g. motifs, phrases, sections).
Structure
A split including test data
Test Set
Text as a modality
Text
A split including training data
Training Set
A split including validation data
Validation Set
Video as a modality
Video