Audio Feature Ontology
This ontology represents the feature extraction workflow of automatically extracted features from audio signals on different levels of abstraction. It updates the original Audio Feature Ontology http://purl.org/ontology/af/.
Version 1.1
2017-09-14T18:01:24.565457
authors: Gyorgy Fazekas, Alo Allik

The Audio Feature Ontology is a Semantic Web ontology that is designed to serve a dual purpose:

  • to represent computational workflows of audio features
  • to provide a common structure for feature data formats using Open Linked Data principles and technologies.
The Audio Feature Ontology is based on the analysis of existing feature extraction tools and MIR literature. The ontology provides a descriptive framework for expressing different conceptualisations of the MIR domain and enables designing linked data formats for content-based audio features. Since there are potentially conflicting views on organising the features into a taxonomy in the research community, the ontology does not attempt to impose a hierarchical structure, leaving the structural organisation open to task and tool specific ontologies.

In order to access the RDF representation of the ontology from a Web browser, the files are accessible in Notation 3 and RDF/XML. From other types of applications, the standard content negotiation guidelines are followed. Here is an example how to access the ontology in N3 syntax from the Python rdflib module:

Graph().parse("https://w3id.org/afo/onto/1.1#", format="n3")

The core model of the ontology retains original attributes to distinguish audio features by temporal characteristics and data density. It relies on the Event and Timeline ontologies to provide the primary structuring concepts for feature data representation. Temporal characteristics classify feature data either into instantaneous points in time - e.g. event onsets or tonal change moments - or events with known time duration. Data density attributes allow describing how a feature relates to the extent of an audio file: whether it is scattered and occurs irregularly over the course of the audio signal (for example, segmentation or onset features), or the feature is calculated at regular intervals and fixed duration (e.g. signal-like features with regular sampling rate). The above image illustrates how audio features are linked with terms in the Music Ontology and thereby other music-related metadata on the Web. Specific named audio feature entities, such as afv:Onset, afv:Segment, and afv:MFCC are subclasses of afo:AudioFeature, which, in turn, is a subclass of event:Event from the Event Ontology.

Representing instantaneous events on a signal timeline like onsets can be accomplished by linking the audio feature with the event:time property to a tl:Instant which can be placed on an audio signal timeline using the tl:at property. Audio features that have a duration over a segment of the signal can be represented as tl:Interval instances. Dense signal-like features such as Chromagrams or Mel Frequency Cepstral Coefficients (MFCC) can be mapped to the signal timeline by tl:TimelineMap objects. Here is an example in Turtle syntax showing Onset and MFCC features:

    <file:///home/snd/moodplay/62400-14.01.mp3> a mo:AudioFile ;
        mo:encodes :signal_f6261475 .

    :signal_f6261475 a mo:Signal ;
        mo:time [
            a tl:Interval ;
            tl:onTimeLine :timeline_aec1cb82
        ] .

    :timeline_aec1cb82 a tl:Timeline .

    :transform_onsets a vamp:Transform ;
    	vamp:plugin plugbase:qm-onsetdetector ;
    	vamp:output plugbase:qm-onsetdetector_output_onsets .

    :transform_mfcc a vamp:Transform ;
        vamp:plugin plugbase:qm-mfcc ;
    	vamp:output plugbase:qm-mfcc_output_coefficients .

    :event_1 a afv:Onset ;
        event:time [
            a tl:Instant ;
            tl:onTimeLine :timeline_aec1cb82 ;
            tl:at "PT1.98S"^^xsd:duration ;
        ] ;
        vamp:computed_by :transform_onsets .

    :feature_1 a afv:MFCC ;
        mo:time [
            a tl:Interval ;
            tl:onTimeLine :timeline_aec1cb82 ;
        ] ;
        vamp:computed_by :transform_mfcc ;
        af:value ( -26.9344 0.188319 0.106938 ..) .
    

Beyond representing audio feature data in research workflows, there are many other practical applications for the ontology framework. One of the test cases is providing data services for an adaptive music player that uses audio features to enrich user experience and enables novel ways to search or browse large music collections. Feature data of the music tracks available in the player is stored in a CouchDB instance in JSON-LD. The data is used by Semantic Web entities called Dynamic Music Objects (dymos) that control the audio mixing functionality of the player. Dymos make song selections and determine tempo alignment for cross-fading based on features. The following examples show JSON-LD representations of a track used in the system linked to feature annotations. The first example shows the document that stores metadata about a track.

    {
       "@context": {
           "foaf": "http://xmlns.com/foaf/0.1/",
           "afo": "https://w3id.org/afo/onto/1.1#",
           "mo": "http://purl.org/ontology/mo/",
           "dc": "http://purl.org/dc/elements/1.1/",
           "tl": "http://purl.org/NET/c4dm/timeline.owl#",
           "vamp": "http://purl.org/ontology/vamp/",
           "afv": "https://w3id.org/afo/vocab/1.1#"
       },
       "@type": "mo:Track",
       "@id": "baf169e8af365c243f08794c7e48b591",
       "mo:available_as": "254087-16.01.wav",
       "mo:artist": {
           "foaf:name": "Dazz Band",
           "@type": "mo:MusicArtist"
       },
       "mo:musicbrainz": "http://musicbrainz.org/recording/ee498e4d-1940-4268-8a23-c3992dfdedef",
       "mo:musicbrainz_guid": "ee498e4d-1940-4268-8a23-c3992dfdedef",
       "dc:title": "Let It Whip",
       "release": { "@type": "mo:Release", "dc:title": "Soul Train Volume 2" },
       "mo:encodes": {
           "@type": "afo:Signal",
           "@id": "ed1c11b3-5830-4b9e-a84f-2e60ddcb3ff4",
           "mo:encoding": "WAV",
           "mo:sampleRate": 44100,
           "mo:time": {
               "tl:timeline": {
                   "@id": "14e80f2a-4b6f-4c1f-8ff4-360b2990dd53",
                   "@type": "tl:Timeline"
               },
               "@type": "tl:Interval",
               "tl:duration": "PT246.533333333S"
           }
       }
    }
    
The track is linked to the feature data by assigning a GUID to the signal timeline, that features can be placed on.
      {
         "@context": {
             "afo": "https://w3id.org/afo/onto/1.1#",
             "mo": "http://purl.org/ontology/mo/",
             "dc": "http://purl.org/dc/elements/1.1/",
             "tl": "http://purl.org/NET/c4dm/timeline.owl#",
             "vamp": "http://purl.org/ontology/vamp/",
             "afv": "https://w3id.org/afo/vocab/1.1#"
         },
         "@type": "afv:Key",
         "afo:input": { "@type": "mo:Signal", "@id": "ed1c11b3-5830-4b9e-a84f-2e60ddcb3ff4" },
         "afo:computed_by": {
             "vamp:block_size": 32768,
             "vamp:sample_rate": 44100,
             "afo:implemented_in": {
                 "@type": "afo:FeatureExtractor",
                 "afo:version": "1.3",
                 "dc:name": "Sonic Annotator"
             },
             "vamp:parameter_binding": [
                 {
                     "vamp:parameter": {
                         "vamp:identifier": "length"
                     },
                     "vamp:value": 10
                 },
                 {
                     "vamp:parameter": {
                         "vamp:identifier": "tuning"
                     },
                     "vamp:value": 440
                 }
             ],
             "vamp:output": "http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-keydetector_output_key",
             "vamp:step_size": 32768,
             "vamp:plugin": "http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-keydetector",
             "@type": "vamp:Transform",
             "vamp:plugin_version": "4"
         },
         "afo:values": [
             {
                 "afo:value": 19,
                 "tl:timeline": "14e80f2a-4b6f-4c1f-8ff4-360b2990dd53",
                 "tl:at": 0,
                 "@type": "tl:Instant",
                 "rdfs:label": "F# minor"
             },
             {
                 "afo:value": 2,
                 "tl:timeline": "14e80f2a-4b6f-4c1f-8ff4-360b2990dd53",
                 "tl:at": 2.229115646,
                 "@type": "tl:Instant",
                 "rdfs:label": "Db major"
             },
             {
                 "afo:value": 1,
                 "tl:timeline": "14e80f2a-4b6f-4c1f-8ff4-360b2990dd53",
                 "tl:at": 28.978503401,
                 "@type": "tl:Instant",
                 "rdfs:label": "C major"
             },
             {
                 "afo:value": 2,
                 "tl:timeline": "14e80f2a-4b6f-4c1f-8ff4-360b2990dd53",
                 "tl:at": 37.151927437,
                 "@type": "tl:Instant",
                 "rdfs:label": "Db major"
             }
         ]
      }
    

The ontology engineering process has produced example ontologies for existing tools including MIR Toolbox, Essentia, Marsyas:

The Audio Feature Ontology is being developed at the Centre for Digital Music, Queen Mary University of London as part of the Fusing Audio and Semantic Technologies (FAST) project.
Contact: Alo Allik
The source code is available here: https://code.soundsoftware.ac.uk/projects/af-ontology.
WebVOWL visualisation of the ontology: Audio Feature Ontology.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

10.5281/zenodo.55564

Prefixes
  • @prefix xml: <http://www.w3.org/XML/1998/namespace> .
  • @prefix owl: <http://www.w3.org/2002/07/owl#> .
  • @prefix afo: <https://w3id.org/afo/onto/1.1#> .
  • @prefix co: <http://purl.org/co/> .
  • @prefix bibo: <http://purl.org/ontology/bibo/> .
  • @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
  • @prefix prov: <http://www.w3.org/ns/prov#> .
  • @prefix mo: <http://purl.org/ontology/mo/> .
  • @prefix dc: <http://purl.org/dc/elements/1.1/> .
  • @prefix qudt: <http://qudt.org/1.1/schema/qudt#> .
  • @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
  • @prefix tl: <http://purl.org/NET/c4dm/timeline.#> .
  • @prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
  • @prefix time: <http://www.w3.org/2006/time#> .
  • @prefix dcterms: <http://purl.org/dc/terms/> .
  • @prefix event: <http://purl.org/NET/c4dm/event.#> .
  • @prefix unit: <http://qudt.org/1.1/vocab/unit#> .
Terms
Aggregation (Class) AudioFeature (Class) Context (Class) FeatureExtractor (Class) Filter (Class) FirstOperation (Class) Identifier (Class) Instance (Class) LastOperation (Class) Model (Class) Operation (Class) OperationSequence (Class) OptionalOperation (Class) Parameter (Class) Point (Class) Segment (Class) Signal (Class) Transformation (Class) agent (Property) collection (Property) computed_by (Property) context (Property) described_in (Property) dimensions (Property) feature (Property) first (Property) first_operation (Property) implementation (Property) implementation_of (Property) implemented_in (Property) implements (Property) model (Property) model_of (Property) next (Property) next_operation (Property) operation (Property) origin (Property) output (Property) parameter (Property) sequence (Property) signal_feature (Property) type (Property) unit (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) value (Property) values (Property) confidence (Individual) data_type (Individual) default_value (Individual) description (Individual) maximum_value (Individual) minimum_value (Individual) name (Individual) probability (Individual) value (Individual) value (Individual) value (Individual)
Classes
Properties