An ontology for describing software and their links to inputs, outputs and variables. The ontology extends schema.org and codemeta vocabularies
Hernan Vargas
Maximiliano Osorio
September 29th, 2020
Daniel Garijo
Varun Ratnakar
Yolanda Gil
Deborah Khider
The Software Description Ontology
sd
An ontology for describing software and their links to inputs, outputs and variables. The ontology extends schema.org and codemeta vocabularies
https://w3id.org/okn/o/sd/1.8.0
1.9.0
Parameter that can be adjusted in a configuration setup
adjustable parameter
Property that links parameter with the variable they adjust. This property can be used when parameters quantify variables without directly representing them. For example, a "fertilizer percentage adjustment" parameter can quantify a "fertilizer price" variable
adjusts variable
The creator of a software component
author
Property that links a software component to other useful software that canbe used to visualize its outputs
compatible visualization software
Contributor to a software component
contributor
Copyright holder for a software component
copyright holder
Link to the organization funding a software component
funding source
Property to identify the original source of the information of the annotated resource. It could be a web page, an organization, a person, some experiment notes, etc.
had primary source
Property that links a model to one of its configurations. A model may have multiple configurations, each of which is unique in terms of the inputs and outputs it uses.
has configuration
Constraint or rule associated to a variable or software configuration. For example: "This model accepts only monthly data", or "all inputs of this model configuration must share the same location". More structured restrictions, such as Jena rules or SWRL rules may also be captured with this property
has constraint
Contact person responsible for a software component
has contact person
Property that associates an input/output with their corresponding data transformation.
has data transformation
Property to link an input/output dataset to the specific data transformation (with URLs
has data transformation setup
Relates a dataset specification to the data structure definition
has file structure
Property that links a parameter or an input to a fixed value. For example, in a given configuration a parameter with the planting date for a model could be fixed to avoid the user changing it for that region.
has fixed resource
Property that links a software project to its funding information
has funding information
Property that links a model configuration to the input types expected by it.
has input
Property that expresses what are the outputs of a model
has output
Property that indicates the parameters of a model configuration
has parameter
Property designed to reference the elements included in a sample collection.
has part
Property that links an instance of a dataset (or a dataset specification) to the presentation of a variable contained (or expected to be contained) on it.
has presentation
Property pointing to a sample execution of a software configuration
has sample execution
Property designed to link a software configuration to a sample resource resulting from its execution
has sample result
A typical sample visualization of the software outputs
has sample visualization
Property used to define configurations with some fixed resources and values. The rationale of this property is to allow predefined configurations
has setup
Function to link a function with its corresponding container
has software image
Property designed to link a software with its software source code (which may reside in a code repository such as GitHub)
has source code
the standard name of a variable
has standard variable
Property that links a rule and the variable that will test it
has variable
Property designed to link a software component with its corresponding versions
has software version
Property that links a dataset specification from a model configuration or setup to the output from a target data transformation. This occurs when a data transformation produces several outputs, but only one of them is the one needed for a model
is transformed from
Property that links to the image used as logo for a software component
logo
Associates a presentation with a dataset where the presentation occurs
part of dataset
Publisher organization or person responsible for a software component
publisher
Image illustrating a snapshot of the target software
screenshot
Property that indicates that a software component (or any of its outputs) can be used to calculate a particular index. The rationale for this property is that indices are usually calculated by applying post-processing steps to the outputs of a software component.
useful for calculating index
Property used to link a variable presentation or time interval to the unit they are represented in
uses unit
Property that links a setup to a previous version of that setup. This property is needed (for example) when creating snapshots of setups.
was derived from setup
Property that identifies the software used to create a visualization
was derived from software
Property that indicates in which registry the software image being described can be found. For example, https://hub.docker.com
available in registry
How to cite this software
citation
URL to the code repository of a software component
code repository
Year in which the software component was copyrighted
copyright year
An identifier for resources with metadata entries in a data catalog
data catalog identifier
Date when a software component was created
date created
Date when a software component was published
date published
A description of a resource
description
Digital Object Identifier associated with a software component
doi
Email of a person
email
Grant number used for funding
funding grant
Property that constraints which values are accepted for a parameter. For example, the name of a crop can only be "Maize" or "Sorghum"
has accepted values
String with the people, organizations and other contributors acknowledged by the authors.
has acknowledgements
Assumptions of a software, e.g. the solver being used for a particular model, the source of the data (e.g., all data must have a given resolution), etc.
has assumption
A file (e.g., Dockerfile) with executable instructions indicating how a Software Image or a Software component is built
has build file
Property linking the software component to the code of conduct to be followed by potential contributors. The range of this property may be a strin or a URI to the target file.
has code of conduct
Location of the aggregation of all the files needed to execute the component. Usually a zip file including the run script and support scripts, including specification files
has component location
Property that indicates the data type of a parameter
has data type
Default accepted value of a variable presentation (or a parameter)
has default value
Property to indicate dimensionality of the input or output of a dataset specification
has dimensionality
Pointer to the documentation of the model
has documentation
Instructions needed to download a software component. The difference with `hasDownloadURL` is that this property captures the human readable instructions required to download software. For example, sometimes an authentication is needed, users need to fill in a form, etc.
has download instructions
Download URL where to obtain the source/executable of the software
has download URL
An example explaining a scenario where the software component was used in plain language.
has example
Instructions that indicate how a software component should be executed. The difference with `hasExecutionCommand` is that the execution instructions aim to be human-readable, and have explanations between the different commands and instructions
has executable instructions
Property that links a software component with an executable notebook (e.g., Jupyter notebook) that illustrates how to use it in an executable manner.
has executable notebook
Execution instructions on how to run the image
has execution command
Frequently asked questions about a software
has FAQ
Value of a parameter in a software setup.
has fixed value
Format followed by a file. For example, txt, nc, etc.
has format
Property that points to the main runnable script for the current function
has implementation script location
Instructions required to install this particular piece of software. Installation instructions usually are available in a human-readable manner.
has installation instructions
Properties that relate the variable representation to its long name. The long name is useful for context (e.g., precipitation is less ambiguous than P) but not as precise as the standard name.
has long name
Maximum accepted value of a variable presentation (or a parameter)
has maximum accepted value
Minimum accepted value of a variable presentation (or a parameter)
has minimum accepted value
Objective or main functionality that can be achieved by running this software
has purpose
Rule that defines this constraint
has rule
A short name (e.g., temperature) capturing the high-level concept of the variable
has short name
Property that determines what are the increments (step size) that are commonly used to vary a parameter. This is commonly used for automatically setting up software tests. For example, if I want to set up a model and try 30 reasonable values on a parameter, I may use the default value and the step size to create the appropriate increments. If the step size is 0.1 and the default value is 0, then I will will be able to create setups: 0, 0.1, 0.2...2.9,3
has step size
Property that links to the location of scripts that may be used from the main runnable script.
has support script location
Typical data sources that are used by a software component
has typical data source
Property that describes the usage considerations of a particular software. These notes capture the rationale of for that software configuration, along with an explanation for sample inputs, things to consider when running the model with data, etc.
has usage notes
Identifier of the version of this software
has version id
Identifier of the resource being described
identifier
Pointer to the issue tracker of a software component
issue tracker
Keywords associated with a software component
keywords
License of a software component or its source code
license
Memory requirements of a software
memory requirements
Name of the resource
name
Operating systems under which a software component can operate
operating systems
Property that indicates the relative path of an input or output with respect to the folder structure of the executable.
For example, let's assume we have an input that has to exist in the folder `/datasets` or the executable will not work. This property ensures that this knowledge is captured for a given software component execution.
In this case the property would capture this as follows:
```
:input_prep a sd:DatasetSpecification .
:input_prep rdfs:label "precipitation file" .
:input_precip sd:pathLocation "/datasets/".
```
path location
Position of the parameter or input/output in the model configuration. This property is needed to know how to organize the I/O of the component on execution
position
Processor requirements of a software component
processor requirements
Language used to code a software component
programming language
URl to the readme file of a software component
readme
Value that represents how a parameter should be incremented on each iteration of a software component execution. This value is important when preparing execution ensembles automatically, e.g., simulating crop production varying the parameter "fertilizer amount" in increments of 10%.
recommended increment
Main publication to cite for this software component
reference publication
A summarized description of the resource
short description
Software requirements needed to install a software component
software requirements
Data property to indicate the status of a configuration setups. For example, to indicate that a setup has been executed in a platform, that the setup should notbe shown to users (it's an auxiliary setup), etc.
status
Property to link details, such as mailing lists in case a contact person is not provided
support details
Tag used to annotate a version or a software configuration. This annotation is useful to show which version is the latest, or which version is deprecated. Supported tags are: "latest", "deprecated"
tag
Value associated to the described entity
value
Website of the software
website
Class to identify that a parameter is a catalog identifier. The rationale for this type of parameter is that in some cases datasets may be downloaded in the software component itself, rather than exposed as an input
Catalog identifier
Special type of configuration in which some of the inputs or parameters are associated to files or values. A configuration may be associated to multiple setups to facilitate its execution.
Configuration Setup
Data constraints of a configuration
Constraint
Class that represents a software for performing data transformation.
Data Transformation
Special type of data transformation where the inputs and parameters have some pre-selected values. For example, they may point to a particular dataset URL to be used in the transformation
Data Transformation Setup
Class designed to describe a type of input or output used or produced by a model. For example, Topoflow has several inputs. One of them is a text file with precipitation values. The representation of this input is an instance of a dataset specification.
Dataset Specification
A class to represent the funding information of a software project
Funding Information
An image (e.g. tiff file) is a type of dataset specification used to define certain inputs of models like soil, crops, etc.
Image
A number (such as a ratio) derived from a series of observations and used as an indicator or measure (https://www.merriam-webster.com/dictionary/index)
Numerical Index
An organized body of people with a particular purpose
Organization
A parameter of the model.
Parameter
A human being (individual)
Person
A collection of resources that are used as sample for running a sfoware component multiple times
Sample Collection
A sample execution of a given software
Sample Execution
A sample resource associated with a software
Sample Resource
The set of instructions that indicate a machine how to work. In this ontology software is a general concept which acts as a superclass for software versions, docker images, data transformations, etc.
Software
A software configuration represents a particular way of invoking a function of a software component. A software configuration exposes the precise inputs and outputs that are used for that function. Multiple software configurations may be associated to a software component. A software configuration facilitates the encapsulation of software, and it can be used to represent components of scientific workflows.
Software Configuration
An image that virtualizes the functionality of a given software. For example, a Docker container.
Software Image
A software version is a specificf type of software that represents a particular set of functionalities. New functionalities and error fixes may occur between software versions
Software Version
Class representing the characteristics of the code associated with a software component
Source Code
A standard variable, necessary to refer to all the variable using the same nomenclature in a domain ontology. For example, a standard variable may be a SVO variable (http://www.geoscienceontology.org/geo-upper#Variable)
Standard Variable
Class designed to distinguish the different types of units that are available in variables from datasets or parameters
Unit
A symbol that represents a quanity in a dataset or dataset specification
Variable
Concept used to represent an instantiation of a variable in an input/output dataset. For example, a model A may use an input file with temperature expressed in Farenheit (variablePresentation1), while a model B may produce an output with temperature in Celsius (variablePresentation2). Both variable presentations refer to the concept of temperature.
Variable presentation
Class to represent any type of visualization related to a software. For example, a dynamic HTML page, a video, etc.
Visualization