This document describes how Ocean assets follow the DID/DDO specification, such that Ocean assets can inherit DID/DDO benefits and enhance interoperability. DIDs and DDOs follow the [specification defined by the World Wide Web Consortium (W3C)](https://w3c-ccg.github.io/did-spec/).
Decentralized identifiers (DIDs) are a type of identifier that enable verifiable, decentralized digital identity. Each DID is associated with a unique entity, and DIDs may represent humans, objects, and more.
An _asset_ in Ocean represents a downloadable file, compute service, or similar. Each asset is a _resource_ under the control of a _publisher_. The Ocean network itself does _not_ store the actual resource (e.g. files).
An _asset_ has a DID and DDO. The DDO should include [metadata](did-ddo.md#metadata) about the asset, and define access in at least one [service](did-ddo.md#services). Only _owners_ or _delegated users_ can modify the DDO.
All DDOs are stored on-chain in encrypted form to be fully GDPR-compatible. A metadata cache like _Aquarius_ can help in reading, decrypting, and searching through encrypted DDO data from the chain. Because the file URLs are encrypted on top of the full DDO encryption, returning unencrypted DDOs e.g. via an API is safe to do as the file URLs will still stay encrypted.
The DDO is stored on-chain as part of the NFT contract and stored in encrypted form using the private key of the _Provider_. To resolve it, a metadata cache like _Aquarius_ must query the provider to decrypt the DDO.
| **`@context`** | Array of `string` | Contexts used for validation. |
| **`id`** | `string` | Computed as `sha256(address of ERC721 contract + chainId)`. |
| **`version`** | `string` | Version information in [SemVer](https://semver.org) notation referring to this DDO spec version, like `4.1.0`. |
| **`chainId`** | `number` | Stores chainId of the network the DDO was published to. |
| **`nftAddress`** | `string` | NFT contract linked to this asset |
| **`metadata`** | [Metadata](did-ddo.md#metadata) | Stores an object describing the asset. |
| **`services`** | [Services](did-ddo.md#services) | Stores an array of services defining access to the asset. |
| **`credentials`** | [Credentials](did-ddo.md#credentials) | Describes the credentials needed to access a dataset in addition to the `services` definition. |
| **`created`** | `ISO date/time string` | | Contains the date of the creation of the dataset content in ISO 8601 format preferably with timezone designators, e.g. `2000-10-31T01:30:00Z`. |
| **`updated`** | `ISO date/time string` | | Contains the date of last update of the dataset content in ISO 8601 format preferably with timezone designators, e.g. `2000-10-31T01:30:00Z`. |
| **`description`** | `string` | **✓** | Details of what the resource is. For a dataset, this attribute explains what the data represents and what it can be used for. |
| **`copyrightHolder`** | `string` | | The party holding the legal copyright. Empty by default. |
| **`name`** | `string` | **✓** | Descriptive name or title of the asset. |
| **`type`** | `string` | **✓** | Asset type. Includes `"dataset"` (e.g. csv file), `"algorithm"` (e.g. Python script). Each type needs a different subset of metadata attributes. |
| **`author`** | `string` | **✓** | Name of the entity generating this data (e.g. Tfl, Disney Corp, etc.). |
| **`license`** | `string` | **✓** | Short name referencing the license of the asset (e.g. Public Domain, CC-0, CC-BY, No License Specified, etc. ). If it's not specified, the following value will be added: "No License Specified". |
| **`links`** | Array of `string` | | Mapping of URL strings for data samples, or links to find out more information. Links may be to either a URL or another asset. |
| **`contentLanguage`** | `string` | | The language of the content. Use one of the language codes from the [IETF BCP 47 standard](https://tools.ietf.org/html/bcp47) |
| **`tags`** | Array of `string` | | Array of keywords or tags used to describe this content. Empty by default. |
| **`categories`** | Array of `string` | | Array of categories associated to the asset. Note: recommended to use `tags` instead of this. |
| **`additionalInformation`** | Object | | Stores additional information, this is customizable by publisher |
| **`algorithm`** | [Algorithm Metadata](did-ddo.md#algorithm-metadata) | **✓** (for algorithm assets only) | Information about asset of `type``algorithm` |
An asset of type `algorithm` has additional attributes under `metadata.algorithm`, describing the algorithm and the Docker environment it is supposed to be run under.
| **`language`** | `string` | | Language used to implement the software. |
| **`version`** | `string` | | Version of the software preferably in [SemVer](https://semver.org) notation. E.g. `1.0.0`. |
| **`consumerParameters`** | [Consumer Parameters](did-ddo.md#consumer-parameters) | | An object that defines required consumer input before running the algorithm |
| **`container`** | `container` | **✓** | Object describing the Docker container image. See below |
| **`timeout`** | `number` | **✓** | Describing how long the service can be used after consumption is initiated. A timeout of `0` represents no time limit. Expressed in seconds. |
| **`compute`** | [Compute](did-ddo.md#compute-options) | **✓** (for compute assets only) | If service is of `type``compute`, holds information about the compute-related privacy settings & resources. |
| **`consumerParameters`** | [Consumer Parameters](did-ddo.md#consumer-parameters) | | An object the defines required consumer input before consuming the asset |
| **`additionalInformation`** | Object | | Stores additional information, this is customizable by publisher |
During the publish process, file URLs must be encrypted with a respective _Provider_ API call before storing the DDO on-chain. For this, you need to send the following object to Provider:
To get information about the files after encryption, the `/fileinfo` endpoint of _Provider_ returns based on a passed DID an array of file metadata (based on the file type):
This only concerns metadata about a file, but never the file URLs. The only way to decrypt them is to exchange at least 1 datatoken based on the respective service pricing scheme.
An asset with a service of `type``compute` has the following additional attributes under the `compute` object. This object is required if the asset is of `type``compute`, but can be omitted for `type` of `access`.
| `allowRawAlgorithm` | `boolean` | **✓** | If `true`, any passed raw text will be allowed to run. Useful for an algorithm drag & drop use case, but increases risk of data escape through malicious user input. Should be `false` by default in all implementations. |
| `allowNetworkAccess` | `boolean` | **✓** | If `true`, the algorithm job will have network access. |
| `publisherTrustedAlgorithmPublishers` | Array of `string` | **✓** | If not defined, then any published algorithm is allowed. If empty array, then no algorithm is allowed. If not empty any algo published by the defined publishers is allowed. |
| `publisherTrustedAlgorithms` | Array of `publisherTrustedAlgorithms` | **✓** | If not defined, then any published algorithm is allowed. If empty array, then no algorithm is allowed. Otherwise only the algorithms defined in the array are allowed. (see below). |
To produce `filesChecksum`, call the Provider FileInfoEndpoint with parameter withChecksum = True. If algorithm has multiple files, `filesChecksum` is a concatenated string of all files checksums (ie: checksumFile1+checksumFile2 , etc)
* The publisher needs to know the sampling interval before the buyer downloads it. Suppose the dataset URL is `https://example.com/mydata`. The publisher defines a field called `sampling` and asks the buyer to enter a value. This parameter is then added to the URL of the published dataset as query parameters: `https://example.com/mydata?sampling=10`.
* An algorithm that needs to know the number of iterations it should perform. In this case, the algorithm publisher defines a field called `iterations`. The buyer needs to enter a value for the `iterations` parameter. Later, this value is stored in a specific location in the Compute-to-Data pod for the algorithm to read and use it.
Algorithms will have access to a JSON file located at /data/inputs/algoCustomData.json, which contains the keys/values for input data required. Example:
By default, a consumer can access a resource if they have 1 datatoken. _Credentials_ allow the publisher to optionally specify more fine-grained permissions.
Consider a medical data use case, where only a credentialed EU researcher can legally access a given dataset. Ocean supports this as follows: a consumer can only access the resource if they have 1 datatoken _and_ one of the specified `"allow"` credentials.
This is like going to an R-rated movie, where you can only get in if you show both your movie ticket (datatoken) _and_ some identification showing you're old enough (credential).
Only credentials that can be proven are supported. This includes Ethereum public addresses, and in the future [W3C Verifiable Credentials](https://www.w3.org/TR/vc-data-model/) and more.
The checksum hash is used when publishing/updating metadata using the `setMetaData` function in the ERC721 contract, and is stored in the event generated by the ERC721 contract:
The following fields are added by _Aquarius_ in its DDO response for convenience reasons, where an asset returned by _Aquarius_ inherits the DDO fields stored on-chain.
Contains information about an asset's purgatory status defined in [`list-purgatory`](https://github.com/oceanprotocol/list-purgatory). Marketplace interfaces are encouraged to prevent certain user actions like adding liquidity on assets in purgatory.