docs/compute-to-data.md at 5a1c268448f7a919e1399c0ec721cf1e13a635e6

mirror of https://github.com/oceanprotocol/docs.git synced 2024-11-02 16:25:37 +01:00

Akshay 5a1c268448 Issue-#808: Improve C2D docs

2021-11-07 19:01:38 +01:00

2.6 KiB

Raw Blame History

title	description	slug	section
Compute-to-Data	Providing access to data in a privacy-preserving fashion	/concepts/compute-to-data/	concepts

Quick Start

Compute-to-Data example

Motivation

The most basic scenario for a Publisher is to provide access to the datasets they own or manage. However, a Publisher may offer a service to execute some computation on top of their data. This has some benefits:

The data never leaves the Publisher enclave.
It's not necessary to move the data; the algorithm is sent to the data.
Having only one copy of the data and not moving it makes it easier to be compliant with data protection regulations.

This page elaborates on the benefits.

Datasets & Algorithms

With Compute-to-Data, datasets are not allowed to leave the premises of the data holder, only algorithms can be permitted to run on them under certain conditions within an isolated and secure environment. Algorithms are an asset type just like datasets. They too can have a pool or a fixed price to determine their price whenever they are used.

Algorithms can be public or private by setting "attributes.main.type" value in DDO as follows:

"access" - public. The algorithm can be downloaded, given appropriate datatoken.
"compute" - private. The algorithm is only available to use as part of a compute job without any way to download it. The Algorithm must be published on the same Ocean Provider as the dataset it's targeted to run on.

For each dataset, publishers can choose to allow various permission levels for algorithms to run:

allow selected algorithms, referenced by their DID
allow all algorithms published within a network or marketplace
allow raw algorithms, for advanced use cases circumventing algorithm as an asset type, but most prone to data escape

All implementations should set permissions to private by default: upon publishing a compute dataset, no algorithms should be allowed to run on it. This is to prevent data escape by a rogue algorithm being written in a way to extract all data from a dataset.

2.6 KiB Raw Blame History

Quick Start

Motivation

Datasets & Algorithms

Further Reading

2.6 KiB

Raw Blame History