first cut

This commit is contained in:
alexcos20 2020-04-29 13:27:29 +03:00
parent 9f30350fe3
commit f2b7529b60
12 changed files with 338 additions and 3 deletions

View File

@ -0,0 +1,94 @@
---
title: Compute-to-Data
description: How Ocean Protocol enables Publishers to provide computing services and related services.
slug: /concepts/compute-to-data/
section: concepts
---
## Motivation
The most basic scenario for a Publisher is to provide access to the datasets they own or manage.
In addition to that, a Publisher could offer other data-related services.
Some possibilities are:
1. A service to execute some computation on top of their data. This has some benefits:
- The data **never** leaves the Publisher enclave.
- It's not necessary to move the data; the algorithm is sent to the data.
- Having only one copy of the data and not moving it makes it easier to be compliant with data protection regulations.
2. A service to store newly-derived datasets. As a result of the computation on existing datasets, a new dataset could be created. Publishers could offer a storage service to make use of their existing storage capabilities. This is optional; users could also download the newly-derived datasets.
## Architecture
### Enabling Publisher Services (Brizo)
The direct interaction with the infrastructure where the data resides requires the execution of a component handled by Publishers.
This component will be in charge of interacting with users and managing the basics of a Publisher's infrastructure to provide these additional services.
The business logic supporting these additional Publisher capabilities is the responsibility of this new technical component.
The main and new key component introduced to support these additional Publisher services is named **Brizo**.
> Brizo is an ancient Greek goddess who was known as the protector of mariners, sailors, and fishermen. She was worshipped primarily by the women of Delos, who set out food offerings in small boats. Brizo was also known as a prophet specializing in the interpretation of dreams.
In the Ocean ecosystem, Brizo is the technical component executed by the **Publishers**, which provides extended data services. Brizo, as part of the Publisher ecosystem, includes the credentials to interact with the infrastructure (initially in cloud providers, but it could be on-premise).
Because of these credentials, the execution of Brizo **SHOULD NOT** be delegated to a third-party.
<repo name="brizo"></repo>
![Brizo High-Level Architecture](images/brizo-hl-arch.png)
### Compute-to-Data Enviroment (Operator-Service)
The Operator Service is a micro-service implementing part of the Ocean Protocol
[Compute to the Data OEP-12](https://github.com/oceanprotocol/OEPs/tree/master/12),
in charge of managing the workflow executing requests.
Typically the Operator Service is integrated from the [Brizo proxy](https://github.com/oceanprotocol/brizo),
but can be called independently if it.
The Operator Service is in charge of stablishing the communication with the K8s cluster, allowing to:
* Register workflows as K8s objects
* List the workflows registered in K8s
* Stop a running workflow execution
* Get information about the state of execution of a workflow
The Operator Service doesn't provide any storage capability, all the state is stored directly in the K8s cluster.
<repo name="operator-service"></repo>
### Responsibilities
The main responsibilities are:
* Expose an HTTP API allowing for the execution of data access and compute endpoints.
* Authorize the user on-chain using the proper Service Agreement. That is, validate that the user requesting the service is allowed to use that service.
* Interact with the infrastructure (cloud/on-premise) using the Publisher's credentials.
* Start/stop/execute computing instances with the algorithms provided by users.
* Retrieve the logs generated during executions.
* Register newly-derived assets arising from the executions (i.e. as new Ocean assets) (if required by the consumer).
### Flow
![Sequence Diagram for computing services](images/4_Starting_New_Compute_Job.png)
In the above diagram you can see the initial integration supported. It involves the following components/actors:
* Data Scientists/Consumers - The end users who need to use some computing services offered by the same Publisher as the data Publisher.
* Ocean Keeper - In charge of enforcing the Service Agreement by tracing conditions.
* Operator-Service - Micro-service that is handling the compute requests.
* Operator-Engine - The computing systems where the compute will be executed.
Before the flow can begin, the following pre-conditions must be met:
* The Asset DDO has a compute service.
* The Asset DDO must specify the Brizo endpoint exposed by the Publisher.
* The Service Agreement template must already be predefined and whitelisted `on-chain`.

Binary file not shown.

After

Width:  |  Height:  |  Size: 40 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 107 KiB

View File

@ -0,0 +1,49 @@
---
title: Compute using a published algorithm on a Data Set
description: Compute using a published algorithm on a Data Set
---
## Requirements
This is a continuation of the [React App Setup](/tutorials/react-setup/) tutorial, so make sure you have done all the steps described in there.
1. [React App Setup](/tutorials/react-setup/)
Open `src/index.js` from your `marketplace/` folder.
## Define Compute Output
First, let's define some options for our upcoming job:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 163-182 GITHUB-EMBED
and use them
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 61-70 GITHUB-EMBED
## Order the dataset
Next, we have to order the dataset that we are going to compute upon. We are going to use the ddoAssetId, which was set during publishing of the asset.
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 73 GITHUB-EMBED
## Start the compute job
And finally, start the job:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 76-82 GITHUB-EMBED
## Final Result
Now that we have all the requirements, we need a function to handle it.
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 58-89,92-94 GITHUB-EMBED
The last thing we need is a button to start our compute job:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 202-207 GITHUB-EMBED
** Notice that the button will be disabled if there were no previous published Datasets and Algorithms.
**Move on to [Get Status of a Compute Job](react-compute-status).**

View File

@ -0,0 +1,65 @@
---
title: Compute using a raw algorithm on a Data Set
description: Compute using a raw algorithm on a Data Set
---
## Requirements
This is a continuation of the [React App Setup](/tutorials/react-setup/) tutorial, so make sure you have done all the steps described in there.
1. [React App Setup](/tutorials/react-setup/)
Open `src/index.js` from your `marketplace/` folder.
## Define Raw Code
Sometime, you just need to quickly run an test algorithm. Instead of publishing it as an asset, you can use the code directly.
To do that, we are going to use a textbox for copy/paste and a button to show/hide it.
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 184-195 GITHUB-EMBED
## Define Algorithm MetaData
We need to define all the requirments for the algorithm:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/asset-compute.js jsx 35-44 GITHUB-EMBED
and them import it to our Compute.js:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 4 GITHUB-EMBED
## Define Compute Output
Let's define some options for our upcoming job:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 163-182 GITHUB-EMBED
and use them
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 61-70 GITHUB-EMBED
# Order the dataset
Next, we have to order the dataset that we are going to compute upon. We are going to use the ddoAssetId, which was set during publishing of the asset.
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 73 GITHUB-EMBED
## Start the compute job
We need a function to start the job:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 58-89 GITHUB-EMBED
Get the pasted code:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 96-100 GITHUB-EMBED
The last thing we need is a button inside the `render()` function:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 208-211 GITHUB-EMBED
** Notice that the button will be disabled if there were no previous published Datasets.
**Move on to [Get Status of a Compute Job](react-compute-status).**

View File

@ -0,0 +1,32 @@
---
title: Get Status of a Compute Job
description: Get Status of a Compute Job
---
## Requirements
For this setup, we need a compute job that has been started from [Compute using a published algorithm on a Data Set](/tutorials/react-compute-published-algorithm/) or [Compute using a raw algorithm on a Data Set](/tutorials/react-compute-raw/)
## Create an Area to display the status
First, let's define an area to display status:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 213-226 GITHUB-EMBED
## Get Job Status
Since we have the agreementId and jobId, we can get status from a compute job:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 106 GITHUB-EMBED
## Final Result
Let's wrap that into a function:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 102-112 GITHUB-EMBED
and have a button for it:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 223 GITHUB-EMBED
** Notice that the button will be disabled if jobId is missing.

View File

@ -1,5 +1,5 @@
---
title: Get & Use a Data Set
title: Search & Consume a Data Set
description: Tutorial to get and use a data set in a basic React app.
---
@ -34,7 +34,7 @@ Consuming means downloading one or multiple files attached to an asset. During t
With the following code we start the consume process with the first search result, then go on to download its first attached file. Put it after the `searchAssets()` function:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/e639e9ed4432e8b72ca453d50ed7bdaa36f1efb4/src/index.js jsx 72-98 GITHUB-EMBED
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/index.js jsx 73-95 GITHUB-EMBED
We still need a button to start consumption. In the render function, just after the _Search assets_ button, add:

View File

@ -0,0 +1,41 @@
---
title: Publish a Algorithm
description: Tutorial to add Algorithm dataset publishing capabilities to a basic React app.
---
## Requirements
This is a continuation of the [React App Setup](/tutorials/react-setup/) tutorial, so make sure you have done all the steps described in there.
1. [React App Setup](/tutorials/react-setup/)
Open `src/index.js` from your `marketplace/` folder.
## Define Asset
First, let's add the [asset](/concepts/terminology/#asset-or-data-asset) that we want to publish.
To do that, we need to define the Algorithm asset based on the [OEP-08](https://github.com/oceanprotocol/OEPs/tree/master/8) metadata structure. An algorithm asset can have multiple `files` attached to it and each file's `url` value will be encrypted during the publish process.
Let's create a new file `src/asset-compute.js` and fill it with:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/asset-compute.js jsx 1-33 GITHUB-EMBED
** Notice the “ALGO” macro in the entrypoint attribute, this is replaced with the downloaded executable algorithm inside the pod
Then import this asset definition at the top of `src/Compute.js`:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 4 GITHUB-EMBED
## Handle Asset Publishing
Now that we have an asset to submit, we need a function to handle it. Just before `render() {` let's add this `publishalgo` function:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 42-56 GITHUB-EMBED
The last thing we need is a button to start our registration inside the `render()` function:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 153 GITHUB-EMBED
**Move on to [Compute using a published algorithm on a Data Set](/tutorials/react-compute-published-algorithm/).**

View File

@ -0,0 +1,34 @@
---
title: Publish a Data Set with Compute features
description: Tutorial to add a dataset with compute capabilities to a basic React app.
---
## Requirements
This is a continuation of the [React App Setup](/tutorials/react-setup/) tutorial, so make sure you have done all the steps described in there.
1. [React App Setup](/tutorials/react-setup/)
Open `src/index.js` from your `marketplace/` folder.
## Define Asset
We will use the same asset as in [Publish a Data Set](/tutorials/react-publish-data-set), but we are going to allow only compute features, without the ability to download the asset.
This is achiveable by adding a 'compute' service to the DDO:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 23-27 GITHUB-EMBED
## Handle Asset Publishing
Note that ocean.assets.create will define an 'access' service if the services list is missing. Since we are providing this attribute, our asset will have only a 'compute' service and no 'access' service.
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 18-40 GITHUB-EMBED
The last thing we need is a button to start our registration inside the `render()` function:
GITHUB-EMBED https://github.com/oceanprotocol/react-tutorial/blob/107d1fa7d0c583cc8042339f1f5090ff9ee0920b/src/Compute.js jsx 143 GITHUB-EMBED
**Move on to [Publish a Algorithm](/tutorials/react-publish-algorithm/).**

View File

@ -82,3 +82,11 @@
- name: Research Board
url: https://github.com/oceanprotocol/ocean/projects/3
- name: dev-ocean
- group: Compute-to-Data Enviroment
items:
- name: operator-service
- name: operator-engine
- name: pod-configuration
- name: pod-publishing

View File

@ -23,6 +23,8 @@
link: /concepts/architecture/
- title: Secret Store
link: /concepts/secret-store/
- title: Compute-to-Data
link: /concepts/compute-to-data/
- group: Contribute
items:

View File

@ -31,8 +31,18 @@
link: /tutorials/react-setup/
- title: Publish a Data Set
link: /tutorials/react-publish-data-set/
- title: Get & Use a Data Set
- title: Search & Consume a Data Set
link: /tutorials/react-get-use-data-set/
- title: Publish a Data Set with Compute features
link: /tutorials/react-publish-data-set-compute/
- title: Publish a Algorithm
link: /tutorials/react-publish-algorithm/
- title: Compute using a published algorithm on a Data Set
link: /tutorials/react-compute-published-algorithm/
- title: Compute using a raw algorithm on a Data Set
link: /tutorials/react-compute-raw/
- title: Get Status of a Compute Job
link: /tutorials/react-compute-status/
- group: squid-py Tutorials
items: