mirror of
https://github.com/oceanprotocol/docs.git
synced 2024-11-26 19:49:26 +01:00
Merge branch 'main' of github.com:oceanprotocol/docs into feature/read-the-docs
This commit is contained in:
commit
3870043b83
@ -31,7 +31,11 @@ To initiate the **consume** step, the data consumer sends 1.0 datatokens to the
|
|||||||
|
|
||||||
Instead of running a Provider themselves, the publisher can have a 3rd party like Ocean Market run it. While more convenient, it means that the 3rd party has custody of the private encryption/decryption key (more centralized). Ocean will support more service types and url custody options in the future.
|
Instead of running a Provider themselves, the publisher can have a 3rd party like Ocean Market run it. While more convenient, it means that the 3rd party has custody of the private encryption/decryption key (more centralized). Ocean will support more service types and url custody options in the future.
|
||||||
|
|
||||||
**Ocean JavaScript and Python libraries** act as drivers for the lower-level contracts. Each library integrates with Ocean Provider to provision & consume data services, and Ocean Aquarius for metadata. **Ocean React hooks** use the JavaScript library, to help build webapps & React Native apps with Ocean.
|
**Ocean JavaScript and Python libraries** act as drivers for the lower-level contracts. Each library integrates with Ocean Provider to provision & consume data services, and Ocean Aquarius for metadata. **Ocean React hooks** use the JavaScript library, to help build web apps & React Native apps with Ocean.
|
||||||
|
|
||||||
|
<repo name="provider"></repo>
|
||||||
|
<repo name="ocean.js"></repo>
|
||||||
|
<repo name="ocean.py"></repo>
|
||||||
|
|
||||||
## Market Tools
|
## Market Tools
|
||||||
|
|
||||||
@ -48,6 +52,8 @@ Complementary to Ocean Market, Ocean has reference code to ease building **third
|
|||||||
|
|
||||||
[This post](https://blog.oceanprotocol.com/ocean-market-an-open-source-community-marketplace-for-data-4b99bedacdc3) elaborates on Ocean marketplace tools.
|
[This post](https://blog.oceanprotocol.com/ocean-market-an-open-source-community-marketplace-for-data-4b99bedacdc3) elaborates on Ocean marketplace tools.
|
||||||
|
|
||||||
|
<repo name="market"></repo>
|
||||||
|
|
||||||
## Metadata Tools
|
## Metadata Tools
|
||||||
|
|
||||||
Metadata (name of dataset, date created etc.) is used by marketplaces for data asset discovery. Each data asset can have a [decentralized identifier](https://w3c-ccg.github.io/did-spec/) (DID) that resolves to a DID document (DDO) for associated metadata. The DDO is essentially [JSON](https://www.json.org/) filling in metadata fields. [OEP7](https://github.com/oceanprotocol/OEPs/tree/master/7) formalizes Ocean DID usage.
|
Metadata (name of dataset, date created etc.) is used by marketplaces for data asset discovery. Each data asset can have a [decentralized identifier](https://w3c-ccg.github.io/did-spec/) (DID) that resolves to a DID document (DDO) for associated metadata. The DDO is essentially [JSON](https://www.json.org/) filling in metadata fields. [OEP7](https://github.com/oceanprotocol/OEPs/tree/master/7) formalizes Ocean DID usage.
|
||||||
@ -58,6 +64,8 @@ Ocean uses the Ethereum mainnet as an **on-chain metadata store**, i.e. to store
|
|||||||
|
|
||||||
Due to the permissionless, decentralized nature of data on Ethereum mainnet, any last-mile tool can access metadata. **Ocean Aquarius** supports different metadata fields for each different Ocean-based marketplace. Developers could also use [TheGraph](https://www.thegraph.com) to see metadata fields that are common across all marketplaces.
|
Due to the permissionless, decentralized nature of data on Ethereum mainnet, any last-mile tool can access metadata. **Ocean Aquarius** supports different metadata fields for each different Ocean-based marketplace. Developers could also use [TheGraph](https://www.thegraph.com) to see metadata fields that are common across all marketplaces.
|
||||||
|
|
||||||
|
<repo name="aquarius"></repo>
|
||||||
|
|
||||||
## Third-Party ERC20 Apps & Tools
|
## Third-Party ERC20 Apps & Tools
|
||||||
|
|
||||||
The ERC20 nature of datatokens eases composability with other Ethereum tools and apps, including **MetaMask** and **Trezor** as data wallets, DEXes as data exchanges, and more. [This post](https://blog.oceanprotocol.com/ocean-datatokens-from-money-legos-to-data-legos-4f867cec1837) has details.
|
The ERC20 nature of datatokens eases composability with other Ethereum tools and apps, including **MetaMask** and **Trezor** as data wallets, DEXes as data exchanges, and more. [This post](https://blog.oceanprotocol.com/ocean-datatokens-from-money-legos-to-data-legos-4f867cec1837) has details.
|
||||||
|
@ -15,62 +15,97 @@ The most basic scenario for a Publisher is to provide access to the datasets the
|
|||||||
|
|
||||||
[This page](https://oceanprotocol.com/technology/compute-to-data) elaborates on the benefits.
|
[This page](https://oceanprotocol.com/technology/compute-to-data) elaborates on the benefits.
|
||||||
|
|
||||||
## Architecture
|
## Data Sets & Algorithms
|
||||||
|
|
||||||
### Enabling Publisher Services, using Ocean Provider
|
With Compute-to-Data, data sets are not allowed to leave the premises of the data holder, only algorithms can be permitted to run on them under certain conditions within an isolated and secure environment. Algorithms are an asset type just like data sets and they too can have a pool or a fixed price to determine their price whenever they are used.
|
||||||
|
|
||||||
The direct interaction with the infrastructure where the data resides requires the execution of a component handled by Publishers.
|
Algorithms can be either public or private by setting either an `access` or a `compute` service in their DDO. An algorithm set to public can be downloaded for its set price, while an algorithm set to private is only available as part of a compute job without any way to download it. If an algorithm is set to private, then the dataset must be published on the same Ocean Provider as the data set it should run on.
|
||||||
|
|
||||||
This component will be in charge of interacting with users and managing the basics of a Publisher's infrastructure to provide these additional services.
|
For each data set, publishers can choose to allow various permission levels for algorithms to run:
|
||||||
|
|
||||||
The business logic supporting these additional Publisher capabilities is the responsibility of this new technical component.
|
- allow selected algorithms, referenced by their DID
|
||||||
|
- allow all algorithms published within a network or marketplace
|
||||||
|
- allow raw algorithms, for advanced use cases circumventing algorithm as an asset type, but most prone to data escape
|
||||||
|
|
||||||
The main and new key component introduced to support these additional Publisher services is named **Ocean Provider**.
|
All implementations should set permissions to private by default: upon publishing a compute data set, no algorithms should be allowed to run on it. This is to prevent data escape by a rogue algorithm being written in a way to extract all data from a data set.
|
||||||
|
|
||||||
Ocean Provider is the technical component executed by the **Publishers**, which provides extended data services. Ocean Provider includes the credentials to interact with the infrastructure (initially in cloud providers, but it could be on-premise).
|
## Architecture Overview
|
||||||
|
|
||||||
### Compute-to-Data Environment (Operator-Service)
|
The architecture follows [OEP-12: Compute-to-Data](https://github.com/oceanprotocol/OEPs/tree/master/12) as a spec.
|
||||||
|
|
||||||
The Operator Service is a micro-service that implements part of the Compute-to-Data spec [OEP-12](https://github.com/oceanprotocol/OEPs/tree/master/12),
|
![Sequence Diagram for computing services](images/Starting New Compute Job.png)
|
||||||
in charge of managing the workflow executing requests.
|
|
||||||
|
In the above diagram you can see the initial integration supported. It involves the following components/actors:
|
||||||
|
|
||||||
|
- Consumers - The end users who need to use some computing services offered by the same Publisher as the data Publisher.
|
||||||
|
- Operator-Service - Micro-service that is handling the compute requests.
|
||||||
|
- Operator-Engine - The computing systems where the compute will be executed.
|
||||||
|
- Kubernetes - a K8 cluster
|
||||||
|
|
||||||
|
Before the flow can begin, the following pre-conditions must be met:
|
||||||
|
|
||||||
|
- The Asset DDO has a `compute` service.
|
||||||
|
- The Asset DDO compute service must permit algorithms to run on it.
|
||||||
|
- The Asset DDO must specify an Ocean Provider endpoint exposed by the Publisher.
|
||||||
|
|
||||||
|
## Access Control using Ocean Provider
|
||||||
|
|
||||||
|
As [with the `access` service](/concepts/architecture/#datatokens--access-control-tools), the `compute` service requires the **Ocean Provider** as a component handled by Publishers. Ocean Provider is in charge of interacting with users and managing the basics of a Publisher's infrastructure to integrate this infrastructure into Ocean Protocol. The direct interaction with the infrastructure where the data resides happens through this component only.
|
||||||
|
|
||||||
|
Ocean Provider includes the credentials to interact with the infrastructure (initially in cloud providers, but it could be on-premise).
|
||||||
|
|
||||||
|
<repo name="provider"></repo>
|
||||||
|
|
||||||
|
## Compute-to-Data Environment
|
||||||
|
|
||||||
|
### Operator Service
|
||||||
|
|
||||||
|
The **Operator Service** is a micro-service in charge of managing the workflow executing requests.
|
||||||
|
|
||||||
|
The main responsibilities are:
|
||||||
|
|
||||||
|
- Expose an HTTP API allowing for the execution of data access and compute endpoints.
|
||||||
|
- Interact with the infrastructure (cloud/on-premise) using the Publisher's credentials.
|
||||||
|
- Start/stop/execute computing instances with the algorithms provided by users.
|
||||||
|
- Retrieve the logs generated during executions.
|
||||||
|
|
||||||
Typically the Operator Service is integrated from Ocean Provider, but can be called independently of it.
|
Typically the Operator Service is integrated from Ocean Provider, but can be called independently of it.
|
||||||
|
|
||||||
The Operator Service is in charge of establishing the communication with the K8s cluster, allowing it to:
|
The Operator Service is in charge of establishing the communication with the K8s cluster, allowing it to:
|
||||||
|
|
||||||
- Register workflows as K8s objects
|
- Register new compute jobs
|
||||||
- List the workflows registered in K8s
|
- List the current compute jobs
|
||||||
- Stop a running workflow execution
|
- Get a detailed result for a given job
|
||||||
- Get information about the state of execution of a workflow
|
- Stop a running job
|
||||||
|
|
||||||
The Operator Service doesn't provide any storage capability, all the state is stored directly in the K8s cluster.
|
The Operator Service doesn't provide any storage capability, all the state is stored directly in the K8s cluster.
|
||||||
|
|
||||||
<repo name="operator-service"></repo>
|
<repo name="operator-service"></repo>
|
||||||
|
|
||||||
### Responsibilities
|
### Operator Engine
|
||||||
|
|
||||||
The main responsibilities are:
|
The **Operator Engine** is in charge of orchestrating the compute infrastructure using Kubernetes as backend where each compute job runs in an isolated [Kubernetes Pod](https://kubernetes.io/docs/concepts/workloads/pods/). Typically the Operator Engine retrieves the workflows created by the Operator Service in Kubernetes, and manage the infrastructure necessary to complete the execution of the compute workflows.
|
||||||
|
|
||||||
- Expose an HTTP API allowing for the execution of data access and compute endpoints.
|
The Operator Engine is in charge of retrieving all the workflows registered in a K8s cluster, allowing to:
|
||||||
- Authorize the user on-chain using the proper Service Agreement. That is, validate that the user requesting the service is allowed to use that service.
|
|
||||||
- Interact with the infrastructure (cloud/on-premise) using the Publisher's credentials.
|
|
||||||
- Start/stop/execute computing instances with the algorithms provided by users.
|
|
||||||
- Retrieve the logs generated during executions.
|
|
||||||
- Register newly-derived assets arising from the executions (i.e. as new Ocean assets) (if required by the consumer).
|
|
||||||
|
|
||||||
### Flow
|
- Orchestrate the flow of the execution
|
||||||
|
- Start the configuration pod in charge of download the workflow dependencies (datasets and algorithms)
|
||||||
|
- Start the pod including the algorithm to execute
|
||||||
|
- Start the publishing pod that publish the new assets created in the Ocean Protocol network.
|
||||||
|
- The Operator Engine doesn't provide any storage capability, all the state is stored directly in the K8s cluster.
|
||||||
|
|
||||||
![Sequence Diagram for computing services](images/4_Starting_New_Compute_Job.png)
|
<repo name="operator-engine"></repo>
|
||||||
|
|
||||||
In the above diagram you can see the initial integration supported. It involves the following components/actors:
|
### Pod: Configuration
|
||||||
|
|
||||||
- Data Scientists/Consumers - The end users who need to use some computing services offered by the same Publisher as the data Publisher.
|
<repo name="pod-configuration"></repo>
|
||||||
- Ocean Keeper - In charge of enforcing the Service Agreement by tracking conditions.
|
|
||||||
- Operator-Service - Micro-service that is handling the compute requests.
|
|
||||||
- Operator-Engine - The computing systems where the compute will be executed.
|
|
||||||
|
|
||||||
Before the flow can begin, the following pre-conditions must be met:
|
### Pod: Publishing
|
||||||
|
|
||||||
- The Asset DDO has a compute service.
|
<repo name="pod-publishing"></repo>
|
||||||
- The Asset DDO must specify the Ocean Provider endpoint exposed by the Publisher.
|
|
||||||
- The Service Agreement template must already be predefined and whitelisted `on-chain`.
|
## Further Reading
|
||||||
|
|
||||||
|
- [Tutorial: Writing Algorithms](/tutorials/compute-to-data-algorithms/)
|
||||||
|
- [Tutorial: Set Up a Compute-to-Data Environment](/tutorials/compute-to-data/)
|
||||||
|
- [Compute-to-Data in Ocean Market](https://blog.oceanprotocol.com)
|
||||||
|
@ -33,7 +33,7 @@ Before you start coding right away, please follow those basic guidelines:
|
|||||||
A typical code contribution in any Ocean Protocol repository would go as follows:
|
A typical code contribution in any Ocean Protocol repository would go as follows:
|
||||||
|
|
||||||
1. As an external developer, fork the respective repo and push to your own fork. Ocean core developers push directly on the repo under `oceanprotocol` org.
|
1. As an external developer, fork the respective repo and push to your own fork. Ocean core developers push directly on the repo under `oceanprotocol` org.
|
||||||
2. We follow [Trunk Based Development](https://trunkbaseddevelopment.com) so work in feature branches, branched off from the `main` branch. For naming use `feature/your-feature` or `feature/23` for new features and `fix/your-fix` or `fix/23` for bug fixes, referring to the issue number.
|
2. You should create a new branch for your changes. The naming convention for branches is: `issue-001-short-feature-description`. The issue number `issue-001` needs to reference the GitHub issue that you are trying to fix. The short feature description helps to quickly distinguish your branch among the other branches in play.
|
||||||
3. To get visibility and Continuous Integration feedback as early as possible, open your Pull Request as a `Draft`.
|
3. To get visibility and Continuous Integration feedback as early as possible, open your Pull Request as a `Draft`.
|
||||||
4. Give it a meaningful title, and at least link to the respective issue in the Pull Request description, like `Fixes #23`. Describe your changes, mention things for reviewers to look out for, and for UI changes screenshots and videos are helpful.
|
4. Give it a meaningful title, and at least link to the respective issue in the Pull Request description, like `Fixes #23`. Describe your changes, mention things for reviewers to look out for, and for UI changes screenshots and videos are helpful.
|
||||||
5. Once your Pull Request is ready, mark it as `Ready for Review`, in most repositories code owners are automatically notified and asked for review.
|
5. Once your Pull Request is ready, mark it as `Ready for Review`, in most repositories code owners are automatically notified and asked for review.
|
||||||
|
Binary file not shown.
Before Width: | Height: | Size: 40 KiB |
BIN
content/concepts/images/Starting New Compute Job.png
Normal file
BIN
content/concepts/images/Starting New Compute Job.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 117 KiB |
@ -16,6 +16,7 @@ These are live projects that leverage core functionality of Ocean, such as readi
|
|||||||
| Name | Description | Link |
|
| Name | Description | Link |
|
||||||
| --------------------------------------------- | ------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- |
|
| --------------------------------------------- | ------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- |
|
||||||
| Ocean Market (from Ocean core team) | A marketplace to find, publish and trade data sets in the Ocean Network | [market.oceanprotocol.com](https://market.oceanprotocol.com) |
|
| Ocean Market (from Ocean core team) | A marketplace to find, publish and trade data sets in the Ocean Network | [market.oceanprotocol.com](https://market.oceanprotocol.com) |
|
||||||
|
| deltaDAO | Ocean Protocol consulting, engineering and integration company for GDPR-compliant data monetization | [delta-dao.com](https://delta-dao.com) |
|
||||||
| [Parsiq](https://parsiq.net/) | User notifications for dataset publishes, metadata actions, more | [parsiq.net](https://parsiq.net/) |
|
| [Parsiq](https://parsiq.net/) | User notifications for dataset publishes, metadata actions, more | [parsiq.net](https://parsiq.net/) |
|
||||||
| [Data Market Cap](https://datamarketcap.xyz/) | A platform that provides an analysis of the datatokens market | [datamarketcap.xyz](https://datamarketcap.xyz/) |
|
| [Data Market Cap](https://datamarketcap.xyz/) | A platform that provides an analysis of the datatokens market | [datamarketcap.xyz](https://datamarketcap.xyz/) |
|
||||||
| [Datapolis](https://datapolis.net/) | A data-marketplace for buying, selling datasets and earning interests through staking | [datapolis.net](https://datapolis.net/) |
|
| [Datapolis](https://datapolis.net/) | A data-marketplace for buying, selling datasets and earning interests through staking | [datapolis.net](https://datapolis.net/) |
|
||||||
|
225
content/tutorials/compute-to-data-algorithms.md
Normal file
225
content/tutorials/compute-to-data-algorithms.md
Normal file
@ -0,0 +1,225 @@
|
|||||||
|
---
|
||||||
|
title: Writing Algorithms for Compute to Data
|
||||||
|
description: Learn how to write algorithms for use in Ocean Protocol's Compute-to-Data feature.
|
||||||
|
---
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
An algorithm in the Ocean Protocol stack is another asset type, in addition to data sets. An algorithm for Compute to Data is composed of the following:
|
||||||
|
|
||||||
|
- an algorithm code
|
||||||
|
- a Docker image (base image + tag)
|
||||||
|
- an entry point
|
||||||
|
|
||||||
|
## Environment
|
||||||
|
|
||||||
|
When creating an algorithm asset in Ocean Protocol, the additional `algorithm` object needs to be included in its metadata service to define the Docker container environment:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"algorithm": {
|
||||||
|
"container": {
|
||||||
|
"entrypoint": "node $ALGO",
|
||||||
|
"image": "node",
|
||||||
|
"tag": "latest"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Variable | Usage |
|
||||||
|
| ------------ | --------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| `image` | The Docker image name the algorithm will run with. |
|
||||||
|
| `tag` | The Docker image tag that you are going to use. |
|
||||||
|
| `entrypoint` | The Docker entrypoint. `$ALGO` is a macro that gets replaced inside the compute job, depending where your algorithm code is downloaded. |
|
||||||
|
|
||||||
|
When publishing an algorithm through the [Ocean Market](https://market.oceanprotoco.com), these properties can be set via the publish UI.
|
||||||
|
|
||||||
|
### Environment Examples
|
||||||
|
|
||||||
|
Run an algorithm written in JavaScript/Node.js, based on Node.js v14:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"algorithm": {
|
||||||
|
"container": {
|
||||||
|
"entrypoint": "node $ALGO",
|
||||||
|
"image": "node",
|
||||||
|
"tag": "14"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Run an algorithm written in Python, based on Python v3.9:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"algorithm": {
|
||||||
|
"container": {
|
||||||
|
"entrypoint": "python3.9 $ALGO",
|
||||||
|
"image": "python",
|
||||||
|
"tag": "3.9.4-alpine3.13"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Be aware that you might need a lot of dependencies, so it's a lot faster if you are going to build your own image and publish your algorithm with that custom image. We also collect some [example images](https://github.com/oceanprotocol/algo_dockers).
|
||||||
|
|
||||||
|
### Data Storage
|
||||||
|
|
||||||
|
As part of a compute job, every algorithm runs in a K8s pod with these volumes mounted:
|
||||||
|
|
||||||
|
| Path | Permissions | Usage |
|
||||||
|
| --------------- | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||||
|
| `/data/inputs` | read | Storage for input data sets, accessible only to the algorithm running in the pod. |
|
||||||
|
| `/data/ddos` | read | Storage for all DDOs involved in compute job (input data set + algorithm). |
|
||||||
|
| `/data/outputs` | read/write | Storage for all of the algorithm's output files. They are uploaded on some form of cloud storage, and URLs are sent back to the consumer. |
|
||||||
|
| `/data/logs/` | read/write | All algorithm output (such as `print`, `console.log`, etc.) is stored in a file located in this folder. They are stored and sent to the consumer as well. |
|
||||||
|
|
||||||
|
### Environment variables available to algorithms
|
||||||
|
|
||||||
|
For every algorithm pod, the Compute to Data environment provides the following environment variables:
|
||||||
|
|
||||||
|
| Variable | Usage |
|
||||||
|
| -------------------- | ------------------------------------------------------ |
|
||||||
|
| `DIDS` | An array of DID strings containing the input datasets. |
|
||||||
|
| `TRANSFORMATION_DID` | The DID of the algorithm. |
|
||||||
|
|
||||||
|
## Example: JavaScript/Node.js
|
||||||
|
|
||||||
|
The following is a simple JavaScript/Node.js algorithm, doing a line count for ALL input datasets. The algorithm is not using any environment variables, but instead it's scanning the `/data/inputs` folder.
|
||||||
|
|
||||||
|
```js
|
||||||
|
const fs = require('fs')
|
||||||
|
|
||||||
|
const inputFolder = '/data/inputs'
|
||||||
|
const outputFolder = '/data/outputs'
|
||||||
|
|
||||||
|
async function countrows(file) {
|
||||||
|
console.log('Start counting for ' + file)
|
||||||
|
const fileBuffer = fs.readFileSync(file)
|
||||||
|
const toString = fileBuffer.toString()
|
||||||
|
const splitLines = toString.split('\n')
|
||||||
|
const rows = splitLines.length - 1
|
||||||
|
fs.appendFileSync(outputFolder + '/output.log', file + ',' + rows + '\r\n')
|
||||||
|
console.log('Finished. We have ' + rows + ' lines')
|
||||||
|
}
|
||||||
|
|
||||||
|
async function processfolder(folder) {
|
||||||
|
const files = fs.readdirSync(folder)
|
||||||
|
|
||||||
|
for (const i = 0; i < files.length; i++) {
|
||||||
|
const file = files[i]
|
||||||
|
const fullpath = folder + '/' + file
|
||||||
|
if (fs.statSync(fullpath).isDirectory()) {
|
||||||
|
await processfolder(fullpath)
|
||||||
|
} else {
|
||||||
|
await countrows(fullpath)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
processfolder(inputFolder)
|
||||||
|
```
|
||||||
|
|
||||||
|
This snippet will create and expose the following files as compute job results to the consumer:
|
||||||
|
|
||||||
|
- `/data/outputs/output.log`
|
||||||
|
- `/data/logs/algo.log`
|
||||||
|
|
||||||
|
To run this, use the following container object:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"algorithm": {
|
||||||
|
"container": {
|
||||||
|
"entrypoint": "node $ALGO",
|
||||||
|
"image": "node",
|
||||||
|
"tag": "12"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example: Python
|
||||||
|
|
||||||
|
A more advanced line counting in Python, which relies on environment variables and constructs a job object, containing all the input files & DDOs
|
||||||
|
|
||||||
|
```python
|
||||||
|
import pandas as pd
|
||||||
|
import numpy as np
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
import json
|
||||||
|
|
||||||
|
def get_job_details():
|
||||||
|
"""Reads in metadata information about assets used by the algo"""
|
||||||
|
job = dict()
|
||||||
|
job['dids'] = json.loads(os.getenv('DIDS', None))
|
||||||
|
job['metadata'] = dict()
|
||||||
|
job['files'] = dict()
|
||||||
|
job['algo'] = dict()
|
||||||
|
job['secret'] = os.getenv('secret', None)
|
||||||
|
algo_did = os.getenv('TRANSFORMATION_DID', None)
|
||||||
|
if job['dids'] is not None:
|
||||||
|
for did in job['dids']:
|
||||||
|
# get the ddo from disk
|
||||||
|
filename = '/data/ddos/' + did
|
||||||
|
print(f'Reading json from {filename}')
|
||||||
|
with open(filename) as json_file:
|
||||||
|
ddo = json.load(json_file)
|
||||||
|
# search for metadata service
|
||||||
|
for service in ddo['service']:
|
||||||
|
if service['type'] == 'metadata':
|
||||||
|
job['files'][did] = list()
|
||||||
|
index = 0
|
||||||
|
for file in service['attributes']['main']['files']:
|
||||||
|
job['files'][did].append(
|
||||||
|
'/data/inputs/' + did + '/' + str(index))
|
||||||
|
index = index + 1
|
||||||
|
if algo_did is not None:
|
||||||
|
job['algo']['did'] = algo_did
|
||||||
|
job['algo']['ddo_path'] = '/data/ddos/' + algo_did
|
||||||
|
return job
|
||||||
|
|
||||||
|
|
||||||
|
def line_counter(job_details):
|
||||||
|
"""Executes the line counter based on inputs"""
|
||||||
|
print('Starting compute job with the following input information:')
|
||||||
|
print(json.dumps(job_details, sort_keys=True, indent=4))
|
||||||
|
|
||||||
|
""" Now, count the lines of the first file in first did """
|
||||||
|
first_did = job_details['dids'][0]
|
||||||
|
filename = job_details['files'][first_did][0]
|
||||||
|
non_blank_count = 0
|
||||||
|
with open(filename) as infp:
|
||||||
|
for line in infp:
|
||||||
|
if line.strip():
|
||||||
|
non_blank_count += 1
|
||||||
|
print ('number of non-blank lines found %d' % non_blank_count)
|
||||||
|
""" Print that number to output to generate algo output"""
|
||||||
|
f = open("/data/outputs/result", "w")
|
||||||
|
f.write(str(non_blank_count))
|
||||||
|
f.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
line_counter(get_job_details())
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
To run this algorithm, use the following `container` object:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"algorithm": {
|
||||||
|
"container": {
|
||||||
|
"entrypoint": "python3.6 $ALGO",
|
||||||
|
"image": "oceanprotocol/algo_dockers",
|
||||||
|
"tag": "python-sql"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
@ -20,20 +20,16 @@ Then you need the following parts:
|
|||||||
- a working Kubernetes (K8s) cluster (Minikube is a good start)
|
- a working Kubernetes (K8s) cluster (Minikube is a good start)
|
||||||
- a working `kubectl` connected to the K8s cluster
|
- a working `kubectl` connected to the K8s cluster
|
||||||
- one folder (/ocean/operator-service/), in which we will download the following:
|
- one folder (/ocean/operator-service/), in which we will download the following:
|
||||||
- [postgres-configmap.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/develop/deploy_on_k8s/postgres-configmap.yaml)
|
- [postgres-configmap.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/main/kubernetes/postgres-configmap.yaml)
|
||||||
- [postgres-storage.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/develop/deploy_on_k8s/postgres-storage.yaml)
|
- [postgres-storage.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/main/kubernetes/postgres-storage.yaml)
|
||||||
- [postgres-deployment.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/develop/deploy_on_k8s/postgres-deployment.yaml)
|
- [postgres-deployment.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/main/kubernetes/postgres-deployment.yaml)
|
||||||
- [postgres-service.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/develop/deploy_on_k8s/postgresql-service.yaml)
|
- [postgres-service.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/main/kubernetes/postgresql-service.yaml)
|
||||||
- [deployment.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/develop/deploy_on_k8s/deployment.yaml)
|
- [deployment.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/main/kubernetes/deployment.yaml)
|
||||||
- [role_binding.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/develop/deploy_on_k8s/role_binding.yaml)
|
|
||||||
- [service_account.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-service/develop/deploy_on_k8s/service_account.yaml)
|
|
||||||
- one folder (/ocean/operator-engine/), in which we will download the following:
|
- one folder (/ocean/operator-engine/), in which we will download the following:
|
||||||
- [sa.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-engine/develop/k8s_install/sa.yml)
|
- [sa.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-engine/main/kubernetes/sa.yml)
|
||||||
- [binding.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-engine/develop/k8s_install/binding.yml)
|
- [binding.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-engine/main/kubernetes/binding.yml)
|
||||||
- [operator.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-engine/develop/k8s_install/operator.yml)
|
- [operator.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-engine/main/kubernetes/operator.yml)
|
||||||
- [computejob-crd.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-engine/develop/k8s_install/computejob-crd.yaml)
|
|
||||||
- [workflow-crd.yaml](https://raw.githubusercontent.com/oceanprotocol/operator-engine/develop/k8s_install/workflow-crd.yaml)
|
|
||||||
|
|
||||||
## Customize your Operator Service deployment
|
## Customize your Operator Service deployment
|
||||||
|
|
||||||
The following resources need attention:
|
The following resources need attention:
|
||||||
@ -45,15 +41,7 @@ The following resources need attention:
|
|||||||
|
|
||||||
## Customize your Operator Engine deployment
|
## Customize your Operator Engine deployment
|
||||||
|
|
||||||
The following resources need attention:
|
Check the [README](https://github.com/oceanprotocol/operator-engine#customize-your-operator-engine-deployment) section of operator engine to customize your deployment
|
||||||
|
|
||||||
| Resource | Variable | Description |
|
|
||||||
| --------------- | ------------------------------------------------------ | ------------------------------------------------------------------------------------------- |
|
|
||||||
| `operator.yaml` | `ACCOUNT_JSON`, `ACCOUNT_PASSWORD` | Defines the account that is going to be used when publishing results back to OceanProtocol. |
|
|
||||||
| | `AWS_ACCESS_KEY_ID`, `AWS_ACCESS_KEY_ID`, `AWS_REGION` | S3 credentials for the logs and output buckets. |
|
|
||||||
| | `AWS_BUCKET_OUTPUT` | Bucket that will hold the output data (algorithm logs & algorithm output). |
|
|
||||||
| | `AWS_BUCKET_ADMINLOGS` | Bucket that will hold the admin logs (logs from pod-configure & pod-publish). |
|
|
||||||
| | `STORAGE_CLASS` | Storage class to use (see next section). |
|
|
||||||
|
|
||||||
## Storage class
|
## Storage class
|
||||||
|
|
||||||
@ -112,8 +100,6 @@ kubectl create -f /ocean/operator-service/postgres-storage.yaml
|
|||||||
kubectl create -f /ocean/operator-service/postgres-deployment.yaml
|
kubectl create -f /ocean/operator-service/postgres-deployment.yaml
|
||||||
kubectl create -f /ocean/operator-service/postgresql-service.yaml
|
kubectl create -f /ocean/operator-service/postgresql-service.yaml
|
||||||
kubectl apply -f /ocean/operator-service/deployment.yaml
|
kubectl apply -f /ocean/operator-service/deployment.yaml
|
||||||
kubectl apply -f /ocean/operator-service/role_binding.yaml
|
|
||||||
kubectl apply -f /ocean/operator-service/service_account.yaml
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Deploy Operator Engine
|
## Deploy Operator Engine
|
||||||
@ -123,8 +109,6 @@ kubectl config set-context --current --namespace ocean-compute
|
|||||||
kubectl apply -f /ocean/operator-engine/sa.yml
|
kubectl apply -f /ocean/operator-engine/sa.yml
|
||||||
kubectl apply -f /ocean/operator-engine/binding.yml
|
kubectl apply -f /ocean/operator-engine/binding.yml
|
||||||
kubectl apply -f /ocean/operator-engine/operator.yml
|
kubectl apply -f /ocean/operator-engine/operator.yml
|
||||||
kubectl apply -f /ocean/operator-engine/computejob-crd.yaml
|
|
||||||
kubectl apply -f /ocean/operator-engine/workflow-crd.yaml
|
|
||||||
kubectl create -f /ocean/operator-service/postgres-configmap.yaml
|
kubectl create -f /ocean/operator-service/postgres-configmap.yaml
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -148,12 +132,12 @@ If your cluster is running on example.com:
|
|||||||
curl -X POST "http://example.com:8050/api/v1/operator/pgsqlinit" -H "accept: application/json"
|
curl -X POST "http://example.com:8050/api/v1/operator/pgsqlinit" -H "accept: application/json"
|
||||||
```
|
```
|
||||||
|
|
||||||
## Update Brizo
|
## Update Barge for local testing
|
||||||
|
|
||||||
Update Brizo by adding or updating the `OPERATOR_SERVICE_URL` env in `/ocean/barge/compose-files/brizo.yaml`
|
Update Barge's Provider by adding or updating the `OPERATOR_SERVICE_URL` env in `/ocean/barge/compose-files/provider.yaml`
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
OPERATOR_SERVICE_URL: http://example.com:8050/
|
OPERATOR_SERVICE_URL: http://example.com:8050/
|
||||||
```
|
```
|
||||||
|
|
||||||
Restart Barge with updated Brizo configuration
|
Restart Barge with updated provider configuration
|
||||||
|
@ -27,6 +27,12 @@ Almost all ERC-20 wallets require these values for adding a custom token:
|
|||||||
- Symbol: `OCEAN`
|
- Symbol: `OCEAN`
|
||||||
- Decimals: `18`
|
- Decimals: `18`
|
||||||
|
|
||||||
|
**Polygon Mainnet (previously Matic)**
|
||||||
|
|
||||||
|
- Contract Address: `0x282d8efCe846A88B159800bd4130ad77443Fa1A1`
|
||||||
|
- Symbol: `mOCEAN`
|
||||||
|
- Decimals: `18`
|
||||||
|
|
||||||
The [OCEAN Token page](https://oceanprotocol.com/token) at oceanprotocol.com has further details.
|
The [OCEAN Token page](https://oceanprotocol.com/token) at oceanprotocol.com has further details.
|
||||||
|
|
||||||
## MetaMask
|
## MetaMask
|
||||||
|
@ -24,6 +24,8 @@
|
|||||||
|
|
||||||
- group: Compute-to-Data
|
- group: Compute-to-Data
|
||||||
items:
|
items:
|
||||||
|
- title: Writing Algorithms
|
||||||
|
link: /tutorials/compute-to-data-algorithms/
|
||||||
- title: Run a Compute-to-Data Environment
|
- title: Run a Compute-to-Data Environment
|
||||||
link: /tutorials/compute-to-data/
|
link: /tutorials/compute-to-data/
|
||||||
|
|
||||||
|
1433
package-lock.json
generated
1433
package-lock.json
generated
File diff suppressed because it is too large
Load Diff
32
package.json
32
package.json
@ -18,9 +18,9 @@
|
|||||||
"dependencies": {
|
"dependencies": {
|
||||||
"@oceanprotocol/art": "^3.0.0",
|
"@oceanprotocol/art": "^3.0.0",
|
||||||
"axios": "^0.21.1",
|
"axios": "^0.21.1",
|
||||||
"classnames": "^2.2.6",
|
"classnames": "^2.3.1",
|
||||||
"gatsby": "^2.32.9",
|
"gatsby": "^2.32.12",
|
||||||
"gatsby-image": "^3.0.0",
|
"gatsby-image": "^3.3.0",
|
||||||
"gatsby-plugin-catch-links": "^2.10.0",
|
"gatsby-plugin-catch-links": "^2.10.0",
|
||||||
"gatsby-plugin-manifest": "^2.12.1",
|
"gatsby-plugin-manifest": "^2.12.1",
|
||||||
"gatsby-plugin-offline": "^3.10.2",
|
"gatsby-plugin-offline": "^3.10.2",
|
||||||
@ -29,28 +29,28 @@
|
|||||||
"gatsby-plugin-sharp": "^2.14.3",
|
"gatsby-plugin-sharp": "^2.14.3",
|
||||||
"gatsby-plugin-sitemap": "^2.12.0",
|
"gatsby-plugin-sitemap": "^2.12.0",
|
||||||
"gatsby-plugin-svgr": "^2.1.0",
|
"gatsby-plugin-svgr": "^2.1.0",
|
||||||
"gatsby-plugin-webpack-size": "^1.0.0",
|
"gatsby-plugin-webpack-size": "^2.0.1",
|
||||||
"gatsby-remark-autolink-headers": "^2.11.0",
|
"gatsby-remark-autolink-headers": "^2.11.0",
|
||||||
"gatsby-remark-code-titles": "^1.1.0",
|
"gatsby-remark-code-titles": "^1.1.0",
|
||||||
"gatsby-remark-component": "^1.1.3",
|
"gatsby-remark-component": "^1.1.3",
|
||||||
"gatsby-remark-copy-linked-files": "^2.10.0",
|
"gatsby-remark-copy-linked-files": "^2.10.0",
|
||||||
"gatsby-remark-embed-video": "^3.1.0",
|
"gatsby-remark-embed-video": "^3.1.1",
|
||||||
"gatsby-remark-github": "^2.0.0",
|
"gatsby-remark-github": "^2.0.0",
|
||||||
"gatsby-remark-images": "^3.11.1",
|
"gatsby-remark-images": "^3.11.1",
|
||||||
"gatsby-remark-responsive-iframe": "^2.11.0",
|
"gatsby-remark-responsive-iframe": "^2.11.0",
|
||||||
"gatsby-remark-smartypants": "^2.10.0",
|
"gatsby-remark-smartypants": "^2.10.0",
|
||||||
"gatsby-remark-vscode": "^3.2.0",
|
"gatsby-remark-vscode": "^3.2.1",
|
||||||
"gatsby-source-filesystem": "^2.11.1",
|
"gatsby-source-filesystem": "^2.11.1",
|
||||||
"gatsby-source-git": "^1.1.0",
|
"gatsby-source-git": "^1.1.0",
|
||||||
"gatsby-source-graphql": "^2.14.0",
|
"gatsby-source-graphql": "^2.14.0",
|
||||||
"gatsby-transformer-remark": "^2.16.1",
|
"gatsby-transformer-remark": "^2.16.1",
|
||||||
"gatsby-transformer-sharp": "^2.12.0",
|
"gatsby-transformer-sharp": "^2.12.1",
|
||||||
"gatsby-transformer-xml": "^2.10.0",
|
"gatsby-transformer-xml": "^2.10.0",
|
||||||
"gatsby-transformer-yaml": "^2.11.0",
|
"gatsby-transformer-yaml": "^2.11.0",
|
||||||
"giphy-js-sdk-core": "^1.0.6",
|
"giphy-js-sdk-core": "^1.0.6",
|
||||||
"intersection-observer": "^0.12.0",
|
"intersection-observer": "^0.12.0",
|
||||||
"react": "^17.0.1",
|
"react": "^17.0.2",
|
||||||
"react-dom": "^17.0.1",
|
"react-dom": "^17.0.2",
|
||||||
"react-helmet": "^6.1.0",
|
"react-helmet": "^6.1.0",
|
||||||
"react-scrollspy": "^3.4.3",
|
"react-scrollspy": "^3.4.3",
|
||||||
"rehype-react": "^6.2.0",
|
"rehype-react": "^6.2.0",
|
||||||
@ -58,20 +58,20 @@
|
|||||||
"remark-github-plugin": "^1.4.0",
|
"remark-github-plugin": "^1.4.0",
|
||||||
"remark-react": "^8.0.0",
|
"remark-react": "^8.0.0",
|
||||||
"shortid": "^2.2.16",
|
"shortid": "^2.2.16",
|
||||||
"slugify": "^1.4.7",
|
"slugify": "^1.5.0",
|
||||||
"smoothscroll-polyfill": "^0.4.4",
|
"smoothscroll-polyfill": "^0.4.4",
|
||||||
"swagger-client": "^3.13.1"
|
"swagger-client": "^3.13.2"
|
||||||
},
|
},
|
||||||
"devDependencies": {
|
"devDependencies": {
|
||||||
"@svgr/webpack": "^5.5.0",
|
"@svgr/webpack": "^5.5.0",
|
||||||
"dotenv": "^8.2.0",
|
"dotenv": "^8.2.0",
|
||||||
"eslint": "^7.21.0",
|
"eslint": "^7.25.0",
|
||||||
"eslint-config-oceanprotocol": "^1.5.0",
|
"eslint-config-oceanprotocol": "^1.5.0",
|
||||||
"eslint-config-prettier": "^8.1.0",
|
"eslint-config-prettier": "^8.3.0",
|
||||||
"eslint-plugin-prettier": "^3.3.1",
|
"eslint-plugin-prettier": "^3.4.0",
|
||||||
"git-format-staged": "^2.1.1",
|
"git-format-staged": "^2.1.1",
|
||||||
"husky": "^5.1.3",
|
"husky": "^6.0.0",
|
||||||
"markdownlint-cli": "^0.27.0",
|
"markdownlint-cli": "^0.27.1",
|
||||||
"node-sass": "^5.0.0",
|
"node-sass": "^5.0.0",
|
||||||
"npm-run-all": "^4.1.5",
|
"npm-run-all": "^4.1.5",
|
||||||
"prettier": "^2.2.1"
|
"prettier": "^2.2.1"
|
||||||
|
Loading…
Reference in New Issue
Block a user