mirror of
https://github.com/oceanprotocol/docs.git
synced 2024-11-26 19:49:26 +01:00
GITBOOK-29: No subject
This commit is contained in:
parent
6e4e4b253d
commit
acbda23deb
@ -27,7 +27,6 @@
|
|||||||
* [Architecture Overview](developers/architecture.md)
|
* [Architecture Overview](developers/architecture.md)
|
||||||
* [Ocean Node](developers/ocean-node/README.md)
|
* [Ocean Node](developers/ocean-node/README.md)
|
||||||
* [Node Architecture](developers/ocean-node/node-architecture.md)
|
* [Node Architecture](developers/ocean-node/node-architecture.md)
|
||||||
* [Compute-to-data (C2D)](developers/ocean-node/compute-to-data-c2d.md)
|
|
||||||
* [Contracts](developers/contracts/README.md)
|
* [Contracts](developers/contracts/README.md)
|
||||||
* [Data NFTs](developers/contracts/data-nfts.md)
|
* [Data NFTs](developers/contracts/data-nfts.md)
|
||||||
* [Datatokens](developers/contracts/datatokens.md)
|
* [Datatokens](developers/contracts/datatokens.md)
|
||||||
|
@ -47,4 +47,13 @@ For details on all of the HTTP endpoints exposed by the Ocean Node API, refer to
|
|||||||
|
|
||||||
{% embed url="https://github.com/oceanprotocol/ocean-node/blob/develop/API.md" %}
|
{% embed url="https://github.com/oceanprotocol/ocean-node/blob/develop/API.md" %}
|
||||||
|
|
||||||
|
###  Compute to Data (C2D)
|
||||||
|
|
||||||
|
The Ocean node provides a convenient and easy way to run a compute-to-data environment. This gives you the opportunity to monetize your node as you can charge fees for using the C2D environment and there are also additional incentives provided Ocean Protocol Foundation (OPF). Soon we will also be releasing C2D V2 which will provide different environments and new ways to pay for computation. 
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
For more details on the C2D V2 architecture, refer to the documentation in the repository: \
|
||||||
 
|
 
|
||||||
|
|
||||||
|
{% embed url="https://github.com/oceanprotocol/ocean-node/blob/develop/docs/C2DV2.md" %}
|
||||||
|
@ -1,178 +0,0 @@
|
|||||||
# Compute-to-data (C2D)
|
|
||||||
|
|
||||||
C2Dv2 continues the concept of bringing algorithms to the data, allowing both public and private datasets to be used with algorithms. While previous versions relied on external components (Provider -> Operator Service running in Kubernetes -> multiple Operator-Engines each running in their own Kubernetes namespace), C2Dv2 is embedded entirely within the ocean-node.
|
|
||||||
|
|
||||||
It has a modular approach, allowing multiple compute engines to be connected to the same ocean-node engine. These compute engines can be internal (Docker or Kubernetes if ocean-node runs in a Kubernetes environment) or external (in the future, integration with projects like Bachalau, iExec, etc., is possible).
|
|
||||||
|
|
||||||
### Additional Features
|
|
||||||
|
|
||||||
* Allow multiple C2D engines to connect to the same ocean-node
|
|
||||||
* Support multiple jobs (stages) in a workflow
|
|
||||||
* Jobs can be dependent or independent of previous stages, allowing for parallel or serial job execution
|
|
||||||
|
|
||||||
### Workflows
|
|
||||||
|
|
||||||
A workflow defines one or more jobs to be executed. Each job may have dependencies on a previous job.
|
|
||||||
|
|
||||||
```json
|
|
||||||
[
|
|
||||||
{
|
|
||||||
"index": number,
|
|
||||||
"jobId": "generated by orchestrator",
|
|
||||||
"runAfter": "if defined, wait for specific jobId to finish",
|
|
||||||
"input": [
|
|
||||||
{
|
|
||||||
"index": number,
|
|
||||||
"did": "optional",
|
|
||||||
"serviceId": "optional",
|
|
||||||
"files": "filesObject, optional"
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"algorithm": {
|
|
||||||
"did": "optional",
|
|
||||||
"serviceId": "optional",
|
|
||||||
"files": "filesObject, optional",
|
|
||||||
"rawcode": "optional",
|
|
||||||
"container": {
|
|
||||||
"entrypoint": "string",
|
|
||||||
"image": "string",
|
|
||||||
"tag": "string"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
]
|
|
||||||
```
|
|
||||||
|
|
||||||
### Orchestration Layer
|
|
||||||
|
|
||||||
Formerly known as the "operator-service," this layer handles interactions between the ocean-node core layer and different execution environments.
|
|
||||||
|
|
||||||
In summary, it should:
|
|
||||||
|
|
||||||
* Expose a list of compute environments for all engines
|
|
||||||
* Expose a list of running jobs and limits (e.g., max concurrent jobs)
|
|
||||||
* Take on new jobs (created by the startJob core handler)
|
|
||||||
* Determine which module to use (Docker, Kubernetes, Bachalau, etc.)
|
|
||||||
* Insert workflow into the database
|
|
||||||
* Signal the engine handler to take over job execution
|
|
||||||
* Read workflow status when the C2D getStatus core is called
|
|
||||||
* Serve job results when the C2D getJobResult is called
|
|
||||||
|
|
||||||
Due to technical constraints, both internal modules (Docker and Kubernetes) will use Docker images for data provisioning (previously pod-configuration) and results publishing (previously pod-publishing). The orchestration layer will also expose two new core commands:
|
|
||||||
|
|
||||||
* `c2dJobStatusUpdate` (called by both pod-configuration and pod-publishing to update job status)
|
|
||||||
* `c2dJobPublishResult` (called by pod-publishing when results need to be uploaded)
|
|
||||||
|
|
||||||
When any pod-\*\* calls one of these endpoints, we must verify the signature and respond accordingly.
|
|
||||||
|
|
||||||
### Payment Flow in Orchestration
|
|
||||||
|
|
||||||
This will be based on an escrow contract. The orchestrator will:
|
|
||||||
|
|
||||||
* Compute the sum of maxDuration from all jobs in the workflow
|
|
||||||
* Calculate the required fee (depending on the previous step, token, environment, etc.)
|
|
||||||
* Lock the amount in the escrow contract
|
|
||||||
* Wait until all jobs are finished (successfully or not)
|
|
||||||
* Calculate the actual duration spent
|
|
||||||
* Compute proof
|
|
||||||
* Withdraw payment & provide proof, releasing the difference back to the customer
|
|
||||||
|
|
||||||
### C2D Engines
|
|
||||||
|
|
||||||
A C2D Engine is a piece of code that handles C2D jobs running on a specific orchestration implementation. This document focuses on internal compute engines: Docker-based (host with Docker environment installed) and Kubernetes-based (if ocean-node runs inside a Kubernetes cluster).
|
|
||||||
|
|
||||||
An engine that uses external services (like Bachalau) follows the same logic but will likely interact with remote APIs.
|
|
||||||
|
|
||||||
An engine is responsible for:
|
|
||||||
|
|
||||||
* Storing workflows and each job status (so on restart, we can resume or continue running flows)
|
|
||||||
* Queueing new jobs
|
|
||||||
|
|
||||||
#### Docker Engine
|
|
||||||
|
|
||||||
This module requires Docker service installed at the host level. It leverages the Docker API to:
|
|
||||||
|
|
||||||
* Create job volumes (with quotas)
|
|
||||||
* Start the provisioning container (pod-configuration)
|
|
||||||
* Monitor its status
|
|
||||||
* Create YAML for algorithms with hardware constraints (CPU, RAM)
|
|
||||||
* Pass devices for GPU environments
|
|
||||||
* Start the algorithm container
|
|
||||||
* Monitor algorithm health & timeout constraints
|
|
||||||
* Stop the algorithm if the quota is exceeded
|
|
||||||
* Start the publishing container
|
|
||||||
* Delete job volumes
|
|
||||||
|
|
||||||
```
|
|
||||||
title C2Dv2 message flow for docker module
|
|
||||||
User -> Ocean-node: start c2d job
|
|
||||||
Ocean-node -> Orchestration-class: start c2d job
|
|
||||||
Orchestration-class -> Orchestration-class: determinte module and insert workflow, random private key in db
|
|
||||||
Orchestration-class -> Docker-engine: queue job
|
|
||||||
Docker-engine -> Docker_host_api: create job volume
|
|
||||||
Docker-engine -> Docker-engine: create yaml for pod-configuration, set private key
|
|
||||||
Docker-engine -> Docker_host_api: start pod-configuration
|
|
||||||
Pod_configuration -> Pod_configuration: starts ocean-node as pod-config
|
|
||||||
Pod_configuration -> Ocean-node: call c2dJobProvision
|
|
||||||
Ocean-node -> Pod_configuration: return workflow
|
|
||||||
Pod_configuration -> Pod_configuration : download inputs & algo
|
|
||||||
Pod_configuration -> Ocean-node: call c2dJobStatusUpdate
|
|
||||||
Ocean-node -> Docker-engine: download success, start algo
|
|
||||||
Docker-engine -> Docker-engine: create yaml for algo
|
|
||||||
Docker-engine -> Docker_host_api: start algo container
|
|
||||||
Docker-engine -> Docker-engine: monitor algo container, stop if timeout
|
|
||||||
Docker-engine -> Docker-engine: create yaml for pod-publishing, set private key
|
|
||||||
Docker-engine -> Docker_host_api: start pod-publishing
|
|
||||||
Docker_host_api -> Pod-Publishing: start as docker container
|
|
||||||
Pod-Publishing -> Pod-Publishing : prepare output
|
|
||||||
Pod-Publishing -> Ocean-node: call c2dJobPublishResult
|
|
||||||
Pod-Publishing -> Ocean-node: call c2dJobStatusUpdate
|
|
||||||
```
|
|
||||||
|
|
||||||
<figure><img src="../../.gitbook/assets/image.png" alt=""><figcaption><p>C2Dv2 flow diagram</p></figcaption></figure>
|
|
||||||
|
|
||||||
#### Kubernetes Engine
|
|
||||||
|
|
||||||
This module requires access to Kubernetes credentials (or autodetects them if ocean-node already runs in a Kubernetes cluster). It leverages the Kubernetes API to:
|
|
||||||
|
|
||||||
* Create job volumes (with quotas)
|
|
||||||
* Start the provisioning container (pod-configuration)
|
|
||||||
* Monitor its status
|
|
||||||
* Create YAML for algorithms with hardware constraints (CPU, RAM)
|
|
||||||
* Pass devices for GPU environments
|
|
||||||
* Start the algorithm container
|
|
||||||
* Monitor algorithm health & timeout constraints
|
|
||||||
* Stop the algorithm if the quota is exceeded
|
|
||||||
* Start the publishing container
|
|
||||||
* Delete job volumes
|
|
||||||
|
|
||||||
#### POD-\* Common Description
|
|
||||||
|
|
||||||
For efficient communication between ocean-node and the two containers, the easiest way is to use p2p/http API. Thus, all pod-\* instances will run an ocean-node instance (each will have a job-generated random key) and connect to the main ocean-node instance. The main ocean-node instance's peerNodeId or HTTP API endpoint will be inserted in the YAML. Each pod-\* will use a private key, also exposed in the YAML.
|
|
||||||
|
|
||||||
Each YAML of pod-\* will contain the following environment variables:
|
|
||||||
|
|
||||||
* nodePeerId
|
|
||||||
* nodeHttpApi
|
|
||||||
* privateKey
|
|
||||||
|
|
||||||
#### Pod-configuration
|
|
||||||
|
|
||||||
Previously, pod-configuration was a standalone repository built as a Docker image. In this implementation, it will be ocean-node with a different entrypoint (entry\_configuration.js).
|
|
||||||
|
|
||||||
Implementation:
|
|
||||||
|
|
||||||
* Call ocean-node/c2dJobProvision and get the workflow's input section
|
|
||||||
* Download all assets
|
|
||||||
* Call the ocean-node/c2dJobStatusUpdate core command to update status (provision finished or errors)
|
|
||||||
|
|
||||||
#### Pod-publishing
|
|
||||||
|
|
||||||
Previously, pod-publishing was a standalone repository built as a Docker image. In this implementation, it will be ocean-node with a different entrypoint (entry\_publishing.js).
|
|
||||||
|
|
||||||
Implementation:
|
|
||||||
|
|
||||||
* Read the output folder
|
|
||||||
* If multiple files or folders are detected, create a zip with all those files/folders
|
|
||||||
* Call the ocean-node/c2dJobPublishResult core command and let ocean-node handle storage
|
|
||||||
* Call the ocean-node/c2dJobStatusUpdate core command to update the job as done
|
|
Loading…
Reference in New Issue
Block a user