mirror of
https://github.com/oceanprotocol/docs.git
synced 2024-11-26 19:49:26 +01:00
179 lines
7.8 KiB
Markdown
179 lines
7.8 KiB
Markdown
|
# Compute-to-data (C2D)
|
||
|
|
||
|
C2Dv2 continues the concept of bringing algorithms to the data, allowing both public and private datasets to be used with algorithms. While previous versions relied on external components (Provider -> Operator Service running in Kubernetes -> multiple Operator-Engines each running in their own Kubernetes namespace), C2Dv2 is embedded entirely within the ocean-node.
|
||
|
|
||
|
It has a modular approach, allowing multiple compute engines to be connected to the same ocean-node engine. These compute engines can be internal (Docker or Kubernetes if ocean-node runs in a Kubernetes environment) or external (in the future, integration with projects like Bachalau, iExec, etc., is possible).
|
||
|
|
||
|
### Additional Features
|
||
|
|
||
|
* Allow multiple C2D engines to connect to the same ocean-node
|
||
|
* Support multiple jobs (stages) in a workflow
|
||
|
* Jobs can be dependent or independent of previous stages, allowing for parallel or serial job execution
|
||
|
|
||
|
### Workflows
|
||
|
|
||
|
A workflow defines one or more jobs to be executed. Each job may have dependencies on a previous job.
|
||
|
|
||
|
```json
|
||
|
[
|
||
|
{
|
||
|
"index": number,
|
||
|
"jobId": "generated by orchestrator",
|
||
|
"runAfter": "if defined, wait for specific jobId to finish",
|
||
|
"input": [
|
||
|
{
|
||
|
"index": number,
|
||
|
"did": "optional",
|
||
|
"serviceId": "optional",
|
||
|
"files": "filesObject, optional"
|
||
|
}
|
||
|
],
|
||
|
"algorithm": {
|
||
|
"did": "optional",
|
||
|
"serviceId": "optional",
|
||
|
"files": "filesObject, optional",
|
||
|
"rawcode": "optional",
|
||
|
"container": {
|
||
|
"entrypoint": "string",
|
||
|
"image": "string",
|
||
|
"tag": "string"
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
]
|
||
|
```
|
||
|
|
||
|
### Orchestration Layer
|
||
|
|
||
|
Formerly known as the "operator-service," this layer handles interactions between the ocean-node core layer and different execution environments.
|
||
|
|
||
|
In summary, it should:
|
||
|
|
||
|
* Expose a list of compute environments for all engines
|
||
|
* Expose a list of running jobs and limits (e.g., max concurrent jobs)
|
||
|
* Take on new jobs (created by the startJob core handler)
|
||
|
* Determine which module to use (Docker, Kubernetes, Bachalau, etc.)
|
||
|
* Insert workflow into the database
|
||
|
* Signal the engine handler to take over job execution
|
||
|
* Read workflow status when the C2D getStatus core is called
|
||
|
* Serve job results when the C2D getJobResult is called
|
||
|
|
||
|
Due to technical constraints, both internal modules (Docker and Kubernetes) will use Docker images for data provisioning (previously pod-configuration) and results publishing (previously pod-publishing). The orchestration layer will also expose two new core commands:
|
||
|
|
||
|
* `c2dJobStatusUpdate` (called by both pod-configuration and pod-publishing to update job status)
|
||
|
* `c2dJobPublishResult` (called by pod-publishing when results need to be uploaded)
|
||
|
|
||
|
When any pod-\*\* calls one of these endpoints, we must verify the signature and respond accordingly.
|
||
|
|
||
|
### Payment Flow in Orchestration
|
||
|
|
||
|
This will be based on an escrow contract. The orchestrator will:
|
||
|
|
||
|
* Compute the sum of maxDuration from all jobs in the workflow
|
||
|
* Calculate the required fee (depending on the previous step, token, environment, etc.)
|
||
|
* Lock the amount in the escrow contract
|
||
|
* Wait until all jobs are finished (successfully or not)
|
||
|
* Calculate the actual duration spent
|
||
|
* Compute proof
|
||
|
* Withdraw payment & provide proof, releasing the difference back to the customer
|
||
|
|
||
|
### C2D Engines
|
||
|
|
||
|
A C2D Engine is a piece of code that handles C2D jobs running on a specific orchestration implementation. This document focuses on internal compute engines: Docker-based (host with Docker environment installed) and Kubernetes-based (if ocean-node runs inside a Kubernetes cluster).
|
||
|
|
||
|
An engine that uses external services (like Bachalau) follows the same logic but will likely interact with remote APIs.
|
||
|
|
||
|
An engine is responsible for:
|
||
|
|
||
|
* Storing workflows and each job status (so on restart, we can resume or continue running flows)
|
||
|
* Queueing new jobs
|
||
|
|
||
|
#### Docker Engine
|
||
|
|
||
|
This module requires Docker service installed at the host level. It leverages the Docker API to:
|
||
|
|
||
|
* Create job volumes (with quotas)
|
||
|
* Start the provisioning container (pod-configuration)
|
||
|
* Monitor its status
|
||
|
* Create YAML for algorithms with hardware constraints (CPU, RAM)
|
||
|
* Pass devices for GPU environments
|
||
|
* Start the algorithm container
|
||
|
* Monitor algorithm health & timeout constraints
|
||
|
* Stop the algorithm if the quota is exceeded
|
||
|
* Start the publishing container
|
||
|
* Delete job volumes
|
||
|
|
||
|
```
|
||
|
title C2Dv2 message flow for docker module
|
||
|
User -> Ocean-node: start c2d job
|
||
|
Ocean-node -> Orchestration-class: start c2d job
|
||
|
Orchestration-class -> Orchestration-class: determinte module and insert workflow, random private key in db
|
||
|
Orchestration-class -> Docker-engine: queue job
|
||
|
Docker-engine -> Docker_host_api: create job volume
|
||
|
Docker-engine -> Docker-engine: create yaml for pod-configuration, set private key
|
||
|
Docker-engine -> Docker_host_api: start pod-configuration
|
||
|
Pod_configuration -> Pod_configuration: starts ocean-node as pod-config
|
||
|
Pod_configuration -> Ocean-node: call c2dJobProvision
|
||
|
Ocean-node -> Pod_configuration: return workflow
|
||
|
Pod_configuration -> Pod_configuration : download inputs & algo
|
||
|
Pod_configuration -> Ocean-node: call c2dJobStatusUpdate
|
||
|
Ocean-node -> Docker-engine: download success, start algo
|
||
|
Docker-engine -> Docker-engine: create yaml for algo
|
||
|
Docker-engine -> Docker_host_api: start algo container
|
||
|
Docker-engine -> Docker-engine: monitor algo container, stop if timeout
|
||
|
Docker-engine -> Docker-engine: create yaml for pod-publishing, set private key
|
||
|
Docker-engine -> Docker_host_api: start pod-publishing
|
||
|
Docker_host_api -> Pod-Publishing: start as docker container
|
||
|
Pod-Publishing -> Pod-Publishing : prepare output
|
||
|
Pod-Publishing -> Ocean-node: call c2dJobPublishResult
|
||
|
Pod-Publishing -> Ocean-node: call c2dJobStatusUpdate
|
||
|
```
|
||
|
|
||
|
<figure><img src="../.gitbook/assets/image.png" alt=""><figcaption><p>C2Dv2 flow diagram</p></figcaption></figure>
|
||
|
|
||
|
#### Kubernetes Engine
|
||
|
|
||
|
This module requires access to Kubernetes credentials (or autodetects them if ocean-node already runs in a Kubernetes cluster). It leverages the Kubernetes API to:
|
||
|
|
||
|
* Create job volumes (with quotas)
|
||
|
* Start the provisioning container (pod-configuration)
|
||
|
* Monitor its status
|
||
|
* Create YAML for algorithms with hardware constraints (CPU, RAM)
|
||
|
* Pass devices for GPU environments
|
||
|
* Start the algorithm container
|
||
|
* Monitor algorithm health & timeout constraints
|
||
|
* Stop the algorithm if the quota is exceeded
|
||
|
* Start the publishing container
|
||
|
* Delete job volumes
|
||
|
|
||
|
#### POD-\* Common Description
|
||
|
|
||
|
For efficient communication between ocean-node and the two containers, the easiest way is to use p2p/http API. Thus, all pod-\* instances will run an ocean-node instance (each will have a job-generated random key) and connect to the main ocean-node instance. The main ocean-node instance's peerNodeId or HTTP API endpoint will be inserted in the YAML. Each pod-\* will use a private key, also exposed in the YAML.
|
||
|
|
||
|
Each YAML of pod-\* will contain the following environment variables:
|
||
|
|
||
|
* nodePeerId
|
||
|
* nodeHttpApi
|
||
|
* privateKey
|
||
|
|
||
|
#### Pod-configuration
|
||
|
|
||
|
Previously, pod-configuration was a standalone repository built as a Docker image. In this implementation, it will be ocean-node with a different entrypoint (entry\_configuration.js).
|
||
|
|
||
|
Implementation:
|
||
|
|
||
|
* Call ocean-node/c2dJobProvision and get the workflow's input section
|
||
|
* Download all assets
|
||
|
* Call the ocean-node/c2dJobStatusUpdate core command to update status (provision finished or errors)
|
||
|
|
||
|
#### Pod-publishing
|
||
|
|
||
|
Previously, pod-publishing was a standalone repository built as a Docker image. In this implementation, it will be ocean-node with a different entrypoint (entry\_publishing.js).
|
||
|
|
||
|
Implementation:
|
||
|
|
||
|
* Read the output folder
|
||
|
* If multiple files or folders are detected, create a zip with all those files/folders
|
||
|
* Call the ocean-node/c2dJobPublishResult core command and let ocean-node handle storage
|
||
|
* Call the ocean-node/c2dJobStatusUpdate core command to update the job as done
|