1
0
mirror of https://github.com/oceanprotocol/docs.git synced 2024-11-01 15:55:34 +01:00
docs/infrastructure/compute-to-data-minikube.md

244 lines
8.1 KiB
Markdown
Raw Normal View History

---
title: Minikube Compute-to-Data Environment
---
# Deploying C2D
2023-07-16 09:19:03 +02:00
This chapter will present how to deploy the C2D component of the Ocean stack. As mentioned in the [C2D Architecture chapter](../developers/compute-to-data/#architecture-and-overview-guides), the Compute-to-Data component uses Kubernetes to orchestrate the creation and deletion of the pods in which the C2D jobs are run.
For the ones that do not have a Kubernetes environment available, we added to this guide instructions on how to install Minikube, which is a lightweight Kubernetes implementation that creates a VM on your local machine and deploys a simple cluster containing only one node. In case you have a Kubernetes environment in place, please skip directly to step 4 of this guide.
### Requirements
* Communications: a functioning internet-accessible provider service
* Hardware: a server capable of running compute jobs (e.g. we used a machine with 8 CPUs, 16 GB Ram, 100GB SSD, and a fast internet connection). See [this guide](setup-server.md) for how to create a server;
* Operating system: Ubuntu 22.04 LTS
### Steps
1. [Install Docker and Git](compute-to-data-minikube.md#install-docker-and-git)
2. [Install Minikube](compute-to-data-minikube.md#install-minikube)
3. [Start Minikube](compute-to-data-minikube.md#start-minikube)
4. [Install the Kubernetes command line tool (kubectl)](compute-to-data-minikube.md#install-the-kubernetes-command-line-tool-kubectl)
5. [Run the IPFS host (optional)](compute-to-data-minikube.md#run-the-ipfs-host-optional)
6. [Update the storage class](compute-to-data-minikube.md#update-the-storage-class)
7. [Download and Configure Operator Service](compute-to-data-minikube.md#download-and-configure-operator-service)
8. [Download and Configure Operator Engine](compute-to-data-minikube.md#download-and-configure-operator-engine)
9. [Create namespaces](compute-to-data-minikube.md#create-namespaces)
10. [Deploy Operator Service](compute-to-data-minikube.md#deploy-operator-service)
11. [Deploy Operator Engine](compute-to-data-minikube.md#deploy-operator-engine)
12. [Expose Operator Service](compute-to-data-minikube.md#expose-operator-service)
13. [Initialize the database](compute-to-data-minikube.md#initialize-database)
14. [Update Provider](compute-to-data-minikube.md#update-provider)
#### Install Docker and Git
```bash
sudo apt update
sudo apt install git docker.io
sudo usermod -aG docker $USER && newgrp docker
```
#### Install Minikube
```bash
wget -q --show-progress https://github.com/kubernetes/minikube/releases/download/v1.22.0/minikube_1.22.0-0_amd64.deb
sudo dpkg -i minikube_1.22.0-0_amd64.deb
```
#### Start Minikube
2021-11-09 12:00:38 +01:00
The first command is important and solves a [PersistentVolumeClaims problem](https://github.com/kubernetes/minikube/issues/7828).
2021-11-09 12:00:38 +01:00
```bash
minikube config set kubernetes-version v1.16.0
minikube start --cni=calico --driver=docker --container-runtime=docker
```
Depending on the number of available CPUs, RAM, and the required resources for running the job, consider adding options `--cpu`, `--memory`, and `--disk-size` to avoid runtime issues.
For other options to run minikube refer to this [link](https://minikube.sigs.k8s.io/docs/commands/start/)
#### Install the Kubernetes command line tool (kubectl)
2021-11-09 12:00:38 +01:00
```bash
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
curl -LO "https://dl.k8s.io/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl.sha256"
echo "$(<kubectl.sha256) kubectl" | sha256sum --check
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
```
Wait until all the defaults are running (1/1).
2021-11-09 12:00:38 +01:00
```bash
watch kubectl get pods --all-namespaces
```
#### Run the IPFS host (optional)
2023-07-16 09:19:03 +02:00
To store the results and the logs of the C2D jobs, you can use either an AWS S3 bucket or IPFS.
In case you want to use IPFS you need to run an IPFS host, as presented below.
2021-11-23 11:58:02 +01:00
2021-11-09 12:00:38 +01:00
```bash
export ipfs_staging=~/ipfs_staging
export ipfs_data=~/ipfs_data
docker run -d --name ipfs_host -v $ipfs_staging:/export -v $ipfs_data:/data/ipfs -p 4001:4001 -p 4001:4001/udp -p 127.0.0.1:8080:8080 -p 127.0.0.1:5001:5001 ipfs/go-ipfs:latest
sudo /bin/sh -c 'echo "127.0.0.1 youripfsserver" >> /etc/hosts'
```
#### Update the storage class
2021-11-22 18:56:28 +01:00
2023-07-16 09:19:03 +02:00
The storage class is used by Kubernetes to create the temporary volumes on which the data used by the algorithm will be stored.
2021-11-22 18:56:28 +01:00
2023-07-16 09:19:03 +02:00
Please ensure that your class allocates volumes in the same region and zone where you are running your pods.
2023-07-16 09:19:03 +02:00
You need to consider the storage class available for your environment.
For Minikube, you can use the default 'standard' class.
2021-11-22 18:56:28 +01:00
In AWS, we created our own 'standard' class:
2021-11-22 18:56:28 +01:00
```bash
kubectl get storageclass standard -o yaml
```
```yaml
allowedTopologies:
- matchLabelExpressions:
- key: failure-domain.beta.kubernetes.io/zone
values:
- us-east-1a
apiVersion: storage.k8s.io/v1
kind: StorageClass
parameters:
fsType: ext4
type: gp2
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Delete
volumeBindingMode: Immediate
```
For more information, please visit https://kubernetes.io/docs/concepts/storage/storage-classes/
#### Download and Configure Operator Service
Open a new terminal and run the command below.
2021-11-23 11:58:02 +01:00
```bash
git clone https://github.com/oceanprotocol/operator-service.git
```
Edit `operator-service/kubernetes/postgres-configmap.yaml`. Change `POSTGRES_PASSWORD` to a nice long random password.
Edit `operator-service/kubernetes/deployment.yaml`. Optionally change:
* `ALGO_POD_TIMEOUT`
* add `requests_cpu`
* add `requests_memory`
* add `limits_cpu`
* add `limits_memory`
```yaml
2023-04-10 12:57:00 +02:00
---
spec:
containers:
- env:
- name: requests_cpu
value: "4"
- name: requests_memory
value: "8Gi"
- name: limits_cpu
value: "8"
- name: limits_memory
value: "15Gi"
- name: ALGO_POD_TIMEOUT
value: "3600"
```
#### Download and Configure Operator Engine
```bash
git clone https://github.com/oceanprotocol/operator-engine.git
```
Check the [README](https://github.com/oceanprotocol/operator-engine#customize-your-operator-engine-deployment) section of the operator engine to customize your deployment.
At a minimum, you should add your IPFS URLs or AWS settings, and add (or remove) notification URLs.
#### Create namespaces
```bash
kubectl create ns ocean-operator
kubectl create ns ocean-compute
```
#### Deploy Operator Service
```bash
kubectl config set-context --current --namespace ocean-operator
kubectl create -f operator-service/kubernetes/postgres-configmap.yaml
kubectl create -f operator-service/kubernetes/postgres-storage.yaml
kubectl create -f operator-service/kubernetes/postgres-deployment.yaml
kubectl create -f operator-service/kubernetes/postgresql-service.yaml
kubectl apply -f operator-service/kubernetes/deployment.yaml
```
#### Deploy Operator Engine
```bash
kubectl config set-context --current --namespace ocean-compute
kubectl apply -f operator-engine/kubernetes/sa.yml
kubectl apply -f operator-engine/kubernetes/binding.yml
kubectl apply -f operator-engine/kubernetes/operator.yml
kubectl create -f operator-service/kubernetes/postgres-configmap.yaml
```
**Optional**: For production enviroments, it's safer to block access to metadata. To do so run the below command:
```bash
kubectl -n ocean-compute apply -f /ocean/operator-engine/kubernetes/egress.yaml
```
#### Expose Operator Service
```bash
kubectl expose deployment operator-api --namespace=ocean-operator --port=8050
```
Run a port forward or create your ingress service and setup DNS and certificates (not covered here):
```bash
kubectl -n ocean-operator port-forward svc/operator-api 8050
```
Alternatively you could use another method to communicate between the C2D Environment and the provider, such as an SSH tunnel.
#### Initialize database
If your Minikube is running on compute.example.com:
```bash
curl -X POST "https://compute.example.com/api/v1/operator/pgsqlinit" -H "accept: application/json"
```
#### Update Provider
Update your provider service by updating the `operator_service.url` value in `config.ini`
```ini
operator_service.url = https://compute.example.com/
```
Restart your provider service.