2023-05-17 16:23:25 +02:00
---
description: >-
How to construct the beginnings of an awesome algorithm for C2D compute jobs
2023-05-22 17:49:05 +02:00
on datasets 😎
2023-05-17 16:23:25 +02:00
---
# Make a Boss C2D Algorithm
2023-05-22 17:49:05 +02:00
< figure > < img src = "../../.gitbook/assets/like-a-boss.gif" alt = "" > < figcaption > < / figcaption > < / figure >
2023-05-17 16:37:46 +02:00
2023-05-22 17:49:05 +02:00
The beginning of any great algorithm for Compute-to-Data starts by referencing the dataset file correctly on the Docker container. Here is the code in both Python and Javascript for how to reference your dataset file on the Docker container:
2023-05-17 16:30:38 +02:00
2023-05-17 17:14:33 +02:00
### Open the local dataset file
2023-05-17 16:30:38 +02:00
2023-05-17 17:14:33 +02:00
{% tabs %}
{% tab title="Python" %}
2023-05-22 17:42:16 +02:00
```python
2023-05-17 16:30:38 +02:00
import csv
import json
import os
def get_input(local=False):
dids = os.getenv("DIDS", None)
if not dids:
print("No DIDs found in the environment. Aborting.")
return
dids = json.loads(dids)
for did in dids:
filename = f"data/inputs/{did}/0" # 0 for metadata service
print(f"Reading asset file {filename}.")
return filename
# Get the input filename using the get_input function
input_filename = get_input()
if not input_filename:
# No input filename returned
exit()
2023-05-17 16:51:18 +02:00
# Open the file & run your code
2023-05-17 16:30:38 +02:00
with open(input_filename, 'r') as file:
# Read the CSV file
csv_reader = csv.DictReader(file)
2023-05-17 16:37:46 +02:00
2023-05-17 16:50:45 +02:00
< YOUR CODE GOES HERE >
2023-05-17 16:30:38 +02:00
```
2023-05-17 17:14:33 +02:00
{% endtab %}
2023-05-17 16:30:38 +02:00
2023-05-17 17:14:33 +02:00
{% tab title="Javascript" %}
2023-05-22 17:42:16 +02:00
```javascript
2023-05-17 16:50:18 +02:00
const fs = require("fs");
var input_folder = "/data/inputs";
var output_folder = "/data/outputs";
async function processfolder(Path) {
var files = fs.readdirSync(Path);
for (var i =0; i < files.length ; i + + ) {
var file = files[i];
var fullpath = Path + "/" + file;
if (fs.statSync(fullpath).isDirectory()) {
await processfolder(fullpath);
} else {
< YOUR CODE GOES HERE >
}
}
}
2023-05-17 16:58:14 +02:00
< YOUR CODE & FUNCTION DEFINITIONS GO HERE >
2023-05-17 16:50:18 +02:00
2023-05-17 16:51:18 +02:00
// Open the file & run your code
2023-05-17 16:50:18 +02:00
processfolder(input_folder);
```
2023-05-17 17:14:33 +02:00
{% endtab %}
{% endtabs %}
**Note:** Here are the following Python libraries that you can use in your code:
2023-05-22 17:42:16 +02:00
```python
2023-05-17 17:14:33 +02:00
// Python modules
numpy==1.16.3
pandas==0.24.2
python-dateutil==2.8.0
pytz==2019.1
six==1.12.0
sklearn
xlrd == 1.2.0
openpyxl >= 3.0.3
wheel
matplotlib
```