From 8bbb0634d7c7781640de1a3ff3ee531cdcb9fa80 Mon Sep 17 00:00:00 2001 From: Christian Casazza Date: Thu, 8 Jun 2023 21:27:28 +0000 Subject: [PATCH] GITBOOK-469: change request with no subject merged in GitBook --- SUMMARY.md | 2 +- ...d => benefits-of-ocean-for-data-science.md | 0 data-science/README.md | 23 ++++++++++++------- data-science/data-engineers.md | 2 +- data-science/the-data-value-creation-loop.md | 2 +- 5 files changed, 18 insertions(+), 11 deletions(-) rename data-science/benefits-of-ocean-for-data-science.md => benefits-of-ocean-for-data-science.md (100%) diff --git a/SUMMARY.md b/SUMMARY.md index 9519f5e5..0a102ef0 100644 --- a/SUMMARY.md +++ b/SUMMARY.md @@ -95,9 +95,9 @@ * [Authentication Endpoints](developers/provider/authentication-endpoints.md) * [📊 Data Science](data-science/README.md) * [The Data Value Creation Loop](data-science/the-data-value-creation-loop.md) - * [Benefits of Ocean for Data Science](data-science/benefits-of-ocean-for-data-science.md) * [Examples of valuable data](data-science/data-engineers.md) * [Data Scientists](data-science/data-scientists.md) +* [Benefits of Ocean for Data Science](benefits-of-ocean-for-data-science.md) * [🔨 Infrastructure](infrastructure/README.md) * [Setup a Server](infrastructure/setup-server.md) * [Deploying Marketplace](infrastructure/deploying-marketplace.md) diff --git a/data-science/benefits-of-ocean-for-data-science.md b/benefits-of-ocean-for-data-science.md similarity index 100% rename from data-science/benefits-of-ocean-for-data-science.md rename to benefits-of-ocean-for-data-science.md diff --git a/data-science/README.md b/data-science/README.md index 1047908b..451028c5 100644 --- a/data-science/README.md +++ b/data-science/README.md @@ -5,7 +5,21 @@ coverY: 0 # 📊 Data Science -Ocean Protocol was built to serve the data science space. This guide links you to the most important tutorials for data scientists working with Ocean Protocol. +Ocean Protocol was built to serve the data science space. + +Data Value Creation Loop stage + +With Ocean, each [Data Value Creation Loop](the-data-value-creation-loop.md) stage is tokenized with data NFTs and datatokens. Leveraging tokenized standards unlocks several unique benefits for the ecosysem. Together, stakeholders can build sophisticated products by combining assets posted onto Ocean. + +Data engineers can publish pipelines for curated data, allowing data scientists to conduct feature engineering and build models on top. The models can be deployed with Compute-to-Data and leveraged by app developers building the last-mile distribution of model outputs into business practices. + + + +Ocean Protocol unlocks _composable data science, ._ Instead of a data scientists needing to conduct each stage of the pipeline themselves, they can work together and build off of each other's components and focus on what they are best at. + + + +This guide links you to the most important tutorials for data scientists working with Ocean Protocol. @@ -15,10 +29,3 @@ Ocean Protocol was built to serve the data science space. This guide links you t * Ocean's [Compute-to-Data](../developers/compute-to-data/) engine resolves the trade-off between the benefits of open data and data privacy risks. Using the engine, algorithms can be run on data without exposing the underlying data. Now, data can be widely shared and monetized without * [Ocean.py](../developers/ocean.py/) a python library that interacts with all Ocean contracts and tools. To get started with the library, check out our guides. They will teach installation and set-up and several popular workflows such as[ publishing an asset](../developers/ocean.py/publish-flow.md) and starting a [compute job](../developers/ocean.py/compute-flow.md). - - -How to take part in the ecosystem - -* Publish useful data -* - diff --git a/data-science/data-engineers.md b/data-science/data-engineers.md index 5390d26f..68508da1 100644 --- a/data-science/data-engineers.md +++ b/data-science/data-engineers.md @@ -1,6 +1,6 @@ # Examples of valuable data -The data value creation loop begins with a us +There is opportunity for ths * **Government Open Data:** Governments serve as a rich and reliable source of data. However, this data often lacks proper documentation or poses challenges for data scientists to work with effectively. Establishing robust Extract, Transform, Load (ETL) pipelines enhance accessibility to government open data. This way, others can tap into this wealth of information without unnecessary hurdles. For example, in one of our [data challenges](https://desights.ai/shared/challenge/8) we leveraged public real estate data from Dubai to build use cases for understanding and predicting valuations and rents. Local, state, and federal governments around the world provide access to valuable data. Build pipelines to make consuming that data easier and help others build useful products to help your local community. * **Public APIs:** A wide range of freely available public APIs covers various data verticals. Leveraging these APIs, data engineers can construct pipelines that enable efficient access and utilization of the data. [This ](https://github.com/public-apis/public-apis)is a public repository of public APIs for a wide range of topics, from weather to gaming to finance. diff --git a/data-science/the-data-value-creation-loop.md b/data-science/the-data-value-creation-loop.md index b3c2f9b8..de32b384 100644 --- a/data-science/the-data-value-creation-loop.md +++ b/data-science/the-data-value-creation-loop.md @@ -1,6 +1,6 @@ # The Data Value Creation Loop -The Data Value Creation Loop lives at the heart of Ocean Protocol. It refers to the process in which data gains value as it progresses from business problem, to raw data, to cleaned data, to trained model, to its use in applications. At each step of the way, additional work is done on the data so that it accrues greater value. +The Data Value Creation Loop represents the data journey as it progresses from a business problem to raw data, undergoes cleansing and refinement, is used to train models, and finally finds its application in real-world scenarios. Data assets accrue more value at each stage of the loop accrues as it gets closer to its real-world deployment. created when a variety of different skillsets work together; business stakeholders, data engineers, data scientists, MLOps engindeploymenteers, and application developers * **Business Problem:** Identifying the business problem that can be addressed with data science is the critical first step. Example: Reducincustomerer churn rate, predicting token prices, or predicting drought risk. * **Raw Data**: This is the unprocessed, untouched data, fresh from the source. This data can be static or dynamic from an API. Example: User profiles, historical prices, or daily temperature.