Skip to content

Commit

Permalink
University Essentials Updates (rtdip#798)
Browse files Browse the repository at this point in the history
* Updates

Signed-off-by: GBBBAS <[email protected]>

* Updated exercises

Signed-off-by: GBBBAS <[email protected]>

---------

Signed-off-by: GBBBAS <[email protected]>
  • Loading branch information
GBBBAS authored Aug 8, 2024
1 parent 7f9ba02 commit cbe4648
Show file tree
Hide file tree
Showing 25 changed files with 194 additions and 78 deletions.
19 changes: 19 additions & 0 deletions docs/university/essentials/rtdip/architecture/databricks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
![RTDIP Databricks](../assets/rtdip_databricks.png)

RTDIP integrates with Databricks and supports executing time series queries or ingesting data. Queries are executed using either Databricks SQL Warehouses or Spark Connect. Data Ingestion can be run and orchestrated using Databricks Workflows or Delta Live Tables.

For further information about Databricks, please refer to:

- [Databricks SQL](https://www.databricks.com/product/databricks-sql)
- [Databricks Workflows](https://docs.databricks.com/en/workflows/index.html)
- [Delta Live Tables](https://www.databricks.com/product/delta-live-tables)

## Course Progress

- [X] Overview
- [X] Architecture
- [X] Queries
- [X] Pipelines
- [X] Databricks
- [ ] SDK
- [ ] APIs
13 changes: 13 additions & 0 deletions docs/university/essentials/rtdip/architecture/pipelines.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
![RTDIP Pipelines](../assets/rtdip_sdk_pipelines.png)

Not in scope for this particular course but it is worth mentioning that RTDIP also provides the ability to create and manage time series ingestion pipelines. Pipelines are a series of steps that are executed in sequence to process time series data. Pipeline components consist of data sources, data sinks, and processing steps.

## Course Progress

- [X] Overview
- [ ] Architecture
- [X] Queries
- [X] Pipelines
- [ ] Databricks
- [ ] SDK
- [ ] APIs
15 changes: 15 additions & 0 deletions docs/university/essentials/rtdip/architecture/queries.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
![RTDIP Queries](../assets/rtdip_sdk_queries.png)

RTDIP provides the ability to execute time series queries on the data stored in the RTDIP platform. The queries can be executed using the RTDIP SDK or APIs, queries such as raw, resample, interpolation, interpolate at time, time-weighted average, circular averages, circular standard deviation, latest, plot, summary, and metadata.

The RTDIP Essentials course will focus on RTDIP queries in the sections that follow.

## Course Progress

- [X] Overview
- [ ] Architecture
- [X] Queries
- [ ] Pipelines
- [ ] Databricks
- [ ] SDK
- [ ] APIs
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 23 additions & 0 deletions docs/university/essentials/rtdip/introduction/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# RTDIP Essentials

<p align="center"><img src=https://raw.githubusercontent.com/rtdip/core/develop/docs/getting-started/images/rtdip-horizontal-color.png alt="rtdip" width=50% height=50%/></p>
<p style="text-align: center; font-size: 2em; font-weight: bold;">Essentials</p>

Welcome to the RTDIP Essentials training course. This course introduces you to the Real Time Data Ingestion Platform, a scalable solution for ingesting and processing data from a variety of time series data sources.

You will learn how to execute
By the end of this course, you will have a good understanding of:

- The RTDIP architecture
- How to use the SDK to interact with the RTDIP platform
- How to use the APIs to execute time series queries
- Build visualizations and dashboards in Power BI

## Course Progress

- [ ] Overview
- [X] Introduction
- [ ] Prerequisites
- [ ] Architecture
- [ ] SDK
- [ ] APIs
28 changes: 28 additions & 0 deletions docs/university/essentials/rtdip/introduction/prerequisites.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Course Prerequisites

Before you begin the course, ensure you obtain the following prerequisites(from your istructor or from your environment if you are doing this on your own):

## Development Environment
- Python >=3.9,<=3.11
- An IDE uch as Visual Studio Code or PyCharm

## System Requirements
- A Cluster for executing Spark SQL - If using Databricks, this would typically be a Databricks SQL Warehouse and its associated connection details:
- Server Hostname
- HTTP Path
- Access to Power BI

## Data Requirements
- Access to a time series table that has, as a minimum:
- An identifier column
- A timestamp column
- A value column

## Course Progress

- [X] Overview
- [X] Introduction
- [X] Prerequisites
- [ ] Architecture
- [ ] SDK
- [ ] APIs
9 changes: 0 additions & 9 deletions docs/university/essentials/rtdip/overview.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/university/essentials/sdk/authentication/azure.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [ ] Authentication
+ [X] Azure Active Directory
+ [ ] Databricks
+ [ ] Exercise
* [ ] Connectors
* [ ] Queries
- [ ] APIs
6 changes: 2 additions & 4 deletions docs/university/essentials/sdk/authentication/databricks.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,10 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [ ] Authentication
+ [X] Azure Active Directory
+ [X] Databricks
+ [ ] Exercise
* [X] Authentication
* [ ] Connectors
* [ ] Queries
- [ ] APIs
26 changes: 0 additions & 26 deletions docs/university/essentials/sdk/authentication/exercise.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [X] Authentication
Expand Down
15 changes: 15 additions & 0 deletions docs/university/essentials/sdk/connectors/exercise.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
In this exercise, you will obtain an access token for Azure AD using the RTDIP SDK and then use it to authenticate with a Databricks SQL Warehouse.

1. Create a new python file

2. Import the necessary classes from the RTDIP SDK

3. Authenticate with Azure AD using the `DefaultAuth` method

4. Retrieve the access token

5. Connect to the Databricks SQL Warehouse using the relevant connector

6. Run your code and ensure that you can connect to the Databricks SQL Warehouse succesfully

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [X] Authentication
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [X] Authentication
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [X] Authentication
Expand Down
37 changes: 22 additions & 15 deletions docs/university/essentials/sdk/getting-started/exercise.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,28 @@
<!--
Copyright 2024 RTDIP
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

It's time to confirm your environment is set up correctly so that you can progress to the next steps of the course.

1. Ensure you have the right version of python installed on your machine. You can check this by running the following command in your terminal:
```bash
python --version
```

1. Ensure you have the right version of pip installed on your machine. You can check this by running the following command in your terminal:
```bash
pip --version
```

1. Create a python `rtdip-sdk` environment, activate it and install the latest version of [rtdip-sdk](https://pypi.org/project/rtdip-sdk/) and validate its installed correctly by running the following commands in your terminal:
```bash
python -m venv rtdip-sdk
source rtdip-sdk/bin/activate
pip install rtdip-sdk
```



## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [ ] Authentication
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [ ] Getting Started
+ [X] Introduction
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [ ] Getting Started
+ [X] Introduction
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [ ] Getting Started
+ [X] Introduction
Expand Down
19 changes: 19 additions & 0 deletions docs/university/essentials/sdk/queries/exercise.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
Its time to start running some time series queries using the RTDIP SDK.

1. Using the python file you created in the previous exercise, import the necessary time series query classes from the RTDIP SDK

1. Pass the connector you created in the previous exercise to the time series query class

1. Run a `Raw` query to retrieve some data from your time series data source

1. Now run a query to `Resample` this data to a 15 minute interval average

1. Convert the resample query to an `Interpolation` query that executes the `linear` interpolation method

1. Finally, try running a `Time Weighted Average` query on the data, with `Step` set to False

## Additional Task

1. The data returned from these queries is in the form of a pandas DataFrame. Use the `matplotlib` or `plotly` library to plot the data returned from the `Time Weighted Average` query

## Course Progress
- [X] Overview
- [X] Architecture
- [X] SDK
* [X] Getting Started
* [X] Authentication
Expand Down
1 change: 1 addition & 0 deletions docs/university/essentials/sdk/queries/sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [X] Authentication
Expand Down
1 change: 1 addition & 0 deletions docs/university/essentials/sdk/queries/timeseries.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [X] Authentication
Expand Down
1 change: 1 addition & 0 deletions docs/university/essentials/sdk/queries/weather.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

## Course Progress
- [X] Overview
- [X] Architecture
- [ ] SDK
* [X] Getting Started
* [X] Authentication
Expand Down
51 changes: 28 additions & 23 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -328,26 +328,31 @@ nav:
- core: releases/core.md
- Blog:
- blog/index.md
# - University:
# - RTDIP Essentials:
# - Overview: university/essentials/rtdip/overview.md
# - SDK:
# - Getting Started:
# - Introduction: university/essentials/sdk/getting-started/introduction.md
# - Prerequisites: university/essentials/sdk/getting-started/prerequisites.md
# - Installation: university/essentials/sdk/getting-started/installation.md
# - Exercise: university/essentials/sdk/getting-started/exercise.md
# - Authentication:
# - Azure Active Directory: university/essentials/sdk/authentication/azure.md
# - Databricks: university/essentials/sdk/authentication/databricks.md
# - Exercise: university/essentials/sdk/authentication/exercise.md
# - Connectors:
# - Databricks SQL: university/essentials/sdk/connectors/databricks-sql-connector.md
# - ODBC: university/essentials/sdk/connectors/odbc-connectors.md
# - Spark: university/essentials/sdk/connectors/spark-connector.md
# - Exercise: university/essentials/sdk/connectors/exercise.md
# - Queries:
# - Time Series: university/essentials/sdk/queries/timeseries.md
# - SQL: university/essentials/sdk/queries/sql.md
# - Weather: university/essentials/sdk/queries/weather.md
# - Exercise: university/essentials/sdk/queries/exercise.md
- University:
- RTDIP Essentials:
- Introduction:
- Overview: university/essentials/rtdip/introduction/overview.md
- Prerequisites: university/essentials/rtdip/introduction/prerequisites.md
- Architecture:
- Queries: university/essentials/rtdip/architecture/queries.md
- Pipelines: university/essentials/rtdip/architecture/pipelines.md
- Databricks: university/essentials/rtdip/architecture/databricks.md
- SDK:
- Getting Started:
- Introduction: university/essentials/sdk/getting-started/introduction.md
- Prerequisites: university/essentials/sdk/getting-started/prerequisites.md
- Installation: university/essentials/sdk/getting-started/installation.md
- Exercise: university/essentials/sdk/getting-started/exercise.md
- Authentication:
- Azure Active Directory: university/essentials/sdk/authentication/azure.md
- Databricks: university/essentials/sdk/authentication/databricks.md
- Connectors:
- Databricks SQL: university/essentials/sdk/connectors/databricks-sql-connector.md
- ODBC: university/essentials/sdk/connectors/odbc-connectors.md
- Spark: university/essentials/sdk/connectors/spark-connector.md
- Exercise: university/essentials/sdk/connectors/exercise.md
- Queries:
- Time Series: university/essentials/sdk/queries/timeseries.md
- SQL: university/essentials/sdk/queries/sql.md
- Weather: university/essentials/sdk/queries/weather.md
- Exercise: university/essentials/sdk/queries/exercise.md

0 comments on commit cbe4648

Please sign in to comment.