Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/develop' into feature/#106-Gauss…
Browse files Browse the repository at this point in the history
…ian-Smoothing
  • Loading branch information
dh1542 committed Jan 28, 2025
2 parents 61ebc77 + 5c9c8a4 commit 6492153
Show file tree
Hide file tree
Showing 33 changed files with 645 additions and 69 deletions.
Binary file added Deliverables/sprint-12/feature-board.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
79 changes: 79 additions & 0 deletions Deliverables/sprint-12/feature-board.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
Title URL Assignees Status Estimated size Real size Labels Sprint
New Component: Moving Average https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/107 mollle Awaiting Review 5 component Sprint 12
Create a new (final) PR for Shell to review https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/122 dh1542 Awaiting Review 2 release Sprint 12
Link all documentation in Planning Document https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/124 mollle Awaiting Review 3 documentation Sprint 12
New Component: KNN https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/105 kristen149 In Progress 8 component Sprint 12
Write an Article about RTDIP and AMOS https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/123 chris-1187 In Progress 5 documentation Sprint 12
New Component: Gaussian Smoothing https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/106 dh1542 In Progress 5 component Sprint 12
Demo pipeline of multiple components https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/96 Timm638 In Progress 8 enhancement Sprint 12
Deciding Which Use-Case to Present for Demo-Day https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/111 chris-1187 Feature Archive 3 3 release Sprint 11
Refine Product Glossary https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/118 kristen149 Feature Archive 3 2 documentation Sprint 11
Put Value Range Check Component into Action https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/109 mollle Feature Archive 3 3 enhancement Sprint 11
Remove flatlining datapoints https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/113 mollle Feature Archive 3 3 enhancement Sprint 11
Finish implementation of feedback and our first major release (PR #57) https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/90 dh1542 Feature Archive 3 3 release Sprint 11
Finalize Documentation https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/110 FelipeTrost Feature Archive 3 3 documentation Sprint 11
De/normalization: refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/64 FelipeTrost Feature Archive 3 5 refactoring Sprint 10
Finish integrating ARIMA functionality of statsmodels into RTDIP https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/40 Timm638 Feature Archive 5 8 component Sprint 10
Missing data detection: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/69 mollle Feature Archive 5 5 refactoring Sprint 8
Flatline detection: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/68 mollle Feature Archive 3 3 refactoring Sprint 8
ARIMA: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/65 chris-1187 Feature Archive 5 5 refactoring Sprint 9
Duplicate detection: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/61 chris-1187 Feature Archive 3 3 refactoring Sprint 9
Missing Value Imputation: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/92 chris-1187 Feature Archive 3 5 refactoring Sprint 9
Interval filtering: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/62 dh1542 Feature Archive 3 2 refactoring Sprint 9
Linear regression: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/66 kristen149 Feature Archive 3 3 refactoring Sprint 9
Fixing missing imports https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/115 Timm638 Feature Archive 1 1 bug, draft
Restore deliverables folder https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/97 FelipeTrost Feature Archive 1 1 bug Sprint 9
Anomaly detection: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/63 FelipeTrost Feature Archive 3 2 refactoring Sprint 9
Dimensionality Reduction https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/17 FelipeTrost Feature Archive 5 5 component Sprint 9
Fix broken API test https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/84 dh1542 Feature Archive 5 5 bug Sprint 8
Value range validation: Refactor unit tests https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/67 mollle Feature Archive 3 3 refactoring Sprint 8
Apply feedback for anomaly detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/75 kristen149 Feature Archive 1 1 refactoring Sprint 7
Store monitoring outputs in a standardized format https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/26 dh1542 Feature Archive 13 13 enhancement Sprint 7
Apply feedback for duplicate detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/73 dh1542 Feature Archive 2 1 refactoring Sprint 7
Apply feedback for interval filtering https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/74 dh1542 Feature Archive 1 1 refactoring Sprint 7
Apply feedback for missing data identification https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/77 mollle Feature Archive 1 1 refactoring Sprint 7
Apply feedback for value range check https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/76 mollle Feature Archive 1 2 refactoring Sprint 7
Apply feedback on project structure https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/80 dh1542 Feature Archive 2 5 refactoring Sprint 7
Unified input data validation https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/60 chris-1187, mollle Feature Archive 8 8 component Sprint 7
Advanced Duplicate Detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/30 mollle Feature Archive 2 2 component Sprint 7
Data Binning https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/46 FelipeTrost Feature Archive 5 5 component Sprint 6
Prepare RTDIP demo https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/43 Timm638 Feature Archive 8 8 Sprint 6
Homework - user/desing/build documentation https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/56 FelipeTrost Feature Archive 5 5 documentation Sprint 6
One-Hot Encoding https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/45 kristen149 Feature Archive 3 3 component Sprint 6
Interval Filtering not working for EventTime column of type 'datetime' https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/53 dh1542 Feature Archive 2 2 bug Sprint 6
Reduce number of parameters needed to use ArimaPrediction effectively https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/41 chris-1187 Feature Archive 8 8 component Sprint 6
Flatline detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/44 mollle Feature Archive 2 2 component Sprint 5
Validation of value ranges https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/31 mollle Feature Archive 3 3 component Sprint 5
Missing value imputation https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/36 chris-1187 Feature Archive 13 13 component Sprint 5
Time series prediction with linear regression https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/28 FelipeTrost, kristen149 Feature Archive 8 8 component Sprint 5
Normalization of Data https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/18 kristen149, Timm638 Feature Archive 8 8 component Sprint 4
Time Series prediction using ARIMA https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/29 Timm638 Feature Archive 13 8 component Sprint 4
Clean data based on Interval/Pattern https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/22 dh1542 Feature Archive 8 8 component Sprint 4
Create a test pipeline to run during release https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/24 FelipeTrost Feature Archive 5 1 Sprint 3
[Component] Identify missing data https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/2 mollle Feature Archive 8 8 enhancement Sprint 3
Explore the test data and brainstorm RTDIP component ideas https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/11 chris-1187 Feature Archive 5 5 Sprint 3
[Component] Anomaly detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/6 FelipeTrost Feature Archive 3 8 enhancement Sprint 3
[sprint-02] Create software architecture diagram https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/10 dh1542, Timm638 Feature Archive 3 5 Sprint 2
[sprint-02] Create software bill of materials https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/9 kristen149 Feature Archive 1 1 Sprint 2
Fix broken virtual environment https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/8 dh1542, Timm638 Feature Archive 3 3 bug Sprint 2
[Component] Duplicate detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/4 chris-1187, dh1542 Feature Archive 8 8 enhancement Sprint 1
[Component] Outlier detection https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/3 FelipeTrost Feature Archive duplicate Sprint 2
Set up a development environment https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/1 Feature Archive good first issue Sprint 1
Missing copyright text https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/78 Feature Archive refactoring Sprint 7
Rename ColsToVector component https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/79 Feature Archive refactoring Sprint 7
Refactor ML components into own package https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/129 Product Backlog refactoring Sprint 13
Add markdown docs to "nav" configuration https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/128 Product Backlog bug, documentation Sprint 13
Build a demo of all components for Shell https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/135 Product Backlog Sprint 13
Create a demo video https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/132 Product Backlog documentation, release Sprint 13
Create one demo day slide https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/133 Product Backlog documentation Sprint 13
Delete unused branches https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/130 Product Backlog Sprint 13
Demo day schedule https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/131 Product Backlog release Sprint 13
Invertible Principal Component Analysis (for Dimension Reduction) https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/108 Product Backlog enhancement
Principal component analysis (PCA) https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/112 Product Backlog component, stale
Handling a greater range of input regarding DataFrame schemas https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/104 Product Backlog wontfix
Alternative Preprocessing Methods https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/19 Feature Archive component
Please adopt the Deliverables folder structure from https://github.com/amosproj/amos202Xss0Y-projname to your repo / branch https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/7 Feature Archive documentation Sprint 1
[Component] Trend Identification https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/20 Feature Archive
Define clear acceptance criteria for components https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/16 Feature Archive
Interval Screening and Missing Entry Insertion https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/47 Feature Archive
[Component] Data Format https://github.com/amosproj/amos2024ws01-rtdip-data-quality-checker/issues/21 Feature Archive
Binary file added Deliverables/sprint-12/imp-squared-backlog.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions Deliverables/sprint-12/imp-squared-backlog.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Title Assignees Status
Long term planning such as: (notes in description) Done
Lower motivation due to not being sure whether or how the solution will be used in the future Done
Better communication regarding PR reviews - contact via Slack Done
Define unit test to a more detailed granularity Done
SDs- agree to datadype passing to ensure consistent function of compnents Done
Assign non backlog homework tasks after team meeting Done
Make sure the team meeting ends at 14:00 Done
Make sure everyone can run the product (for example via readme doc) Done
Getting code reviews by shell Done
Slack workspace (Avi) Done
SD Meeting Done
Homework no assigned clearly - now assigned Done
Figure out pipeline bug - for everyone Done
Get to know expectations and requirements from Industry Partner Done
Discuss with industry partner an optimal time for the meeting to take place Done
No expericence in ML Done
Consider - more granular testing structure if issues keep coming up in following sprints Done
Coordinated PR Reviews Done
Uphold POs<>SDs communication throughout the week to ensure each SD receives enough tasks that accommodate their capacity Todo
Binary file added Deliverables/sprint-12/planning-document.pdf
Binary file not shown.
Binary file added Deliverables/sprint-12/rtdip_blogpost.pdf
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
::: src.sdk.python.rtdip_sdk.pipelines.data_quality.data_manipulation.spark.normalization.denormalization
::: src.sdk.python.rtdip_sdk.pipelines.data_quality.data_manipulation.spark.missing_value_imputation

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: src.sdk.python.rtdip_sdk.pipelines.data_quality.data_manipulation.spark.normalization.denormalization

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: src.sdk.python.rtdip_sdk.pipelines.forecasting.spark.arima
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: src.sdk.python.rtdip_sdk.pipelines.forecasting.spark.auto_arima
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: src.sdk.python.rtdip_sdk.pipelines.forecasting.spark.data_binning
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: src.sdk.python.rtdip_sdk.pipelines.forecasting.spark.linear_regression

This file was deleted.

This file was deleted.

16 changes: 8 additions & 8 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -246,8 +246,8 @@ nav:
- Pattern Based: sdk/code-reference/pipelines/data_quality/monitoring/spark/identify_missing_data_pattern.md
- Moving Average: sdk/code-reference/pipelines/data_quality/monitoring/spark/moving_average.md
- Data Manipulation:
- Duplicate Detection: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/duplicate_detection.md
- Filter Out of Range Values: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/out_of_range_value_filter.md
- Duplicate Detetection: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/duplicate_detection.md
- Out of Range Value Filter: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/out_of_range_value_filter.md
- Flatline Filter: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/flatline_filter.md
- Gaussian Smoothing: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/gaussian_smoothing.md
- Dimensionality Reduction: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/dimensionality_reduction.md
Expand All @@ -259,12 +259,12 @@ nav:
- Normalization Mean: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/normalization/normalization_mean.md
- Normalization MinMax: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/normalization/normalization_minmax.md
- Normalization ZScore: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/normalization/normalization_zscore.md
- Prediction:
- Arima: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/prediction/arima.md
- Auto Arima: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/prediction/auto_arima.md
- Machine Learning:
- Data Binning: sdk/code-reference/pipelines/machine_learning/spark/data_binning.md
- Linear Regression: sdk/code-reference/pipelines/machine_learning/spark/linear_regression.md
- Denormalization: sdk/code-reference/pipelines/data_quality/data_manipulation/spark/normalization/denormalization.md
- Forecasting:
- Data Binning: sdk/code-reference/pipelines/forecasting/spark/data_binning.md
- Linear Regression: sdk/code-reference/pipelines/forecasting/spark/linear_regression.md
- Arima: sdk/code-reference/pipelines/forecasting/spark/arima.md
- Auto Arima: sdk/code-reference/pipelines/forecasting/spark/auto_arima.md

- Jobs: sdk/pipelines/jobs.md
- Deploy:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@
# limitations under the License.

from .normalization import *
from .prediction import *
from .dimensionality_reduction import DimensionalityReduction
from .duplicate_detection import DuplicateDetection
from .interval_filtering import IntervalFiltering
from .k_sigma_anomaly_detection import KSigmaAnomalyDetection
from .missing_value_imputation import MissingValueImputation
from .out_of_range_value_filter import OutOfRangeValueFilter
from .flatline_filter import FlatlineFilter
from .missing_value_imputation import MissingValueImputation
Loading

0 comments on commit 6492153

Please sign in to comment.