@@ -59,28 +59,26 @@ image::spark-k8s-anomaly-detection-taxi-data/overview.png[]
59
59
60
60
To list the installed Stackable services run the following command:
61
61
62
- // TODO(Techassi): Update console output
63
-
64
62
[source,console]
65
63
----
66
64
$ stackablectl stacklet list
67
- PRODUCT NAME NAMESPACE ENDPOINTS EXTRA INFOS
68
-
69
- hive hive spark-k8s-ad-taxi-data hive 172.18.0.2:31912
70
- metrics 172.18.0.2:30812
71
-
72
- hive hive-iceberg spark-k8s-ad-taxi-data hive 172.18.0.4:32133
73
- metrics 172.18.0.4:32125
74
-
75
- opa opa spark-k8s-ad-taxi-data http http://172.18.0.3:31450
76
-
77
- superset superset spark-k8s-ad-taxi-data external-superset http://172.18.0.2:31339 Admin user: admin, password: adminadmin
78
-
79
- trino trino spark-k8s-ad-taxi-data coordinator-metrics 172.18.0.3:32168
80
- coordinator-https https://172.18.0.3:31408
81
65
82
- minio minio-trino spark-k8s-ad-taxi-data http http://172.18.0.3:30589 Third party service
83
- console-http http://172.18.0.3:31452 Admin user: admin, password: adminadmin
66
+ ┌──────────┬───────────────┬───────────┬───────────────────────────────────────────────┬─────────────────────────────────┐
67
+ │ PRODUCT ┆ NAME ┆ NAMESPACE ┆ ENDPOINTS ┆ CONDITIONS │
68
+ ╞══════════╪═══════════════╪═══════════╪═══════════════════════════════════════════════╪═════════════════════════════════╡
69
+ │ hive ┆ hive ┆ default ┆ ┆ Available, Reconciling, Running │
70
+ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
71
+ │ hive ┆ hive-iceberg ┆ default ┆ ┆ Available, Reconciling, Running │
72
+ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
73
+ │ opa ┆ opa ┆ default ┆ ┆ Available, Reconciling, Running │
74
+ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
75
+ │ superset ┆ superset ┆ default ┆ external-http http://172.18.0.2:30562 ┆ Available, Reconciling, Running │
76
+ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
77
+ │ trino ┆ trino ┆ default ┆ coordinator-metrics 172.18.0.2:31980 ┆ Available, Reconciling, Running │
78
+ │ ┆ ┆ ┆ coordinator-https https://172.18.0.2:32186 ┆ │
79
+ ├╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
80
+ │ minio ┆ minio-console ┆ default ┆ http http://172.18.0.2:32276 ┆ │
81
+ └──────────┴───────────────┴───────────┴───────────────────────────────────────────────┴─────────────────────────────────┘
84
82
----
85
83
86
84
include::partial$instance-hint.adoc[]
@@ -89,8 +87,8 @@ include::partial$instance-hint.adoc[]
89
87
90
88
=== List Buckets
91
89
92
- The S3 provided by MinIO is used as persistent storage to store all the data used. Open the endpoint `console- http`
93
- retrieved by `stackablectl stacklet list` in your browser (http://172.18.0.3:31452 in this case).
90
+ The S3 provided by MinIO is used as persistent storage to store all the data used. Open the endpoint `http`
91
+ retrieved by `stackablectl stacklet list` in your browser (http://172.18.0.2:32276 in this case).
94
92
95
93
image::spark-k8s-anomaly-detection-taxi-data/minio_0.png[]
96
94
@@ -107,16 +105,16 @@ Here, you can see the two buckets the S3 is split into:
107
105
108
106
=== Inspect raw data
109
107
110
- Click on the blue button `Browse` on the bucket `demo` .
108
+ Click on the bucket `demo` and then on `ny-taxi-data` and `raw` respectively .
111
109
112
110
image::spark-k8s-anomaly-detection-taxi-data/minio_3.png[]
113
111
114
- A folder (called prefixes in S3) contains a dataset of similarly structured data files. The data is partitioned by month
112
+ This folder (called prefixes in S3) contains a dataset of similarly structured data files. The data is partitioned by month
115
113
and contains several hundred MBs, which may seem small for a dataset. Still, the model is a time-series model where the
116
114
data has decreasing relevance the "older" it is, especially when the data is subject to multiple external factors, many
117
115
of which are unknown and fluctuating in scope and effect.
118
116
119
- The second bucket prediction contains the output from the model scoring process:
117
+ The second bucket prediction contains the output from the model scoring process under `prediction/anomaly-detection/iforest/data` :
120
118
121
119
image::spark-k8s-anomaly-detection-taxi-data/minio_4.png[]
122
120
@@ -147,7 +145,9 @@ image::spark-k8s-anomaly-detection-taxi-data/spark_job.png[]
147
145
148
146
== Dashboard
149
147
150
- The anomaly detection dashboard is pre-defined and accessible under `Dashboards` when logged in to Superset:
148
+ Open the `external-http` Superset endpoint found in the output of the `stackablectl stacklet list` command. The anomaly detection
149
+ dashboard is pre-defined and accessible under the `Dashboards` tab when logged in to Superset using the username `admin`
150
+ password `adminadmin`:
151
151
152
152
image::spark-k8s-anomaly-detection-taxi-data/superset_anomaly_scores.png[]
153
153
0 commit comments