Add sqlID column to failed_jobs.csv #1567

amahussein · 2025-03-03T13:55:51Z

Signed-off-by: Ahmed Hussein (amahussein) [email protected]

adds column sqlID to failed_jobs.csv
the column might be empty if the job has no sqlID attached to it

This pull request includes several changes to improve the handling of job profiling and file format extraction in the RAPIDS plugin for Apache Spark. The most important changes include modifying the FailedJobsProfileResults case class to include an optional SQL ID, updating related views and tests, and simplifying the code in the HealthCheckSuite class.

Sample output file

appIndex,jobID,sqlID,jobResult,failureReason
1,79,27,"JobFailed","java.lang.Exception: Job 79 cancelled because SparkContext was shut down"

Improvements to job profiling:

core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileClassWarehouse.scala: Modified the FailedJobsProfileResults case class to include an optional sqlID field and updated the outputHeaders and convertToSeq methods accordingly.
core/src/main/scala/com/nvidia/spark/rapids/tool/views/JobView.scala: Updated the AppFailedJobsViewTrait to handle the new sqlID field in FailedJobsProfileResults and modified the sortView method to include sqlID in the sorting criteria.
core/src/test/resources/ProfilingExpectations/jobs_failure_eventlog_expectation.csv: Updated the test expectations to include the new sqlID field in the CSV header and data.

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]> Fixes NVIDIA#1563 - adds column `sqlID` to failed_jobs.csv - the column might be empty if the job has no sqlID attached to it

amahussein · 2025-03-03T13:57:28Z

core/src/main/scala/com/nvidia/spark/rapids/tool/planparser/DataWritingCommandExecParser.scala

+    // Extracts the file format from a class object string, such as
+    // "com.nvidia.spark.rapids.GpuParquetFileFormat@9f5022c".
+    //
+    // This function is designed to handle cases where the RAPIDS plugin logs raw object names
+    // instead of a user-friendly file format name. For example, it extracts "Parquet" from
+    // "com.nvidia.spark.rapids.GpuParquetFileFormat@9f5022c".
+    // Refer: https://github.com/NVIDIA/spark-rapids-tools/issues/1561
+    //
+    // If the input string does not match the expected pattern, the function returns the original
+    // string as a fallback.
+    //
+    // @param formatStr The raw format string, typically containing the class name of the file
+    //                  format.
+    // @return A user-friendly file format name (e.g., "Parquet") or the original string if no
+    //         match is found.


Changed the documentation format because this scala doc style is not allowed inside nested methods. It is only allowed in top level resources.

amahussein · 2025-03-03T13:58:18Z

core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileClassWarehouse.scala

+case class FailedJobsProfileResults(
+    appIndex: Int,
+    jobId: Int,
+    sqlID: Option[Long],  // sqlID is optional because Jobs might not have a SQL (i.e., RDDs)
+    jobResult: String,
+    endReason: String) extends ProfileResult {
+  override val outputHeaders = Seq("appIndex", "jobID", "sqlID", "jobResult", "failureReason")


Added the sqlID column. The rest of changes are code formatting to have each field in its own line.

amahussein · 2025-03-03T13:58:45Z

core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileClassWarehouse.scala

+    Seq(appIndex.toString,
+      jobId.toString,
+      sqlID.map(_.toString).getOrElse(null),


Added the sqlID column. if sqlId is not defined then it puts null.
The rest of changes are code formatting to have each field in its own line.

amahussein · 2025-03-03T13:59:01Z

core/src/main/scala/com/nvidia/spark/rapids/tool/profiling/ProfileClassWarehouse.scala

+    Seq(appIndex.toString,
+      jobId.toString,
+      sqlID.map(_.toString).getOrElse(null),
+      StringUtils.reformatCSVString(jobResult),


Added the sqlID column. if sqlId is not defined then it puts null.
The rest of changes are code formatting to have each field in its own line.

amahussein · 2025-03-03T13:59:45Z

core/src/test/scala/com/nvidia/spark/rapids/tool/profiling/HealthCheckSuite.scala

@@ -56,7 +56,7 @@ class HealthCheckSuite extends FunSuite {
    assert(apps.size == 1)

    val healthCheck = new HealthCheck(apps)
-    for (app <- apps) {
+    for (_ <- apps) {


(unrelated) get rid of unused definition

amahussein · 2025-03-03T13:59:51Z

core/src/test/scala/com/nvidia/spark/rapids/tool/profiling/HealthCheckSuite.scala

@@ -142,7 +142,7 @@ class HealthCheckSuite extends FunSuite {
    assert(apps.size == 1)

    val healthCheck = new HealthCheck(apps)
-    for (app <- apps) {
+    for (_ <- apps) {


(unrelated) get rid of unused definition

amahussein · 2025-03-03T14:01:01Z

CC: @leewyang
In case this simplifies the code complexity in QualX.

Add sqlID column to failed_jobs.csv

a3d4853

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]> Fixes NVIDIA#1563 - adds column `sqlID` to failed_jobs.csv - the column might be empty if the job has no sqlID attached to it

amahussein added core_tools Scope the core module (scala) API change A changeA change affecting the output (add/remove/rename files, add/remove/rename columns) labels Mar 3, 2025

amahussein requested review from cindyyuanjiang and sayedbilalbari March 3, 2025 13:55

amahussein self-assigned this Mar 3, 2025

amahussein commented Mar 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sqlID column to failed_jobs.csv #1567

Add sqlID column to failed_jobs.csv #1567

amahussein commented Mar 3, 2025

amahussein Mar 3, 2025

amahussein Mar 3, 2025

amahussein Mar 3, 2025

amahussein Mar 3, 2025

amahussein Mar 3, 2025

amahussein Mar 3, 2025

amahussein commented Mar 3, 2025

Add sqlID column to failed_jobs.csv #1567

Are you sure you want to change the base?

Add sqlID column to failed_jobs.csv #1567

Conversation

amahussein commented Mar 3, 2025

Sample output file

Improvements to job profiling:

amahussein Mar 3, 2025

Choose a reason for hiding this comment

amahussein Mar 3, 2025

Choose a reason for hiding this comment

amahussein Mar 3, 2025

Choose a reason for hiding this comment

amahussein Mar 3, 2025

Choose a reason for hiding this comment

amahussein Mar 3, 2025

Choose a reason for hiding this comment

amahussein Mar 3, 2025

Choose a reason for hiding this comment

amahussein commented Mar 3, 2025