Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sqlID column to failed_jobs.csv #1567

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

amahussein
Copy link
Collaborator

Signed-off-by: Ahmed Hussein (amahussein) [email protected]

Fixes #1563

  • adds column sqlID to failed_jobs.csv
  • the column might be empty if the job has no sqlID attached to it

This pull request includes several changes to improve the handling of job profiling and file format extraction in the RAPIDS plugin for Apache Spark. The most important changes include modifying the FailedJobsProfileResults case class to include an optional SQL ID, updating related views and tests, and simplifying the code in the HealthCheckSuite class.

Sample output file

appIndex,jobID,sqlID,jobResult,failureReason
1,79,27,"JobFailed","java.lang.Exception: Job 79 cancelled because SparkContext was shut down"

Improvements to job profiling:

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Fixes NVIDIA#1563

- adds column `sqlID` to failed_jobs.csv
- the column might be empty if the job has no sqlID attached to it
@amahussein amahussein added core_tools Scope the core module (scala) API change A changeA change affecting the output (add/remove/rename files, add/remove/rename columns) labels Mar 3, 2025
@amahussein amahussein self-assigned this Mar 3, 2025
Comment on lines +260 to +274
// Extracts the file format from a class object string, such as
// "com.nvidia.spark.rapids.GpuParquetFileFormat@9f5022c".
//
// This function is designed to handle cases where the RAPIDS plugin logs raw object names
// instead of a user-friendly file format name. For example, it extracts "Parquet" from
// "com.nvidia.spark.rapids.GpuParquetFileFormat@9f5022c".
// Refer: https://github.com/NVIDIA/spark-rapids-tools/issues/1561
//
// If the input string does not match the expected pattern, the function returns the original
// string as a fallback.
//
// @param formatStr The raw format string, typically containing the class name of the file
// format.
// @return A user-friendly file format name (e.g., "Parquet") or the original string if no
// match is found.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the documentation format because this scala doc style is not allowed inside nested methods. It is only allowed in top level resources.

Comment on lines +487 to +493
case class FailedJobsProfileResults(
appIndex: Int,
jobId: Int,
sqlID: Option[Long], // sqlID is optional because Jobs might not have a SQL (i.e., RDDs)
jobResult: String,
endReason: String) extends ProfileResult {
override val outputHeaders = Seq("appIndex", "jobID", "sqlID", "jobResult", "failureReason")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the sqlID column. The rest of changes are code formatting to have each field in its own line.

Comment on lines +496 to +498
Seq(appIndex.toString,
jobId.toString,
sqlID.map(_.toString).getOrElse(null),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the sqlID column. if sqlId is not defined then it puts null.
The rest of changes are code formatting to have each field in its own line.

Comment on lines +503 to +506
Seq(appIndex.toString,
jobId.toString,
sqlID.map(_.toString).getOrElse(null),
StringUtils.reformatCSVString(jobResult),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the sqlID column. if sqlId is not defined then it puts null.
The rest of changes are code formatting to have each field in its own line.

@@ -56,7 +56,7 @@ class HealthCheckSuite extends FunSuite {
assert(apps.size == 1)

val healthCheck = new HealthCheck(apps)
for (app <- apps) {
for (_ <- apps) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unrelated) get rid of unused definition

@@ -142,7 +142,7 @@ class HealthCheckSuite extends FunSuite {
assert(apps.size == 1)

val healthCheck = new HealthCheck(apps)
for (app <- apps) {
for (_ <- apps) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(unrelated) get rid of unused definition

@amahussein
Copy link
Collaborator Author

CC: @leewyang
In case this simplifies the code complexity in QualX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API change A changeA change affecting the output (add/remove/rename files, add/remove/rename columns) core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Add SQL ID column in failed_jobs.csv in Profiling tools output
1 participant