Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDF-20156: fixup fatjar pom file #858

Merged
merged 1 commit into from
Nov 1, 2023
Merged

CDF-20156: fixup fatjar pom file #858

merged 1 commit into from
Nov 1, 2023

Conversation

dmivankov
Copy link
Contributor

@dmivankov dmivankov commented Nov 1, 2023

Currently
https://repo1.maven.org/maven2/com/cognite/spark/datasource/cdf-spark-datasource-fatjar_2.13/3.2.1044/cdf-spark-datasource-fatjar_2.13-3.2.1044.pom
includes

<dependencies>
  <dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>2.13.8</version>
  </dependency>
  <dependency>
    <groupId>com.cognite.spark.datasource</groupId>
    <artifactId>cdf-spark-datasource_2.13</artifactId>
    <version>3.2.1044</version>
  </dependency>
  <dependency>
    <groupId>io.scalaland</groupId>
    <artifactId>chimney_2.13</artifactId>
    <version>0.5.3</version>
  </dependency>
</dependencies>

Which isn't expected, fatjar should have (almost) no dependencies.
The jar itself though contains all the classes and applies shading
as expected.
It is then possible that pulling fatjar from maven central would pull duplicated classes
and if non-fatjar unshaded versions are picked up or a mix with fatjar ones, the
datasource would runtime-crash when used for example in Spark in DataBricks.

Let's remove chimney from commonSettings and move to per-target dependencies.
And for main library, looks like the simplest way to both
have it as part of fatJar and not have it in pom is to patch generated pom.

As for scala-library keeping it as-is, older fatjar setup also had it,
so not investigating for now.

Verified pom contents by running fatJarShaded/publishLocal and inspecting pom
from filesystem.
And used fatJarShaded/assembly to generate fatjar and inspect with jar tf
to check that is has our classes and applies shading rules.
Did not yet verify in databricks, but even if this PR isn't enough to fix
running there it is still a step in the right directiion.

CDF-20156

Currently
https://repo1.maven.org/maven2/com/cognite/spark/datasource/cdf-spark-datasource-fatjar_2.13/3.2.1044/cdf-spark-datasource-fatjar_2.13-3.2.1044.pom
includes
```xml
<dependencies>
  <dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>2.13.8</version>
  </dependency>
  <dependency>
    <groupId>com.cognite.spark.datasource</groupId>
    <artifactId>cdf-spark-datasource_2.13</artifactId>
    <version>3.2.1044</version>
  </dependency>
  <dependency>
    <groupId>io.scalaland</groupId>
    <artifactId>chimney_2.13</artifactId>
    <version>0.5.3</version>
  </dependency>
</dependencies>
```
Which isn't expected, fatjar should have (almost) no dependencies.
The jar itself though contains all the classes and applies shading
as expected.
It is then possible that pulling fatjar from maven central would pull duplicated classes
and if non-fatjar unshaded versions are picked up or a mix with fatjar ones, the
datasource would runtime-crash when used for example in Spark in DataBricks.

Let's remove chimney from commonSettings and move to per-target dependencies.
And for main library, looks like the simplest way to both
have it as part of fatJar and not have it in pom is to patch generated pom.

As for scala-library keeping it as-is, older fatjar setup also had it,
so not investigating for now.

Verified pom contents by running `fatJarShaded/publishLocal` and inspecting pom
from filesystem.
And used `fatJarShaded/assembly` to generate fatjar and inspect with `jar tf`
to check that is has our classes and applies shading rules.
Did not yet verify in databricks, but even if this PR isn't enough to fix
running there it is still a step in the right directiion.

[CDF-20156]
@dmivankov dmivankov requested a review from a team as a code owner November 1, 2023 11:27
@dmivankov dmivankov self-assigned this Nov 1, 2023
Copy link

codecov bot commented Nov 1, 2023

Codecov Report

Merging #858 (89f485d) into master (d473c47) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master     #858   +/-   ##
=======================================
  Coverage   78.77%   78.77%           
=======================================
  Files          48       48           
  Lines        3020     3020           
  Branches      135      135           
=======================================
  Hits         2379     2379           
  Misses        641      641           

@dmivankov dmivankov merged commit 7ae467d into master Nov 1, 2023
2 checks passed
@dmivankov dmivankov deleted the CDF-20156 branch November 1, 2023 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants