Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-19072. S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir #6543

Merged
merged 11 commits into from
Aug 8, 2024

Conversation

virajjasani
Copy link
Contributor

@virajjasani virajjasani commented Feb 9, 2024

Description of PR

HADOOP-19072 S3A: expand optimisations on stores with "fs.s3a.create.performance"

How was this patch tested?

us-west-2

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 1s Maven dependency ordering for branch
+1 💚 mvninstall 19m 19s trunk passed
+1 💚 compile 8m 18s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 7m 31s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 1m 59s trunk passed
+1 💚 mvnsite 1m 29s trunk passed
+1 💚 javadoc 1m 6s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 1s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
-1 ❌ spotbugs 1m 24s /branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs warnings.
+1 💚 shadedclient 19m 55s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 20s Maven dependency ordering for patch
+1 💚 mvninstall 0m 45s the patch passed
+1 💚 compile 7m 53s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 7m 53s the patch passed
+1 💚 compile 7m 40s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 7m 40s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 56s the patch passed
+1 💚 mvnsite 1m 24s the patch passed
+1 💚 javadoc 1m 1s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 1s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 2m 22s the patch passed
+1 💚 shadedclient 19m 56s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 34s hadoop-common in the patch passed.
+1 💚 unit 2m 25s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
144m 55s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/1/artifact/out/Dockerfile
GITHUB PR #6543
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle
uname Linux d9248b10a158 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 2728303
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/1/testReport/
Max. process+thread count 2153 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 29s Maven dependency ordering for branch
+1 💚 mvninstall 19m 53s trunk passed
+1 💚 compile 8m 20s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 7m 38s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 2m 4s trunk passed
+1 💚 mvnsite 1m 28s trunk passed
+1 💚 javadoc 1m 6s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 1s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
-1 ❌ spotbugs 1m 26s /branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs warnings.
+1 💚 shadedclient 19m 49s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 21s Maven dependency ordering for patch
+1 💚 mvninstall 0m 47s the patch passed
+1 💚 compile 8m 1s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 8m 1s the patch passed
+1 💚 compile 7m 30s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 7m 30s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 2m 0s the patch passed
+1 💚 mvnsite 1m 22s the patch passed
+1 💚 javadoc 0m 57s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 1s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 2m 21s the patch passed
+1 💚 shadedclient 19m 48s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 41s hadoop-common in the patch passed.
+1 💚 unit 2m 26s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
145m 53s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/2/artifact/out/Dockerfile
GITHUB PR #6543
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle
uname Linux b01c9384f3da 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 3874f72
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/2/testReport/
Max. process+thread count 3152 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

Tested against us-west-2:

mvn clean verify -Dparallel-tests -DtestsThreadCount=8 -Dscale -Dprefetch

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks. this looks appealing. we can save overhead. curious what downstream things will fail though...

@@ -144,7 +147,7 @@ public void testMkdirRecursiveWithExistingFile() throws IOException {
try {
fc.mkdir(dirPath, FileContext.DEFAULT_PERM, true);
Assert.fail("Mkdir for " + dirPath
+ " should have failed as a file was present");
+ MKDIR_FILE_PRESENT_ERROR);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets move this old code to intercept(IOException.class, ()-> mkdir(...))


// Create the marker file, delete the parent entries
// if the filesystem isn't configured to retain them
callbacks.createFakeDirectory(dir, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass down performanceCreation here; so always keep parent dirs. I know the marker retention default has changed, but we are in perfomance mode here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is from previous patch version.

@Test
public void testMkdirOverParentFile() throws Throwable {
try {
super.testMkdirOverParentFile();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you know what I'm going to say here now, don't you?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, while kept this in intercept(), still had to move the call to separate method for using method ref or lambda.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, this is getting over complex.

proposed: copy the superclass code but remove the expectation of failures, retaining only setup and validation.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed.

  • we can tighten this even more
  • we should treat magic paths as exactly the same as the performance creation ones.

@@ -124,7 +132,32 @@ public Boolean execute() throws IOException {
return true;
}

// Walk path to root, ensuring closest ancestor is a directory, not file
// if performance creation mode is set, no need to check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about on L116 we only do a HEAD check for the path without /, (maybe need new callback), so no LIST probe for a dir via HEAD/LIST

S3AFileStatus fileStatus = performanceCreation
    ? probePathStatusOrNull(dir, StatusProbeEnum.Head) 
    ? getPathStatusExpectingDir(dir);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable, i was curious about whether we need full probe for magic, i think yes we can make it much performant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was wrong; now the patch is in I see where I was mistaken. Its the versioned buckets where problems surface. sorry!

@@ -73,11 +79,13 @@ public MkdirOperation(
final StoreContext storeContext,
final Path dir,
final MkdirCallbacks callbacks,
final boolean isMagicPath) {
final boolean isMagicPath,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, these should be the same flag. so rename it performanceCreation and in s3aFS set to true if the path is magic or performanceCreation is true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is from previous patch version.

@steveloughran
Copy link
Contributor

@shameersss1 what do you think here?
actually, maybe under magic paths we skip trying to create dirs at all, at least on the in-memory mode. no files to look for after all so all that happens is a dir tree is needlessly created, and the HEAD requests I'm proposing wouldn't even find any conflict with files that don't exist

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 4s Maven dependency ordering for branch
+1 💚 mvninstall 21m 16s trunk passed
+1 💚 compile 8m 48s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 7m 59s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 2m 0s trunk passed
+1 💚 mvnsite 1m 24s trunk passed
+1 💚 javadoc 1m 2s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 2s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
-1 ❌ spotbugs 1m 26s /branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs warnings.
+1 💚 shadedclient 19m 55s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 20s Maven dependency ordering for patch
+1 💚 mvninstall 0m 49s the patch passed
+1 💚 compile 8m 18s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 8m 18s the patch passed
+1 💚 compile 7m 45s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 7m 45s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 2m 4s /results-checkstyle-root.txt root: The patch generated 2 new + 15 unchanged - 0 fixed = 17 total (was 15)
+1 💚 mvnsite 1m 22s the patch passed
+1 💚 javadoc 0m 58s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 1m 2s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 2m 25s the patch passed
+1 💚 shadedclient 19m 56s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 17m 12s hadoop-common in the patch passed.
+1 💚 unit 2m 16s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 34s The patch does not generate ASF License warnings.
148m 44s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/3/artifact/out/Dockerfile
GITHUB PR #6543
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle
uname Linux d3062a6e7ed1 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 12e6bff
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/3/testReport/
Max. process+thread count 3151 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/3/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 13m 59s Maven dependency ordering for branch
+1 💚 mvninstall 20m 18s trunk passed
+1 💚 compile 8m 18s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 compile 7m 32s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 checkstyle 2m 1s trunk passed
+1 💚 mvnsite 1m 24s trunk passed
+1 💚 javadoc 1m 3s trunk passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 56s trunk passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
-1 ❌ spotbugs 1m 24s /branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs warnings.
+1 💚 shadedclient 19m 24s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 20s Maven dependency ordering for patch
+1 💚 mvninstall 0m 43s the patch passed
+1 💚 compile 8m 3s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javac 8m 3s the patch passed
+1 💚 compile 7m 35s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 javac 7m 35s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 56s the patch passed
+1 💚 mvnsite 1m 19s the patch passed
+1 💚 javadoc 0m 55s the patch passed with JDK Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04
+1 💚 javadoc 0m 59s the patch passed with JDK Private Build-1.8.0_392-8u392-ga-1~20.04-b08
+1 💚 spotbugs 2m 19s the patch passed
+1 💚 shadedclient 19m 37s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 50s hadoop-common in the patch passed.
+1 💚 unit 2m 20s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
144m 26s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/4/artifact/out/Dockerfile
GITHUB PR #6543
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle
uname Linux f3c9795ec06a 5.15.0-88-generic #98-Ubuntu SMP Mon Oct 2 15:18:56 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ff8a9d2
Default Java Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.21+9-post-Ubuntu-0ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_392-8u392-ga-1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/4/testReport/
Max. process+thread count 1282 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@shameersss1
Copy link
Contributor

@shameersss1 what do you think here? actually, maybe under magic paths we skip trying to create dirs at all, at least on the in-memory mode. no files to look for after all so all that happens is a dir tree is needlessly created, and the HEAD requests I'm proposing wouldn't even find any conflict with files that don't exist

@steveloughran In the proposed solution (https://issues.apache.org/jira/browse/HADOOP-19047), Even in the in-memory mode, The taskAttempt will write a (.pendingset) file containing the metadata of multi-part-upload (MPU) inside the magic path which will be read by the driver process and Hence the directory creation is necessary.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 23s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 16s Maven dependency ordering for branch
+1 💚 mvninstall 21m 45s trunk passed
+1 💚 compile 9m 46s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 9m 3s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 2m 10s trunk passed
+1 💚 mvnsite 1m 35s trunk passed
+1 💚 javadoc 1m 9s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 1m 7s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 2m 17s trunk passed
+1 💚 shadedclient 21m 49s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 21s Maven dependency ordering for patch
+1 💚 mvninstall 0m 54s the patch passed
+1 💚 compile 8m 23s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 8m 23s the patch passed
+1 💚 compile 7m 47s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 7m 47s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 58s the patch passed
+1 💚 mvnsite 1m 25s the patch passed
+1 💚 javadoc 0m 53s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 1m 2s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 2m 25s the patch passed
+1 💚 shadedclient 20m 0s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 39s hadoop-common in the patch passed.
+1 💚 unit 2m 14s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 35s The patch does not generate ASF License warnings.
154m 16s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/5/artifact/out/Dockerfile
GITHUB PR #6543
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle
uname Linux 106dcbd22ec9 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ff8a9d2
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/5/testReport/
Max. process+thread count 1273 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

@steveloughran could you please take another look?

@Test
public void testMkdirOverParentFile() throws Throwable {
try {
super.testMkdirOverParentFile();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, this is getting over complex.

proposed: copy the superclass code but remove the expectation of failures, retaining only setup and validation.

@steveloughran
Copy link
Contributor

reviewed, i'm just wondering how to make the test the cleanest.

Going to invite reviews from @shameersss1 @ahmarsuhail @HarshitGupta11 @mukund-thakur as they've been looking around here.

Does anyone expect anything to break from this? I don't: we know code doesn't normally try these tricks, otherwise we'd have had complaints about other optimisations.

@virajjasani
Copy link
Contributor Author

ok, this is getting over complex.

proposed: copy the superclass code but remove the expectation of failures, retaining only setup and validation.

sounds good, addressed in the latest revision.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 9s Maven dependency ordering for branch
+1 💚 mvninstall 23m 56s trunk passed
+1 💚 compile 8m 47s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 8m 2s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 2m 2s trunk passed
+1 💚 mvnsite 1m 20s trunk passed
+1 💚 javadoc 1m 3s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 59s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 2m 14s trunk passed
+1 💚 shadedclient 20m 42s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 20s Maven dependency ordering for patch
+1 💚 mvninstall 0m 50s the patch passed
+1 💚 compile 8m 24s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 8m 24s the patch passed
+1 💚 compile 8m 2s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 8m 2s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 1m 55s the patch passed
+1 💚 mvnsite 1m 23s the patch passed
+1 💚 javadoc 0m 56s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 59s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 2m 24s the patch passed
+1 💚 shadedclient 20m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 36s hadoop-common in the patch passed.
+1 💚 unit 2m 15s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
152m 50s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/6/artifact/out/Dockerfile
GITHUB PR #6543
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle
uname Linux 2e19c4bcca99 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 8d3012c
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/6/testReport/
Max. process+thread count 2153 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that this patch is now a bit over ambitious, because it will let us create a directory over a file. I'm okay with creating a directory under a file, as that should be rarer. Creating a directory where they already is a file is danger.

We do treat magic pots in the "Danger" way because we are assuming that they are exclusively for spark and map produce jobs writing directory trees properly and that the whole directory tree is transient.

But if we're doing this for a whole directory, for all applications, I think that is a bit too risky.

I now realise why C/C++ compilers let you list optimisations explicitly, with the -O1, -O2, -O3 level of aggression but which can be expanded to explicit request of all features: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

What do we do here then? I am pushing you round in circles aren't I?

// For non-performance or regular mode, the probe for both HEAD and LIST would
// be done.
S3AFileStatus fileStatus = performanceCreation
? probePathStatusOrNull(dir, StatusProbeEnum.HEAD_ONLY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will trigger a needless PUT if there isn't a marker, just children, which hits all the problems in the comments of the existing code.

afrid we need to revert to the old code.

@@ -124,7 +132,32 @@ public Boolean execute() throws IOException {
return true;
}

// Walk path to root, ensuring closest ancestor is a directory, not file
// if performance creation mode is set, no need to check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was wrong; now the patch is in I see where I was mistaken. Its the versioned buckets where problems surface. sorry!

@virajjasani
Copy link
Contributor Author

But if we're doing this for a whole directory, for all applications, I think that is a bit too risky.

I see your point.

Let me run the whole suite with the latest revision.

@virajjasani
Copy link
Contributor Author

Tested against us-west-2, looks good for this change.

Though found a separate issue with scale tests using noaa-cors-pds bucket for my local endpoint/region setup. It's minor issue, not a big deal, will create a Jira later.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 18m 37s Maven dependency ordering for branch
+1 💚 mvninstall 20m 51s trunk passed
+1 💚 compile 9m 41s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 compile 9m 4s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 checkstyle 2m 14s trunk passed
+1 💚 mvnsite 1m 19s trunk passed
+1 💚 javadoc 1m 1s trunk passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 55s trunk passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 2m 3s trunk passed
+1 💚 shadedclient 21m 27s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 21s Maven dependency ordering for patch
+1 💚 mvninstall 0m 47s the patch passed
+1 💚 compile 9m 50s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javac 9m 50s the patch passed
+1 💚 compile 8m 54s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 javac 8m 54s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 2m 0s the patch passed
+1 💚 mvnsite 1m 18s the patch passed
+1 💚 javadoc 0m 54s the patch passed with JDK Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1
+1 💚 javadoc 0m 52s the patch passed with JDK Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
+1 💚 spotbugs 2m 22s the patch passed
+1 💚 shadedclient 21m 44s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 49s hadoop-common in the patch passed.
+1 💚 unit 2m 12s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
160m 27s
Subsystem Report/Notes
Docker ClientAPI=1.44 ServerAPI=1.44 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/7/artifact/out/Dockerfile
GITHUB PR #6543
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle
uname Linux 84529fe21bf3 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / fc306d5
Default Java Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/7/testReport/
Max. process+thread count 2769 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 6m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 54s Maven dependency ordering for branch
+1 💚 mvninstall 20m 3s trunk passed
+1 💚 compile 9m 2s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 compile 8m 11s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 checkstyle 2m 6s trunk passed
+1 💚 mvnsite 1m 37s trunk passed
+1 💚 javadoc 1m 14s trunk passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 1m 10s trunk passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 2m 11s trunk passed
+1 💚 shadedclient 20m 14s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 23s Maven dependency ordering for patch
+1 💚 mvninstall 0m 50s the patch passed
+1 💚 compile 8m 41s the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javac 8m 41s the patch passed
+1 💚 compile 8m 8s the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 javac 8m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 2m 4s the patch passed
+1 💚 mvnsite 1m 32s the patch passed
+1 💚 javadoc 1m 10s the patch passed with JDK Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04
+1 💚 javadoc 1m 10s the patch passed with JDK Private Build-1.8.0_422-8u422-b05-1~20.04-b05
+1 💚 spotbugs 2m 24s the patch passed
+1 💚 shadedclient 20m 40s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 16m 50s hadoop-common in the patch passed.
+1 💚 unit 2m 12s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 42s The patch does not generate ASF License warnings.
157m 24s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/11/artifact/out/Dockerfile
GITHUB PR #6543
Optional Tests dupname asflicense mvnsite codespell detsecrets markdownlint compile javac javadoc mvninstall unit shadedclient spotbugs checkstyle
uname Linux 43900f0f3b1f 5.15.0-116-generic #126-Ubuntu SMP Mon Jul 1 10:14:24 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c5652a0
Default Java Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.24+8-post-Ubuntu-1ubuntu320.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_422-8u422-b05-1~20.04-b05
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/11/testReport/
Max. process+thread count 3153 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6543/11/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall. we can merge once Steve takes a final look.

@@ -200,8 +200,8 @@ Prioritize file creation performance over safety checks for filesystem consisten
This:
1. Skips the `LIST` call which makes sure a file is being created over a directory.
Risk: a file is created over a directory.
1. Ignores the overwrite flag.
1. Never issues a `DELETE` call to delete parent directory markers.
2. Ignores the overwrite flag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to add numbering in md files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't make any difference; IDEs often add them automatically. I personally prefer just 1 because its easier to reorder things, but don't care what others do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, this was done by IDE so i thought of keeping it. earlier i updated this description but now that we have new config, i removed the description here and moved it to fs.s3a.performance.flags.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
+1

some minor comments but not enough to justify another iteration

@@ -200,8 +200,8 @@ Prioritize file creation performance over safety checks for filesystem consisten
This:
1. Skips the `LIST` call which makes sure a file is being created over a directory.
Risk: a file is created over a directory.
1. Ignores the overwrite flag.
1. Never issues a `DELETE` call to delete parent directory markers.
2. Ignores the overwrite flag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't make any difference; IDEs often add them automatically. I personally prefer just 1 because its easier to reorder things, but don't care what others do

@@ -54,30 +56,54 @@
* <li>If needed, one PUT</li>
* </ol>
*/
@InterfaceAudience.Private
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed given .impl paackage is tagged privat/unstable

FS_S3A_CREATE_PERFORMANCE,
FS_S3A_PERFORMANCE_FLAGS);
conf.setStrings(FS_S3A_PERFORMANCE_FLAGS,
"create,mkdir");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use just "set" unless you want to provide a list of the enum elements with .toString() after each

@steveloughran steveloughran merged commit 321a6cc into apache:trunk Aug 8, 2024
4 checks passed
@steveloughran
Copy link
Contributor

tried to pull to branch-3.4; tests failed

reran on trunk,. tests failed

@virajjasani you need to do an urgent followup; use same JIRA

[INFO] Running org.apache.hadoop.fs.contract.s3a.ITestS3AContractMkdir
[ERROR] Tests run: 8, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 12.921 s <<< FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractMkdir
[ERROR] testMkdirOverParentFile(org.apache.hadoop.fs.contract.s3a.ITestS3AContractMkdir)  Time elapsed: 0.682 s  <<< FAILURE!
java.lang.AssertionError: 
mkdirs did not fail over a file but returned true; ls s3a://stevel-london/test/testMkdirOverParentFile [00] S3AFileStatus{path=s3a://stevel-london/test/testMkdirOverParentFile/child-to-mkdir; isDirectory=true; modification_time=0; access_time=0; owner=stevel; group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; isEncrypted=true; isErasureCoded=false} isEmptyDirectory=FALSE eTag=null versionId=null

        at org.junit.Assert.fail(Assert.java:89)
        at org.apache.hadoop.fs.contract.AbstractContractMkdirTest.testMkdirOverParentFile(AbstractContractMkdirTest.java:99)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.lang.Thread.run(Thread.java:750)

[INFO] 
[INFO] Results:
[INFO] 
[ERROR] Failures: 
[ERROR]   ITestS3AContractMkdir>AbstractContractMkdirTest.testMkdirOverParentFile:99->Assert.fail:89 mkdirs did not fail over a file but returned true; ls s3a://stevel-london/test/testMkdirOverParentFile [00] S3AFileStatus{path=s3a://stevel-london/test/testMkdirOverParentFile/child-to-mkdir; isDirectory=true; modification_time=0; access_time=0; owner=stevel; group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; isEncrypted=true; isErasureCoded=false} isEmptyDirectory=FALSE eTag=null versionId=null

@virajjasani
Copy link
Contributor Author

The test is consistently passing for me, using oregon, let me try london.

But the above failure can happen only if fs.s3a.performance.flags contain mkdir. I checked my local configs too, nothing suspicious.

@virajjasani
Copy link
Contributor Author

The only way i am able to reproduce the exact failure is by applying this patch:

diff --git a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractMkdir.java b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractMkdir.java
index bace0a79f24..e01048b5c3b 100644
--- a/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractMkdir.java
+++ b/hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractMkdir.java
@@ -23,6 +23,7 @@
 import org.apache.hadoop.fs.contract.AbstractFSContract;
 
 import static org.apache.hadoop.fs.s3a.Constants.FS_S3A_CREATE_PERFORMANCE;
+import static org.apache.hadoop.fs.s3a.Constants.FS_S3A_PERFORMANCE_FLAGS;
 import static org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides;
 
 /**
@@ -33,8 +34,12 @@ public class ITestS3AContractMkdir extends AbstractContractMkdirTest {
   @Override
   protected Configuration createConfiguration() {
     Configuration conf = super.createConfiguration();
-    removeBaseAndBucketOverrides(conf,
-        FS_S3A_CREATE_PERFORMANCE);
+    removeBaseAndBucketOverrides(
+        conf,
+        FS_S3A_CREATE_PERFORMANCE,
+        FS_S3A_PERFORMANCE_FLAGS);
+    conf.setStrings(FS_S3A_PERFORMANCE_FLAGS,
+        "create,mkdir");
     return conf;
   }

@virajjasani
Copy link
Contributor Author

or i can reproduce the exact failure with this patch without any code changes:

diff --git a/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml b/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
index 4104e304314..97a7fcb3290 100644
--- a/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
+++ b/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml
@@ -1428,6 +1428,11 @@
   </description>
 </property>
 
+  <property>
+    <name>fs.s3a.performance.flags</name>
+    <value>mkdir</value>
+  </property>
+
 <property>
   <name>fs.s3a.security.credential.provider.path</name>
   <value />

@virajjasani
Copy link
Contributor Author

Addendum PR to override fs.s3a.performance.flags: #6985

@mukund-thakur
Copy link
Contributor

Just pulled the trunk and ran this test and it passed for me as well.

@virajjasani
Copy link
Contributor Author

Thanks for validating @mukund-thakur!
This failure can only happen if the perf flags are set explicitly, i am certain. This addendum will ensure that even if the perf flags are set, the test passes: #6985

@virajjasani
Copy link
Contributor Author

ITestS3AFileContextCreateMkdir would also fail if the perf flags are set explicitly, so the addendum overrides perf flags for both tests (ITestS3AContractMkdir and ITestS3AFileContextCreateMkdir) #6985

@steveloughran
Copy link
Contributor

my setup is the most aggressive you can get: all new options automatically. new pr fixes it

  <property>
    <name>fs.s3a.performance.flags</name>
    <value>*</value>
  </property>

steveloughran pushed a commit that referenced this pull request Aug 12, 2024
…DUM) (#6985)


This is a followup to #6543 which ensures all test pass in configurations where 
fs.s3a.performance.flags is set to "*" or contains "mkdirs"

Contributed by VJ Jasani
steveloughran pushed a commit that referenced this pull request Aug 12, 2024
…mance.flags" for mkdir (#6543)

If the flag list in fs.s3a.performance.flags includes "mkdir"
then the safety check of a walk up the tree to look for a parent directory,
-done to verify a directory isn't being created under a file- are skipped.

This saves the cost of multiple list operations.

Includes:

HADOOP-19072. S3A: Override fs.s3a.performance.flags for tests (ADDENDUM) (#6985)

This is a followup to #6543 which ensures all test pass in configurations where
fs.s3a.performance.flags is set to "*" or contains "mkdirs"

Contributed by VJ Jasani
@steveloughran
Copy link
Contributor

steveloughran commented Aug 12, 2024

aah, even after the addendum i"m getting failures.

Viraj. copy my setting then rerun everything with -Dscale. Then also a run with -Dprefix


[INFO] 
[ERROR] Failures: 
[ERROR]   ITestS3AFSMainOperations>FSMainOperationsBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile:241 Should throw IOException.
[ERROR]   ITestS3AFileSystemContract>FileSystemContractBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile:224 Should throw IOException.
[ERROR]   ITestS3AFileContextMainOperations>FileContextMainOperationsBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile:276 Should throw IOException.
[ERROR]   ITestS3AFileContextURI>FileContextURIBase.testMkdirsFailsForSubdirectoryOfExistingFile:258 Should throw IOException.
[INFO] 
[ERROR] testMkdirsFailsForSubdirectoryOfExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextURI)  Time elapsed: 1.217 s  <<< FAILURE!
java.lang.AssertionError: Should throw IOException.
        at org.junit.Assert.fail(Assert.java:89)
        at org.apache.hadoop.fs.FileContextURIBase.testMkdirsFailsForSubdirectoryOfExistingFile(FileContextURIBase.java:258)

[ERROR] testMkdirsFailsForSubdirectoryOfExistingFile(org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations)  Time elapsed: 1.264 s  <<< FAILURE!
java.lang.AssertionError: Should throw IOException.
        at org.junit.Assert.fail(Assert.java:89)
        at org.apache.hadoop.fs.FileContextMainOperationsBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile(FileContextMainOperationsBaseTest.java:276)


[ERROR] testMkdirsFailsForSubdirectoryOfExistingFile(org.apache.hadoop.fs.s3a.ITestS3AFileSystemContract)  Time elapsed: 0.856 s  <<< FAILURE!
java.lang.AssertionError: Should throw IOException.
        at org.junit.Assert.fail(Assert.java:89)
        at org.apache.hadoop.fs.FileSystemContractBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile(FileSystemContractBaseTest.java:224)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)

[ERROR] testMkdirsFailsForSubdirectoryOfExistingFile(org.apache.hadoop.fs.s3a.ITestS3AFSMainOperations)  Time elapsed: 0.844 s  <<< FAILURE!
java.lang.AssertionError: Should throw IOException.
        at org.junit.Assert.fail(Assert.java:89)
        at org.apache.hadoop.fs.FSMainOperationsBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile(FSMainOperationsBaseTest.java:241)

suspect I didn't do a full 3.4 test run before pushing it up on that branch, instead just the modified files. my mistake. there'll be another patch to backport

@virajjasani
Copy link
Contributor Author

virajjasani commented Aug 12, 2024

my setup is the most aggressive you can get: all new options automatically. new pr fixes it

  <property>
    <name>fs.s3a.performance.flags</name>
    <value>*</value>
  </property>

Sure i will run all tests with this setup. Now i think we can even introduce new mvn profile to run all tests will all perf flgas? Something like -DenableAllPerfFlags (as a follow up Jira)?

@steveloughran
Copy link
Contributor

no, maybe too many profiles. best to have that variety of users and their own non-standard configs

@virajjasani
Copy link
Contributor Author

Done with the full run, these are the failures:

[ERROR] Failures: 
[ERROR]   ITestS3AFSMainOperations>FSMainOperationsBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile:241 Should throw IOException.
[ERROR]   ITestS3AFileSystemContract>FileSystemContractBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile:224 Should throw IOException.
[ERROR]   ITestS3AFileContextMainOperations>FileContextMainOperationsBaseTest.testMkdirsFailsForSubdirectoryOfExistingFile:276 Should throw IOException.
[ERROR]   ITestS3AFileContextURI>FileContextURIBase.testMkdirsFailsForSubdirectoryOfExistingFile:258 Should throw IOException.
[INFO] 

Seems to be matching with Steve's above comment listing failed tests. I will prepare the addendum soon.

@virajjasani
Copy link
Contributor Author

Here is another addendum #6993

I am re-running the whole suite again to ensure everything passes with above settings.

steveloughran pushed a commit that referenced this pull request Aug 14, 2024
…UM 2) (#6993)


Second followup to #6543; all hadoop-aws integration tests complete correctly even when 

fs.s3a.performance.flags = *

Contributed by Viraj Jasani
steveloughran pushed a commit that referenced this pull request Aug 14, 2024
…UM 2) (#6993)

Second followup to #6543; all hadoop-aws integration tests complete correctly even when

fs.s3a.performance.flags = *

Contributed by Viraj Jasani
steveloughran pushed a commit to steveloughran/hadoop that referenced this pull request Aug 15, 2024
…mance.flags" for mkdir (apache#6543)

If the flag list in fs.s3a.performance.flags includes "mkdir"
then the safety check of a walk up the tree to look for a parent directory,
-done to verify a directory isn't being created under a file- are skipped.

This saves the cost of multiple list operations.

Includes:

HADOOP-19072. S3A: Override fs.s3a.performance.flags for tests (ADDENDUM) (apache#6985)

This is a followup to apache#6543 which ensures all test pass in configurations where
fs.s3a.performance.flags is set to "*" or contains "mkdirs"

Contributed by VJ Jasani
steveloughran pushed a commit to steveloughran/hadoop that referenced this pull request Aug 15, 2024
…UM 2) (apache#6993)

Second followup to apache#6543; all hadoop-aws integration tests complete correctly even when

fs.s3a.performance.flags = *

Contributed by Viraj Jasani
KeeProMise pushed a commit to KeeProMise/hadoop that referenced this pull request Sep 9, 2024
…mance.flags" for mkdir (apache#6543)


If the flag list in fs.s3a.performance.flags includes "mkdir"
then the safety check of a walk up the tree to look for a parent directory,
-done to verify a directory isn't being created under a file- are skipped.

This saves the cost of multiple list operations.

Contributed by Viraj Jasani
KeeProMise pushed a commit to KeeProMise/hadoop that referenced this pull request Sep 9, 2024
…DUM) (apache#6985)


This is a followup to apache#6543 which ensures all test pass in configurations where 
fs.s3a.performance.flags is set to "*" or contains "mkdirs"

Contributed by VJ Jasani
KeeProMise pushed a commit to KeeProMise/hadoop that referenced this pull request Sep 9, 2024
…UM 2) (apache#6993)


Second followup to apache#6543; all hadoop-aws integration tests complete correctly even when 

fs.s3a.performance.flags = *

Contributed by Viraj Jasani
Hexiaoqiao pushed a commit to Hexiaoqiao/hadoop that referenced this pull request Sep 12, 2024
…mance.flags" for mkdir (apache#6543)


If the flag list in fs.s3a.performance.flags includes "mkdir"
then the safety check of a walk up the tree to look for a parent directory,
-done to verify a directory isn't being created under a file- are skipped.

This saves the cost of multiple list operations.

Contributed by Viraj Jasani
Hexiaoqiao pushed a commit to Hexiaoqiao/hadoop that referenced this pull request Sep 12, 2024
…DUM) (apache#6985)


This is a followup to apache#6543 which ensures all test pass in configurations where 
fs.s3a.performance.flags is set to "*" or contains "mkdirs"

Contributed by VJ Jasani
Hexiaoqiao pushed a commit to Hexiaoqiao/hadoop that referenced this pull request Sep 12, 2024
…UM 2) (apache#6993)


Second followup to apache#6543; all hadoop-aws integration tests complete correctly even when 

fs.s3a.performance.flags = *

Contributed by Viraj Jasani
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants