-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-19072. S3A: expand optimisations on stores with "fs.s3a.performance.flags" for mkdir #6543
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Tested against
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks. this looks appealing. we can save overhead. curious what downstream things will fail though...
@@ -144,7 +147,7 @@ public void testMkdirRecursiveWithExistingFile() throws IOException { | |||
try { | |||
fc.mkdir(dirPath, FileContext.DEFAULT_PERM, true); | |||
Assert.fail("Mkdir for " + dirPath | |||
+ " should have failed as a file was present"); | |||
+ MKDIR_FILE_PRESENT_ERROR); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets move this old code to intercept(IOException.class, ()-> mkdir(...))
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/MkdirOperation.java
Outdated
Show resolved
Hide resolved
|
||
// Create the marker file, delete the parent entries | ||
// if the filesystem isn't configured to retain them | ||
callbacks.createFakeDirectory(dir, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass down performanceCreation here; so always keep parent dirs. I know the marker retention default has changed, but we are in perfomance mode here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is from previous patch version.
@Test | ||
public void testMkdirOverParentFile() throws Throwable { | ||
try { | ||
super.testMkdirOverParentFile(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you know what I'm going to say here now, don't you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, while kept this in intercept(), still had to move the call to separate method for using method ref or lambda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, this is getting over complex.
proposed: copy the superclass code but remove the expectation of failures, retaining only setup and validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed.
- we can tighten this even more
- we should treat magic paths as exactly the same as the performance creation ones.
@@ -124,7 +132,32 @@ public Boolean execute() throws IOException { | |||
return true; | |||
} | |||
|
|||
// Walk path to root, ensuring closest ancestor is a directory, not file | |||
// if performance creation mode is set, no need to check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about on L116 we only do a HEAD check for the path without /, (maybe need new callback), so no LIST probe for a dir via HEAD/LIST
S3AFileStatus fileStatus = performanceCreation
? probePathStatusOrNull(dir, StatusProbeEnum.Head)
? getPathStatusExpectingDir(dir);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable, i was curious about whether we need full probe for magic, i think yes we can make it much performant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was wrong; now the patch is in I see where I was mistaken. Its the versioned buckets where problems surface. sorry!
@@ -73,11 +79,13 @@ public MkdirOperation( | |||
final StoreContext storeContext, | |||
final Path dir, | |||
final MkdirCallbacks callbacks, | |||
final boolean isMagicPath) { | |||
final boolean isMagicPath, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, these should be the same flag. so rename it performanceCreation and in s3aFS set to true if the path is magic or performanceCreation
is true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is from previous patch version.
@shameersss1 what do you think here? |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@steveloughran In the proposed solution (https://issues.apache.org/jira/browse/HADOOP-19047), Even in the in-memory mode, The taskAttempt will write a (.pendingset) file containing the metadata of multi-part-upload (MPU) inside the magic path which will be read by the driver process and Hence the directory creation is necessary. |
🎊 +1 overall
This message was automatically generated. |
@steveloughran could you please take another look? |
@Test | ||
public void testMkdirOverParentFile() throws Throwable { | ||
try { | ||
super.testMkdirOverParentFile(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, this is getting over complex.
proposed: copy the superclass code but remove the expectation of failures, retaining only setup and validation.
reviewed, i'm just wondering how to make the test the cleanest. Going to invite reviews from @shameersss1 @ahmarsuhail @HarshitGupta11 @mukund-thakur as they've been looking around here. Does anyone expect anything to break from this? I don't: we know code doesn't normally try these tricks, otherwise we'd have had complaints about other optimisations. |
sounds good, addressed in the latest revision. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm worried that this patch is now a bit over ambitious, because it will let us create a directory over a file. I'm okay with creating a directory under a file, as that should be rarer. Creating a directory where they already is a file is danger.
We do treat magic pots in the "Danger" way because we are assuming that they are exclusively for spark and map produce jobs writing directory trees properly and that the whole directory tree is transient.
But if we're doing this for a whole directory, for all applications, I think that is a bit too risky.
I now realise why C/C++ compilers let you list optimisations explicitly, with the -O1, -O2, -O3 level of aggression but which can be expanded to explicit request of all features: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
What do we do here then? I am pushing you round in circles aren't I?
// For non-performance or regular mode, the probe for both HEAD and LIST would | ||
// be done. | ||
S3AFileStatus fileStatus = performanceCreation | ||
? probePathStatusOrNull(dir, StatusProbeEnum.HEAD_ONLY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will trigger a needless PUT if there isn't a marker, just children, which hits all the problems in the comments of the existing code.
afrid we need to revert to the old code.
@@ -124,7 +132,32 @@ public Boolean execute() throws IOException { | |||
return true; | |||
} | |||
|
|||
// Walk path to root, ensuring closest ancestor is a directory, not file | |||
// if performance creation mode is set, no need to check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was wrong; now the patch is in I see where I was mistaken. Its the versioned buckets where problems surface. sorry!
This reverts commit 12e6bff.
I see your point. Let me run the whole suite with the latest revision. |
Tested against Though found a separate issue with scale tests using |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall. we can merge once Steve takes a final look.
@@ -200,8 +200,8 @@ Prioritize file creation performance over safety checks for filesystem consisten | |||
This: | |||
1. Skips the `LIST` call which makes sure a file is being created over a directory. | |||
Risk: a file is created over a directory. | |||
1. Ignores the overwrite flag. | |||
1. Never issues a `DELETE` call to delete parent directory markers. | |||
2. Ignores the overwrite flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to add numbering in md files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't make any difference; IDEs often add them automatically. I personally prefer just 1
because its easier to reorder things, but don't care what others do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, this was done by IDE so i thought of keeping it. earlier i updated this description but now that we have new config, i removed the description here and moved it to fs.s3a.performance.flags
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
+1
some minor comments but not enough to justify another iteration
@@ -200,8 +200,8 @@ Prioritize file creation performance over safety checks for filesystem consisten | |||
This: | |||
1. Skips the `LIST` call which makes sure a file is being created over a directory. | |||
Risk: a file is created over a directory. | |||
1. Ignores the overwrite flag. | |||
1. Never issues a `DELETE` call to delete parent directory markers. | |||
2. Ignores the overwrite flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't make any difference; IDEs often add them automatically. I personally prefer just 1
because its easier to reorder things, but don't care what others do
@@ -54,30 +56,54 @@ | |||
* <li>If needed, one PUT</li> | |||
* </ol> | |||
*/ | |||
@InterfaceAudience.Private |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not needed given .impl paackage is tagged privat/unstable
FS_S3A_CREATE_PERFORMANCE, | ||
FS_S3A_PERFORMANCE_FLAGS); | ||
conf.setStrings(FS_S3A_PERFORMANCE_FLAGS, | ||
"create,mkdir"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use just "set" unless you want to provide a list of the enum elements with .toString() after each
tried to pull to branch-3.4; tests failed reran on trunk,. tests failed @virajjasani you need to do an urgent followup; use same JIRA
|
The test is consistently passing for me, using oregon, let me try london. But the above failure can happen only if |
The only way i am able to reproduce the exact failure is by applying this patch:
|
or i can reproduce the exact failure with this patch without any code changes:
|
Addendum PR to override |
Just pulled the trunk and ran this test and it passed for me as well. |
Thanks for validating @mukund-thakur! |
|
my setup is the most aggressive you can get: all new options automatically. new pr fixes it <property>
<name>fs.s3a.performance.flags</name>
<value>*</value>
</property> |
…mance.flags" for mkdir (#6543) If the flag list in fs.s3a.performance.flags includes "mkdir" then the safety check of a walk up the tree to look for a parent directory, -done to verify a directory isn't being created under a file- are skipped. This saves the cost of multiple list operations. Includes: HADOOP-19072. S3A: Override fs.s3a.performance.flags for tests (ADDENDUM) (#6985) This is a followup to #6543 which ensures all test pass in configurations where fs.s3a.performance.flags is set to "*" or contains "mkdirs" Contributed by VJ Jasani
aah, even after the addendum i"m getting failures. Viraj. copy my setting then rerun everything with -Dscale. Then also a run with -Dprefix
suspect I didn't do a full 3.4 test run before pushing it up on that branch, instead just the modified files. my mistake. there'll be another patch to backport |
Sure i will run all tests with this setup. Now i think we can even introduce new mvn profile to run all tests will all perf flgas? Something like |
no, maybe too many profiles. best to have that variety of users and their own non-standard configs |
Done with the full run, these are the failures:
Seems to be matching with Steve's above comment listing failed tests. I will prepare the addendum soon. |
Here is another addendum #6993 I am re-running the whole suite again to ensure everything passes with above settings. |
…mance.flags" for mkdir (apache#6543) If the flag list in fs.s3a.performance.flags includes "mkdir" then the safety check of a walk up the tree to look for a parent directory, -done to verify a directory isn't being created under a file- are skipped. This saves the cost of multiple list operations. Includes: HADOOP-19072. S3A: Override fs.s3a.performance.flags for tests (ADDENDUM) (apache#6985) This is a followup to apache#6543 which ensures all test pass in configurations where fs.s3a.performance.flags is set to "*" or contains "mkdirs" Contributed by VJ Jasani
…UM 2) (apache#6993) Second followup to apache#6543; all hadoop-aws integration tests complete correctly even when fs.s3a.performance.flags = * Contributed by Viraj Jasani
…mance.flags" for mkdir (apache#6543) If the flag list in fs.s3a.performance.flags includes "mkdir" then the safety check of a walk up the tree to look for a parent directory, -done to verify a directory isn't being created under a file- are skipped. This saves the cost of multiple list operations. Contributed by Viraj Jasani
…DUM) (apache#6985) This is a followup to apache#6543 which ensures all test pass in configurations where fs.s3a.performance.flags is set to "*" or contains "mkdirs" Contributed by VJ Jasani
…UM 2) (apache#6993) Second followup to apache#6543; all hadoop-aws integration tests complete correctly even when fs.s3a.performance.flags = * Contributed by Viraj Jasani
…mance.flags" for mkdir (apache#6543) If the flag list in fs.s3a.performance.flags includes "mkdir" then the safety check of a walk up the tree to look for a parent directory, -done to verify a directory isn't being created under a file- are skipped. This saves the cost of multiple list operations. Contributed by Viraj Jasani
…DUM) (apache#6985) This is a followup to apache#6543 which ensures all test pass in configurations where fs.s3a.performance.flags is set to "*" or contains "mkdirs" Contributed by VJ Jasani
…UM 2) (apache#6993) Second followup to apache#6543; all hadoop-aws integration tests complete correctly even when fs.s3a.performance.flags = * Contributed by Viraj Jasani
Description of PR
HADOOP-19072 S3A: expand optimisations on stores with "fs.s3a.create.performance"
How was this patch tested?
us-west-2
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?