-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve handling of symlinks in poll OR re-introduce get (or equivalent functionality) for polls #1298
Comments
in sr3, the structure returned from sftp is SFTPAttributes... ... we can already test if it is a link...
We could add logic, perhaps based on option follow_symlinks where, if we find a link, we either don't return it at all, or return followed type... (by doing a stat of the destination and returning that instead...) |
might have to do that in the ls routine of the sftp driver though... |
This is where I meant: sarracenia/sarracenia/transfer/sftp.py Lines 419 to 427 in 1a0eb16
we post process the SFTPAttributes records being looped there... and we can test for a symlink, and replace the record with the destination... we could do that either all the time, or conditionally on whether follow_symlink is set. |
oh... and to say v2 works with an ascii listing... where sr3 works on the SFTPAttributes... we should see an 'l' at the beginning of the line to be able to deduce that it is a link, and do a similar thing... I think it is easier/more worthwhile to do based on SFTPAttributes in sr3. |
At least on this server, the directory listing does show that the file is a symlink (
Maybe setting the size to 0 when we see a symlink is also a possibility? |
in sr3: the ls_attr call already present returns SFTPAttributes for every file in the directory... we don't need to stat again. In the SFTPAttributes record already returned is the same record returned by stat. There is already... the mode field which can be queried to find if it is a link... we do need to issue a readlink for each link we find if we want to find the destination... SFTP protocol has readlink... in v2: the ls_attr is re-constructed into a string to return something that looks like what is returned by an FTP server... but the same sort of logic is there. it's just that v2 throws away the stat records and returns the string. |
I was imagining that stat'ing the |
options:
weirdness:
|
currently poll is v2... these changes are too involved to do there... so likely want to test with sr3... |
back to talking about get...
if we just use path ... the what if someone does:
(ie. a path that is not within a single directory.) ... current logic uses cd to get to a directory, and then does ls (no args) within it... that logic would be broken by this... so some kind of refactor needed. suggestion to use path (to establish baseline) and have a second argument for pattern within it. An alternate syntax...
having the pattern as explicitly within a path (separate argument) Another wrinkle.. the globbing pattern is a server-side grammar... need to know the server-side grammar to specify... e.g. I believe windows has a different globbing grammar. Won't work on http... so API differences for different protocols. |
I'm mostly documenting that there is at least one case where
get
is useful, not necessarily saying we need to work on re-implementing it.We just had a problem with a v2 SFTP poll, where we are polling a directory that contains many symlinks to files.
Previously, this directory presumably did not contain symlinks, so the poll worked fine. Sarra v2 would list the directory, the directory listing would report the actual file sizes, and everything worked.
The source made a change, so that the directory listing reports the size of the symlinks in the directory.
e.g.
Which is much smaller than the actual file sizes. I worked around this problem by migrating the v2 sarra to sr3 and turning on
acceptSizeWrong
.If we do
ls /somedir/*
, the SFTP server would return the correct file sizes:An alternative solution would have been changing the v2 poll to use
get *
for each directory. But I didn't do this because I knew it wouldn't work with sr3.I also tried
path /somedir/*
on sr3, but that fails because it tries to cd to that directory:The text was updated successfully, but these errors were encountered: