-
-
Notifications
You must be signed in to change notification settings - Fork 389
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds zip downloads for all of a document's attachments #1471
Conversation
6d68c6d
to
001004b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick q about access control, not a full review since this is still draft.
app/server/lib/ActiveDoc.ts
Outdated
@@ -959,6 +960,32 @@ export class ActiveDoc extends EventEmitter { | |||
return data; | |||
} | |||
|
|||
public async getAttachmentsArchive(): Promise<Archive> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no access control here. Enough ActiveDoc methods take responsibility for access control that I think you might want to add a comment that you're not doing access control (or take an OptDocSession and do access control). Specifically, so far the caller just has a canView, which isn't a very high bar. The easy path would be to use ActiveDoc.canDownload. More advanced would be to check view permissions on every attachment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Minor comments so far. Ping me when ready for re-review and everything done. Will make time to kick tires and try it out.
app/server/lib/ActiveDoc.ts
Outdated
const filesSeen = new Set<string>(); | ||
for (const attachment of attachments) { | ||
if (filesSeen.has(attachment.fileIdent)) { | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we've agreed to not dedupe attachments that happen to have the same hash (but may have different names)
app/server/lib/ActiveDoc.ts
Outdated
name: attachment.fileName, | ||
data: file.contentStream, | ||
}); | ||
// TODO - Abort this on shutdown by throwing an error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a TODO here.
app/server/lib/Archive.ts
Outdated
|
||
export interface Archive { | ||
dataStream: stream.Readable, | ||
completed: Promise<void> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you format the end of lines of your interfaces consistently?
app/server/lib/Archive.ts
Outdated
|
||
return { | ||
dataStream: archive, | ||
// TODO - Should we add a default 'catch' here that logs errors? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a TODO here, I don't quite understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now, this promise swallows errors if there isn't a .catch
or await
used by the calling code. I'm not sure if we should add some default logging here just to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment claims that await stream.promises.finished(archive);
will throw if there's an error? With the error caught error? If that's true, that seems enough to me. A docstring saying that the caller owns the Archive promise could be worthwhile just to draw attention? But apart from that, it would be just normal to not leave any promise you are given unhandled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect - thanks for clarifying!
app/server/lib/AttachmentStore.ts
Outdated
import * as fse from 'fs-extra'; | ||
import * as stream from 'node:stream'; | ||
import * as path from 'path'; | ||
import {Readable} from 'node:stream'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh man, I'm really disappointed with this one - I did a review of imports before posting the PR for review.
test/server/lib/Archive.ts
Outdated
} | ||
} | ||
|
||
describe('create_zip_archive', function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you wrap everything in a describe Archive? It is helpful heuristically to have a top level test name that matches the filename. Not 100% done in codebase but done enough that it is annoying when not done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note from testing.
app/server/lib/DocApi.ts
Outdated
res.status(200) | ||
.type("application/zip") | ||
// Construct a content-disposition header of the form 'attachment; filename="NAME"' | ||
.set('Content-Disposition', contentDisposition(`${activeDoc.docName}.zip`, {type: 'attachment'})) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave this a try. Everything worked well. The name was a bit jarring, compared to other downloads associated with a document. What do you think about a more human friendly name, that rhymes with what other /download
methods do?
grist-core/app/server/lib/DocApi.ts
Lines 1834 to 1844 in 14b9147
private async _getDownloadFilename(req: Request, tableId?: string, optDoc?: Document): Promise<string> { | |
let filename = optStringParam(req.query.title, 'title'); | |
if (!filename) { | |
// Query DB for doc metadata to get the doc data. | |
const doc = optDoc || await this._dbManager.getDoc(req); | |
const docTitle = doc.name; | |
const suffix = tableId ? (tableId === docTitle ? '' : `-${tableId}`) : ''; | |
filename = docTitle + suffix || 'document'; | |
} | |
return filename; | |
} |
Something a bit weird seems to happen if I download with multiple distinct files that have the same name, but we've talked about changing the name structure so no need to poke at that...
app/server/lib/ActiveDoc.ts
Outdated
for (const attachment of attachments) { | ||
if (filesSeen.has(attachment.fileIdent)) { | ||
continue; | ||
if (doc._shuttingDown) { | ||
throw new ApiError("Document is shutting down, archiving aborted", 500); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like this approach, because there's not technically an ActiveDoc dependency here - there's a DocStorage dependency, which happens to be shut down when ActiveDoc is. I'd like us to refactor shutdown handling at some point across ActiveDoc and all dependencies, but this feels okay for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Spoffy !
app/server/lib/ActiveDoc.ts
Outdated
!await this._granularAccess.canReadEverything(docSession) && | ||
!await this.canDownload(docSession) | ||
) { | ||
throw new ApiError("No access to download attachments", 403); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps "Insufficient access"
test/server/lib/DocApi.ts
Outdated
{...chimpy, responseType: 'arraybuffer'}); | ||
assert.equal(resp.status, 200); | ||
assert.deepEqual(resp.headers['content-type'], 'application/zip'); | ||
assert.deepEqual(resp.headers['content-disposition'], `attachment; filename="TestDocExternalAttachments-Attachments.zip"`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some long line linting around here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Spoffy !
Context
When using internal storage for attachments, all attachments are included in the .grist file when a document is downloaded. They can be extracted if needed using standard SQLite tools.
When using external storage, attachments are stored outside of Grist (e.g. MinIO, S3, filesystem) and can't be easily retrieved.
Proposed solution
This PR adds an API endpoint to download a .zip of all attachments in the document.
This is streamed where possible, resulting in no more than a single file being kept in memory at once (due to SQLite restrictions - external storage keeps no copies in memory).
Related issues
#1021
Has this been tested?