-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to quick extract individual entry data without iterate through the entry list #87
Comments
Hey, thanks for the feedback! I use UnrarKit and UnzipKit in my own comic reader, for what it's worth. I'm curious if you're noticing specific performance problems, or if it's just a lot of logging. The logging shouldn't materially slow down production apps, but if you don't turn it off in Xcode, debugging can definitely become slow. It also it's scanning through the whole file every time to find the file headers. It does read from RAR's equivalent of the ZIP format's central directory. I could imagine reading the directory into memory and then using that to seek to files, and it might potentially save some amount of time, but at the expense of a larger memory footprint. If you do have a specific case that's taking a long time, please send it my way. I'm sure there are ways that the library could be made more efficient, but I definitely follow the path of avoiding premature optimization. Can you send me an archive that's taking longer to extract a file using UnrarKit than another library or app? |
@abbeycode Thanks for the info, u are right I shouldn't put my judgment too early, I will try to profile in I will share my profile results later. If it's really due to the log then we can close this issue, if it's not I will try to generate a small rar archive with many empty files in it and send to u. |
@abbeycode I run some rough profiling, and the result is really interesting, I use a test rar file with about 900 pictures in it, here is the result: Here is my test archive(190MB), it's not applying any compression, all imgs just All code running in It's really a huge difference and did surprise me... and the time is also increased when the archive's entry count increased, since u need to iterate through more entries each time u try to extract one. And somehow I failed to suppress the verbose logs by using |
Instead of using the compiled version of I will come back to u if I need some help, thank u! |
I made it!!! I record the The total time saving is remarkable for random individual extraction now, it's less than 10s, and I think we can do even better if we keep the RAR archive open until all individual extraction operations is finished, I will try that later. I will submit a PR or something to show u how I do it later, but I only know a little bit about Object C & C++, so the code should be terribly ugly.:P |
That’s great. I can work with you to refine your PR, but I’d say since the point of this change would be for a performance gain, if need to see a unit test up front that I can use to compare the new approach to the old approach. There are plenty of examples already in the codebase. I use the RAR command line tool to generate archives with large files and with large numbers of files. This Apple article (specifically the “Write a Performance Test” section can show you how to write a performance test. I can work with you to refine the PR, or if you’d prefer, I could ultimately merge it into an intermediate branch and refine it myself. I look forward to seeing what you come up with! |
@abbeycode I just push my PR, it would be great if u can spend some time to help me refine & review it, it's my first time trying to modify some code in such low level, and if not thing is broken that would be a miracle to me haha :P. Anyway, the PR is here: I test with a couple of my test archives, |
Thanks for providing such a good project! Everything about RAR just works fine as it means too, my first iOS app will get a lot of enhancements from your works, thank u so much again!
One little thing though, I don't know if it's a RAR limit or it's something we can overcome.
Right now we can select and extract individual entries from the Archive file, these functions can be used:
Both accept a
filename
of selected entry as a parameter, the problem is it seems that every time I call these functions, they need to iterate through all entries in the archive and locate the corresponding entry by thefilename
parameter(with==
comparison I guess), and this will slow down the extract speed especially if my RAR archive got more than 10K entries...My app is a comic Reader that can use to open ZIP or RAR format comic books, ideally, user can view the book pictures as soon as they click on the archive file, and I don't need to extract the entire archive, the individual extraction happens frequently during the comic page switching.
In ZIP archive, they got a useful attribute for each entry call
relative offset to local file header
, which basically tell u where u should start looking for the entry's content in the disk, once u have that attribute for every entry(by reading them fromCentral Directory
section of the Zip Archive), u can extract individual entry very fast.I'm wondering if
unrar
has similar attributes like this? If not then I have to extract the entire RAR archive ahead to achieve best user experience, and it's a bit waste of CPU & storage resources...And from the screenshot we can see that
UnrarKit
seems always use the pattern:open > locate entry > extract > close
is there any way to keep the archive open? Since I need to extract entries over and over again.
Reference:
https://en.wikipedia.org/wiki/Zip_(file_format)#Central_directory_file_header
The text was updated successfully, but these errors were encountered: