-
Notifications
You must be signed in to change notification settings - Fork 46
CloudFlare challenge solver support #173
base: master
Are you sure you want to change the base?
Conversation
…into cfsolver-support
Thank you for your effort. This would be great for fixing ulozto-downloader. As of point 3 I think it would be best to create Docker image and provide whole package of all apps at once. It's already complicated to get auto captcha working from installing from PIP for some systems. For example I can't get it work on Debian 12 Bookworm because I got "Illegal instruction" error when running ulozto-downloader. EDIT: Got it working on my Windows machine. But still affected by #172 even when HEADLESS set to false or true. |
* Removed leftover code from cloudscraper * Use more consts where appropriate
This was some leftover code from the previous version, it is fixed in 99178d6. This however should not have affected the downloader. I think that was related to the Flaresolverr taking sometimes too long (several minutes) to launch the browser and resolve the captcha. |
Could you perhaps check if there is anything relevant in the Flaresolverr logs? You can also try to set |
I can't see any problem except there is only one POST request to flaresolverr.
and ulozto-downloader log
|
That is the first request to test whether the service is reachable. For some reason it does not attempt to connect to Flaresolverr any further. This might be due to an unexpected response from the ulozto websire. What is the file you are downloading and the exact arguments passed to the downloader? |
and my Python version is |
Could you try to add also |
Yes. It's working now with this argument. |
OK, I'll try to find out what's wrong and fix it over the weekend. |
Unfortunately still not working for me.
and there are no requests for FlareSolverr - just that first POST. With I think it'll be problem with my network. I'm trying it at work where we're using WPAD for autoconfigure proxy maybe FlareSolverr browser is affected by it. I'll try it at home in Windows VM and let you know. EDIT: Just tried at home network on Windows 10 and still the same. Works only with |
What about using FlareSolverr in Docker? Should it work now? You striked 1 and 2 in Major issues but I can't get it work.
|
@pkejval Are you sure you got the latest commit? if so, could you try to add a "print(r.text)" around this line to see what is coming back from Ulozto? 97a8d9b#diff-3f44b382e24d924148e720b5937197e1aaac47aa7a552ae30dc40841f1f395e4L317 For some reason the presence of the Cloudflare challenge is not detected, therefore the downloader never tries to use Flaresolverr. The "detection" is currently done by searching for the pattern "Just a moment..." in the response: 97a8d9b#diff-3f44b382e24d924148e720b5937197e1aaac47aa7a552ae30dc40841f1f395e4R235 Regarding Docker, I'm using it in my Linux environment as well and it works fine now. From you log it seems the container is not able to reach the Tor proxy. The Tor proxy listens on localhost and by default accepts connections from localhost only, thus the container needs to run with the |
Yes, Added print(r.text) at page.py line 322 and without
|
I assembled Docker container containing all required software with FlareSolverr, Chromium and ulozto-downloader using your PR branch. https://github.com/pkejval/uld-docker We can use it for testing and maybe in future I'll PR it as official Docker image. |
@pkejval I managed to reproduce your case and the Also, thanks for the Dockerfile! I don't have a strict opinion on how the application should be distributed, however I believe it should be able to gather and install its dependencies (at least with pip if not itself), so there should be no need for manually installing and starting prerequisites by the user. I'll try to work on that as well eventually. |
@filo891 I can confirm it works without |
I think being viewer of github project is advanced as it is. From a home server administrator it shouldn't be as hard for those who already found this project :) With detailed guide it could be done even with blindfold :) |
The last two commits implement the reuse of the This way Flaresolverr is invoked only once the first time CloudFlare challenge is detected. Please test and report any issues. Inspired from #157 (comment) (thanks @vladodriver). |
i am getting this error after update |
I pushed a second commit a few minutes ago (3699624). Is the error coming from this one? |
No, it is from a4b3008. I didn't notice the second commit. |
Potvrzuji, jedine rozumne reseni na pruchod cf WIP je nyni https://github.com/FlareSolverr/FlareSolverr . Je to funkcni a udrzovane a taky funguje headless (vcetne terminalu bez X nebo wayland - pouziva to Xvfb na Linuxu..) |
To move this further I'm thinking to implement also a "Manual mode" for solving the CloudFlare Challenge. The user would be presented a browser window, where they could solve the CloudFlare captcha and then the cookie would be extracted and used further in the download process (similar to the manual mode used with the Ulozto captchas). The Flaresolverr interface would then be left as optional for advanced users who require full automation. I have done a quick test using pywebview, and it does provide the functionality, except I'm not able to bypass the CloudFlare challenge when it's launched with the default arguments (The challenge is displayed in a loop indefinitely). If someone has more experience and can suggest what tweaks are needed for the webview component to be accepted by CloudFlare, I'd be grateful. I can then implement the rest of the functionality. |
This pull request is WIP for about 2 months. It would be cool to wrap it up somehow to a minimal functional state and without adding things that could be added later. What everyone needs right now is something that works at least somewhat reliably. |
And this already does (very reliably). |
? |
It works reliably and is in functional state. |
Thanks for the feedback. I have polished the README file and removed the "WIP" flag from the pull request. I agree it works reliable now, although there is still some room for improvements. But it's definitely progress compared to the latest release. However, I leave the final approval and merging to @setnicka. |
It used to work flawlessly but there is some problem with solving CF challenges now as @VasekPurchart mentioned in issue in uld-docker repo. Log when not working:
When using --enforce-tor it ends after first try:
|
Try to run Flaresolverr with the I'm seeing some different behaviour on Ulozto side in handling requests from abroad. While before German IPs were blocked with an appropriate error message ( Additionally, when using |
Pekny den, prosim ktory fork momentalne funguje s rychlejsim stahovanim? Niekto v starych vlaknach doporucil pouzit nizsiu verziu, ale tych vlakien je uz tolko, ze naozaj netusim co funguje. je tu velmi vela vlakien :( Diky za pomoc ! Hello, please which fork or version is currently working ? I am kinda confused which one to choose or how to install eventually after the choice :( |
Vzhledem k tomu, že od zítra ulož.to přestane fungovat jako veřejné úložiště, tak asi již nemá cenu řešit která verze či fork aktuálně funguje. Uvidíme v příštích dnech, jak server samotný bude fungovat.
29. 11. 2023 15:58:36 Mikkauser ***@***.***>:
…
Pekny den, prosim ktory fork momentalne funguje s rychlejsim stahovanim? Niekto v starych vlaknach doporucil pouzit nizsiu verziu, ale tych vlakien je uz tolko, ze naozaj netusim co funguje. je tu velmi vela vlakien :( Diky za pomoc ! Hello, please which fork or version is currently working ? I am kinda confused which one to choose or how to install eventually after the choice :(
—
Reply to this email directly, view it on GitHub[#173 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ALOKP46AYJZIJVPQVZJE2K3YG5ERXAVCNFSM6AAAAAA3O4RYVGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZSGA2TOOJSGA].
You are receiving this because you were mentioned.
[Sledovací obrázek][https://github.com/notifications/beacon/ALOKP4ZFDLZBLRNRTZIUCTLYG5ERXA5CNFSM6AAAAAA3O4RYVGWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTNGL6EA.gif]
|
@Mikkauser žádný, když budeš mít štěstí, tak se možná na pár vláken dostaneš. |
Technicky sa sluzba zmeni az od 01.12. to je den navyse :D Vzhladom na pocetnost vlakien clovek nadobudol dojem, ze nejaky fork funguje aspon obmedzene, ze obcas treba vypisat captchu manualne. Alebo obcas trva dlhsie nech sa chytia vlakna, ale ze to nejake funguje na rychle pouzitie. Je tu aj anglicky pisany dalsi fork od usera filo891. Clovek sa v tom straca. Ma niekto vyskusane komplet vsetky? Vyzera to z dialky, ze vzdy niekto nieco predsa len vykoumal nove a sfunkcnil to. |
skusal si vsetky prosim? Niektore nove forky su popisane uz dost technicky do hlbky, supercookie a podobne. Nefunguje vobec nic z tych forkov? Ani s opisovanim captchy prosim? Vdaka! |
Ten od filo891 je nejnovější. S captchou už to nemá nic společného, to asi byly starší vlákna, teď je problém, že uložto blokuje připojení skrze Tor. |
Ja použíma gDebrid. Je to discord server a cez ich nejaký bypass viete instantne stiahnut gigabajty z uložta. Samozrejme je tam písané že to dokáže handlovať len 3gb/24h ale z mojej osobnej skúsenosti je to určite viac |
Určitě bych zkusil, nemáš odkaz? Ale problém je, že já bych potřeboval minimalne tak 20GB. |
This is an attempt to bypass the CloudFlare WAF (web application firewall) using Flaresolverr. This is a potential fix for issues #157, #170 and #172.
The code is not yet stable and not ready to be merged to Master, contributions to make it mergeable are more than welcome :).
How to run this version of the Ulozto Downloader
This version of the Downloader requires an instance of Flaresolverr running
in headful modeon localhost.On Linux
Major issues remaining to solve
Flaresolverr only seems to be able to resolve the CloudFlare challenge for Ulozto when running in headful mode.The recommended way to run Flaresolverr is via a Docker container, but both this way and running the executable in a default headless mode causes the challenge solver to time out (the ulozto/cloudflare website is not displaying the challenge until it detects user interaction with the browser). Running the executable with the environment variable HEADLES=false results in Chromium browser windows popping up and the CloudFlare challenges being successfully resolved when the Chromium window is focused on.
To fix this, it will most likely require updates in the Flaresolverr code itself.
Once we get Flaresolverr to be usable in headless mode,an automatic download/update feature shall be implemented in the Ulozto Downloader, so the user does not need to worry about setting up third-party dependencies.The Flaresolverr binary currently only works on newer versions of Linux (Python 3.11 dependency on GLIBC_2.35). For easy usability of the Downloader we shall find a way to embed it in a way that it will be compatible with everything the Downloader itself is compatible with.
The first point seems have been my misunderstanding and Flaresolverr actually works well also in headless mode - both on Windows and on Linux inside a Docker container. What I have noticed thought is that sometimes it can take several minutes until the browser is loaded, while normally the browser starts up and resolves the captcha in around 15-30 seconds.
As the Flaresolverr challenge resolution can take around 30-60 seconds per download part, it take quite long to start downloading at a decent speed. A feature for parallelizing the challenge resolution via multiple parallel Tor circuits could improve this.