Skip to content
This repository has been archived by the owner on Jan 27, 2024. It is now read-only.

CloudFlare challenge solver support #173

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

filo891
Copy link
Collaborator

@filo891 filo891 commented Aug 13, 2023

This is an attempt to bypass the CloudFlare WAF (web application firewall) using Flaresolverr. This is a potential fix for issues #157, #170 and #172.

image

The code is not yet stable and not ready to be merged to Master, contributions to make it mergeable are more than welcome :).

How to run this version of the Ulozto Downloader

This version of the Downloader requires an instance of Flaresolverr running in headful mode on localhost.

  1. Download the Flaresolverr binary from https://github.com/FlareSolverr/FlareSolverr/releases
  2. Start the service:
    On Linux
    ./flaresolver
    
    On Windows (cmd.exe)
    flaresolver.exe
    
    On Linux via Docker (attached to Host network)
    docker run -d \
      --name=flaresolverr \
      -p 8191:8191 \
      -e LOG_LEVEL=info \
      --restart unless-stopped \
      --network host \
      ghcr.io/flaresolverr/flaresolverr:latest
    
  3. Run the Ulozto Downloader from this branch. It will automatically connect use the Flaresolverr service to proxy requests to Ulozto.
    python3 ulozto-downloader.py -t --parts-progress https://uloz.to/file/qDF5FrZU4GBi/debian-live-11-6-0-amd64-xfce-iso
    

Major issues remaining to solve

  1. Flaresolverr only seems to be able to resolve the CloudFlare challenge for Ulozto when running in headful mode.
    The recommended way to run Flaresolverr is via a Docker container, but both this way and running the executable in a default headless mode causes the challenge solver to time out (the ulozto/cloudflare website is not displaying the challenge until it detects user interaction with the browser). Running the executable with the environment variable HEADLES=false results in Chromium browser windows popping up and the CloudFlare challenges being successfully resolved when the Chromium window is focused on.
    To fix this, it will most likely require updates in the Flaresolverr code itself.

  2. Once we get Flaresolverr to be usable in headless mode, an automatic download/update feature shall be implemented in the Ulozto Downloader, so the user does not need to worry about setting up third-party dependencies.

  3. The Flaresolverr binary currently only works on newer versions of Linux (Python 3.11 dependency on GLIBC_2.35). For easy usability of the Downloader we shall find a way to embed it in a way that it will be compatible with everything the Downloader itself is compatible with.

  4. The first point seems have been my misunderstanding and Flaresolverr actually works well also in headless mode - both on Windows and on Linux inside a Docker container. What I have noticed thought is that sometimes it can take several minutes until the browser is loaded, while normally the browser starts up and resolves the captcha in around 15-30 seconds.

  5. As the Flaresolverr challenge resolution can take around 30-60 seconds per download part, it take quite long to start downloading at a decent speed. A feature for parallelizing the challenge resolution via multiple parallel Tor circuits could improve this.

@pkejval
Copy link
Contributor

pkejval commented Aug 16, 2023

Thank you for your effort. This would be great for fixing ulozto-downloader.

As of point 3 I think it would be best to create Docker image and provide whole package of all apps at once. It's already complicated to get auto captcha working from installing from PIP for some systems. For example I can't get it work on Debian 12 Bookworm because I got "Illegal instruction" error when running ulozto-downloader.

EDIT: Got it working on my Windows machine. But still affected by #172 even when HEADLESS set to false or true.

uldlib/cfsolver.py Outdated Show resolved Hide resolved
@Scavy
Copy link

Scavy commented Aug 17, 2023

I have been testing this since yesterday, and today I got this error:

billede

The download continues, but it stops processing new parts.

I've never had any error resembling this, so I guess it must be related to this PR.

* Removed leftover code from cloudscraper
* Use more consts where appropriate
@filo891
Copy link
Collaborator Author

filo891 commented Aug 17, 2023

I have been testing this since yesterday, and today I got this error:

The download continues, but it stops processing new parts.

I've never had any error resembling this, so I guess it must be related to this PR.

This was some leftover code from the previous version, it is fixed in 99178d6. This however should not have affected the downloader. I think that was related to the Flaresolverr taking sometimes too long (several minutes) to launch the browser and resolve the captcha.

@filo891
Copy link
Collaborator Author

filo891 commented Aug 17, 2023

EDIT: Got it working on my Windows machine. But still affected by #172 even when HEADLESS set to false or true.

Could you perhaps check if there is anything relevant in the Flaresolverr logs? You can also try to set LOG_LEVEL=debug.

@pkejval
Copy link
Contributor

pkejval commented Aug 18, 2023

I can't see any problem except there is only one POST request to flaresolverr.

C:\Users\user\Downloads\flaresolverr_windows_x64\flaresolverr>flaresolverr.exe
2023-08-18 06:58:34 INFO     ReqId 17268 FlareSolverr 3.3.2
2023-08-18 06:58:34 DEBUG    ReqId 17268 Debug log enabled
2023-08-18 06:58:34 INFO     ReqId 17268 Testing web browser installation...
2023-08-18 06:58:34 INFO     ReqId 17268 Platform: Windows-10-10.0.22621-SP0
2023-08-18 06:58:34 INFO     ReqId 17268 Chrome / Chromium path: C:\Users\user\Downloads\flaresolverr_windows_x64\flaresolverr\chrome\chrome.exe
2023-08-18 06:58:34 INFO     ReqId 17268 Chrome / Chromium major version: 115
2023-08-18 06:58:34 INFO     ReqId 17268 Launching web browser...
2023-08-18 06:58:34 DEBUG    ReqId 17268 Launching web browser...
2023-08-18 06:58:34 DEBUG    ReqId 17268 Started executable: `C:\Users\user\appdata\roaming\undetected_chromedriver\chromedriver.exe` in a child process with pid: 12744
2023-08-18 06:58:35 INFO     ReqId 17268 FlareSolverr User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36
2023-08-18 06:58:35 INFO     ReqId 17268 Test successful!
2023-08-18 06:58:35 INFO     ReqId 17268 Serving on http://0.0.0.0:8191
2023-08-18 06:58:44 INFO     ReqId 24808 Incoming request => POST /v1 body: {'cmd': 'sessions.list'}
2023-08-18 06:58:44 DEBUG    ReqId 24808 Response => POST /v1 body: {'status': 'ok', 'message': '', 'sessions': [], 'startTimestamp': 1692334724705, 'endTimestamp': 1692334724705, 'version': '3.3.2'}
2023-08-18 06:58:44 INFO     ReqId 24808 Response in 0.0 s
2023-08-18 06:58:44 INFO     ReqId 24808 127.0.0.1 POST http://127.0.0.1:8191/v1 200 OK

and ulozto-downloader log

CAPTCHA protected download - CAPTCHA challenges will be displayed
[TOR]   TOR started
[Link solve]    TOR get new CAPTCHA (timeout 30)
[Link solve]    ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit.
[Link solve]    TOR get new CAPTCHA (timeout 30)
[Link solve]    ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit.
[Link solve]    TOR get new CAPTCHA (timeout 30)
[Link solve]    ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit.
[Link solve]    TOR get new CAPTCHA (timeout 30)
[Link solve]    ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit.
[Link solve]    TOR get new CAPTCHA (timeout 30)
[Link solve]    ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit.
[Link solve]    TOR get new CAPTCHA (timeout 30)

@filo891
Copy link
Collaborator Author

filo891 commented Aug 18, 2023

I can't see any problem except there is only one POST request to flaresolverr.

That is the first request to test whether the service is reachable. For some reason it does not attempt to connect to Flaresolverr any further. This might be due to an unexpected response from the ulozto websire.

What is the file you are downloading and the exact arguments passed to the downloader?

@pkejval
Copy link
Contributor

pkejval commented Aug 18, 2023

python ulozto-downloader.py --auto-captcha 'https://uloz.to/file/G9lP6ENk8kXZ/rick-a-morty-s05e08-rickternal-friendshine-of-the-spotless-mort-1080p-web-dl-cz-dabing-mkv#!ZGN2ZQR2ZmOwBGRlZwqvZJWvATL2MScLnmt3GIOcoKVjJQZ5At=='

and my Python version is Python 3.11.4 (tags/v3.11.4:d2340ef, Jun 7 2023, 05:45:37) [MSC v.1934 64 bit (AMD64)] on win32

@filo891
Copy link
Collaborator Author

filo891 commented Aug 18, 2023

Could you try to add also --enforce-tor?

@pkejval
Copy link
Contributor

pkejval commented Aug 18, 2023

Yes. It's working now with this argument.

@filo891
Copy link
Collaborator Author

filo891 commented Aug 18, 2023

OK, I'll try to find out what's wrong and fix it over the weekend.

@filo891
Copy link
Collaborator Author

filo891 commented Aug 19, 2023

@pkejval everything should be working now without the need to set --enforce-tor in the latest version on 97a8d9b.

@pkejval
Copy link
Contributor

pkejval commented Aug 21, 2023

Unfortunately still not working for me.

python ulozto-downloader.py --auto-captcha 'https://uloz.to/file/G9lP6ENk8kXZ/rick-a-morty-s05e08-rickternal-friendshine-of-the-spotless-mort-1080p-web-dl-cz-dabing-mkv#!ZGN2ZQR2ZmOwBGRlZwqvZJWvATL2MScLnmt3GIOcoKVjJQZ5At=='
Starting downloading for url 'https://uloz.to/file/G9lP6ENk8kXZ/rick-a-morty-s05e08-rickternal-friendshine-of-the-spotless-mort-1080p-web-dl-cz-dabing-mkv#!ZGN2ZQR2ZmOwBGRlZwqvZJWvATL2MScLnmt3GIOcoKVjJQZ5At=='
Getting info (filename, filesize, …)
Downloading into: './Rick.a.Morty.S05E08.Rickternal.Friendshine.of.the.Spotless.Mort.1080p.WEB-DL.CZ-dabing.mkv'
CAPTCHA protected download - CAPTCHA challenges will be displayed
[TOR]   TOR started
[Link solve]    TOR get new CAPTCHA (timeout 30)
[Link solve]    ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit.
[Link solve]    TOR get new CAPTCHA (timeout 30)
[Link solve]    ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit.

and there are no requests for FlareSolverr - just that first POST. With --enforce-tor it is working great.

I think it'll be problem with my network. I'm trying it at work where we're using WPAD for autoconfigure proxy maybe FlareSolverr browser is affected by it. I'll try it at home in Windows VM and let you know.

EDIT: Just tried at home network on Windows 10 and still the same. Works only with --enforce-tor
But I think it's not quite a problem... You could just force set --enforce-tor when using cf-solver :)

@pkejval
Copy link
Contributor

pkejval commented Aug 21, 2023

What about using FlareSolverr in Docker? Should it work now? You striked 1 and 2 in Major issues but I can't get it work.

python ulozto-downloader.py --enforce-tor --cf-endpoint 'http://192.168.0.4:8191/v1' --auto-captcha 'https://uloz.to/file/G9lP6ENk8kXZ/rick-a-morty-s05e08-rickternal-friendshine-of-the-spotless-mort-1080p-web-dl-cz-dabing-mkv#!ZGN2ZQR2ZmOwBGRlZwqvZJWvATL2MScLnmt3GIOcoKVjJQZ5At=='
Starting downloading for url 'https://uloz.to/file/G9lP6ENk8kXZ/rick-a-morty-s05e08-rickternal-friendshine-of-the-spotless-mort-1080p-web-dl-cz-dabing-mkv#!ZGN2ZQR2ZmOwBGRlZwqvZJWvATL2MScLnmt3GIOcoKVjJQZ5At=='
Getting info (filename, filesize, …)
[TOR]   TOR started 
Cloudflare WAF detected, initializing automated Cloudflare Solver (timeout 90s).
Cannot download file: Cloudflare solver error: Error: Error solving the challenge. Message: unknown error: net::ERR_PROXY_CONNECTION_FAILED
 (Session info: chrome=115.0.5790.98)
Stacktrace:
#0 0x55ea96b7e053 <unknown>
#1 0x55ea9689b4d8 <unknown>
#2 0x55ea96898ec8 <unknown>
#3 0x55ea968848ba <unknown>
#4 0x55ea96886120 <unknown>
#5 0x55ea96884c5c <unknown>
#6 0x55ea96883c59 <unknown>
#7 0x55ea96883af6 <unknown>
#8 0x55ea968825bb <unknown>
#9 0x55ea96882a14 <unknown>
#10 0x55ea9689e101 <unknown>
#11 0x55ea9691b49c <unknown>
#12 0x55ea969032c2 <unknown>
#13 0x55ea9691af2a <unknown>
#14 0x55ea96903083 <unknown>
#15 0x55ea968cdfce <unknown>
#16 0x55ea968cf362 <unknown>
#17 0x55ea96b4c786 <unknown>
#18 0x55ea96b5029d <unknown>
#19 0x55ea96b4fd36 <unknown>
#20 0x55ea96b50745 <unknown>
#21 0x55ea96b56ecb <unknown>
#22 0x55ea96b50ac6 <unknown>
#23 0x55ea96b2796d <unknown>
#24 0x55ea96b6a2c5 <unknown>
#25 0x55ea96b6a46e <unknown>
#26 0x55ea96b7802f <unknown>
#27 0x7f3a423d7ea7 start_thread
Terminating download. Please wait for stopping all threads.
Download terminated.

@filo891
Copy link
Collaborator Author

filo891 commented Aug 21, 2023

Unfortunately still not working for me.

@pkejval Are you sure you got the latest commit? if so, could you try to add a "print(r.text)" around this line to see what is coming back from Ulozto? 97a8d9b#diff-3f44b382e24d924148e720b5937197e1aaac47aa7a552ae30dc40841f1f395e4L317

For some reason the presence of the Cloudflare challenge is not detected, therefore the downloader never tries to use Flaresolverr. The "detection" is currently done by searching for the pattern "Just a moment..." in the response: 97a8d9b#diff-3f44b382e24d924148e720b5937197e1aaac47aa7a552ae30dc40841f1f395e4R235

Regarding Docker, I'm using it in my Linux environment as well and it works fine now. From you log it seems the container is not able to reach the Tor proxy. The Tor proxy listens on localhost and by default accepts connections from localhost only, thus the container needs to run with the --network host argument set (this is not the same startup command as Flaresolverr recommends) - could you confirm you started it like that?

@pkejval
Copy link
Contributor

pkejval commented Aug 21, 2023

Yes, git rev-parse --short HEAD shows that I'm at 97a8d9b

Added print(r.text) at page.py line 322 and without --enforce-tor I got this:

<!DOCTYPE html>
<html lang="en-US">

<head>
    <title>Just a moment...</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <meta http-equiv="X-UA-Compatible" content="IE=Edge">
    <meta name="robots" content="noindex,nofollow">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <link href="/cdn-cgi/styles/challenges.css" rel="stylesheet">
</head>

<body class="no-js">
    <div class="main-wrapper" role="main">
        <div class="main-content"><noscript>
                <div id="challenge-error-title">
                    <div class="h2"><span class="icon-wrapper">
                            <div class="heading-icon warning-icon"></div>
                        </span><span id="challenge-error-text">Enable JavaScript and cookies to continue</span></div>
                </div>
            </noscript></div>
    </div>
    <script>(function () { window._cf_chl_opt = { cvId: '2', cZone: 'uloz.to', cType: 'managed', cNounce: '30707', cRay: '7fa0fb7fccb3d8fe', cHash: '0023b20f028c7c1', cUPMDTk: "\/download-dialog\/free\/download?fileSlug=G9lP6ENk8kXZ&__cf_chl_tk=0ffHfkeAWyGHVvroBl_hcJ0okFrZ1_kCXGWoNpCHrEY-1692601052-0-gaNycGzNCuU", cFPWv: 'g', cTTimeMs: '1000', cMTimeMs: '0', cTplV: 5, cTplB: 'cf', cK: "", fa: "/download-dialog/free/download?fileSlug=G9lP6ENk8kXZ&amp;__cf_chl_f_tk=0ffHfkeAWyGHVvroBl_hcJ0okFrZ1_kCXGWoNpCHrEY-1692601052-0-gaNycGzNCuU", md: "WPwYy30i63eZzuvpP1fTMdtWbaFDbqUne.ogq.Z6czE-1692601052-0-AfPCo318CbxEF_R65qNQeWBZeDbH5vBT8txADAxm233Jk6P5H00CTuB9T_k5Oe2BHTIZC-fhTlFCzepI_cd5wnemxzhM1uCdCzvjXhjXVniG9Q5bek3a0nlpG7O5TuPK8ah2k-Tkzyl1GEoTLW_Co_y-ZyloCvbsg1JU7HJFZdBhXMbBMss7xbujAFLQ1WGhdHSrt1i3eJjuz_6b2PkPLuR1_Qx-mYRmFtqJcFmFQs8jBIhdjRAeyCAqOUkckqjUBhzPzLW3x7xTxXtXVDp2StLl_5SJjww1h79COA2LDaGdIaMhP3izqFRKuJ1mOHNHmHqNElx2KVvVUo3EIHSmetNZSQC8CaJuPIyDcKWJMznIZ0xK0vBJf3xNFm1sfcsoztUXGnG2Ry6f5nsVRTfJKSvOBlsuE3L6GqpMeJr7xHscAiOUc_X06Pv_hPHz1GJtDtCZUC2wQ8liVcYht4n6mf8ygKKvfmT6sk7gmncegPeqPk8JNlWcP8wFrLco9E9_UyVrKrwJNBjdsCGJpxHMtmK0k6RbDvsotDDmMiGdmON5DknA1UbBR6ZWIDDDA3eAQF9kiPo_9soalvaI-zjq0A7lO8LA9ZH9gEZyex0RFXpe8Nb7d25g3l8uN2X3UTuvy7x6sHnvfyZD7DhHt0wX4DZctO9K526Kk33mo4gCxesiPyMDNllIkCFzVw4GZkWIaVJwGWZ4ekkGJjC7OHNQJphDjhVtYZHV68LUxo_pNxAI6EG9PUoCyY5Nur0HrEu-bcia-o9bXn_hqBSd26wbvFKCCvTRahMeGsVoJUiu4smiOrN5smDqqSCMB71uN990wN_RErFtBZSoWNjCAhVlwc05Dm9TYV2W5evvVpkUHpmutSut8v5DayQ5il_Heiv8lxSpiNr6HELudoWQ7pkGZ6r70I1M2ECD1oM2uYLIRtc0C5613g9BZztT7dy5HoFT-Q_jqMpAa0xoZW6KR-VEbIJ7BvNtldoS-PHuxz7YLkH1IoPqLAD_Lo8Y5F__tl_aLkrbyMVlZIxjHoS5TXClPeUMpvz4kCT2ocUBDjO3llTdwnYVqWACdwDmi5p8ZcjsMEN9KyyfLEDtCqawl2UYE_ydFl_TbBCmTgZuXusojs36EVZJlhuTM0EYl98n8wi-rxsaacXlOSgNpWPFidQHVBPDt2T9TpfYlKrnUviW4aPMkkolEVt1OXzkvRE28_yuQ3AEgGTIHca8vxbyaXi6-29uCTSAj5b40w8zh91udrF3hn7lJ1pQzFzg0HmB9CinxFN8T4ElESKa7_F-N8GLMWoxlnQs1E6z6kqYq2FPmNEpbr90o2Opt18nbP6WI-qY5iGY8PRwk82abWNYuc4VxhN3FzekgOvkChSG0-m912JFHqVwutNyzkX5Bu3mrgEh97jf12sXqWQZMa5eKgmpsNMOgwqWF2fly3U4ErYGfTA-vLSRENY0C4WsL4q94eEvS4b3_-Cfia4b_cuRystc45PTr7H40ztKPm6Q4RYIC8ACjmEiiIeEZcEtCOsLm5mJErd-e3FPR6QwZ3-28NQiba4_2nHRE-W6sk174LDgjkBBE1fxM-qHWY8bJaJ0K3ryQgVfHgpq_9fgGSnO_5opyoF2hpUrwwp8572eekLzn_3BR0JQKYmRu5b5aQkcPMf_QMYa4sekGkbghOGNYsnlC7H103qu0jUXOdTP0RhuvwQJBFVPB-oFokHOKsWD08hk5sfEMVA1dxJWRSlcioRLoKCpt_AAuLCDGSoMbL-dtMcLECUvTHwh0ENfza9VO5k896vr6Do2THQbfkOrz9UWPdPhq3SbvJL2_dPLuMiD5NKuJ_9C-5TiKBIT3SL0KhbTlRji_NQ0_ha_HB56aXm5p8Kz1xZIe9c0skQbhsPpW2uE_pdYkehUrbHsP5Tv-OFQAisMKFcukdp4qboV0hXOrmF3Hx5bjWO9tyUvxBhr-MBzLj_vPp5q9TDfs_MlXn-uEjwSQTsfHt0B_Zb6rfc8lTTyZAvFCAgj0qg2ubSDY7VU0OrLRf7OLOsA3s96Z3FAmn5BE1DVArWdbvI4GpbRh_Gj6H660P2fWHz6t8KW5q1ZbxFulNF5mk5SznlhzkGKkRZWCV-lTwbo4Ho5n3Lnw3SCxOGkjhquFJvtQFGknIDJ8FRDt3cfhcWrHAua3bwFYvCdZINYc0MwSUBALdEj9GEHhQit0i0JAb2wC6g81UHeQqBBtlb26m3nhT8CQPbUEzfJ1ReK0k9S6f7qHjBUZWM8otpBXiqr74sNtIEu1UfRcd66LDS-WMsFbjjlXNz-GFy83egKR6zo8zDfUnS2GIFulS5ez1J5c1oKUKg-madN5HTiODkxbD0DUhgVhwWm1ifuHOLKFFfu2zJEpfUVrGHLNMiKOfSDPPMSjXh6dqqSpl1b0djNvXp3kvsnJiZUgy5Z6Z_ocwxCgC0174nGquNziNU7eZmAaksWdMW4T6GboYTiGP46bk9ClsJjMCMsXtrA2y90L69BNOkzamBfRkIOSho354bSHsUmJRkC3E4hDltLuD1O9hr9Elx85tEn7PI6IgcK1c7aBDmtcahmecUU4O9AWtVYm1vNAPy2mQzjilRy7lKRyUJ3Yi-OLdorBFoBUhRD6UHZHX0LKjvL6zUAEjCkd9RgcXT4v-LxPENirjcyxaT-pvakSOnRQWJ5w37qgTYTjuMIe6sBhUmdewqhOSbqdy9PIgH_ecZahLo_", cRq: { ru: 'aHR0cHM6Ly91bG96LnRvL2Rvd25sb2FkLWRpYWxvZy9mcmVlL2Rvd25sb2FkP2ZpbGVTbHVnPUc5bFA2RU5rOGtYWg==', ra: 'cHl0aG9uLXJlcXVlc3RzLzIuMzEuMA==', rm: 'R0VU', d: 'WvTiyYegpusFBCHHzUwh1kQdeX4uMkrl80uSDBb6Rlgsks5jFhGtNOpQDFdezuwmSdLpHhlHuxaYy9tbiAiT+QDuN8fL6X90aKEFbqM/N3K3BThdLNTVGcOAUTgpLIx3aM5+e4rKa8URlADo2yZRXjo5WAMFdiB9hZgP4dMP9f2T2jLYLEvyz04Gi8A9w2idCOAf+le2K57kE4MvbuxiWnRdv7HpYpA53GKcmdxBL/1/Mdldd5spv0T+yVYMTeTwS54fl9qMvMhV89vDoWrXNsEw3bgYJgxmic8pM3q9Dsd6s5YdO2OyIjy6048zD/Wn5O4d9fXFWFwFGX3i082bhjk5rMGU16HLA4q+M2VGgy34yXsgXjz5J36EfY0l4bf4Q1yfpBMVuT8whvcVsbP/BwlmueXPoKujPjk+z5Qhbvd1mCN8tHWEmiC2RqyThQ5ACVFy9rFGQPWe0VfSPB7cdtGX0PLV7YNA5c1L+eVjdNE=', t: 'MTY5MjYwMTA1Mi4xMjQwMDA=', cT: Math.floor(Date.now() / 1000), m: 'kAbKx2s2XkXXpbzNnn6BuEfbjWtdYgQqGcKJ/BYd2dQ=', i1: 'g6cK+wLYOSUeZ/i4yWkh8w==', i2: 'RQopdvr8hJoJxpqUK6cAkg==', zh: 'H6H5rT46MdJEduO2EFVWUYu6Mz0W/6o6lKBs5jFOnDc=', uh: 'YE9XOpG5TeHmhA1zfs5mxC8CrRZzq2a/+r+OU7dliYQ=', hh: 'AZxN1L+Nck6+Yo5cCT418B4s2dJxrgUeCciQcMYDIbA=', } }; var cpo = document.createElement('script'); cpo.src = '/cdn-cgi/challenge-platform/h/g/orchestrate/chl_page/v1?ray=7fa0fb7fccb3d8fe'; window._cf_chl_opt.cOgUHash = location.hash === '' && location.href.indexOf('#') !== -1 ? '#' : location.hash; window._cf_chl_opt.cOgUQuery = location.search === '' && location.href.slice(0, location.href.length - window._cf_chl_opt.cOgUHash.length).indexOf('?') !== -1 ? '?' : location.search; if (window.history && window.history.replaceState) { var ogU = location.pathname + window._cf_chl_opt.cOgUQuery + window._cf_chl_opt.cOgUHash; history.replaceState(null, null, "\/download-dialog\/free\/download?fileSlug=G9lP6ENk8kXZ&__cf_chl_rt_tk=0ffHfkeAWyGHVvroBl_hcJ0okFrZ1_kCXGWoNpCHrEY-1692601052-0-gaNycGzNCuU" + window._cf_chl_opt.cOgUHash); cpo.onload = function () { history.replaceState(null, null, ogU); }; } document.getElementsByTagName('head')[0].appendChild(cpo); }());</script>
</body>

</html>

@pkejval
Copy link
Contributor

pkejval commented Aug 21, 2023

I assembled Docker container containing all required software with FlareSolverr, Chromium and ulozto-downloader using your PR branch. https://github.com/pkejval/uld-docker We can use it for testing and maybe in future I'll PR it as official Docker image.

@filo891
Copy link
Collaborator Author

filo891 commented Aug 21, 2023

@pkejval I managed to reproduce your case and the --enforce-tor should finally not be needed in ef8e7a5.

Also, thanks for the Dockerfile! I don't have a strict opinion on how the application should be distributed, however I believe it should be able to gather and install its dependencies (at least with pip if not itself), so there should be no need for manually installing and starting prerequisites by the user. I'll try to work on that as well eventually.

@pkejval
Copy link
Contributor

pkejval commented Aug 22, 2023

@filo891 I can confirm it works without --enforce-tor now. Good job! :)

@Pheggas
Copy link

Pheggas commented Sep 15, 2023

@Vojtak42 And your point is? User has to do same steps on Windows as on Linux. As @setnicka pointed out, user needs to:

  • Install Python
  • Download ulozto-downloader from pip with tensorflow
  • Install Tor
  • Install and run Flaresolverr

In ideal world it should be all included in one executable file ready to run without installing anything. I think easiest way to do it now is Docker image but it's suitable only for advanced users.

I think being viewer of github project is advanced as it is. From a home server administrator it shouldn't be as hard for those who already found this project :) With detailed guide it could be done even with blindfold :)

@Pheggas
Copy link

Pheggas commented Sep 15, 2023

Dockerizing everything is also an option, but I'd rather do that in a "sibling" project and keep this one for the application only.

This has been done by @pkejval in his repo. However i'm not sure if he's able to keep up with newer versions of the downloader.

@filo891
Copy link
Collaborator Author

filo891 commented Sep 22, 2023

The last two commits implement the reuse of the cf_clearance cookie which speeds up the startup of the download threads significantly (back to the original performance before Flaresolverr was introduced).

This way Flaresolverr is invoked only once the first time CloudFlare challenge is detected.

Please test and report any issues.

Inspired from #157 (comment) (thanks @vladodriver).

@Vojtak42
Copy link
Contributor

Vojtak42 commented Sep 22, 2023

i am getting this error after update
C:\Users\Vojta>ulozto-downloader --parts-progress https://uloz.to/file/mRR2Fg2aDeBM/aquaman-2018-1080p-25fps-h264-128kbit-aac-mkv [Autodetect] tensorflow.lite available, using --auto-captcha Starting downloading for url 'https://uloz.to/file/mRR2Fg2aDeBM/aquaman-2018-1080p-25fps-h264-128kbit-aac-mkv' Getting info (filename, filesize, …) Downloading into: './Aquaman (2018) (1080p_25fps_H264-128kbit_AAC).mkv' You are lucky, this is slow direct download without CAPTCHA :) [TOR] TOR started [Link solve] TOR get downlink (timeout 30) [Link solve] Direct download does no seem to work, trying with captcha resolution instead... [Link solve] TOR get new CAPTCHA (timeout 30) [Link solve] ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit. [Link solve] TOR get new CAPTCHA (timeout 30) [Link solve] ERROR: Cannot parse CAPTCHA image URL from the page. Changing Tor circuit. [Link solve] TOR get new CAPTCHA (timeout 30)

@filo891
Copy link
Collaborator Author

filo891 commented Sep 22, 2023

I pushed a second commit a few minutes ago (3699624). Is the error coming from this one?

@Vojtak42
Copy link
Contributor

Vojtak42 commented Sep 22, 2023

No, it is from a4b3008. I didn't notice the second commit.

@vladodriver
Copy link
Contributor

Potvrzuji, jedine rozumne reseni na pruchod cf WIP je nyni https://github.com/FlareSolverr/FlareSolverr . Je to funkcni a udrzovane a taky funguje headless (vcetne terminalu bez X nebo wayland - pouziva to Xvfb na Linuxu..)
Dalsi dobra zprava je ze staci jedna cf_clearance cookie pro domenu + odpovidajici UserAgent a dle meho dosavadniho pozorovani vydrzi 24 hodin a staci ta 1 sama na kazdy pozadavek. Takze by se da ukladat si cf_cloudflare cookie ulozit a pouze pokud vyprsi a objevi se dalsi vyzva, tak spustit opet flaresolver a ziskat novou.

@filo891
Copy link
Collaborator Author

filo891 commented Sep 24, 2023

To move this further I'm thinking to implement also a "Manual mode" for solving the CloudFlare Challenge. The user would be presented a browser window, where they could solve the CloudFlare captcha and then the cookie would be extracted and used further in the download process (similar to the manual mode used with the Ulozto captchas). The Flaresolverr interface would then be left as optional for advanced users who require full automation.

I have done a quick test using pywebview, and it does provide the functionality, except I'm not able to bypass the CloudFlare challenge when it's launched with the default arguments (The challenge is displayed in a loop indefinitely). If someone has more experience and can suggest what tweaks are needed for the webview component to be accepted by CloudFlare, I'd be grateful. I can then implement the rest of the functionality.

@SpiReCZ
Copy link
Contributor

SpiReCZ commented Oct 15, 2023

This pull request is WIP for about 2 months. It would be cool to wrap it up somehow to a minimal functional state and without adding things that could be added later. What everyone needs right now is something that works at least somewhat reliably.

@Vojtak42
Copy link
Contributor

Vojtak42 commented Oct 15, 2023

And this already does (very reliably).

@SpiReCZ
Copy link
Contributor

SpiReCZ commented Oct 27, 2023

?

@Vojtak42
Copy link
Contributor

It works reliably and is in functional state.

@filo891 filo891 marked this pull request as ready for review October 29, 2023 18:56
@filo891 filo891 changed the title WIP: CloudFlare challenge solver support CloudFlare challenge solver support Oct 29, 2023
@filo891
Copy link
Collaborator Author

filo891 commented Oct 29, 2023

Thanks for the feedback. I have polished the README file and removed the "WIP" flag from the pull request.

I agree it works reliable now, although there is still some room for improvements. But it's definitely progress compared to the latest release.

However, I leave the final approval and merging to @setnicka.

@pkejval
Copy link
Contributor

pkejval commented Nov 18, 2023

It used to work flawlessly but there is some problem with solving CF challenges now as @VasekPurchart mentioned in issue in uld-docker repo.
For some files CF challenge can be obtained for some it didn't. I can't tell if it's FlareSolverr issue or cfsolver itself?
Maybe they are blocking TOR connections?

Log when not working:

2023-11-18 09:28:22 INFO     Incoming request => POST /v1 body: {'maxTimeout': 90000, 'proxy': {'url': 'socks5://127.0.0.1:9050'}, 'session': 'd0f0eafc-85f4-11ee-a47d-0242ac11000b', 'url': 'https://uloz.to/download-dialog/free/download?fileSlug=TRvOKPgL3CpG', 'cmd': 'request.get'}

2023-11-18 09:28:36 INFO     Challenge not detected!

When using --enforce-tor it ends after first try:

Starting ulozto-downloader for https://uloz.to/file/TRvOKPgL3CpG/jara-cimrman-lezici-spici-komedie-cssr-1983-1080p-jackripper-mp4#!ZGN4BQR2ZGxkMzRkBJSzAmyvMwSvAHqjDmAlGSSHn0qYpTR3Zt==
2023-11-18 09:33:38.134303: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2023-11-18 09:33:38.174195: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-18 09:33:38.174246: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-18 09:33:38.175715: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-11-18 09:33:38.183105: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2023-11-18 09:33:38.183392: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-11-18 09:33:39.076042: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2023-11-18 09:33:39 INFO     Incoming request => POST /v1 body: {'cmd': 'sessions.list'}
2023-11-18 09:33:39 INFO     Response in 0.001 s
2023-11-18 09:33:39 INFO     127.0.0.1 POST http://127.0.0.1:8191/v1 200 OK
2023-11-18 09:33:51 INFO     Incoming request => POST /v1 body: {'cmd': 'sessions.create', 'proxy': {'url': 'socks5://127.0.0.1:9050'}}
2023-11-18 09:33:51 INFO     Response in 0.565 s
2023-11-18 09:33:51 INFO     127.0.0.1 POST http://127.0.0.1:8191/v1 200 OK
2023-11-18 09:33:51 INFO     Incoming request => POST /v1 body: {'maxTimeout': 90000, 'proxy': {'url': 'socks5://127.0.0.1:9050'}, 'session': '953d1dd6-85f5-11ee-92c9-0242ac11000b', 'url': 'https://uloz.to/file/TRvOKPgL3CpG/jara-cimrman-lezici-spici-komedie-cssr-1983-1080p-jackripper-mp4', 'cmd': 'request.get'}
2023-11-18 09:34:03 INFO     Challenge not detected!
2023-11-18 09:34:03 INFO     Response in 11.674 s
2023-11-18 09:34:03 INFO     127.0.0.1 POST http://127.0.0.1:8191/v1 200 OK
status=initializing
2023-11-18 09:34:03 INFO     Incoming request => POST /v1 body: {'cmd': 'sessions.destroy', 'session': '953d1dd6-85f5-11ee-92c9-0242ac11000b'}
ulozto-downloader is initializing...
tor=TOR started
Tor: TOR started
2023-11-18 09:34:03 INFO     Response in 0.111 s
2023-11-18 09:34:03 INFO     127.0.0.1 POST http://127.0.0.1:8191/v1 200 OK
11/18/23 09:34:04 - Done downloading https://uloz.to/file/TRvOKPgL3CpG/jara-cimrman-lezici-spici-komedie-cssr-1983-1080p-jackripper-mp4#!ZGN4BQR2ZGxkMzRkBJSzAmyvMwSvAHqjDmAlGSSHn0qYpTR3Zt==

@filo891
Copy link
Collaborator Author

filo891 commented Nov 19, 2023

Try to run Flaresolverr with the LOG_LEVEL=debug environment variable set. It will then print the actual HTML response from Ulozto which should give us some indication about the actual error. I assume you are using the latest version of Flaresolverr (3.3.10).

I'm seeing some different behaviour on Ulozto side in handling requests from abroad. While before German IPs were blocked with an appropriate error message (The page is blocked due to the decision of the authorities in your area), now the page has changed and says The file has been marked as private instead. Perhaps this is now the behaviour also for other than German IPs and that is the reason for the new behaviour.

Additionally, when using --enforce-tor I'm still experiencing the problem described in issue #183, where the error from Flaresolverr is explicitly saying CloudFlare is blocking the exit IP.

@Mikkauser
Copy link

Pekny den, prosim ktory fork momentalne funguje s rychlejsim stahovanim? Niekto v starych vlaknach doporucil pouzit nizsiu verziu, ale tych vlakien je uz tolko, ze naozaj netusim co funguje. je tu velmi vela vlakien :( Diky za pomoc ! Hello, please which fork or version is currently working ? I am kinda confused which one to choose or how to install eventually after the choice :(

@pkejval
Copy link
Contributor

pkejval commented Nov 29, 2023 via email

@Vojtak42
Copy link
Contributor

Vojtak42 commented Nov 29, 2023

@Mikkauser žádný, když budeš mít štěstí, tak se možná na pár vláken dostaneš.
@pkejval Právě že by dávalo, pokud si chceš něco stáhnout před tím, než to přestane fungovat.

@Mikkauser
Copy link

Vzhledem k tomu, že od zítra ulož.to přestane fungovat jako veřejné úložiště, tak asi již nemá cenu řešit která verze či fork aktuálně funguje. Uvidíme v příštích dnech, jak server samotný bude fungovat. 29. 11. 2023 15:58:36 Mikkauser @.***>:

Pekny den, prosim ktory fork momentalne funguje s rychlejsim stahovanim? Niekto v starych vlaknach doporucil pouzit nizsiu verziu, ale tych vlakien je uz tolko, ze naozaj netusim co funguje. je tu velmi vela vlakien :( Diky za pomoc ! Hello, please which fork or version is currently working ? I am kinda confused which one to choose or how to install eventually after the choice :( — Reply to this email directly, view it on GitHub[#173 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ALOKP46AYJZIJVPQVZJE2K3YG5ERXAVCNFSM6AAAAAA3O4RYVGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZSGA2TOOJSGA]. You are receiving this because you were mentioned. [Sledovací obrázek][https://github.com/notifications/beacon/ALOKP4ZFDLZBLRNRTZIUCTLYG5ERXA5CNFSM6AAAAAA3O4RYVGWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTNGL6EA.gif]

Technicky sa sluzba zmeni az od 01.12. to je den navyse :D Vzhladom na pocetnost vlakien clovek nadobudol dojem, ze nejaky fork funguje aspon obmedzene, ze obcas treba vypisat captchu manualne. Alebo obcas trva dlhsie nech sa chytia vlakna, ale ze to nejake funguje na rychle pouzitie. Je tu aj anglicky pisany dalsi fork od usera filo891. Clovek sa v tom straca. Ma niekto vyskusane komplet vsetky? Vyzera to z dialky, ze vzdy niekto nieco predsa len vykoumal nove a sfunkcnil to.

@Mikkauser
Copy link

@Mikkauser žádný, když budeš mít štěstí, tak se možná na pár vláken dostaneš. @pkejval Právě že by dávalo, pokud si chceš něco stáhnout před tím, než to přestane fungovat.

skusal si vsetky prosim? Niektore nove forky su popisane uz dost technicky do hlbky, supercookie a podobne. Nefunguje vobec nic z tych forkov? Ani s opisovanim captchy prosim? Vdaka!

@Vojtak42
Copy link
Contributor

Ten od filo891 je nejnovější. S captchou už to nemá nic společného, to asi byly starší vlákna, teď je problém, že uložto blokuje připojení skrze Tor.

@Pheggas
Copy link

Pheggas commented Nov 30, 2023

Ja použíma gDebrid. Je to discord server a cez ich nejaký bypass viete instantne stiahnut gigabajty z uložta. Samozrejme je tam písané že to dokáže handlovať len 3gb/24h ale z mojej osobnej skúsenosti je to určite viac

@Vojtak42
Copy link
Contributor

Vojtak42 commented Nov 30, 2023

Určitě bych zkusil, nemáš odkaz? Ale problém je, že já bych potřeboval minimalne tak 20GB.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.