🔥 RedNote Link Extraction/Content Collection Tool:Extract account-published, favorites, and liked works links; extract search result works links and user links; collect RedNote works information; extract RedNote works download addresses; download RedNote watermark-free works files!
🔥 "RedNote", "XiaoHongShu" and "小红书" have the same meaning, and this project is collectively referred to as "RedNote".
⭐ This project is completely free and open-source, with no paid features. Please do not be deceived!
⭐ Due to the author's limited energy, I was unable to update the English document in a timely manner, and the content may have become outdated, partial translation is machine translation, the translation result may be incorrect, Suggest referring to Chinese documentation. If you want to contribute to translation, we warmly welcome you.
- Program Features
- ✅ Collect RedNote works information
- ✅ Extract RedNote works download addresses
- ✅ Download RedNote watermark-free works files
- ✅ Download RedNote livePhoto files (non-watermark-free)
- ✅ Automatically skip already downloaded works files
- ✅ works file integrity handling mechanism
- ✅ Customizable image works file download format
- ✅ Persistently store works information to files
- ✅ Store works files to a separate folder
- ✅ Background clipboard monitoring for works download
- ✅ Record downloaded works IDs
- ✅ Support command line for downloading works files
- ✅ Read cookies from browser
- ✅ Customizable file name format
- ✅ Support API call functionality
- ✅ Support file breakpoint resume download
- ✅ Intelligent recognition of works file types
- Script Features
- ✅ Download RedNote watermark-free works files
- ✅ Extract discovery page works links
- ✅ Extract account-published works links
- ✅ Extract account-favorited works links
- ✅ Extract account-liked works links
- ✅ Extract account-board works links
- ✅ Extract search result works links
- ✅ Extract search result user links
⭐ The development plan and progress of XHS-Downloader can be found at Projects
🎥 Click the images to watch the demo video
https://www.xiaohongshu.com/explore/WorksID?xsec_token=XXX
https://www.xiaohongshu.com/discovery/item/WorksID?xsec_token=XXX
https://xhslink.com/ShareCode
Supports entering multiple works links at once, separated by spaces; the program will automatically extract valid links without additional processing!
⭐ It is recommended to use the Windows Terminal (default terminal for Windows 11) to run the program for the best display effect!
If you only need to download watermark-free works files, it is recommended to choose Program Run; if you have other needs, it is recommended to choose Source Code Run!
Starting from version 2.2
, if there are no abnormalities in project functionality, there is no need to handle cookies separately!
⭐ Mac OS, Windows 10 and above users can go to Releases to download the program package, unzip it, open the program folder, and double-click to run main
to use.
⭐ This project includes GitHub Actions for manually building executable files. Users can use GitHub Actions to build the latest source code into executable files at any time!
Note: The executable file main
for Mac OS may need to be launched from the terminal command line; Due to device limitations, the Mac OS executable file has not been tested and its availability cannot be guaranteed!
If you use the program in this way, the default download path for files is: .\_internal\Download
; the configuration file path is: .\_internal\settings.json
- Get Image
- Method 1: Build the image using the
Dockerfile
- Method 2: Pull the image using the command
docker pull joeanamier/xhs-downloader
- Create Container
- TUI Mode:
docker run -it joeanamier/xhs-downloader
- API Mode:
docker run -it joeanamier/xhs-downloader python main.py server
- Run Container
- Start Container:
docker start -i ContainerName/ContainerID
- Restart Container:
docker restart -i ContainerName/ContainerID
- Start Container:
When running the project via Docker, the command line call mode is not supported. The clipboard reading and clipboard monitoring functions are unavailable, but pasting content works fine. Please provide feedback if other features are not functioning properly!
- Install Python interpreter with version
3.12
- Download the latest source code of this project or the source code released in Releases to your local machine
- Open the terminal and switch to the root path of the project
- Run the command
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt
to install the required modules - Run
main.py
to use
The project supports command line mode. If you want to download specific images from a text and image work, you can use this mode to set the image sequence number you want to download!
You can use the command line to read cookies from the browser and write to the configuration file!
Command example: python .\main.py --browser_cookie Chrome --update_settings
The bool
type parameters support setting with true
, false
, 1
, 0
, yes
, no
, on
or off
(case insensitive).
Start: Run the command: python .\main.py server
Stop: Press Ctrl
+ C
to stop the server
Request endpoint:
/xhs/
Request method:
POST
Request format:
JSON
Request parameters:
Parameter | Type | Description | Default |
---|---|---|---|
url | str | RedNote works link, auto-extraction, does not support multiple links; Required parameter | None |
download | bool | Whether to download the works file; set to true will take more time; Optional parameter |
false |
index | list[int] | Download specific image files by index, only effective for text and image works; not effective when the download parameter is set to false ; Optional parameter |
null |
cookie | str | Cookies used when requesting data; Optional parameter | Settings Cookie Value |
skip | bool | Whether to skip works with download records; set to true will not return works data with download records; Optional parameter |
false |
Code example:
def api_demo(): server = "http://127.0.0.1:8000/xhs/" data = { "url": "https://www.xiaohongshu.com/explore/123456789", "download": True, "index": [ 3, 6, 9, ], } response = requests.post(server, json=data) print(response.json())
- Due to the date information carried in the links of RedNote works, using links obtained from previous dates may be subject to risk control. It is recommended to use the latest RedNote works links when downloading RedNote work files
- Windows system requires running programs as an administrator to read Chromium, Chrome, Edge browser cookies
- If the function to save works data to a file is enabled, the works data will be stored by default in the
./Download/ExploreData.db
file - The program's download records will be stored in the
./ExploreID.db
file
If your browser has the Tampermonkey browser extension installed, you can add the user script to experience the project features without needing to download or install anything!
After successfully installing the script, open the RedNote page, check the script instructions, and follow the prompts to operate.
Note: Using the XHS-Downloader user script to batch extract works links, in combination with the XHS-Downloader program, can achieve batch downloading of watermark-free works files!
- When downloading watermark-free works from Xiaohongshu, the script requires time to process the files. Please wait for a moment and do not click the download button multiple times.
- Watermark-free image files are in PNG format; watermark-free video files are larger and may take longer to process. Page redirects may cause download failures.
- When extracting links for posts, collects, likes, and board from an account, the script can automatically scroll the page until all works are loaded. The default scroll detection interval is 2.5 seconds.
- When extracting links to explore works, searching for works, and user links, the script can automatically scroll the page to load more content. The default number of page scrolls is 10.
- The automatic page scroll feature is turned off by default. Users can enable it and modify the scroll detection interval and the number of scrolls, with changes taking effect immediately.
- If the automatic page scroll feature is not enabled, users need to manually scroll the page to load more content before performing other actions.
- Support packaging and downloading of work files; This feature is enabled by default, and works from multiple files will be downloaded in compressed file format
- Using global proxy tools may cause script download failures. If there are issues, please try disabling the proxy tool. If necessary, contact the author for feedback.
- XHS-Downloader userscript only implements the data collection functionality for visible content and does not include any paid or cracked features.
The automatic page scroll feature has been refactored and is turned off by default! Enabling this feature may be detected as automated behavior by Xiaohongshu, potentially resulting in account risk control or banning.
If you have other needs, you can perform code calls or modifications based on the comments in main.py
!
async def example(): """通过代码设置参数,适合二次开发""" # 示例链接 error_link = "https://github.com/JoeanAmier/XHS_Downloader" demo_link = "https://www.xiaohongshu.com/explore/xxxxxxxxxx" multiple_links = f"{demo_link} {demo_link} {demo_link}" # 实例对象 work_path = "D:\\" # 作品数据/文件保存根路径,默认值:项目根路径 folder_name = "Download" # 作品文件储存文件夹名称(自动创建),默认值:Download name_format = "作品标题 作品描述" user_agent = "" # User-Agent cookie = "" # 小红书网页版 Cookie,无需登录,可选参数,登录状态对数据采集有影响 proxy = None # 网络代理 timeout = 5 # 请求数据超时限制,单位:秒,默认值:10 chunk = 1024 * 1024 * 10 # 下载文件时,每次从服务器获取的数据块大小,单位:字节 max_retry = 2 # 请求数据失败时,重试的最大次数,单位:秒,默认值:5 record_data = False # 是否保存作品数据至文件 image_format = "WEBP" # 图文作品文件下载格式,支持:PNG、WEBP folder_mode = False # 是否将每个作品的文件储存至单独的文件夹 # async with XHS() as xhs: # pass # 使用默认参数 async with XHS( work_path=work_path, folder_name=folder_name, name_format=name_format, user_agent=user_agent, cookie=cookie, proxy=proxy, timeout=timeout, chunk=chunk, max_retry=max_retry, record_data=record_data, image_format=image_format, folder_mode=folder_mode, ) as xhs: # 使用自定义参数 download = True # 是否下载作品文件,默认值:False # 返回作品详细信息,包括下载地址 # 获取数据失败时返回空字典 print(await xhs.extract(error_link, download, )) print(await xhs.extract(demo_link, download, index=[1, 2])) # 支持传入多个作品链接 print(await xhs.extract(multiple_links, download, ))
The project uses pyperclip
to implement clipboard reading functionality, which varies across different systems.
On Windows, no additional modules are needed.
On Mac, this module makes use of the pbcopy and pbpaste commands, which should come with the os.
On Linux, this module makes use of the xclip or xsel commands, which should come with the os. Otherwise run "sudo apt-get install xclip" or "sudo apt-get install xsel" (Note: xsel does not always seem to work.)
Otherwise on Linux, you will need the qtpy or PyQT5 modules installed.
The settings.json
file in the root directory of the project is automatically generated on the first run and allows customization of some runtime parameters.
If invalid parameter values are set, the program will use the default values!
Parameter | Type | Description | Default Value |
---|---|---|---|
work_path | str | Root path for saving works data/files | Project root path |
folder_name | str | Name of the folder for storing works files | Download |
name_format | str | #Format of works file name, separated by spaces between fields, supports fields: 收藏数量 、评论数量 、分享数量 、点赞数量 、作品标签 、作品ID 、作品标题 、作品描述 、作品类型 、发布时间 、最后更新时间 、作者昵称 、作者ID |
发布时间 作者昵称 作品标题 |
user_agent | str | Browser User Agent | Built-in Chrome User Agent |
cookie | str | RedNote web version cookie, No login required, non essential parameters! | None |
proxy | str | Set program proxy | null |
timeout | int | Request data timeout limit, in seconds | 10 |
chunk | int | Size of data chunk to fetch from the server each time when downloading files, in bytes | 2097152(2 MB) |
max_retry | int | Maximum number of retries when requesting data fails | 5 |
record_data | bool | Whether to save works data to a file, saved in SQLite format |
false |
image_format | str | Download format for text and image works files, supported formats: PNG , WEBP This parameter affects the API used when downloading images, not the fixed image format! |
PNG |
image_download | bool | Switch for downloading text and image works files | true |
video_download | bool | Switch for downloading video works files | true |
live_download | bool | Switch for downloading animated image files | false |
folder_mode | bool | Whether to store each works files in a separate folder; the folder name matches the file name | false |
download_record | bool | Do record the ID of successfully downloaded works? If enabled, the program will automatically skip downloading works with records | true |
language | str | Set program language. Currently supported: zh_CN , en_US |
zh_CN |
name_format instructions (Currently only supports Chinese values) :
收藏数量
: Number of Collections评论数量
: Number of Comments分享数量
: Number of Shares点赞数量
: Number of Likes作品标签
: Works Tags作品ID
: Works ID作品标题
: Works Title作品描述
: Works Description作品类型
: Works Type发布时间
: Publish Time最后更新时间
: Last Updated Time作者昵称
: Author Nickname作者ID
: Author ID
Additional Notes: The parameters user_agent
examples are provided for reference; Strongly recommend setting according to actual browser information!
Starting from version 2.2
, if there are no abnormalities in project functionality, there is no need to handle cookies separately!
- Open the browser (optional: start in incognito mode) and visit
https://www.xiaohongshu.com/explore
- Log in to your RedNote account (can be skipped)
- Press
F12
to open the developer tools - Select the
Network
tab - Check
Preserve log
- In the
Filter
input box, entercookie-name:web_session
- Select the
Fetch/XHR
filter - Click on any piece of works on the RedNote page
- In the
Network
tab, select any data packet (if no packets appear, repeat step 7) - Copy and paste the entire Cookie into the program or configuration file
XHS-Downloader will store the IDs of downloaded works in a database. When downloading the same works again, XHS-Downloader will automatically skip the file download (even if the works file does not exist). If you want to re-download the works file, please delete the corresponding works ID from the database and then use XHS-Downloader to download the works file again!
This feature is enabled by default. If it is turned off, XHS-Downloader will check if the file exists. If the file exists, it will skip the download!
This guide will walk you through forking this repository and executing GitHub Actions to automatically build and package the program based on the latest source code!
- Click the Fork button at the top right of the project repository to fork it to your personal GitHub account
- Your forked repository address will look like this:
https://github.com/your-username/this-repo
- Go to the page of your forked repository
- Click the Settings tab at the top
- Click the Actions tab on the right
- Click the General option
- Under Actions permissions, select Allow all actions and reusable workflows and click the Save button
- In your forked repository, click the Actions tab at the top
- Find the workflow named Manual Build of Executable File
- Click the Run workflow button on the right:
- Select the master or develop branch
- Click Run workflow
- On the Actions page, you can see the execution records of the triggered workflow
- Click on the run record to view detailed logs to check the build progress and status
- Once the build is complete, go to the corresponding run record page
- In the Artifacts section at the bottom of the page, you will see the built result file
- Click to download and save it to your local machine to get the built program
-
Resource Usage:
- GitHub provides free build environments for Actions, with a monthly usage limit (2000 minutes) for free-tier users
-
Code Modifications:
- You are free to modify the code in your forked repository to customize the build process
- After making changes, you can trigger the build process again to get your customized version
-
Stay in Sync with the Main Repository:
- If the main repository is updated with new code or workflows, it is recommended that you periodically sync your forked repository to get the latest features and fixes
A: Please ensure that you have followed the steps to Enable Actions. Otherwise, GitHub will prevent the workflow from running
A:
- Check the run logs to understand the cause of the failure
- Ensure there are no syntax errors or dependency issues in the code
- If the problem persists, please open an issue on the Issues page
A: Due to permission restrictions, you cannot directly trigger Actions from the main repository. Please use the forked repository to execute the build process
If XHS-Downloader has been helpful to you, please consider giving it a Star ⭐. Thank you for your support!
微信(WeChat) | 支付宝(Alipay) |
---|---|
If you are willing, you may consider making a donation to provide additional support for XHS-Downloader!
Welcome to contributing to this project! To keep the codebase clean, efficient, and easy to maintain, please read the following guidelines carefully to ensure that your contributions can be accepted and integrated smoothly.
- Before starting development, please pull the latest code from the
develop
branch as the basis for your modifications; this helps avoid merge conflicts and ensures your changes are based on the latest state of the project. - If your changes involve multiple unrelated features or issues, please split them into several independent commits or pull requests.
- Each pull request should focus on a single feature or fix as much as possible, to facilitate code review and testing.
- Follow the existing coding style; make sure your code is consistent with the style already present in the project.
- Write code that is easy to read; add appropriate annotation to help others understand your intentions.
- Each commit should include a clear and concise commit message describing the changes made. The commit message should follow this format:
<type>: <short description>
- When you are ready to submit a pull request, please prioritize submitting them to the
develop
branch; this provides maintainers with a buffer zone for additional testing and review before final merging into themaster
branch.
Reference materials:
- Author's Email:[email protected]
- Author's WeChat: Downloader_Tools
- Discord Community: Click to Join the Community
✨ Other Open Source Projects by the Author:
- TikTokDownloader(抖音 / TikTok):https://github.com/JoeanAmier/TikTokDownloader
- KS-Downloader(快手):https://github.com/JoeanAmier/KS-Downloader
JetBrains support active projects recognized within the global open-source community with complimentary licenses for non-commercial development.
- Users decide on their own how to use this project and bear the risks themselves. The author is not responsible for any losses, liabilities, or risks incurred by users in the use of this project
- The code and functionalities provided by the author of this project are developed based on existing knowledge and technology. The author strives to ensure the correctness and security of the code but does not guarantee that the code is completely error-free or defect-free.
- Users must strictly adhere to the provisions in GNU General Public License v3.0 , and appropriately mention the use of code adhering GNU General Public License v3.0.
- Under no circumstances shall users associate the author of this project, contributors, or other related parties with the user's usage behavior, or demand that they be held responsible for any losses or damages incurred by the user's use of this project.
- Users must independently study relevant laws and regulations when using the code and functionalities of this project and ensure that their usage is legal and compliant. Users are solely responsible for any legal liability and risks resulting from violations of laws and regulations.
- The author of this project will not provide a paid version of the XHS-Downloader project, nor will they offer any commercial services related to the XHS-Downloader project.
- Any secondary development, modification, or compilation of the program based on this project is unrelated to the original author. The original author is not responsible for any consequences related to secondary development or its results. Users should take full responsibility for any situations that may arise from secondary development on their own.
- https://github.com/encode/httpx/
- https://github.com/tiangolo/fastapi
- https://github.com/textualize/textual/
- https://github.com/omnilib/aiosqlite
- https://github.com/thewh1teagle/rookie
- https://github.com/carpedm20/emoji/
- https://github.com/asweigart/pyperclip
- https://github.com/lxml/lxml
- https://github.com/yaml/pyyaml
- https://github.com/pallets/click/
- https://github.com/encode/uvicorn
- https://github.com/Tinche/aiofiles