A simple and powerful proxy scraping and validation tool that helps you extract, validate, and manage proxies from various sources. Perfect for developers and enthusiasts who need to work with proxies in an efficient way! π
- Scrape Proxies: Fetch proxies from a list of URLs.
- Validate Proxies: Ensure the proxies meet specific patterns and are valid.
- Live Tracking: Monitor success and failure counts in real-time.
- Downloadable Results: Easily download the validated proxy list.
- Process Control: Start, stop, and manage the scraping process with ease.
- PHP: Version 7.4 or higher.
- JavaScript: Modern browser support with ES6 compatibility.
- Server: Apache or Nginx with write permissions enabled for the project directory.
- Additional Tools: cURL must be enabled on your server.
-
Clone the Repository:
git clone https://github.com/yourusername/proxy-scraper.git
-
Navigate to the Project Directory:
cd proxy-scraper
-
Set Permissions: Ensure the
proxies.txt
file is writable:chmod 666 proxies.txt
-
Start Your Server:
- If using XAMPP:
- Place the project folder in the
htdocs
directory. - Start Apache and MySQL from the XAMPP control panel.
- Place the project folder in the
- If using cPanel:
- Upload the project folder to your public directory.
- If using XAMPP:
-
Open the project in your browser by navigating to:
http://localhost/proxy-scraper/
-
Interface Overview:
- Click the Start button to begin the scraping process.
- View live counts for successful and failed links, total links, and unique proxies.
- Use the Stop button to halt the process at any time.
-
Download Results:
- Once the process is complete (or stopped), a Download Proxies button will appear.
- Click it to download the
proxies.txt
file.
proxy-scraper/
βββ assets/
β βββ links.json # Input file containing the list of URLs to scrape
β βββ script.js # Frontend JavaScript for managing the process
β βββ style.css # Styling for the interface
β βββ proxy_count.php # Returns the count of saved proxies
βββ index.html # Main interface
βββ save_link.php # Handles link validation and proxy saving
βββ proxies.txt # Output file for validated proxies
βββ README.md # Documentation
-
links.json: Add your list of URLs in the following format:
{ "links": [ "http://example.com/proxies1.txt", "http://example.com/proxies2.txt" ] }
-
save_link.php:
- Customize the regex pattern for proxy validation if needed:
$proxyPattern = '/^([a-zA-Z0-9.-]+):([0-9]{1,5})$/';
- Customize the regex pattern for proxy validation if needed:
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add new feature"
- Push to your branch:
git push origin feature-name
- Open a pull request.
This project is licensed under the MIT License.
For any issues, feel free to open an issue on GitHub or contact me at [email protected].
Happy Scraping! π