A simple and powerful proxy scraping and validation tool that helps you extract, validate, and manage proxies from various sources. Perfect for developers and enthusiasts who need to work with proxies in an efficient way! 🌟
- Scrape Proxies: Fetch proxies from a list of URLs.
- Validate Proxies: Ensure the proxies meet specific patterns and are valid.
- Live Tracking: Monitor success and failure counts in real-time.
- Downloadable Results: Easily download the validated proxy list.
- Process Control: Start, stop, and manage the scraping process with ease.
- PHP: Version 7.4 or higher.
- JavaScript: Modern browser support with ES6 compatibility.
- Server: Apache or Nginx with write permissions enabled for the project directory.
- Additional Tools: cURL must be enabled on your server.
-
Clone the Repository:
git clone https://github.com/yourusername/proxy-scraper.git
-
Navigate to the Project Directory:
cd proxy-scraper
-
Set Permissions: Ensure the
proxies.txt
file is writable:chmod 666 proxies.txt
-
Start Your Server:
- If using XAMPP:
- Place the project folder in the
htdocs
directory. - Start Apache and MySQL from the XAMPP control panel.
- Place the project folder in the
- If using cPanel:
- Upload the project folder to your public directory.
- If using XAMPP:
-
Open the project in your browser by navigating to:
http://localhost/proxy-scraper/
-
Interface Overview:
- Click the Start button to begin the scraping process.
- View live counts for successful and failed links, total links, and unique proxies.
- Use the Stop button to halt the process at any time.
-
Download Results:
- Once the process is complete (or stopped), a Download Proxies button will appear.
- Click it to download the
proxies.txt
file.
proxy-scraper/
├── assets/
│ ├── links.json # Input file containing the list of URLs to scrape
│ ├── script.js # Frontend JavaScript for managing the process
│ ├── style.css # Styling for the interface
│ └── proxy_count.php # Returns the count of saved proxies
├── index.html # Main interface
├── save_link.php # Handles link validation and proxy saving
├── proxies.txt # Output file for validated proxies
└── README.md # Documentation
-
links.json: Add your list of URLs in the following format:
{ "links": [ "http://example.com/proxies1.txt", "http://example.com/proxies2.txt" ] }
-
save_link.php:
- Customize the regex pattern for proxy validation if needed:
$proxyPattern = '/^([a-zA-Z0-9.-]+):([0-9]{1,5})$/';
- Customize the regex pattern for proxy validation if needed:
- Fork the repository.
- Create a new branch:
git checkout -b feature-name
- Commit your changes:
git commit -m "Add new feature"
- Push to your branch:
git push origin feature-name
- Open a pull request.
This project is licensed under the MIT License.
For any issues, feel free to open an issue on GitHub or contact me at [email protected].
Happy Scraping! 🎉