You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Selenium is awesome, but I am trying to use this with requests and lxml. It seems like it is solving things properly, but I am having trouble submitting the solution. Could you add some example usage to the readme?
This is what I am doing right now using requests/lxml:
import random
import requests
from lxml import html
from fake_useragent import UserAgent
import csv
import time
import os
from amazoncaptcha import AmazonCaptcha
amazon_captcha_xpath = '//h4[contains(text(), "Enter the characters you see below")]'
captcha_image_xpath = '//div[@class="a-row a-text-center"]/img/@src'
def get_link(url, session=None, user_agent=None, proxy=None):
"""
Fetches the HTML content from the provided URL.
Returns a parsed lxml HTML tree that can be used with XPath.
"""
ua = UserAgent()
headers = {'User-Agent': ua.google if not user_agent else user_agent}
proxies = {'http': proxy, 'https': proxy} if proxy else {}
if session is None:
session = requests.Session()
response = session.get(url, headers=headers, proxies=proxies)
tree = html.fromstring(response.content)
return tree, session
# code that does stuff assuming there is no captcha. Leaving it out because it's long and probably not helpful.
if tree.xpath(amazon_captcha_xpath):
bot_check = True
print(html.tostring(tree).decode())
print('[ Captcha Detected! ]')
captcha_image_link = tree.xpath(captcha_image_xpath)[0]
print(captcha_image_link)
solution = AmazonCaptcha.fromlink(captcha_image_link).solve()
print(f'Solution is: {solution}')
print('Pausing to seem human...')
time.sleep(random.randrange(3, 15))
print('Submitting solution')
# THIS IS THE PART TO SUBIMT IT THAT DOES NOT SEEM TO WORK
amzn = tree.xpath('//input[@name="amzn"]/@value')[0]
amzn_r = tree.xpath('//input[@name="amzn-r"]/@value')[0]
data = {
'amzn': amzn,
'amzn-r': amzn_r,
'field-keywords': solution
}
response = response = session.post('https://www.amazon.com/errors/validateCaptcha', data=data)
# check response
print(response.status_code) # always comes back as 503
#print(response.text)
#input('PAUSED')
```
The text was updated successfully, but these errors were encountered:
gotScraping is a request like a library. The thing it is a get request, requires referer and and a followup URL to redirect after the captcha resolves.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Selenium is awesome, but I am trying to use this with requests and lxml. It seems like it is solving things properly, but I am having trouble submitting the solution. Could you add some example usage to the readme?
This is what I am doing right now using requests/lxml:
The text was updated successfully, but these errors were encountered: