Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It seems that it is no longer possible to scrape housing rental data #52

Open
TARTAR4600 opened this issue Dec 8, 2024 · 0 comments
Open

Comments

@TARTAR4600
Copy link

It is possible to retrieve housing sales data from the website, but the following error occurs when attempting to access rental data:
`---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[4], line 4
1 from rightmove_webscraper import RightmoveData
3 url = "https://www.rightmove.co.uk/property-to-rent/find.html?searchLocation=W10&useLocationIdentifier=true"
----> 4 rm = RightmoveData(url)

File ~\AppData\Local\Programs\Python\Python313\Lib\site-packages\rightmove_webscraper\scraper.py:32, in RightmoveData.init(self, url, get_floorplans)
30 self._url = url
31 self._validate_url()
---> 32 self._results = self._get_results(get_floorplans=get_floorplans)

File ~\AppData\Local\Programs\Python\Python313\Lib\site-packages\rightmove_webscraper\scraper.py:220, in RightmoveData._get_results(self, get_floorplans)
217 results = self._get_page(self._first_page, get_floorplans=get_floorplans)
219 # Iterate through all pages scraping results:
--> 220 for p in range(1, self.page_count + 1, 1):
221
222 # Create the URL of the specific results page:
223 p_url = f"{str(self.url)}&index={p * 24}"
225 # Make the request:

File ~\AppData\Local\Programs\Python\Python313\Lib\site-packages\rightmove_webscraper\scraper.py:143, in RightmoveData.page_count(self)
138 @Property
139 def page_count(self):
140 """Returns the number of result pages returned by the search URL. There
141 are 24 results per page. Note that the website limits results to a
142 maximum of 42 accessible pages."""
--> 143 page_count = self.results_count_display // 24
144 if self.results_count_display % 24 > 0:
145 page_count += 1

File ~\AppData\Local\Programs\Python\Python313\Lib\site-packages\rightmove_webscraper\scraper.py:136, in RightmoveData.results_count_display(self)
134 tree = html.fromstring(self._first_page)
135 xpath = """//span[@Class="searchHeader-resultCount"]/text()"""
--> 136 return int(tree.xpath(xpath)[0].replace(",", ""))

IndexError: list index out of range`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant