This library is no long actively maintained and is currently broken due to changes in Dubizzle's search URL structure. I may be doing a complete rewrite using either Scala or Python 3.5. Watch this space.
Dubizzle is an online classifieds website. This project aims to become a simple and complete scraping-based API for Dubizzle.
This is still a work in progress. There is much left to do until this becomes what it should be. I will however make sure that the master
branch functions as expected. Any help would be greatly appreciated, obviously.
Another thing to point out is that the main focus for the time being is on Dubizzle UAE and specifically Motors search within it.
- Requests
- BeautifulSoup 4
- Python 2.6+
I've added the package to PyPI, so it can now be easily installed using pip install dubizzle
.
>> import dubizzle
>> results = dubizzle.search(country='uae', city='dubai', section='motors', num_results=100)
>> print results
>>
[
{
'url': 'test',
'image': 'http://...',
'price': 10000,
'date': datetime.datetime(2013, 07, 20, 0, 0, 0),
'features': {
'Color': 'black',
'Doors': 4,
'Kilometers': 35000
},
...
},
...
]
# Find average price of year 2007 and above Nissan Altimas in Dubai
import dubizzle
results = dubizzle.search(keyword='altima', country='uae', city='dubai', section='motors',
category='cars', make='nissan', min_year=2007, num_results='all')
total_price, result_count = 0, len(results)
for result in results:
total_price += result['price']
print float(total_price) / result_count # Prints 39239.94
# Use the above results to find distribution of post-2007 Altima colors
from collections import Counter
colors = [result['features']['color'] for result in results]
distribution = Counter(colors)
print distribution['white'] # Prints 52
# Retrieve a single listing from Dubizzle UAE
import dubizzle
listing = dubizzle.listing('http://dubai.dubizzle.com/motors/used-cars/nissan/tiida/2013/9/25/easy-installment-new-and-used-cars-0563276-2/', country='uae')
print listing
country
- string; defaults to 'uae'keyword
- stringcity
- stringsection
- stringmin_price
andmax_price
- integerscategory
- stringadded_days
- choices are 0, 3, 7, 14, 30, 90, or 180num_results
- integer; 'all' fetches all results availabledetailed
(not implemented) - if set toTrue
, fetches full listing data for each result; slower, obviously
make
- a long list can be found inregions.py
min_year
andmax_year
- integersmin_kms
andmax_kms
- integersseller
- 'dealer' or 'owner'fuel
- 'gasoline', 'hybrid', 'diesel', or 'electric'cylinders
- 3, 4, 5, 6, 8, 10, or 12transmission
- 'automatic' or 'manual'
url
- string, requiredcountry
- string; defaults to 'uae'
Please use the Issues page for that.