forked from OrthoLoess/fivestar
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Ed Landamore
committed
Dec 10, 2020
1 parent
d835af3
commit f84a674
Showing
4 changed files
with
416 additions
and
2,516 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,29 +1 @@ | ||
name: Python package | ||
|
||
on: | ||
push: | ||
branches: [ master ] | ||
pull_request: | ||
branches: [ master ] | ||
|
||
jobs: | ||
build: | ||
|
||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
python-version: [3.6, 3.7] | ||
|
||
steps: | ||
- uses: actions/checkout@v2 | ||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v1 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install -r requirements.txt | ||
- name: Install package and test | ||
run: | | ||
make install test clean |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,70 +1,64 @@ | ||
# Data analysis | ||
- Document here the project: 5-star | ||
- Description: Project Description | ||
- Data Source: | ||
- Type of analysis: | ||
# Project Overview | ||
|
||
Please document the project the better you can. | ||
The goal of the project was to explore Airbnb listings in London, from a host’s perspective, and predict guest review scores based on a certain property's attributes. | ||
|
||
# Stratup the project | ||
Ultimately this may just have the potential to become the one-stop shop tool for an Airbnb host when managing and optimising their listing offering. | ||
|
||
The initial setup. | ||
Data source: Inside Airbnb | ||
|
||
Create virtualenv and install the project: | ||
```bash | ||
$ sudo apt-get install virtualenv python-pip python-dev | ||
$ deactivate; virtualenv ~/venv ; source ~/venv/bin/activate ;\ | ||
pip install pip -U; pip install -r requirements.txt | ||
``` | ||
Status - completed (version 1) | ||
|
||
Unittest test: | ||
```bash | ||
$ make clean install test | ||
``` | ||
## Team | ||
Miles Tomlinson - [GH profile](https://github.com/milestommo)<br> | ||
Ed Landamore - [GH profile](https://github.com/OrthoLoess)<br> | ||
Elsa Lebrun-Grandie - [GH profile](https://github.com/ElsaLGF)<br> | ||
Leone Cavicchia - [GH profile](https://github.com/leoncav) | ||
|
||
Check for 5-star in gitlab.com/{group}. | ||
If your project is not set please add it: | ||
## Methods | ||
Data exploration<br> | ||
Inferential statistics<br> | ||
Data visualisation<br> | ||
Machine learning/predictive modelling<br> | ||
Natural Language Processing<br> | ||
App user interface design | ||
|
||
- Create a new project on `gitlab.com/{group}/5-star` | ||
- Then populate it: | ||
## Tech | ||
SQL<br> | ||
Python (Jupyter)<br> | ||
Pandas<br> | ||
Numpy<br> | ||
Matplotlib<br> | ||
Seaborn<br> | ||
Scikit Learn<br> | ||
NLTK<br> | ||
Miro Scratchpad<br> | ||
Streamlit / HTML | ||
|
||
```bash | ||
$ ## e.g. if group is "{group}" and project_name is "5-star" | ||
$ git remote add origin [email protected]:{group}/5-star.git | ||
$ git push -u origin master | ||
$ git push -u origin --tags | ||
``` | ||
# Project Description | ||
- Inspired by the wealth of data provided by Inside Airbnb, we chose to explore a listing’s review score and its relationship to the features that a property offers its guests | ||
- The early stage of the project prioritised on what the end product would look like and how it could offer real value to hosts who wanted more insight on which features to address or install in order to improve a guest’s experience | ||
- Miro was used to design the app wireframes and visualise the user flow | ||
- The next stage centred around data understanding and exploration. With so much data collated for each listing (c 90 dataframe columns), the trick was to shortlist the most potentially influential features for the predictive model by undergoing multiple phases of feature prioritisation | ||
- Pandas and Matplotlib were used to understand/visualise the make up of each feature while a dummy model regressor was used to highlight the more influential features in relation to the review score | ||
- In the modelling phase, we put K Means clustering to good use along with manual grouping in Pandas to identify groups of listings that share a common set of fixed attributes that a host wouldn’t necessarily be able to change (eg borough location, number of bedrooms, property type, etc). This allowed the app to offer the functionality of being able to compare vs other hosts with similar properties | ||
- The offering to the host was enhanced by applying NLP methods to the verbatim review feedback left by guests, with a focus on the top rated listings in each group allowing the host to leverage the qualitative insights available to them | ||
- The final linear regression model used a set of features chosen to minimise multicollinearity. It used l2 regularisation to help control overfitting on the training set. | ||
- Finally, all of this led to the creation of an interactive front end that would provide information about a host’s listing, the group a host belonged to and how the most influential features could be dialled up or down to positively, or negatively, affect the review score. | ||
|
||
Functionnal test with a script: | ||
```bash | ||
$ cd /tmp | ||
$ 5-star-run | ||
``` | ||
# Install | ||
Go to `gitlab.com/{group}/5-star` to see the project, manage issues, | ||
setup you ssh public key, ... | ||
|
||
Create a python3 virtualenv and activate it: | ||
|
||
# Startup the project | ||
|
||
The initial setup. | ||
|
||
Create virtualenv and install the project: | ||
```bash | ||
$ sudo apt-get install virtualenv python-pip python-dev | ||
$ deactivate; virtualenv -ppython3 ~/venv ; source ~/venv/bin/activate | ||
$ deactivate; virtualenv ~/venv ; source ~/venv/bin/activate ;\ | ||
pip install pip -U; pip install -r requirements.txt | ||
``` | ||
|
||
Clone the project and install it: | ||
Run the streamlit server with | ||
```bash | ||
$ git clone gitlab.com/{group}/5-star | ||
$ cd 5-star | ||
$ pip install -r requirements.txt | ||
$ make clean install test # install and test | ||
$ streamlit run fivestar/five-star.py | ||
``` | ||
Functionnal test with a script: | ||
```bash | ||
$ cd /tmp | ||
$ 5-star-run | ||
``` | ||
|
||
# Continus integration | ||
## Github | ||
Every push of `master` branch will execute `.github/workflows/pythonpackages.yml` docker jobs. | ||
## Gitlab | ||
Every push of `master` branch will execute `.gitlab-ci.yml` docker jobs. |
Oops, something went wrong.