Skip to content
This repository has been archived by the owner on Oct 25, 2019. It is now read-only.

start server via docker #373

Merged
merged 9 commits into from
Jun 20, 2019
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,5 @@
.cache
__pycache__
*.pyc

/app-dev.cfg
6 changes: 3 additions & 3 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ ARG install_dev
COPY requirements.dev.txt ./
RUN if [ "${install_dev}" = "y" ]; then pip install -r requirements.dev.txt; fi

COPY --from=client --chown=elife:elife /home/node/client/ ${PROJECT_FOLDER}/client/
COPY --from=client --chown=elife:elife /home/node/client/dist/ ${PROJECT_FOLDER}/client/dist/
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need dist when running the server (no need to copy node_modules & co)

COPY --chown=elife:elife peerscout/ ${PROJECT_FOLDER}/peerscout/
COPY --chown=elife:elife server.sh ${PROJECT_FOLDER}/
COPY --chown=elife:elife update-data-and-reload.sh ${PROJECT_FOLDER}/
Expand All @@ -35,7 +35,7 @@ COPY --chown=elife:elife app-defaults.cfg ${PROJECT_FOLDER}/

USER root
RUN mkdir .data && chown www-data:www-data .data
RUN mkdir logs && chown www-data:www-data logs
RUN mkdir logs && chown www-data:www-data logs && chmod -R a+w logs
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be www-data and elife writing to the logs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be just www-data if we run everything in the container? But fine to keep this backward compatibility if needed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image is (or would be) used to run the server as well as the "ETL" scripts (migrate schema, load data etc.). The latter is currently meant to be run via the elife user. But we could use www-data for both if you think that would be better.


USER www-data
CMD ["venv/bin/python"]
CMD ["/srv/peerscout/server.sh"]
2 changes: 2 additions & 0 deletions Dockerfile.client
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,5 @@ COPY --chown=node:node client/package.json client/package-lock.json ./
RUN npm ci

COPY --chown=node:node client ./

RUN npm run bundle
66 changes: 61 additions & 5 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ DOCKER_COMPOSE_CI = docker-compose -f docker-compose.yml
DOCKER_COMPOSE = $(DOCKER_COMPOSE_DEV)


.PHONY: all test clean build
.PHONY: all test clean build logs


dev-venv:
Expand Down Expand Up @@ -31,7 +31,11 @@ client-shell: client-build


server-build:
$(DOCKER_COMPOSE) build peerscout
# only dev compose file has "init" service defined
@if [ "$(DOCKER_COMPOSE)" = "$(DOCKER_COMPOSE_DEV)" ]; then \
$(DOCKER_COMPOSE) build init; \
fi
$(DOCKER_COMPOSE) build client peerscout


server-build-dev:
Expand All @@ -42,12 +46,64 @@ server-test: server-build-dev
$(DOCKER_COMPOSE) run --rm peerscout-dev ./project_tests.sh


server-shell: server-build
server-dev-shell: server-build-dev
$(DOCKER_COMPOSE) run --rm peerscout-dev bash


db-start:
$(DOCKER_COMPOSE) up -d db


migrate-schema: server-build
$(DOCKER_COMPOSE) run --rm --user elife peerscout ./migrate-schema.sh


update-data-and-reload: server-build
$(DOCKER_COMPOSE) run --rm --user elife peerscout ./update-data-and-reload.sh


fix-data-permissions:
mkdir -p .data
chmod -R a+w .data


start: server-build
$(DOCKER_COMPOSE) up -d peerscout


stop:
$(DOCKER_COMPOSE) stop peerscout


www-shell:
$(DOCKER_COMPOSE) exec peerscout bash


www-shell-run:
$(DOCKER_COMPOSE) run --rm peerscout bash


server-dev-shell: server-build-dev
$(DOCKER_COMPOSE) run --rm peerscout-dev bash
elife-shell:
$(DOCKER_COMPOSE) exec --user elife peerscout bash


elife-shell-run:
$(DOCKER_COMPOSE) run --rm --user elife peerscout bash


restart: stop start


down:
$(DOCKER_COMPOSE) down


clean:
$(DOCKER_COMPOSE) down -v


logs:
$(DOCKER_COMPOSE) logs -f


build: .PHONY
Expand Down
114 changes: 78 additions & 36 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,79 @@
# Pre-requisites
# PeerScout

## Configuration

### AWS Credentials and Configuration

Create `~/.aws/credentials` and populate the credentials, e.g.:

```ini
[default]
aws_access_key_id=<key id>
aws_secret_access_key=<access key>
```

You may also create `~/.aws/config`:

```ini
[default]
region=<region>
```

### App Configuration

Copy `app-example.cfg` to `app.cfg` and make the necessary configuration changes.

When using Docker for development, the configuration file is `app-dev.cfg` instead.
It would very much have the same config but the database should point to the `db` service (defined in [docker-compose.override.yml](docker-compose.override.yml)).

## Development with Docker

### Pre-requisites (with Docker)

* [Docker](https://www.docker.com/) and [Docker Compose](https://docs.docker.com/compose/)

### Create / Migrate Schema (with Docker)

```bash
make migrate-schema
```

### Populate Database (with Docker)

This will download updates from AWS and populate the database with it. This may take a while. It will also download data from Crossref, which might fail, which is 'okay'.

```bash
make update-data-and-reload
```

### Data Directory

Downloaded files and cache are stored to the `.data` directory. Which in development is mounted to the host. To make the directory writable by the Docker user (`elife`) call (writable to all users):

```bash
make fix-data-permissions
```

### Start

```bash
make start
```

## Development without Docker

### Pre-requisites (without Docker)

* Python 3 (including dev dependencies, e.g. `python3-dev` on Ubunutu / Debian)
* system dev dependencies (e.g. `build-essential`)
* [PostgreSQL](https://www.postgresql.org/) (although [SQLite](https://sqlite.org/) may work too)
* [Node.js](https://nodejs.org/) and [npm](https://www.npmjs.com/)

### Setup (without Docker)

# Setup
#### Python Dependencies

## Python Dependencies

### A) Using Local Virtual Environment
##### A) Using Local Virtual Environment

If you want to use a separate virtual environment (`venv`) for this project.

Expand All @@ -19,7 +82,7 @@ If you want to use a separate virtual environment (`venv`) for this project.
source venv/bin/activate
```

### B) Using Your Own Virtual Environment
##### B) Using Your Own Virtual Environment

Assuming you already created and switched to your own virtual environment.

Expand All @@ -33,7 +96,7 @@ Download SpaCy models (~1 GB).
python -m spacy.en.download all
```

## Database
#### Database

The recommended database is [PostgreSQL](https://www.postgresql.org/).

Expand All @@ -46,50 +109,29 @@ sudo -u postgres psql -c "alter user reviewer_suggestions_user with encrypted pa
sudo -u postgres psql -c "grant all privileges on database reviewer_suggestions_db to reviewer_suggestions_user;"
```

## AWS Credentials and Configuration

Create `~/.aws/credentials` and populate the credentials, e.g.:

```ini
[default]
aws_access_key_id=<key id>
aws_secret_access_key=<access key>
```

You may also create `~/.aws/config`:

```ini
[default]
region=<region>
```

## App Configuration

Copy `app-example.cfg` to `app.cfg` and make the necessary configuration changes.

## Create / Update Database Schema
#### Create / Update Database Schema

```bash
python -m peerscout.preprocessing.migrateSchema
```

## Populate Database
#### Populate Database

This will download updates from AWS and populate the database with it. This may take a while. It will also download data from Crossref, which might fail, which is 'okay'.

```bash
python -m peerscout.preprocessing.updateDataAndReload
```

## Compile Client
#### Compile Client

```bash
cd client
npm install
npm run bundle
```

# Start Server
### Start Server

```bash
python -m peerscout.server
Expand All @@ -99,7 +141,7 @@ Then go to [http://localhost:8080/](http://localhost:8080/). (If you are getting

The server will provide the REST API [http://localhost:8080/api/](http://localhost:8080/api/) and serve the static client bundle.

# Start Client Dev Server
### Start Client Dev Server

Use this option to develop the client, in addition to the python server (which will still provide the API).

Expand All @@ -110,15 +152,15 @@ npm start

Then server will be availabe under [http://localhost:8081/](http://localhost:8081/).

# Tests
### Tests

## Run All Tests
#### Run All Tests

```bash
./project_tests.sh
```

## Python Tests
#### Python Tests

```bash
pytest
Expand Down
41 changes: 41 additions & 0 deletions docker-compose.override.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
version: '3'

services:
init:
build:
context: ./docker/init
dockerfile: Dockerfile
image: elifesciences/peerscout_init:${IMAGE_TAG}
volumes:
- config-aws:/home/elife/volume-config-aws
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some weird copying from one folder to the other to avoid ownership clashes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem with that, but why needing a volume if the copying is executed every time the container starts anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find that when the container is not run as root or the developer's user, then just mounting the credentials won't work because the permissions are meant to set up that only the user can read them. This makes the developers credentials available via the volume. (The init container is run as root, and therefore has permissions)

One alternative would maybe be to run the container using the developer's user (which seem to also be more complicated than it should). I experimented with a few approaches across the projects. Not sure which one is best. Any suggestions?

- ~/.aws:/home/elife/user-config-aws

db:
image: postgres:9.6
restart: always
volumes:
- postgres-data:/var/lib/postgresql/data
environment:
POSTGRES_DB: peerscout_db
POSTGRES_USER: peerscout_user
POSTGRES_PASSWORD: peerscout_password
healthcheck:
test: ["CMD", "bash", "-c", "echo > /dev/tcp/localhost/5432"]
interval: 10s
timeout: 10s
retries: 5
ports:
- "9432:5432"

peerscout:
depends_on:
- client
- db
- init
volumes:
- ./app-dev.cfg:/srv/peerscout/app.cfg
- ./.data:/srv/peerscout/.data
- config-aws:/home/elife/.aws

volumes:
config-aws:
6 changes: 5 additions & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,10 @@ services:
args:
commit: ${IMAGE_TAG}
image: elifesciences/peerscout:${IMAGE_TAG}
command: /bin/sh -c exit 0
depends_on:
- client
ports:
- "8080:8080"

peerscout-base-dev:
build:
Expand All @@ -41,3 +42,6 @@ services:
command: /bin/sh -c exit 0
depends_on:
- peerscout-base-dev

volumes:
postgres-data:
5 changes: 5 additions & 0 deletions docker/init/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
FROM busybox:1.30.1

COPY init.sh /bin/

CMD /bin/init.sh
14 changes: 14 additions & 0 deletions docker/init/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
#!/bin/sh

set -e

# elife user id
USER_ID=1000
echo 'changing ownership to $USER_ID, and...'

echo 'copying aws credentials...'

cp -r /home/elife/user-config-aws/* /home/elife/volume-config-aws
chown -R $USER_ID:$USER_ID /home/elife/volume-config-aws

echo 'done'