Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile staging and speed improvements #141

Closed
wants to merge 31 commits into from
Closed
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ccf5f46
Vulnerability scanner configs
Nava-JoshLong Feb 15, 2023
0513bfa
Tweaked Makefile
Nava-JoshLong Feb 15, 2023
e2d8f74
Multi-stage Dockerfile
Nava-JoshLong Feb 15, 2023
b6abc35
Poetry groups and speed up
Nava-JoshLong Feb 15, 2023
3ece7bc
Added more documentation
Nava-JoshLong Feb 15, 2023
e5eda21
Removed unnecessary commands
Nava-JoshLong Feb 15, 2023
0fed0dc
Fixing docker complaint
Nava-JoshLong Feb 15, 2023
f402840
Removed redundant Makefile
Nava-JoshLong Feb 15, 2023
d45548b
Revert poetry script tweak
Nava-JoshLong Feb 15, 2023
091072f
New comments from PR discussion
Nava-JoshLong Feb 15, 2023
1344dde
Updates from comments
Nava-JoshLong Feb 16, 2023
6101c4f
Added newline
Nava-JoshLong Feb 16, 2023
164d396
Cleanup from comments
Nava-JoshLong Feb 16, 2023
3ee6d9d
Removed unneeded file from binary def
Nava-JoshLong Feb 16, 2023
886fea2
Tweaked comments
Nava-JoshLong Feb 16, 2023
cff0654
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Feb 21, 2023
7650c91
Typos and removed unneeded things
Nava-JoshLong Feb 22, 2023
21227bb
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Feb 22, 2023
01b6e49
Update Dockerfile
Nava-JoshLong Feb 22, 2023
834c169
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Feb 27, 2023
f023479
Swap from short flags
Nava-JoshLong Mar 7, 2023
2a9f908
Added strict to zip calls
Nava-JoshLong Mar 7, 2023
11ab3c1
Trying to fix linting and testing errors
Nava-JoshLong Mar 7, 2023
c2b7e96
Roll back changes from last commit
Nava-JoshLong Mar 7, 2023
63f93ac
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Mar 14, 2023
e8981b3
Testing fixes
Nava-JoshLong Mar 14, 2023
a6fa8d9
More testing fixes
Nava-JoshLong Mar 14, 2023
90d3f47
ci-app fixes
Nava-JoshLong Mar 14, 2023
a09fa6b
apispec tweaks
Nava-JoshLong Mar 14, 2023
f76f903
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Mar 21, 2023
0d09b4f
Apispec tweaks for test
Nava-JoshLong Mar 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .dockleconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# This file is allows you to specify a list of files that is acceptable to Dockle
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
# To allow multiple files, use a list of names. example below
# DOCKLE_ACCEPT_FILES=file1,path/to/file2,file3/path,etc
# https://github.com/goodwithtech/dockle#accept-suspicious-environment-variables--files--file-extensions
# The apiflask/settings file is a stub file that apiflask creates, and has no sensitive data in. We are ignoring it since it is unused
DOCKLE_ACCEPT_FILES=app/.venv/lib/python3.11/site-packages/apiflask/settings.py
26 changes: 26 additions & 0 deletions .grype.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# List of vulnerabilities to ignore for the anchore scan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file looks to be the same as the one on template-infra so i think we can remove it, since it'll be copied over from template-infra when the project installs template-infra

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually need to tweak the infra version since it has a python specific recast in it, the certifi line. Should I remove this file and leave it in infra, or keep it and tweak the infra version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see. so you're suggesting that we move the python specific thing from the infra repo to this repo? that makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as we continue to build in this repo, there will probably be other things we need to recast or safelist that are just for this repo, so I think we should keep the config files here since they'd fix those issues

# https://github.com/anchore/grype#specifying-matches-to-ignore
# More info can be found in the docs/infra/vulnerability-management.md file

# Please add safelists in the following format to make it easier when checking
# Package/module name: URL to vulnerability for checking updates
# Versions: URL to the version history
# Dependencies: Name of any other packages or modules that are dependent on this version
# Link to the dependencies for ease of checking for updates
# Issue: Why there is a finding and why this is here or not been removed
# Last checked: Date last checked in scans
# - vulnerability: The-CVE-or-vuln-id # Remove comment at start of line

ignore:
# These settings ignore any findings that fall into these categories
- fix-state: not-fixed
- fix-state: wont-fix
- fix-state: unknown

# Certifi: https://cve.report/CVE-2022-23491
# Versions: https://pypi.org/project/certifi/#history
# Dependencies: None
# Issue: No update in the last month to resolve the finding, however the finding is showing the latest
# version being the same one as the finding, with a 0 - 2022.12.7 (latest) vs 2022.12.07 (non-existent version)
# Last checked: 02-02-2023
- vulnerability: GHSA-43fp-rhv2-5gv8
10 changes: 10 additions & 0 deletions .hadolint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# List of settings and ignore or safelist findings for the hadolint scanner

# For more information on any settings you can specify, see the actions' documentation here
# https://github.com/hadolint/hadolint#configure
override:
info:
# Casts the apt-get install <package>=<version> finding as info
# We have this set since there is no way to specify version for
# build-essentials in the Dockerfile
- DL3008
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
9 changes: 9 additions & 0 deletions .trivyignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# List of vulnerabilities to ignore for the trivy scan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm wondering if we can get rid of this file too since it looks to be the same as on template-infra

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we continue to build in this repo, there will probably be other things we need to recast or safelist that are just for this repo, so I think we should keep the config files here since they'd fix those issues

# Please add safelists in the following format to make it easier when checking
# Package/module name: URL to vulnerability for checking updates
# Versions: URL to the version history
# Dependencies: Name of any other packages or modules that are dependent on this version
# Link to the dependencies for ease of checking for updates
# Issue: Why there is a finding and why this is here or not been removed
# Last checked: Date last checked in scans
#The-CVE-or-vuln-id # Remove comment at start of line
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,13 @@ rm -fr template-application-flask
```

Now you're ready to [get started](/docs/app/getting-started.md).

## Vulnerability scanner config files
In this repo are four configuration files for vulnerability scanner from the [template-infra repo](https://github.com/navapbc/template-infra). These files include

- `.dockleconfig` -> Configuration for the `Dockle` scanner
- `.grype` -> Configuration for the `Anchore` scanner
- `.hadolint` -> Configuration for the `Hadolint` scanner
- `.trivyignore` -> Configuration for the `Trivy` scanner

For more information about the configuration files and available options, see the [vulnerability README](https://github.com/navapbc/template-infra/blob/main/docs/infra/vulnerability-management.md) in the infra repo
122 changes: 109 additions & 13 deletions app/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,29 +2,57 @@
# https://hub.docker.com/_/python

# The build stage that will be used to deploy to the various environments
# needs to be called `release` in order to integrate with the repo's
# top-level Makefile
FROM python:3-slim AS release

# Install poetry, the package manager.
# https://python-poetry.org
RUN pip install poetry
# This is what the other stages will call as their base
FROM python:3.11-slim-bullseye AS build
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

# Keep container packages up-to-date.
RUN apt-get update \
&& apt-get install --no-install-recommends --yes \
build-essential \
libpq-dev \
postgresql

# Install poetry, the package manager - https://python-poetry.org
# We use --no-cache-dir here because the pip caches normally, and if
# we don't cache, it will reduce the overall image size
# The version is pinned so that there is a known good configuration
# that we have better control over vs a range that can be used
RUN pip install --no-cache-dir poetry==1.3.1
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

# setup user stuff
ARG RUN_UID
ARG RUN_USER

# The following logic creates the RUN_USER home directory and the directory where
# we will be storing the application in the image. This runs when the user is not root
RUN : "${RUN_USER:?RUN_USER and RUN_UID need to be set and non-empty.}" && \
[ "${RUN_USER}" = "root" ] || \
(useradd -mU --home "/home/${RUN_USER}" --uid ${RUN_UID} "${RUN_USER}" \
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
&& mkdir /app \
&& chown -R ${RUN_UID} "/home/${RUN_USER}" /app)

USER ${RUN_USER}

#-------------------------------------------------------------------------------
# Development build environment
#-------------------------------------------------------------------------------
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
FROM build as dev
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

# Runtime arguement for what user will the bulk of this stage run as
ARG RUN_USER

# In between ARG RUN_USER and USER ${RUN_USER}, the user is still root
# If there is anything that needs to be ran as root, this is the spot

# Change the user from root to ${RUN_USER}
USER ${RUN_USER}

# Set the application working directory.
WORKDIR /srv
WORKDIR /app

COPY pyproject.toml poetry.lock ./
COPY . /app
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there was a performance gain for local docker builds to split up the "COPY dependencies" layer from the "COPY source files layer", should we keep that for the dev stage (and add a comment about it)? I think even if it saves a few seconds it could add up over the course of many local builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be a slight performance gain since it would be cached, but we wouldn't be updating the pyproject or poetry files super often, since we should be using dependabot to update these files. I can split it back up, but IDK if it will save much time overall

RUN poetry config virtualenvs.in-project false && poetry env use python
RUN poetry install --no-root

# Copy application files.
COPY . /srv
RUN poetry install --no-root --with dev

# Set the host to 0.0.0.0 to make the server available external
# to the Docker container that it's running in.
Expand All @@ -33,5 +61,73 @@ ENV HOST=0.0.0.0
# Install application dependencies.
# https://python-poetry.org/docs/basic-usage/#installing-dependencies
RUN poetry install

# Run the application.
CMD ["poetry", "run", "python", "-m", "api"]

#-------------------------------------------------------------------------------
# Non-development build environment
#-------------------------------------------------------------------------------
FROM build as app-build
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

# Runtime arguement for what user will the bulk of this stage run as
ARG RUN_USER

# In between ARG RUN_USER and USER ${RUN_USER}, the user is still root
# If there is anything that needs to be ran as root, this is the spot

# Change the user from root to ${RUN_USER}
USER ${RUN_USER}

# Set the application working directory.
WORKDIR /app

# Install dependencies
# Use a project-local virtualenv to be easy to find
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
ENV POETRY_VIRTUALENVS_IN_PROJECT true
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
COPY . /app
RUN poetry install --no-root --without dev

# Install application dependencies
# The name of the wheel file comes from pyproject.toml
RUN poetry build --format wheel && poetry run pip install 'dist/template_application_flask-0.1.0-py3-none-any.whl'
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

#-------------------------------------------------------------------------------
# Production Application Image only
#-------------------------------------------------------------------------------
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
# Now copy over just the installed components into the final image
FROM scratch as release

# Runtime arguement for what user will the bulk of this stage run as
ARG RUN_USER

# In between ARG RUN_USER and USER ${RUN_USER}, the user is still root
# If there is anything that needs to be ran as root, this is the spot

# Change the user from root to ${RUN_USER}
USER ${RUN_USER}

# Any required files or directories will need to be explicitly copied to
# the release image
COPY --from=app-build /etc/passwd /etc/group /etc/
COPY --from=app-build /app/.venv /app/.venv

COPY --from=app-build /usr/share/ca-certificates/* /usr/share/ca-certificates/
COPY --from=app-build /usr/local/bin/python* /usr/local/bin/

# Include etc files that link to /usr/local/lib/
COPY --from=app-build /etc/ld.so.conf /etc/ld.so.cache /etc/
COPY --from=app-build /etc/ld.so.conf.d/ /etc/ld.so.conf.d/

# Include required libs
COPY --from=app-build /usr/lib/x86_64-linux-gnu/ /usr/lib/x86_64-linux-gnu/
COPY --from=app-build /lib/* /lib/
COPY --from=app-build /usr/local/lib/ /usr/local/lib/
COPY --from=app-build /usr/local/include/python* /usr/local/include/
COPY --from=app-build /lib64/* /lib64/
COPY --from=app-build --chown=${RUN_USER} /tmp/ /tmp/

ENV PATH="/app/.venv/bin:$PATH"

# run the application
CMD ["run-api"]
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
41 changes: 39 additions & 2 deletions app/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,37 @@ else
PY_RUN_CMD := docker-compose run $(DOCKER_EXEC_ARGS) --rm $(APP_NAME) poetry run
endif

# Docker user configuration
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
#
# Can be set by adding user=<username> and/ or uid=<id> to make command.
#
# If variables are not set explicitly: try looking up values from current
# environment, otherwise fixed defaults.
#
# uid= defaults to 0 if user= set (which makes sense if user=root, otherwise you
# probably want to set uid as well).
#
# Tested to work consistently on popular Linux flavors and Mac.
ifeq ($(user),)
RUN_USER ?= $(or $(strip $(USER)),nodummy)
RUN_UID ?= $(or $(strip $(shell id -u)),4000)
else
RUN_USER = $(user)
RUN_UID = $(or $(strip $(uid)),0)
endif

export RUN_USER
export RUN_UID

##################################################
# API Build & Run
##################################################

build:
docker-compose build
docker-compose build $(APP_NAME)
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

start:
docker-compose up --detach
docker-compose up --detach $(APP_NAME)
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

run-logs: start
docker-compose logs --follow --no-color $(APP_NAME)
Expand Down Expand Up @@ -177,6 +199,21 @@ lint-security: # https://bandit.readthedocs.io/en/latest/index.html
create-user-csv:
$(PY_RUN_CMD) create-user-csv


##################################################
# Release Management
##################################################

release-build:
docker buildx build \
--target release \
--platform=linux/amd64 \
$(OPTS) \
--build-arg RUN_USER=container \
--build-arg RUN_UID=4000 \
.


##################################################
# Miscellaneous Utilities
##################################################
Expand Down
2 changes: 1 addition & 1 deletion app/local.env
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,4 @@ AWS_DEFAULT_REGION=us-east-1
############################
# Example CSV Generation
############################
USER_ROLE_CSV_OUTPUT_PATH = ./
USER_ROLE_CSV_OUTPUT_PATH=./
Loading