Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile staging and speed improvements #141

Closed
wants to merge 31 commits into from
Closed
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
ccf5f46
Vulnerability scanner configs
Nava-JoshLong Feb 15, 2023
0513bfa
Tweaked Makefile
Nava-JoshLong Feb 15, 2023
e2d8f74
Multi-stage Dockerfile
Nava-JoshLong Feb 15, 2023
b6abc35
Poetry groups and speed up
Nava-JoshLong Feb 15, 2023
3ece7bc
Added more documentation
Nava-JoshLong Feb 15, 2023
e5eda21
Removed unnecessary commands
Nava-JoshLong Feb 15, 2023
0fed0dc
Fixing docker complaint
Nava-JoshLong Feb 15, 2023
f402840
Removed redundant Makefile
Nava-JoshLong Feb 15, 2023
d45548b
Revert poetry script tweak
Nava-JoshLong Feb 15, 2023
091072f
New comments from PR discussion
Nava-JoshLong Feb 15, 2023
1344dde
Updates from comments
Nava-JoshLong Feb 16, 2023
6101c4f
Added newline
Nava-JoshLong Feb 16, 2023
164d396
Cleanup from comments
Nava-JoshLong Feb 16, 2023
3ee6d9d
Removed unneeded file from binary def
Nava-JoshLong Feb 16, 2023
886fea2
Tweaked comments
Nava-JoshLong Feb 16, 2023
cff0654
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Feb 21, 2023
7650c91
Typos and removed unneeded things
Nava-JoshLong Feb 22, 2023
21227bb
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Feb 22, 2023
01b6e49
Update Dockerfile
Nava-JoshLong Feb 22, 2023
834c169
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Feb 27, 2023
f023479
Swap from short flags
Nava-JoshLong Mar 7, 2023
2a9f908
Added strict to zip calls
Nava-JoshLong Mar 7, 2023
11ab3c1
Trying to fix linting and testing errors
Nava-JoshLong Mar 7, 2023
c2b7e96
Roll back changes from last commit
Nava-JoshLong Mar 7, 2023
63f93ac
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Mar 14, 2023
e8981b3
Testing fixes
Nava-JoshLong Mar 14, 2023
a6fa8d9
More testing fixes
Nava-JoshLong Mar 14, 2023
90d3f47
ci-app fixes
Nava-JoshLong Mar 14, 2023
a09fa6b
apispec tweaks
Nava-JoshLong Mar 14, 2023
f76f903
Merge branch 'main' into dockerfile-staging-and-speed-improvements
Nava-JoshLong Mar 21, 2023
0d09b4f
Apispec tweaks for test
Nava-JoshLong Mar 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .dockleconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# This file is allows you to specify a list of files that is acceptable to Dockle
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
# To allow multiple files, use a list of names. example below
# DOCKLE_ACCEPT_FILES=file1,path/to/file2,file3/path,etc
# https://github.com/goodwithtech/dockle#accept-suspicious-environment-variables--files--file-extensions
# The apiflask/settings file is a stub file that apiflask creates, and has no sensitive data in. We are ignoring it since it is unused
DOCKLE_ACCEPT_FILES=app/.venv/lib/python3.11/site-packages/apiflask/settings.py
26 changes: 26 additions & 0 deletions .grype.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# List of vulnerabilities to ignore for the anchore scan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file looks to be the same as the one on template-infra so i think we can remove it, since it'll be copied over from template-infra when the project installs template-infra

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually need to tweak the infra version since it has a python specific recast in it, the certifi line. Should I remove this file and leave it in infra, or keep it and tweak the infra version?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see. so you're suggesting that we move the python specific thing from the infra repo to this repo? that makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as we continue to build in this repo, there will probably be other things we need to recast or safelist that are just for this repo, so I think we should keep the config files here since they'd fix those issues

# https://github.com/anchore/grype#specifying-matches-to-ignore
# More info can be found in the docs/infra/vulnerability-management.md file

# Please add safelists in the following format to make it easier when checking
# Package/module name: URL to vulnerability for checking updates
# Versions: URL to the version history
# Dependencies: Name of any other packages or modules that are dependent on this version
# Link to the dependencies for ease of checking for updates
# Issue: Why there is a finding and why this is here or not been removed
# Last checked: Date last checked in scans
# - vulnerability: The-CVE-or-vuln-id # Remove comment at start of line

ignore:
# These settings ignore any findings that fall into these categories
- fix-state: not-fixed
- fix-state: wont-fix
- fix-state: unknown

# Certifi: https://cve.report/CVE-2022-23491
# Versions: https://pypi.org/project/certifi/#history
# Dependencies: None
# Issue: No update in the last month to resolve the finding, however the finding is showing the latest
# version being the same one as the finding, with a 0 - 2022.12.7 (latest) vs 2022.12.07 (non-existent version)
# Last checked: 02-02-2023
- vulnerability: GHSA-43fp-rhv2-5gv8
10 changes: 10 additions & 0 deletions .hadolint.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# List of settings and ignore or safelist findings for the hadolint scanner

# For more information on any settings you can specify, see the actions' documentation here
# https://github.com/hadolint/hadolint#configure
override:
info:
# Casts the apt-get install <package>=<version> finding as info
# We have this set since there is no way to specify version for
# build-essentials in the Dockerfile
- DL3008
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
9 changes: 9 additions & 0 deletions .trivyignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# List of vulnerabilities to ignore for the trivy scan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm wondering if we can get rid of this file too since it looks to be the same as on template-infra

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we continue to build in this repo, there will probably be other things we need to recast or safelist that are just for this repo, so I think we should keep the config files here since they'd fix those issues

# Please add safelists in the following format to make it easier when checking
# Package/module name: URL to vulnerability for checking updates
# Versions: URL to the version history
# Dependencies: Name of any other packages or modules that are dependent on this version
# Link to the dependencies for ease of checking for updates
# Issue: Why there is a finding and why this is here or not been removed
# Last checked: Date last checked in scans
#The-CVE-or-vuln-id # Remove comment at start of line
49 changes: 49 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# PROJECT_NAME defaults to name of the current directory.
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
PROJECT_NAME ?= $(notdir $(PWD))

# For now only support a single app in the folder `app/` within the repo
# In the future, support multiple apps, and which app is being operated
# on will be determined by the APP_NAME Makefile argument
APP_NAME ?= app


.PHONY : \
release-build \
release-image-name \
release-image-tag \
help

########################
## Release Management ##
########################

# Include project name in image name so that image name
# does not conflict with other images during local development
IMAGE_NAME := $(PROJECT_NAME)-$(APP_NAME)

GIT_REPO_AVAILABLE := $(shell git rev-parse --is-inside-work-tree 2>/dev/null)

# Generate a unique tag based solely on the git hash.
# This will be the identifier used for deployment via terraform.
ifdef GIT_REPO_AVAILABLE
IMAGE_TAG := $(shell git rev-parse HEAD)
else
IMAGE_TAG := "unknown-dev.$(DATE)"
endif

# Generate an informational tag so we can see where every image comes from.
DATE := $(shell date -u '+%Y%m%d.%H%M%S')
INFO_TAG := $(DATE).$(USER)

release-build:
cd $(APP_NAME) && $(MAKE) release-build \
OPTS="--tag $(IMAGE_NAME):latest --tag $(IMAGE_NAME):$(IMAGE_TAG)"

release-image-name: ## Prints the image name of the release image
@echo $(IMAGE_NAME)

release-image-tag: ## Prints the image tag of the release image
@echo $(IMAGE_TAG)

help: ## Prints the help documentation and info about each command
@grep -E '^[/a-zA-Z0-9_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}'
96 changes: 88 additions & 8 deletions app/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,47 @@
# https://hub.docker.com/_/python

# The build stage that will be used to deploy to the various environments
# needs to be called `release` in order to integrate with the repo's
# top-level Makefile
FROM python:3-slim AS release

# Install poetry, the package manager.
# https://python-poetry.org
RUN pip install poetry
# This is what the other stages will call as their base
FROM python:3.11-slim-bullseye AS build
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

# Keep container packages up-to-date.
RUN apt-get update \
&& apt-get install --no-install-recommends --yes \
build-essential \
libpq-dev \
postgresql

# Install poetry, the package manager.
# https://python-poetry.org
RUN pip install --no-cache-dir poetry==1.3.1
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

# setup user stuff
ARG RUN_UID
ARG RUN_USER

RUN : "${RUN_USER:?RUN_USER and RUN_UID need to be set and non-empty.}" && \
[ "${RUN_USER}" = "root" ] || \
(useradd -mU --home "/home/${RUN_USER}" --uid ${RUN_UID} "${RUN_USER}" \
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
&& mkdir /app \
&& chown -R ${RUN_UID} "/home/${RUN_USER}" /app)
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

USER ${RUN_USER}

#-------------------------------------------------------------------------------
# Development build environment
#-------------------------------------------------------------------------------
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
FROM build as dev
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

ARG RUN_USER

USER ${RUN_USER}
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

# Set the application working directory.
WORKDIR /srv

COPY pyproject.toml poetry.lock ./
RUN poetry config virtualenvs.in-project false && poetry env use python
RUN poetry install --no-root
RUN poetry install --no-root --with app,dev
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

# Copy application files.
COPY . /srv
Expand All @@ -33,5 +54,64 @@ ENV HOST=0.0.0.0
# Install application dependencies.
# https://python-poetry.org/docs/basic-usage/#installing-dependencies
RUN poetry install

# Run the application.
CMD ["poetry", "run", "python", "-m", "api"]

#-------------------------------------------------------------------------------
# Non-development build environment
#-------------------------------------------------------------------------------
FROM build as app-build
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

ARG RUN_USER
USER ${RUN_USER}

# Set the application working directory.
WORKDIR /app

# install dependencies
# use a project-local virtualenv to be easy to find
ENV POETRY_VIRTUALENVS_IN_PROJECT true
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
COPY pyproject.toml poetry.lock ./
RUN poetry install --no-root --with app --without dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment for why we're specifically copying pyproject.toml and poetry.lock even though right after this we copy the entire . directory with COPY . .. My understanding is that functionally it's redundant, but that this way we can separate out the dependencies as an initial container layer to improve the cache performance of builds since dependencies don't change as often as the source code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swapped to just using the COPY . /app since it covered everything

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there a performance gain to having it separate though? My understanding is that with the docker build cache, if the dependencies in pyproject.toml and poetry.lock don't change, Docker could reuse that layer, and only need to rebuild the layer with the source files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a slight gain in having them separate, but I think it would be on the local version only since I'm not sure how much GitHub caches between builds. I know there is a Docker cache action for workflows, but I think it is still in beta right now


COPY . ./
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

# Install application dependencies
# The name of the wheel file comes from pyproject.toml
RUN poetry build --format wheel && poetry run pip install 'dist/template_application_flask-0.1.0-py3-none-any.whl'
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

#-------------------------------------------------------------------------------
# Application only
lorenyu marked this conversation as resolved.
Show resolved Hide resolved
#-------------------------------------------------------------------------------
# now copy over just the installed components into the final image
FROM scratch as release

ARG RUN_USER
USER ${RUN_USER}

COPY --from=app-build /etc/passwd /etc/group /etc/
COPY --from=app-build /app/.venv /app/.venv

# Include GPG for the DOR Import task.
# Ideally this isn't included for every other ECS task/service, but we'll do it for now.
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved

COPY --from=app-build /usr/share/ca-certificates/* /usr/share/ca-certificates/
COPY --from=app-build /usr/local/bin/python* /usr/local/bin/

# Include etc files that link to /usr/local/lib/
COPY --from=app-build /etc/ld.so.conf /etc/ld.so.cache /etc/
COPY --from=app-build /etc/ld.so.conf.d/ /etc/ld.so.conf.d/

# Include required libs
COPY --from=app-build /usr/lib/x86_64-linux-gnu/ /usr/lib/x86_64-linux-gnu/
COPY --from=app-build /lib/* /lib/
COPY --from=app-build /usr/local/lib/ /usr/local/lib/
COPY --from=app-build /usr/local/include/python* /usr/local/include/
COPY --from=app-build /lib64/* /lib64/
COPY --from=app-build --chown=${RUN_USER} /tmp/ /tmp/

ENV PATH="/app/.venv/bin:$PATH"

# run the application
CMD ["run-api"]
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
41 changes: 39 additions & 2 deletions app/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,37 @@ else
PY_RUN_CMD := docker-compose run $(DOCKER_EXEC_ARGS) --rm $(APP_NAME) poetry run
endif

# Docker user configuration
Nava-JoshLong marked this conversation as resolved.
Show resolved Hide resolved
#
# Can be set by adding user=<username> and/ or uid=<id> to make command.
#
# If variables are not set explicitly: try looking up values from current
# environment, otherwise fixed defaults.
#
# uid= defaults to 0 if user= set (which makes sense if user=root, otherwise you
# probably want to set uid as well).
#
# Tested to work consistently on popular Linux flavors and Mac.
ifeq ($(user),)
RUN_USER ?= $(or $(strip $(USER)),nodummy)
RUN_UID ?= $(or $(strip $(shell id -u)),4000)
else
RUN_USER = $(user)
RUN_UID = $(or $(strip $(uid)),0)
endif

export RUN_USER
export RUN_UID

##################################################
# API Build & Run
##################################################

build:
docker-compose build
docker-compose build $(APP_NAME)
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

start:
docker-compose up --detach
docker-compose up --detach $(APP_NAME)
lorenyu marked this conversation as resolved.
Show resolved Hide resolved

run-logs: start
docker-compose logs --follow --no-color $(APP_NAME)
Expand Down Expand Up @@ -177,6 +199,21 @@ lint-security: # https://bandit.readthedocs.io/en/latest/index.html
create-user-csv:
$(PY_RUN_CMD) create-user-csv


##################################################
# Release Management
##################################################

release-build:
docker buildx build \
--target release \
--platform=linux/amd64 \
$(OPTS) \
--build-arg RUN_USER=container \
--build-arg RUN_UID=4000 \
.


##################################################
# Miscellaneous Utilities
##################################################
Expand Down
2 changes: 1 addition & 1 deletion app/local.env
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,4 @@ AWS_DEFAULT_REGION=us-east-1
############################
# Example CSV Generation
############################
USER_ROLE_CSV_OUTPUT_PATH = ./
USER_ROLE_CSV_OUTPUT_PATH=./
Loading