Skip to content

INTPYTHON-611 Make it easier to run patch builds #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion .evergreen/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ functions:
- command: subprocess.exec
type: test
params:
include_expansions_in_env: [DIR]
include_expansions_in_env: [DIR, REPO_ORG, REPO_BRANCH]
working_dir: "src"
binary: bash
args: [.evergreen/fetch-repo.sh]
Expand Down Expand Up @@ -96,6 +96,13 @@ post:
working_dir: "src"
binary: bash
args: [drivers-evergreen-tools/.evergreen/teardown.sh]
- command: subprocess.exec
type: setup
params:
include_expansions_in_env: [DIR, REPO_ORG, REPO_BRANCH]
working_dir: "src"
binary: bash
args: [.evergreen/teardown.sh]

tasks:
- name: test-semantic-kernel-python-local
Expand Down
21 changes: 20 additions & 1 deletion .evergreen/fetch-repo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,32 @@ fi

cd ${DIR}

# Allow overrides from the patch build.
REPO_ORG_OVERRIDE=${REPO_ORG:-}
REPO_BRANCH_OVERRIDE=${REPO_BRANCH:-}

# Source the configuration.
set -a
source config.env
set +a

if [ -n "${REPO_ORG_OVERRIDE}" ]; then
REPO_ORG="${REPO_ORG_OVERRIDE}"
fi
if [ -n "${REPO_BRANCH_OVERRIDE}" ]; then
REPO_BRANCH="${REPO_BRANCH_OVERRIDE}"
fi

rm -rf ${REPO_NAME}
git clone ${CLONE_URL}

ARGS="https://github.com/${REPO_ORG}/${REPO_NAME}"
if [ -n "${REPO_BRANCH:-}" ]; then
ARGS="-b ${REPO_BRANCH} ${ARGS}"
fi

echo "Cloning repo $ARGS..."
git clone --depth=1 ${ARGS}
echo "Cloning repo $ARGS... done."

# Apply patches to upstream repo if desired.
if [ -d "patches" ]; then
Expand Down
17 changes: 17 additions & 0 deletions .evergreen/teardown.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash

set -eu

OVERRIDES=
if [ -n "${REPO_ORG:-}" ]; then
echo "REPO_ORG=$REPO_ORG"
OVERRIDES=1
fi
if [ -n "${REPO_BRANCH:-}" ]; then
echo "REPO_BRANCH=$REPO_BRANCH"
OVERRIDES=1
fi

if [ -z "${OVERRIDES}" ]; then
echo "No overrides"
fi
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ Within each subdirectory you should expect to have:
- `run.sh` -- A script that should handle any additional library installations and steps for executing the test suite. This script should not populate the Atlas database with any required test data.
- `config.env` - A file that defines the following environment variables:
- `REPO_NAME` -- The name of the AI/ML framework repository that will get cloned
- `CLONE_URL` -- The Github URL to clone into the specified `DIR`
- `REPO_ORG` -- The Github org of the repository
- `REPO_BRANCH` -- The optional branch to clone
- `DATABASE` -- The optional database where the Atlas CLI will load your index configs
- `database/` -- An optional directory used by `.evergreen/scaffold_atlas.py` to populate a MongoDB database with test data. Only provide this if your tests require pre-populated data.
- `database/{collection}.json` -- An optional JSON file containing one or more MongoDB documents that will be uploaded to `$DATABASE.{collection}` in the local Atlas instance. Only provide this if your tests require pre-populated data.
Expand Down Expand Up @@ -117,7 +118,7 @@ Test execution flow is defined in `.evergreen/config.yml`. The test pipeline's c

**[Functions](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Configuration-Files#functions)** -- We've defined some common functions that will be used. See the `.evergreen/config.yml` for example cases. The standard procedure is to fetch the repository, provision Atlas as needed, and then execute the tests specified in the `run.sh` script you create. Ensure that the expansions are provided for these functions, otherwise the tests will run improperly and most likely fail.

- [`fetch repo`](https://github.com/mongodb-labs/ai-ml-pipeline-testing/blob/main/.evergreen/config.yml#L30) -- Clones the library's git repository; make sure to provide the expansion CLONE_URL
- [`fetch repo`](https://github.com/mongodb-labs/ai-ml-pipeline-testing/blob/main/.evergreen/config.yml#L30) -- Clones the library's git repository; make sure to provide the expansion REPO_ORG/REPO_NAME and REPO_BRANCH (optional)
- [`execute tests`](https://github.com/mongodb-labs/ai-ml-pipeline-testing/blob/main/.evergreen/config.yml#L51) -- Uses [subprocess.exec](https://docs.devprod.prod.corp.mongodb.com/evergreen/Project-Configuration/Project-Commands#subprocessexec) to run the provided `run.sh` file. `run.sh` must be within the specified `DIR` path.
- `fetch source` -- Retrieves the current (`ai-ml-pipeline-testing`) repo
- `setup atlas cli` -- Sets up the local Atlas deployment
Expand All @@ -137,8 +138,7 @@ At the start, we will hopefully add the integration tests themselves.
The bad news is that the maintainers of the AI/ML packages may take considerable
time to review and merge our changes. The good news is that we can begin testing
without pointing to the main branch of the upstream repo.
The parameter value of the `CLONE_URL` is very flexible.
We literally just call `git clone $CLONE_URL`.
We can use `REPO_ORG`, `REPO_NAME`, and an optional `REPO_BRANCH` to define which repo to clone.
As such, we can point to an arbitrary branch on an arbitrary repo.
While developing, we encourage developers to point to a feature branch
on their own fork, and add a TODO with the JIRA ticket to update the url
Expand Down Expand Up @@ -169,3 +169,11 @@ We realized that we could easily get this working without changing the upstream
simply by applying a git patch file.
This is a standard practice used by `conda package` maintainers,
as they often have to build for a more broad set of scenarios than the original authors intended.

### Running a patch build of a given PR

Rather than making a new branch and modifying a `config.env` file, you can run a patch build as follows:

```bash
evergreen patch -p ai-ml-pipelin-testing --param REPO_ORG="<my-org>" --param REPO_BRANCH="<my-branch>" -y "<my-message>"
```
2 changes: 1 addition & 1 deletion chatgpt-retrieval-plugin/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=chatgpt-retrieval-plugin
CLONE_URL="https://github.com/openai/chatgpt-retrieval-plugin.git"
REPO_ORG=openai
DATABASE=chatgpt_retrieval_plugin_test_db
2 changes: 1 addition & 1 deletion docarray/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=docarray
CLONE_URL="https://github.com/docarray/docarray.git"
REPO_ORG=docarray
DATABASE=docarray_test_db
2 changes: 1 addition & 1 deletion haystack-embeddings/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=haystack-core-integrations
CLONE_URL="https://github.com/deepset-ai/haystack-core-integrations.git"
REPO_ORG=deepset-ai
DATABASE=haystack_integration_test
2 changes: 1 addition & 1 deletion haystack-fulltext/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=haystack-core-integrations
CLONE_URL="https://github.com/deepset-ai/haystack-core-integrations.git"
REPO_ORG=deepset-ai
DATABASE=haystack_test
2 changes: 1 addition & 1 deletion langchain-python/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=langchain-mongodb
CLONE_URL="https://github.com/langchain-ai/langchain-mongodb.git"
REPO_ORG=langchain-ai
DATABASE=langchain_test_db
2 changes: 1 addition & 1 deletion langchaingo-golang/config.env
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
REPO_NAME=langchaingo
CLONE_URL="https://github.com/tmc/langchaingo.git"
REPO_ORG=tmc
2 changes: 1 addition & 1 deletion langgraph-python/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=langchain-mongodb
CLONE_URL="https://github.com/langchain-ai/langchain-mongodb.git"
REPO_ORG=langchain-ai
DATABASE=langgraph-test
2 changes: 1 addition & 1 deletion llama-index-python-vectorstore/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=llama_index
CLONE_URL="https://github.com/run-llama/llama_index.git"
REPO_ORG=run-llama
DATABASE=llama_index_test_db
2 changes: 1 addition & 1 deletion pymongo-voyageai/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=pymongo-voyageai
CLONE_URL="https://github.com/mongodb-labs/pymongo-voyageai.git"
REPO_ORG=mongodb-labs
DATABASE="pymongo_voyageai_test_db"
2 changes: 1 addition & 1 deletion semantic-kernel-csharp/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=semantic-kernel
CLONE_URL="https://github.com/microsoft/semantic-kernel.git"
REPO_ORG=microsoft
DATABASE=dotnetMSKNearestTest
2 changes: 1 addition & 1 deletion semantic-kernel-python/config.env
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
REPO_NAME=semantic-kernel
CLONE_URL="https://github.com/microsoft/semantic-kernel.git"
REPO_ORG=microsoft
DATABASE=pyMSKTest