Skip to content

gizmodata/benchmark-bigquery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

22dfd37 · Sep 11, 2024

History

1 Commit
Sep 11, 2024
Sep 11, 2024
Sep 11, 2024
Sep 11, 2024
Sep 11, 2024

Repository files navigation

BigQuery benchmark repo

This repo is intended to run benchmark queries against BigQuery.

Setup (to run locally)

1. Clone the repo

git clone https://github.com/voltrondata/benchmark-bigquery

2. Setup Python

Create a new Python 3.8+ virtual environment and install the requirements with:

cd benchmark-bigquery

# Create the virtual environment
python3 -m venv ./venv

# Activate the virtual environment
. ./venv/bin/activate

# Upgrade pip, setuptools, and wheel
pip install --upgrade pip setuptools wheel

# Install the benchmark-bigquery package (in editable mode)
pip install --editable .

3. Create .env file in root of repo folder

Create a .env file in the root folder of the repo - it will be git-ignored for security reasons.

Sample contents:

export GOOGLE_PROJECT_ID="voltron-data-developers"
export DATASET_ID="voltron-data-developers.tpch_10"

4. Authenticate with Google Cloud

gcloud auth application-default login

Running the benchmarks (with default settings)

benchmark-bigquery

Note: this will create a file in the data directory called: "benchmark_results.json" with the query run details.

To see more options:

benchmark-bigquery --help

Converting the benchmark JSON output data to Excel format

benchmark-bigquery-convert-output-to-excel

Note: this will create an Excel file in the data directory called: "benchmark_results.xlsx" with the query run details.

About

Tools to benchmark Google Cloud BigQuery

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages