Python CLI utility and library for manipulating DuckDB
databases.
In principle a "port" of the excellent sqlite-utils
by Simon Willison. Not sure if the use case is there in DuckDB
-land. Still thinking this through (or if there are similar offerings from others.)
This is in pre-beta and not yet working or released. Please do not use.
- Pipe JSON (or CSV or TSV) directly into a new DuckDB database file, automatically creating a table with the appropriate schema
- Run in-memory SQL queries, including joins, directly against data in CSV, TSV or JSON files and view the results
- Configure DuckDB full-text search against your database tables and run search queries against them, ordered by relevance
- Run transformations against your tables to make schema changes that DuckDB
ALTER TABLE
does not directly support, such as changing the type of a column - Extract columns into separate tables to better normalize your existing data
- Install plugins to add custom SQL functions and additional features
Read more on my blog, in this series of posts on New features in duckdb-utils and other entries tagged duckdb-utils.
pip install duckdb-utils
Or if you use Homebrew for macOS:
brew install duckdb-utils
Now you can do things with the CLI utility like this:
$ duckdb-utils memory cats.csv "select * from t"
[{"id": 1, "age": 4, "name": "Emme"},
{"id": 2, "age": 2, "name": "Pancakes"}]
$ duckdb-utils insert cats.duckdb cats cats.csv --csv
[####################################] 100%
$ duckdb-utils tables cats.duckdb --counts
[{"table": "cats", "count": 2}]
$ duckdb-utils cats.duckdb "select id, name from cats"
[{"id": 1, "name": "Emme"},
{"id": 2, "name": "Pancakes"}]
$ duckdb-utils cats.duckdb "select * from cats" --csv
id,age,name
1,4,Emme
2,2,Pancakes
$ duckdb-utils cats.duckdb "select * from cats" --table
id age name
---- ----- --------
1 4 Emme
2 2 Pancakes
You can import JSON data into a new database table like this:
$ curl https://api.github.com/repos/databooth/duckdb-utils/releases \
| duckdb-utils insert releases.duckdb releases - --pk id
Or for data in a CSV file:
$ duckdb-utils insert cats.duckdb cats cats.csv --csv
duckdb-utils memory
lets you import CSV or JSON data into an in-memory database and run SQL queries against it in a single command:
$ cat cats.csv | duckdb-utils memory - "select name, age from stdin"
See the full CLI documentation for comprehensive coverage of many more commands.
You can also import duckdb_utils
and use it as a Python library like this:
import duckdb_utils
db = duckdb_utils.Database("demo_database.duckdb")
# This line creates a "cats" table if one does not already exist:
db["cats"].insert_all([
{"id": 1, "age": 4, "name": "Emme"},
{"id": 2, "age": 2, "name": "Pancakes"}
], pk="id")
Check out the full library documentation for everything else you can do with the Python library.
- sqlite-utils: A tool for SQLite database TODO