-
Notifications
You must be signed in to change notification settings - Fork 372
Building indexes
Designing indexes for full-text search queries in PostgreSQL is a delicate art. Here are some tips to help guide developers to find an index definition that works for their application.
Please update this page with any new tips or tricks you might discover.
You must use GIN or GIST indexes for full text search.
Essentially you look in a SQL statement for the operator (% or @@) and grab the operand that isn't the query.
Then you do something like this:
CREATE INDEX my_cool_index ON my_cool_table USING gist (([expression]));
or (for trigram only)
CREATE INDEX my_cool_index ON my_cool_table USING gist (([expression]) pg_trgm_ops);
Trigram seems to underperform the other types.
Alternatively, you can use a trigger to automatically populate a column of type tsvector with the [expression] from above, and then use the :tsvector_column option in pg_search to tell it to use that column instead. Then you'd make a very similar index as the above (option 1) but put the column in instead of [expression]
You'll need enough rows in your table for EXPLAIN ANALYZE to show up using the index, and the index is brittle on the expression, so if you change the pg_search configuration, you'll have to change the index.
Begin by making sure you have a tsvector column in your table:
add_column :posts, :tsv_body, :tsvector
Add an index on the tsv_body column. As of ActiveRecord 4.0, you can add indexes and specify the "type" of index easily:
add_index(:posts, :tsv_body, using: 'gin')
For the triggers, you can use hair_trigger to add database triggers via your migrations:
create_trigger(compatibility: 1).on(:posts).before(:insert, :update) do
"new.tsv_body := to_tsvector('pg_catalog.english', coalesce(new.body,''));"
end
You can also execute plain SQL for adding trigger function in your migration code:
say_with_time("Adding trigger function on posts for updating tsv_body column") do
sql = <<-MIGRATION
CREATE TRIGGER tsvectorupdate BEFORE INSERT OR UPDATE
ON posts FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(tsv_body, 'pg_catalog.simple', body);
MIGRATION
execute(sql)
end
To update tsv_body
column for existing records, you can simply run
Post.find_each(&:touch)
or
execute("UPDATE posts SET id = id")
The latter will be faster since it won't load each post into memory as a Ruby object.
Don't forget to add the column to your pg_search specification, or the DB won't use your index.
using: {
tsearch: {
tsvector_column: "tsv_body"
}
}
To ensure the DB is using your index, copy the search's SELECT
query from your development log and run an EXPLAIN
(or use Rails' explain
function if you prefer). Look for a bitmap scan on your tsv index. If your index doesn't appear in the EXPLAIN
output, it's not being used.