PostgreSQL Full-Text Search: Examples and Best Practices
PostgreSQL Full-Text Search: A Comprehensive Guide
Learn how to implement full-text search in PostgreSQL using tsvector and tsquery. Optimize search queries with advanced indexing and configuration tips.
What is PostgreSQL Full-Text Search?
Full-text search in PostgreSQL enables querying text data with advanced linguistic matching, handling typos, stemming, and synonyms. It's a powerful tool for search-heavy applications like blogs, e-commerce, or forums..
1. Setup for Full-Text Search
To use full-text search, you need:
- Text Processing: Convert text to a tsvector (text search vector).
- Query Input: Use a tsquery for searches.
- Indexing: Create GIN or GiST indexes to optimize search performance.
2. Example: Simple Full-Text Search
Consider a documents table:
id | content |
---|---|
1 | PostgreSQL is amazing! |
2 | Learn PostgreSQL search. |
Insert some data:
Code:
INSERT INTO documents (content)
VALUES ('PostgreSQL is amazing!'),
('Learn PostgreSQL search.');
Query using to_tsvector and to_tsquery:
Code:
SELECT *
FROM documents
WHERE to_tsvector('english', content) @@ to_tsquery('PostgreSQL');
This finds all rows containing the word "PostgreSQL".
3. Using Indexes for Performance
To improve query performance:
Code:
CREATE INDEX idx_content_tsvector
ON documents USING GIN (to_tsvector('english', content));
This index ensures faster searches on the content column.
4. Handling Stemming and Stop Words
PostgreSQL's full-text search automatically reduces words to their stems (e.g., "running" → "run") and ignores common words (e.g., "is", "the").
To customize this:
- Use dictionaries and configurations.
- Override default stop words or stemming.
5. Search with Rankings
Order results based on relevance using ts_rank:
Code:
SELECT id, ts_rank(to_tsvector('english', content), to_tsquery('PostgreSQL')) AS rank
FROM documents
WHERE to_tsvector('english', content) @@ to_tsquery('PostgreSQL')
ORDER BY rank DESC;
6. Advanced Search: Phrases and Boolean Logic
You can use:
- Phrase Search: SELECT * WHERE content @@ to_tsquery('PostgreSQL <-> search');
- AND/OR Logic: Combine terms using | (OR) and & (AND).
Best Practices
- Normalize Data: Store precomputed tsvector columns for faster indexing.
- Monitor Performance: Use EXPLAIN to analyze query plans.
- Test Dictionaries: Experiment with dictionaries for better language-specific results.
All PostgreSQL Questions, Answers, and Code Snippets Collection.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics