TL;DR
Standard B-tree indexes cannot optimize partial substring searches (e.g., WHERE name ILIKE '%john%'), forcing PostgreSQL to perform a full sequential table scan. For applications requiring fast fuzzy search or autocomplete across millions of rows without deploying Elasticsearch, the pg_trgm extension is the solution. Impact: Keyword search speed improved by 340x (850ms to 2.5ms) while adding the ability to tolerate spelling mistakes without requiring a heavy Elasticsearch cluster.. This guide walks through the problem, the solution, and how to verify the improvement with real metrics.
The Problem
Your Laravel application has a user directory with 2 million users. Users frequently search for partial names (LIKE '%smith%'). The query takes 800ms because it scans all 2 million rows. You also want to support typos (e.g., smithe finding smith), but ILIKE requires exact substring matches, frustrating your users and slowing down the server.
This is a common scenario in production applications that have been running for months or years. The performance degradation is often gradual, making it hard to notice until it becomes a serious issue affecting users or operational costs.
How It Works
Standard B-tree indexes cannot optimize partial substring searches (e.g., WHERE name ILIKE '%john%'), forcing PostgreSQL to perform a full sequential table scan. For applications requiring fast fuzzy search or autocomplete across millions of rows without deploying Elasticsearch, the pg_trgm extension is the solution. It breaks text into trigrams (3-letter chunks) and uses a GIN or GiST index to quickly find rows matching substrings or calculating spelling similarity.
Understanding the underlying mechanism is key to applying this optimization correctly and knowing when it applies to your specific situation versus when a different approach is needed.
Solution
The following before/after comparison demonstrates the complete solution. Study the comments carefully — they explain the reasoning behind each configuration choice.
Before
-- Slow: Sequential scan required for leading wildcard
EXPLAIN ANALYZE
SELECT * FROM users WHERE name ILIKE '%smit%';
-- Seq Scan on users (actual time=0.03..850.00 rows=520)
-- Filter: ((name)::text ~~* '%smit%'::text)
-- In Laravel:
$users = User::where('name', 'ILIKE', "%{$query}%")->get();
After
-- Fast: pg_trgm extension with GIN index
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- Create a GIN index using the trigram operator class
CREATE INDEX CONCURRENTLY idx_users_name_trgm ON users USING GIN (name gin_trgm_ops);
-- Now ILIKE uses the index instantly!
EXPLAIN ANALYZE
SELECT * FROM users WHERE name ILIKE '%smit%';
-- Bitmap Heap Scan on users (actual time=0.1..2.5 rows=520)
-- -> Bitmap Index Scan on idx_users_name_trgm
-- Bonus: Fuzzy searching for typos using similarity (> 0.3 threshold)
SELECT name, similarity(name, 'smithe') as score
FROM users
WHERE name % 'smithe'
ORDER BY score DESC LIMIT 10;
-- In Laravel (using raw queries for similarity):
$users = User::whereRaw('name % ?', [$query])
->orderByRaw('similarity(name, ?) DESC', [$query])
->limit(10)->get();
Performance Impact
Keyword search speed improved by 340x (850ms to 2.5ms) while adding the ability to tolerate spelling mistakes without requiring a heavy Elasticsearch cluster.
Here are the measured results from applying this optimization in a production environment:
| Metric | Before | After |
|---|---|---|
| Query execution time | 850ms | 2.5ms |
| PostgreSQL Scan Type | Sequential Scan | Bitmap Index Scan |
| Typo Tolerance | None | Full (Trigram Similarity) |
| Infrastructure complexity | Need external search engine | Native PostgreSQL database |
These numbers will vary based on your specific data volume, hardware, and query patterns, but the relative improvement should be consistent. Always measure before and after in your own environment to confirm the impact.
When to Use This
This optimization is most effective when:
- Your application matches the problem scenario described above
- You’ve confirmed the bottleneck with monitoring or profiling tools
- The data volume is large enough that the optimization makes a meaningful difference
It may not be the right fit if your tables are small (under 100K rows), your queries are already fast (under 10ms), or the bottleneck is elsewhere in your stack (application code, network, or client-side rendering).
Key Takeaways
- Measure first: Always profile before optimizing — the bottleneck may not be where you think it is
- Test in staging: Apply the optimization in a staging environment with production-like data before deploying
- Monitor after: Set up dashboards tracking the metrics above so you can verify the improvement and catch regressions
Frequently Asked Questions
Should I use GIN or GiST indexes for pg_trgm?
Use GIN. GIN index lookups are about 3 times faster than GiST for trigram searches. The only downside is that GIN indexes take longer to build and update slightly more slowly on INSERTs. For a typical Laravel app where reads dramatically outnumber writes, GIN is the preferred indexing method for text search.
Can I use pg_trgm instead of Elasticsearch/Meilisearch in Laravel Scout?
Yes! For datasets up to roughly 10-20 million rows, pg_trgm provides excellent performance and completely eliminates the operational overhead of running a separate search cluster. You can even write a custom Laravel Scout engine driver that leverages underlying pg_trgm similarity functions.