PostgreSQL Expression Indexes
This article describes how to create expression indexes in PostgreSQL.
In addition to creating indexes on columns, PostgreSQL allows you to create indexes on expressions based on one or more columns in a table, which are called expression indexes.
Note that indexes on expressions are expensive to maintain because PostgreSQL must evaluate the expression on each row as it is inserted or updated, and use the result for the index. Therefore, you should only use indexes on expressions when retrieval speed is more important than insertion and update speed.
PostgreSQL Expression Index Syntax
To create an expression index, follow this syntax:
CREATE INDEX index_name
ON table_name ( (expression) );
Explanation:
- The syntax for creating an expression index is basically the same as creating a normal index, except that the column names are replaced by expressions.
- Expressions are generally enclosed in parentheses. If the expression is just a function call, parentheses can be omitted.
- The
index_name
is the index name. You can skip the index name and PostgreSQL will automatically generate one.
Once an index expression is defined, PostgreSQL will consider the index when the expression defining the index appears in a WHERE
clause or ORDER BY
clause of an SQL statement.
PostgreSQL Expression Index Examples
The following demonstrates PostgreSQL expression indexes with the customer
table from PostgreSQL Sakila sample database.
The following is a partial definition of the customer
table:
Table "public.customer"
Column | Type | Collation | Nullable | Default
-------------+-----------------------------+-----------+----------+-----------------------------------------------
customer_id | integer | | not null | nextval('customer_customer_id_seq'::regclass)
store_id | smallint | | not null |
first_name | character varying(45) | | not null |
last_name | character varying(45) | | not null |
email | character varying(50) | | |
address_id | smallint | | not null |
activebool | boolean | | not null | true
create_date | date | | not null | 'now'::text::date
last_update | timestamp without time zone | | | now()
active | integer | | |
Indexes:
"customer_pkey" PRIMARY KEY, btree (customer_id)
"idx_fk_address_id" btree (address_id)
"idx_fk_store_id" btree (store_id)
"idx_last_name" btree (last_name)
...
There is a B-tree index idx_last_name
defined on the last_name
column. The following EXPLAIN
statement shows that PostgreSQL uses the idx_last_name
index when quering rows with last name Adam
from the customer
table:
EXPLAIN
SELECT * FROM customer
WHERE last_name = 'Adam';
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using idx_last_name on customer (cost=0.28..8.29 rows=1 width=70)
Index Cond: ((last_name)::text = 'Adam'::text)
(2 rows)
However, when looking for customers with adam
lowercase, PostgreSQL cannot make use of the index for the lookup:
EXPLAIN
SELECT * FROM customer
WHERE LOWER(last_name) = 'adam';
QUERY PLAN
----------------------------------------------------------
Seq Scan on customer (cost=0.00..16.98 rows=3 width=70)
Filter: (lower((last_name)::text) = 'adam'::text)
(2 rows)
In order to improve the query efficiency of the LOWER()
function, you can define an index expression like this:
CREATE INDEX ON customer(LOWER(last_name));
Now, look at the execution plan of the above statement again,
EXPLAIN
SELECT * FROM customer
WHERE LOWER(last_name) = 'adam';
QUERY PLAN
---------------------------------------------------------------------------------
Bitmap Heap Scan on customer (cost=4.30..10.83 rows=3 width=70)
Recheck Cond: (lower((last_name)::text) = 'adam'::text)
-> Bitmap Index Scan on customer_lower_idx (cost=0.00..4.30 rows=3 width=0)
Index Cond: (lower((last_name)::text) = 'adam'::text)
(4 rows)
You will notice that the PostgreSQL optimizer uses indexes.
Conclusion
PostgreSQL allows you to create indexes on expressions based on one or more columns in a table, which are called expression indexes.