Pagination in SQLLast updated: 2020-08-13

🌳 Pagination will reduce the weight of the page at the client and the query cost on the database.

🐃 In real world, we only need to load 20 or 30 items instead of millions

OFFSET, LIMIT & ORDER BY

One way to achive Pagination in SQL is using OFFSET, LIMIT & ORDER BY


OFFSET => PAGE
LIMIT => SIZE
Eg:
SELECT * FROM public.employee OFFSET 0 LIMIT 50
SELECT * FROM public.employee ORDER BY employee_id DESC OFFSET 0 LIMIT 50

ORDER BY will give an expected result

COUNT(*)

As a part of pagination, often we need to display how many records are available in the table

Eg: 🍇

SELECT COUNT(*) FROM public.employee

Warning

COUNT(*) is so slow on PostgreSQL. It has to calculate the result before it knows how many rows it will contain. COUNT(*) will normally perform a sequential scan of the table, which can be quite expensive.

Mis-Conception

The “*” in SELECT * FROM is expanded to all columns. Consequently, many people think that using count(*) is inefficient and should be written count(id) or count(1) instead.

But the “*” in count(*) is quite different, it just means “row” and is not expanded at all (actually, that is a “zero-argument aggregate”).

Writing count(1) or count(id) are actually slower than count(*), because they have to test if the argument IS NULL or not (count, like most aggregates, ignores NULL arguments).

So there is nothing to be gained by avoiding the “*".

Alternative to Count(*)

Often an approximation is good enough and you don’t need the exact count. In that case you can use the estimate that PostgreSQL uses for query planning:


SELECT reltuples::bigint FROM pg_catalog.pg_class WHERE relname = 'mytable';
SELECT reltuples::bigint AS count FROM pg_class WHERE oid = 'public.mytable'::regclass;

This value is updated by both autovacuum and autoanalyze, so it should never be much more than 10% off.
You can reduce autovacuum_analyze_scale_factor for that table so that autoanalyze runs more often there.

Pagination in a Single Query

Pagination in a Single Query using CTE With clauses

Row without any condition


WITH n_cnt AS (
	SELECT reltuples::bigint AS count FROM pg_class WHERE oid = 'public.employee'::regclass;
)
SELECT *, (SELECT count FROM n_cnt) FROM public.employee OFFSET 0 LIMIT 10

Row with complicate conditions


WITH e_cnt as (
    SELECT count(*) cnt FROM public.employee AS e WHERE e.employer_id = 7 
)
SELECT *, (SELECT cnt FROM e_cnt) FROM public.employee e WHERE e.employer_id = 7 OFFSET 0 LIMIT 10

WITH (clause) CTE query gives better performance than window functionCOUNT(*) OVER()


SELECT *, COUNT(*) OVER() from EMPLOYEE e where e.employer_id = ? OFFSET 0 LIMIT 25

Pagination in ApplicationLast updated: 2020-08-13

🌳 Pagination will reduce the weight of the page at the client and the query cost on the database.

🐃 In real world, we only need to load 20 or 30 items instead of millions

CTE (WITH clauses)

A common table expression (CTE) can be seen as a view that is only valid for a single query

This could also be written as a subquery in FROM, but there are some advantages to using CTEs:

The query becomes more readable.
You can reference the CTE several times in the query, and it will be calculated only once.
You can use data modifying statements in the CTE (typically with a RETURNING clause).

Note

Note that before v12, PostgreSQL always materialized CTEs. That means, the CTE was calculated independently from the containing query. From v12 on, CTEs can be “inlined” into the query, which provides further optimization potential.


WITH RECURSIVE fib AS (
	SELECT 1 AS n,
			1::bigint AS "fib",
			1::bigint AS "fib+1"
	UNION ALL
	SELECT n+1,
			"fib+1",
			"fib" + "fib+1"
	FROM fib
)
SELECT n, "fib" FROM fib
LIMIT 20;

RECURSIVE in CTE

Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY

Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem.

Larger settings might improve performance for vacuuming and for restoring database dumps.

The default value is 64 MB.

autovacuum_work_mem

The vacuum uses the maintenance_work_mem by default, but we can separate it using this parameter.

We can specify the maximum amount of memory to be used by each autovacuum worker here.

wal_sync_method & wal_buffers

The amount of shared memory used for WAL data that has not yet been written to disk.

The default setting is 3% or 1/32 of shared_buffers, but not less than 64kB nor more than the size of one WAL segment, typically 16MB.

Larger settings might improve performance for vacuuming and for restoring database dumps.

The default value is 64 MB.