I have a simple count query that can use Index Only Scan, but it still take so long in PostgresQL!
I have a cars table with 2 columns type bigint and active boolean, I also have a multi-column index on those columns
CREATE TABLE cars
(
id BIGSERIAL NOT NULL
CONSTRAINT cars_pkey PRIMARY KEY ,
type BIGINT NOT NULL ,
name VARCHAR(500) NOT NULL ,
active BOOLEAN DEFAULT TRUE NOT NULL,
created_at TIMESTAMP(0) WITH TIME ZONE default NOW(),
updated_at TIMESTAMP(0) WITH TIME ZONE default NOW(),
deleted_at TIMESTAMP(0) WITH TIME ZONE
);
CREATE INDEX cars_type_active_index ON cars(type, active);
I inserted some test data with 950k records, type=1 have 600k records
INSERT INTO cars (type, name) (SELECT 1, 'car-name' FROM generate_series(1,600000));
INSERT INTO cars (type, name) (SELECT 2, 'car-name' FROM generate_series(1,200000));
INSERT INTO cars (type, name) (SELECT 3, 'car-name' FROM generate_series(1,100000));
INSERT INTO cars (type, name) (SELECT 4, 'car-name' FROM generate_series(1,50000));
Let 's run VACUUM ANALYZE and force PostgresQL to use Index Only Scan
VACUUM ANALYSE;
SET enable_seqscan = OFF;
SET enable_bitmapscan = OFF;
OK, I have a simple query on type and active
EXPLAIN (VERBOSE, BUFFERS, ANALYSE)
SELECT count(*)
FROM cars
WHERE type = 1 AND active = true;
Result:
Aggregate (cost=24805.70..24805.71 rows=1 width=0) (actual time=4460.915..4460.918 rows=1 loops=1)
Output: count(*)
Buffers: shared hit=2806
-> Index Only Scan using cars_type_active_index on public.cars (cost=0.42..23304.23 rows=600590 width=0) (actual time=0.051..2257.832 rows=600000 loops=1)
Output: type, active
Index Cond: ((cars.type = 1) AND (cars.active = true))
Filter: cars.active
Heap Fetches: 0
Buffers: shared hit=2806
Planning time: 0.213 ms
Execution time: 4461.002 ms
(11 rows)
Look at the query explain result,
It used
Index Only Scan, with index only scan, depending onvisibilities map, PostgresQL sometime need to fetch Table Heap to check for visibility of the tuple, But I already runVACUUM ANALYZEso you can seeHeap fetch = 0, so reading the index is enough for answer this query.The size of the index is quite small, it can all fit on the Buffer cache (
Buffers: shared hit=2806), PostgresQL does not need to fetch pages from disk.
From there, I can't understand why PostgresQL take that long (4.5s) to answer the query, 1M records is not a big number of records, everything is already cached on memory, and the data on index is visible, it does not need to fetch Heap.
PostgreSQL 9.5.10 on x86_64-pc-linux-gnu, compiled by gcc (Debian 4enter code here.9.2-10) 4.9.2, 64-bit
I tested it on docker 17.09.1-ce, Macbook pro 2015.
I am still new to PostgresQL and trying to map my knowledge with the real cases. Thanks so much,