Slow left join lateral in subqueryPostgreSQL 9.2 (PostGIS) performance problemHow can I speed up a Postgres...
Why don't you get burned by the wood benches in a sauna?
I am a loser when it comes to jobs, what possibilities do I have?
Is there a configuration of the 8-puzzle where locking a tile makes it harder?
Trying to make a 3dplot
What is an efficient way to digitize a family photo collection?
Can I legally make a website about boycotting a certain company?
What does an unprocessed RAW file look like?
70s or 80s B-movie about aliens in a family's television, fry the house cat and trap the son inside the TV
What happens if both players misunderstand the game state until it's too late?
Why is Shelob considered evil?
Are there any rules for handling distractions whilst performing skill checks?
Determinant of 3x3 matrix by cofactor expansion
Are all power cords made equal?
bash aliases do not expand even with shopt expand_aliases
Minimum Viable Product for RTS game?
Does しかたない imply disappointment?
How do I add a strong "onion flavor" to the biryani (in restaurant style)?
Why write a book when there's a movie in my head?
How can guns be countered by melee combat without raw-ability or exceptional explanations?
Is the tritone (A4 / d5) still banned in Roman Catholic music?
Is it possible to detect 100% of SQLi with a simple regex?
How can I handle players killing my NPC outside of combat?
When distributing a Linux kernel driver as source code, what's the difference between Proprietary and GPL license?
How to Build a List from Separate Lists
Slow left join lateral in subquery
PostgreSQL 9.2 (PostGIS) performance problemHow can I speed up a Postgres query containing lots of Joins with an ILIKE conditionpostgres explain plan with giant gaps between operationsSlow fulltext search due to wildly inaccurate row estimatespostgresql 9.2 hash join issueSorting killing my postgresql queryHow to make DISTINCT ON faster in PostgreSQL?PostgreSQL query is slow when return LineString dataWhy is this query with WHERE, ORDER BY and LIMIT so slow?How to index two tables for JOINed query optimisation
I have three tables: units (apartment units) leases, and line items (a better name would have been recurring rents). A unit has many leases and lease has many line items (associated with each rent change). My goal is given a set of a months, list the unit and the most recent rent for that unit.
The following query does exactly what I would hope but is extremely slow (8 seconds) on a relatively small data set (~1500 units, ~5000 leases, ~15000 line items).
with "last_rent" as (
SELECT "line_items".* FROM (SELECT "line_items".*, "leases".unit_id, row_number() over (
partition by unit_id, date_trunc('month', "line_items".start)
order by "line_items".start desc
)
FROM "line_items" INNER JOIN "leases" ON "leases"."id" = "line_items"."lease_id" WHERE "line_items"."name" = 'RNT' AND "leases"."community_id" IN (X)) as "line_items" WHERE "line_items"."row_number" = 1
), "month_series" as (
SELECT id as unit_id, community_id, bedrooms, bathrooms, generate_series( '2019-03-01', '2019-03-01'::date, interval '1 month' ) - interval '1 day' dt
FROM "units" WHERE "units"."community_id" IN (X) ORDER BY "units"."id" ASC
), "unit_rent_month" as (
select unit_id, community_id, bedrooms, bathrooms, amount_cents, date_trunc('month', dt)::date as period
from "month_series"
left join lateral (
select amount_cents
from "last_rent"
where
"last_rent"."unit_id" = "month_series"."unit_id"
and "last_rent"."start" <= "month_series"."dt"
order by unit_id, "last_rent".start desc
limit 1
) last_rent on true
order by dt, unit_id
)
select *
from "unit_rent_month";
My strategy was to isolate the last rent of each month (in case of early move outs or cancellations) by using a window function. I created another table (month_series) which lists all the unit_ids and the months. I tried doing a lateral join between last rent and month series.
The analyze looks like this:
CTE Scan on unit_rent_month (cost=2563655.29..2593175.29 rows=1476000 width=24) (actual time=8019.924..8020.607 rows=1476 loops=1)
Output: unit_rent_month.unit_id, unit_rent_month.community_id, unit_rent_month.bedrooms, unit_rent_month.bathrooms, unit_rent_month.amount_cents, unit_rent_month.period
Buffers: shared hit=990
CTE last_rent
-> Subquery Scan on line_items (cost=1682.58..2163.78 rows=60 width=56) (actual time=44.584..63.921 rows=12010 loops=1)
Output: line_items.id, line_items.name, line_items.lease_id, line_items.amount_cents, line_items.amount_currency, line_items.created_at, line_items.updated_at, line_items.start, line_items.expiration, line_items.unit_id, line_items.row_number
Filter: (line_items.row_number = 1)
Rows Removed by Filter: 63
Buffers: shared hit=269
-> WindowAgg (cost=1682.58..2013.40 rows=12030 width=68) (actual time=44.582..60.093 rows=12073 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration, leases.unit_id, row_number() OVER (?), leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone))
Buffers: shared hit=269
-> Sort (cost=1682.58..1712.65 rows=12030 width=56) (actual time=44.524..45.478 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Sort Key: leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.start DESC
Sort Method: quicksort Memory: 2082kB
Buffers: shared hit=269
-> Hash Join (cost=378.43..867.28 rows=12030 width=56) (actual time=7.626..33.901 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, date_trunc('month'::text, (line_items_1.start)::timestamp with time zone), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Inner Unique: true
Hash Cond: (line_items_1.lease_id = leases.id)
Buffers: shared hit=269
-> Seq Scan on public.line_items line_items_1 (cost=0.00..395.25 rows=12736 width=44) (actual time=0.012..5.813 rows=12736 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration
Filter: ((line_items_1.name)::text = 'RNT'::text)
Rows Removed by Filter: 4884
Buffers: shared hit=175
-> Hash (cost=292.99..292.99 rows=6835 width=8) (actual time=7.537..7.537 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Buckets: 8192 Batches: 1 Memory Usage: 332kB
Buffers: shared hit=94
-> Seq Scan on public.leases (cost=0.00..292.99 rows=6835 width=8) (actual time=0.010..4.954 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Filter: (leases.community_id = ANY ('{X}'::bigint[]))
Rows Removed by Filter: 401
Buffers: shared hit=94
CTE month_series
-> Result (cost=0.28..33371.24 rows=1476000 width=24) (actual time=0.065..11.553 rows=1476 loops=1)
Output: units.id, units.community_id, units.bedrooms, units.bathrooms, ((generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval)) - '1 day'::interval)
Buffers: shared hit=712
-> ProjectSet (cost=0.28..7541.24 rows=1476000 width=24) (actual time=0.062..9.869 rows=1476 loops=1)
Output: generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval), units.id, units.community_id, units.bedrooms, units.bathrooms
Buffers: shared hit=712
-> Index Scan using units_pkey on public.units (cost=0.28..146.48 rows=1476 width=16) (actual time=0.038..4.365 rows=1476 loops=1)
Output: units.id, units.number, units.bedrooms, units.bathrooms, units.square_footage, units.community_id, units.building_id, units.created_at, units.updated_at, units.slug, units.status, units.note, units.half_bathrooms, units.display_number
Filter: (units.community_id = ANY ('{X}'::integer[]))
Rows Removed by Filter: 201
Buffers: shared hit=712
CTE unit_rent_month
-> Sort (cost=2524430.27..2528120.27 rows=1476000 width=32) (actual time=8019.918..8019.998 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, ((date_trunc('month'::text, month_series.dt))::date), month_series.dt
Sort Key: month_series.dt, month_series.unit_id
Sort Method: quicksort Memory: 164kB
Buffers: shared hit=990
-> Nested Loop Left Join (cost=1.51..2302560.00 rows=1476000 width=32) (actual time=78.693..8018.611 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, (date_trunc('month'::text, month_series.dt))::date, month_series.dt
Buffers: shared hit=984
-> CTE Scan on month_series (cost=0.00..29520.00 rows=1476000 width=24) (actual time=0.067..13.332 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, month_series.dt
Buffers: shared hit=712
-> Limit (cost=1.51..1.51 rows=1 width=12) (actual time=5.419..5.420 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Buffers: shared hit=272
-> Sort (cost=1.51..1.51 rows=1 width=12) (actual time=5.418..5.418 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Sort Key: last_rent.start DESC
Sort Method: top-N heapsort Memory: 25kB
Buffers: shared hit=272
-> CTE Scan on last_rent (cost=0.00..1.50 rows=1 width=12) (actual time=3.395..5.411 rows=8 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Filter: ((last_rent.start <= month_series.dt) AND (last_rent.unit_id = month_series.unit_id))
Rows Removed by Filter: 12002
Buffers: shared hit=269
Planning time: 3.219 ms
Execution time: 8021.388 ms
Based on what the analyze is saying, I believe the issue to be the subquery and the lateral join. Would anyone help me figure how to optimize this? Happy to share more information if needed.
postgresql postgresql-performance
New contributor
add a comment |
I have three tables: units (apartment units) leases, and line items (a better name would have been recurring rents). A unit has many leases and lease has many line items (associated with each rent change). My goal is given a set of a months, list the unit and the most recent rent for that unit.
The following query does exactly what I would hope but is extremely slow (8 seconds) on a relatively small data set (~1500 units, ~5000 leases, ~15000 line items).
with "last_rent" as (
SELECT "line_items".* FROM (SELECT "line_items".*, "leases".unit_id, row_number() over (
partition by unit_id, date_trunc('month', "line_items".start)
order by "line_items".start desc
)
FROM "line_items" INNER JOIN "leases" ON "leases"."id" = "line_items"."lease_id" WHERE "line_items"."name" = 'RNT' AND "leases"."community_id" IN (X)) as "line_items" WHERE "line_items"."row_number" = 1
), "month_series" as (
SELECT id as unit_id, community_id, bedrooms, bathrooms, generate_series( '2019-03-01', '2019-03-01'::date, interval '1 month' ) - interval '1 day' dt
FROM "units" WHERE "units"."community_id" IN (X) ORDER BY "units"."id" ASC
), "unit_rent_month" as (
select unit_id, community_id, bedrooms, bathrooms, amount_cents, date_trunc('month', dt)::date as period
from "month_series"
left join lateral (
select amount_cents
from "last_rent"
where
"last_rent"."unit_id" = "month_series"."unit_id"
and "last_rent"."start" <= "month_series"."dt"
order by unit_id, "last_rent".start desc
limit 1
) last_rent on true
order by dt, unit_id
)
select *
from "unit_rent_month";
My strategy was to isolate the last rent of each month (in case of early move outs or cancellations) by using a window function. I created another table (month_series) which lists all the unit_ids and the months. I tried doing a lateral join between last rent and month series.
The analyze looks like this:
CTE Scan on unit_rent_month (cost=2563655.29..2593175.29 rows=1476000 width=24) (actual time=8019.924..8020.607 rows=1476 loops=1)
Output: unit_rent_month.unit_id, unit_rent_month.community_id, unit_rent_month.bedrooms, unit_rent_month.bathrooms, unit_rent_month.amount_cents, unit_rent_month.period
Buffers: shared hit=990
CTE last_rent
-> Subquery Scan on line_items (cost=1682.58..2163.78 rows=60 width=56) (actual time=44.584..63.921 rows=12010 loops=1)
Output: line_items.id, line_items.name, line_items.lease_id, line_items.amount_cents, line_items.amount_currency, line_items.created_at, line_items.updated_at, line_items.start, line_items.expiration, line_items.unit_id, line_items.row_number
Filter: (line_items.row_number = 1)
Rows Removed by Filter: 63
Buffers: shared hit=269
-> WindowAgg (cost=1682.58..2013.40 rows=12030 width=68) (actual time=44.582..60.093 rows=12073 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration, leases.unit_id, row_number() OVER (?), leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone))
Buffers: shared hit=269
-> Sort (cost=1682.58..1712.65 rows=12030 width=56) (actual time=44.524..45.478 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Sort Key: leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.start DESC
Sort Method: quicksort Memory: 2082kB
Buffers: shared hit=269
-> Hash Join (cost=378.43..867.28 rows=12030 width=56) (actual time=7.626..33.901 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, date_trunc('month'::text, (line_items_1.start)::timestamp with time zone), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Inner Unique: true
Hash Cond: (line_items_1.lease_id = leases.id)
Buffers: shared hit=269
-> Seq Scan on public.line_items line_items_1 (cost=0.00..395.25 rows=12736 width=44) (actual time=0.012..5.813 rows=12736 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration
Filter: ((line_items_1.name)::text = 'RNT'::text)
Rows Removed by Filter: 4884
Buffers: shared hit=175
-> Hash (cost=292.99..292.99 rows=6835 width=8) (actual time=7.537..7.537 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Buckets: 8192 Batches: 1 Memory Usage: 332kB
Buffers: shared hit=94
-> Seq Scan on public.leases (cost=0.00..292.99 rows=6835 width=8) (actual time=0.010..4.954 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Filter: (leases.community_id = ANY ('{X}'::bigint[]))
Rows Removed by Filter: 401
Buffers: shared hit=94
CTE month_series
-> Result (cost=0.28..33371.24 rows=1476000 width=24) (actual time=0.065..11.553 rows=1476 loops=1)
Output: units.id, units.community_id, units.bedrooms, units.bathrooms, ((generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval)) - '1 day'::interval)
Buffers: shared hit=712
-> ProjectSet (cost=0.28..7541.24 rows=1476000 width=24) (actual time=0.062..9.869 rows=1476 loops=1)
Output: generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval), units.id, units.community_id, units.bedrooms, units.bathrooms
Buffers: shared hit=712
-> Index Scan using units_pkey on public.units (cost=0.28..146.48 rows=1476 width=16) (actual time=0.038..4.365 rows=1476 loops=1)
Output: units.id, units.number, units.bedrooms, units.bathrooms, units.square_footage, units.community_id, units.building_id, units.created_at, units.updated_at, units.slug, units.status, units.note, units.half_bathrooms, units.display_number
Filter: (units.community_id = ANY ('{X}'::integer[]))
Rows Removed by Filter: 201
Buffers: shared hit=712
CTE unit_rent_month
-> Sort (cost=2524430.27..2528120.27 rows=1476000 width=32) (actual time=8019.918..8019.998 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, ((date_trunc('month'::text, month_series.dt))::date), month_series.dt
Sort Key: month_series.dt, month_series.unit_id
Sort Method: quicksort Memory: 164kB
Buffers: shared hit=990
-> Nested Loop Left Join (cost=1.51..2302560.00 rows=1476000 width=32) (actual time=78.693..8018.611 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, (date_trunc('month'::text, month_series.dt))::date, month_series.dt
Buffers: shared hit=984
-> CTE Scan on month_series (cost=0.00..29520.00 rows=1476000 width=24) (actual time=0.067..13.332 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, month_series.dt
Buffers: shared hit=712
-> Limit (cost=1.51..1.51 rows=1 width=12) (actual time=5.419..5.420 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Buffers: shared hit=272
-> Sort (cost=1.51..1.51 rows=1 width=12) (actual time=5.418..5.418 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Sort Key: last_rent.start DESC
Sort Method: top-N heapsort Memory: 25kB
Buffers: shared hit=272
-> CTE Scan on last_rent (cost=0.00..1.50 rows=1 width=12) (actual time=3.395..5.411 rows=8 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Filter: ((last_rent.start <= month_series.dt) AND (last_rent.unit_id = month_series.unit_id))
Rows Removed by Filter: 12002
Buffers: shared hit=269
Planning time: 3.219 ms
Execution time: 8021.388 ms
Based on what the analyze is saying, I believe the issue to be the subquery and the lateral join. Would anyone help me figure how to optimize this? Happy to share more information if needed.
postgresql postgresql-performance
New contributor
add a comment |
I have three tables: units (apartment units) leases, and line items (a better name would have been recurring rents). A unit has many leases and lease has many line items (associated with each rent change). My goal is given a set of a months, list the unit and the most recent rent for that unit.
The following query does exactly what I would hope but is extremely slow (8 seconds) on a relatively small data set (~1500 units, ~5000 leases, ~15000 line items).
with "last_rent" as (
SELECT "line_items".* FROM (SELECT "line_items".*, "leases".unit_id, row_number() over (
partition by unit_id, date_trunc('month', "line_items".start)
order by "line_items".start desc
)
FROM "line_items" INNER JOIN "leases" ON "leases"."id" = "line_items"."lease_id" WHERE "line_items"."name" = 'RNT' AND "leases"."community_id" IN (X)) as "line_items" WHERE "line_items"."row_number" = 1
), "month_series" as (
SELECT id as unit_id, community_id, bedrooms, bathrooms, generate_series( '2019-03-01', '2019-03-01'::date, interval '1 month' ) - interval '1 day' dt
FROM "units" WHERE "units"."community_id" IN (X) ORDER BY "units"."id" ASC
), "unit_rent_month" as (
select unit_id, community_id, bedrooms, bathrooms, amount_cents, date_trunc('month', dt)::date as period
from "month_series"
left join lateral (
select amount_cents
from "last_rent"
where
"last_rent"."unit_id" = "month_series"."unit_id"
and "last_rent"."start" <= "month_series"."dt"
order by unit_id, "last_rent".start desc
limit 1
) last_rent on true
order by dt, unit_id
)
select *
from "unit_rent_month";
My strategy was to isolate the last rent of each month (in case of early move outs or cancellations) by using a window function. I created another table (month_series) which lists all the unit_ids and the months. I tried doing a lateral join between last rent and month series.
The analyze looks like this:
CTE Scan on unit_rent_month (cost=2563655.29..2593175.29 rows=1476000 width=24) (actual time=8019.924..8020.607 rows=1476 loops=1)
Output: unit_rent_month.unit_id, unit_rent_month.community_id, unit_rent_month.bedrooms, unit_rent_month.bathrooms, unit_rent_month.amount_cents, unit_rent_month.period
Buffers: shared hit=990
CTE last_rent
-> Subquery Scan on line_items (cost=1682.58..2163.78 rows=60 width=56) (actual time=44.584..63.921 rows=12010 loops=1)
Output: line_items.id, line_items.name, line_items.lease_id, line_items.amount_cents, line_items.amount_currency, line_items.created_at, line_items.updated_at, line_items.start, line_items.expiration, line_items.unit_id, line_items.row_number
Filter: (line_items.row_number = 1)
Rows Removed by Filter: 63
Buffers: shared hit=269
-> WindowAgg (cost=1682.58..2013.40 rows=12030 width=68) (actual time=44.582..60.093 rows=12073 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration, leases.unit_id, row_number() OVER (?), leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone))
Buffers: shared hit=269
-> Sort (cost=1682.58..1712.65 rows=12030 width=56) (actual time=44.524..45.478 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Sort Key: leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.start DESC
Sort Method: quicksort Memory: 2082kB
Buffers: shared hit=269
-> Hash Join (cost=378.43..867.28 rows=12030 width=56) (actual time=7.626..33.901 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, date_trunc('month'::text, (line_items_1.start)::timestamp with time zone), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Inner Unique: true
Hash Cond: (line_items_1.lease_id = leases.id)
Buffers: shared hit=269
-> Seq Scan on public.line_items line_items_1 (cost=0.00..395.25 rows=12736 width=44) (actual time=0.012..5.813 rows=12736 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration
Filter: ((line_items_1.name)::text = 'RNT'::text)
Rows Removed by Filter: 4884
Buffers: shared hit=175
-> Hash (cost=292.99..292.99 rows=6835 width=8) (actual time=7.537..7.537 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Buckets: 8192 Batches: 1 Memory Usage: 332kB
Buffers: shared hit=94
-> Seq Scan on public.leases (cost=0.00..292.99 rows=6835 width=8) (actual time=0.010..4.954 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Filter: (leases.community_id = ANY ('{X}'::bigint[]))
Rows Removed by Filter: 401
Buffers: shared hit=94
CTE month_series
-> Result (cost=0.28..33371.24 rows=1476000 width=24) (actual time=0.065..11.553 rows=1476 loops=1)
Output: units.id, units.community_id, units.bedrooms, units.bathrooms, ((generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval)) - '1 day'::interval)
Buffers: shared hit=712
-> ProjectSet (cost=0.28..7541.24 rows=1476000 width=24) (actual time=0.062..9.869 rows=1476 loops=1)
Output: generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval), units.id, units.community_id, units.bedrooms, units.bathrooms
Buffers: shared hit=712
-> Index Scan using units_pkey on public.units (cost=0.28..146.48 rows=1476 width=16) (actual time=0.038..4.365 rows=1476 loops=1)
Output: units.id, units.number, units.bedrooms, units.bathrooms, units.square_footage, units.community_id, units.building_id, units.created_at, units.updated_at, units.slug, units.status, units.note, units.half_bathrooms, units.display_number
Filter: (units.community_id = ANY ('{X}'::integer[]))
Rows Removed by Filter: 201
Buffers: shared hit=712
CTE unit_rent_month
-> Sort (cost=2524430.27..2528120.27 rows=1476000 width=32) (actual time=8019.918..8019.998 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, ((date_trunc('month'::text, month_series.dt))::date), month_series.dt
Sort Key: month_series.dt, month_series.unit_id
Sort Method: quicksort Memory: 164kB
Buffers: shared hit=990
-> Nested Loop Left Join (cost=1.51..2302560.00 rows=1476000 width=32) (actual time=78.693..8018.611 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, (date_trunc('month'::text, month_series.dt))::date, month_series.dt
Buffers: shared hit=984
-> CTE Scan on month_series (cost=0.00..29520.00 rows=1476000 width=24) (actual time=0.067..13.332 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, month_series.dt
Buffers: shared hit=712
-> Limit (cost=1.51..1.51 rows=1 width=12) (actual time=5.419..5.420 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Buffers: shared hit=272
-> Sort (cost=1.51..1.51 rows=1 width=12) (actual time=5.418..5.418 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Sort Key: last_rent.start DESC
Sort Method: top-N heapsort Memory: 25kB
Buffers: shared hit=272
-> CTE Scan on last_rent (cost=0.00..1.50 rows=1 width=12) (actual time=3.395..5.411 rows=8 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Filter: ((last_rent.start <= month_series.dt) AND (last_rent.unit_id = month_series.unit_id))
Rows Removed by Filter: 12002
Buffers: shared hit=269
Planning time: 3.219 ms
Execution time: 8021.388 ms
Based on what the analyze is saying, I believe the issue to be the subquery and the lateral join. Would anyone help me figure how to optimize this? Happy to share more information if needed.
postgresql postgresql-performance
New contributor
I have three tables: units (apartment units) leases, and line items (a better name would have been recurring rents). A unit has many leases and lease has many line items (associated with each rent change). My goal is given a set of a months, list the unit and the most recent rent for that unit.
The following query does exactly what I would hope but is extremely slow (8 seconds) on a relatively small data set (~1500 units, ~5000 leases, ~15000 line items).
with "last_rent" as (
SELECT "line_items".* FROM (SELECT "line_items".*, "leases".unit_id, row_number() over (
partition by unit_id, date_trunc('month', "line_items".start)
order by "line_items".start desc
)
FROM "line_items" INNER JOIN "leases" ON "leases"."id" = "line_items"."lease_id" WHERE "line_items"."name" = 'RNT' AND "leases"."community_id" IN (X)) as "line_items" WHERE "line_items"."row_number" = 1
), "month_series" as (
SELECT id as unit_id, community_id, bedrooms, bathrooms, generate_series( '2019-03-01', '2019-03-01'::date, interval '1 month' ) - interval '1 day' dt
FROM "units" WHERE "units"."community_id" IN (X) ORDER BY "units"."id" ASC
), "unit_rent_month" as (
select unit_id, community_id, bedrooms, bathrooms, amount_cents, date_trunc('month', dt)::date as period
from "month_series"
left join lateral (
select amount_cents
from "last_rent"
where
"last_rent"."unit_id" = "month_series"."unit_id"
and "last_rent"."start" <= "month_series"."dt"
order by unit_id, "last_rent".start desc
limit 1
) last_rent on true
order by dt, unit_id
)
select *
from "unit_rent_month";
My strategy was to isolate the last rent of each month (in case of early move outs or cancellations) by using a window function. I created another table (month_series) which lists all the unit_ids and the months. I tried doing a lateral join between last rent and month series.
The analyze looks like this:
CTE Scan on unit_rent_month (cost=2563655.29..2593175.29 rows=1476000 width=24) (actual time=8019.924..8020.607 rows=1476 loops=1)
Output: unit_rent_month.unit_id, unit_rent_month.community_id, unit_rent_month.bedrooms, unit_rent_month.bathrooms, unit_rent_month.amount_cents, unit_rent_month.period
Buffers: shared hit=990
CTE last_rent
-> Subquery Scan on line_items (cost=1682.58..2163.78 rows=60 width=56) (actual time=44.584..63.921 rows=12010 loops=1)
Output: line_items.id, line_items.name, line_items.lease_id, line_items.amount_cents, line_items.amount_currency, line_items.created_at, line_items.updated_at, line_items.start, line_items.expiration, line_items.unit_id, line_items.row_number
Filter: (line_items.row_number = 1)
Rows Removed by Filter: 63
Buffers: shared hit=269
-> WindowAgg (cost=1682.58..2013.40 rows=12030 width=68) (actual time=44.582..60.093 rows=12073 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration, leases.unit_id, row_number() OVER (?), leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone))
Buffers: shared hit=269
-> Sort (cost=1682.58..1712.65 rows=12030 width=56) (actual time=44.524..45.478 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Sort Key: leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.start DESC
Sort Method: quicksort Memory: 2082kB
Buffers: shared hit=269
-> Hash Join (cost=378.43..867.28 rows=12030 width=56) (actual time=7.626..33.901 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, date_trunc('month'::text, (line_items_1.start)::timestamp with time zone), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Inner Unique: true
Hash Cond: (line_items_1.lease_id = leases.id)
Buffers: shared hit=269
-> Seq Scan on public.line_items line_items_1 (cost=0.00..395.25 rows=12736 width=44) (actual time=0.012..5.813 rows=12736 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration
Filter: ((line_items_1.name)::text = 'RNT'::text)
Rows Removed by Filter: 4884
Buffers: shared hit=175
-> Hash (cost=292.99..292.99 rows=6835 width=8) (actual time=7.537..7.537 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Buckets: 8192 Batches: 1 Memory Usage: 332kB
Buffers: shared hit=94
-> Seq Scan on public.leases (cost=0.00..292.99 rows=6835 width=8) (actual time=0.010..4.954 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Filter: (leases.community_id = ANY ('{X}'::bigint[]))
Rows Removed by Filter: 401
Buffers: shared hit=94
CTE month_series
-> Result (cost=0.28..33371.24 rows=1476000 width=24) (actual time=0.065..11.553 rows=1476 loops=1)
Output: units.id, units.community_id, units.bedrooms, units.bathrooms, ((generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval)) - '1 day'::interval)
Buffers: shared hit=712
-> ProjectSet (cost=0.28..7541.24 rows=1476000 width=24) (actual time=0.062..9.869 rows=1476 loops=1)
Output: generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval), units.id, units.community_id, units.bedrooms, units.bathrooms
Buffers: shared hit=712
-> Index Scan using units_pkey on public.units (cost=0.28..146.48 rows=1476 width=16) (actual time=0.038..4.365 rows=1476 loops=1)
Output: units.id, units.number, units.bedrooms, units.bathrooms, units.square_footage, units.community_id, units.building_id, units.created_at, units.updated_at, units.slug, units.status, units.note, units.half_bathrooms, units.display_number
Filter: (units.community_id = ANY ('{X}'::integer[]))
Rows Removed by Filter: 201
Buffers: shared hit=712
CTE unit_rent_month
-> Sort (cost=2524430.27..2528120.27 rows=1476000 width=32) (actual time=8019.918..8019.998 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, ((date_trunc('month'::text, month_series.dt))::date), month_series.dt
Sort Key: month_series.dt, month_series.unit_id
Sort Method: quicksort Memory: 164kB
Buffers: shared hit=990
-> Nested Loop Left Join (cost=1.51..2302560.00 rows=1476000 width=32) (actual time=78.693..8018.611 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, (date_trunc('month'::text, month_series.dt))::date, month_series.dt
Buffers: shared hit=984
-> CTE Scan on month_series (cost=0.00..29520.00 rows=1476000 width=24) (actual time=0.067..13.332 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, month_series.dt
Buffers: shared hit=712
-> Limit (cost=1.51..1.51 rows=1 width=12) (actual time=5.419..5.420 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Buffers: shared hit=272
-> Sort (cost=1.51..1.51 rows=1 width=12) (actual time=5.418..5.418 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Sort Key: last_rent.start DESC
Sort Method: top-N heapsort Memory: 25kB
Buffers: shared hit=272
-> CTE Scan on last_rent (cost=0.00..1.50 rows=1 width=12) (actual time=3.395..5.411 rows=8 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Filter: ((last_rent.start <= month_series.dt) AND (last_rent.unit_id = month_series.unit_id))
Rows Removed by Filter: 12002
Buffers: shared hit=269
Planning time: 3.219 ms
Execution time: 8021.388 ms
Based on what the analyze is saying, I believe the issue to be the subquery and the lateral join. Would anyone help me figure how to optimize this? Happy to share more information if needed.
postgresql postgresql-performance
postgresql postgresql-performance
New contributor
New contributor
New contributor
asked 5 mins ago
sunnyrjunejasunnyrjuneja
1011
1011
New contributor
New contributor
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f230516%2fslow-left-join-lateral-in-subquery%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.
sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.
sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.
sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f230516%2fslow-left-join-lateral-in-subquery%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown