Slow left join lateral in subqueryPostgreSQL 9.2 (PostGIS) performance problemHow can I speed up a Postgres...

Why don't you get burned by the wood benches in a sauna?

I am a loser when it comes to jobs, what possibilities do I have?

Is there a configuration of the 8-puzzle where locking a tile makes it harder?

Trying to make a 3dplot

What is an efficient way to digitize a family photo collection?

Can I legally make a website about boycotting a certain company?

What does an unprocessed RAW file look like?

70s or 80s B-movie about aliens in a family's television, fry the house cat and trap the son inside the TV

What happens if both players misunderstand the game state until it's too late?

Why is Shelob considered evil?

Are there any rules for handling distractions whilst performing skill checks?

Determinant of 3x3 matrix by cofactor expansion

Are all power cords made equal?

bash aliases do not expand even with shopt expand_aliases

Minimum Viable Product for RTS game?

Does しかたない imply disappointment?

How do I add a strong "onion flavor" to the biryani (in restaurant style)?

Why write a book when there's a movie in my head?

How can guns be countered by melee combat without raw-ability or exceptional explanations?

Is the tritone (A4 / d5) still banned in Roman Catholic music?

Is it possible to detect 100% of SQLi with a simple regex?

How can I handle players killing my NPC outside of combat?

When distributing a Linux kernel driver as source code, what's the difference between Proprietary and GPL license?

How to Build a List from Separate Lists



Slow left join lateral in subquery


PostgreSQL 9.2 (PostGIS) performance problemHow can I speed up a Postgres query containing lots of Joins with an ILIKE conditionpostgres explain plan with giant gaps between operationsSlow fulltext search due to wildly inaccurate row estimatespostgresql 9.2 hash join issueSorting killing my postgresql queryHow to make DISTINCT ON faster in PostgreSQL?PostgreSQL query is slow when return LineString dataWhy is this query with WHERE, ORDER BY and LIMIT so slow?How to index two tables for JOINed query optimisation













0















I have three tables: units (apartment units) leases, and line items (a better name would have been recurring rents). A unit has many leases and lease has many line items (associated with each rent change). My goal is given a set of a months, list the unit and the most recent rent for that unit.



The following query does exactly what I would hope but is extremely slow (8 seconds) on a relatively small data set (~1500 units, ~5000 leases, ~15000 line items).



with "last_rent" as (
SELECT "line_items".* FROM (SELECT "line_items".*, "leases".unit_id, row_number() over (
partition by unit_id, date_trunc('month', "line_items".start)
order by "line_items".start desc
)
FROM "line_items" INNER JOIN "leases" ON "leases"."id" = "line_items"."lease_id" WHERE "line_items"."name" = 'RNT' AND "leases"."community_id" IN (X)) as "line_items" WHERE "line_items"."row_number" = 1
), "month_series" as (
SELECT id as unit_id, community_id, bedrooms, bathrooms, generate_series( '2019-03-01', '2019-03-01'::date, interval '1 month' ) - interval '1 day' dt
FROM "units" WHERE "units"."community_id" IN (X) ORDER BY "units"."id" ASC
), "unit_rent_month" as (
select unit_id, community_id, bedrooms, bathrooms, amount_cents, date_trunc('month', dt)::date as period
from "month_series"
left join lateral (
select amount_cents
from "last_rent"
where
"last_rent"."unit_id" = "month_series"."unit_id"
and "last_rent"."start" <= "month_series"."dt"
order by unit_id, "last_rent".start desc
limit 1
) last_rent on true
order by dt, unit_id
)
select *
from "unit_rent_month";


My strategy was to isolate the last rent of each month (in case of early move outs or cancellations) by using a window function. I created another table (month_series) which lists all the unit_ids and the months. I tried doing a lateral join between last rent and month series.



The analyze looks like this:



CTE Scan on unit_rent_month  (cost=2563655.29..2593175.29 rows=1476000 width=24) (actual time=8019.924..8020.607 rows=1476 loops=1)
Output: unit_rent_month.unit_id, unit_rent_month.community_id, unit_rent_month.bedrooms, unit_rent_month.bathrooms, unit_rent_month.amount_cents, unit_rent_month.period
Buffers: shared hit=990
CTE last_rent
-> Subquery Scan on line_items (cost=1682.58..2163.78 rows=60 width=56) (actual time=44.584..63.921 rows=12010 loops=1)
Output: line_items.id, line_items.name, line_items.lease_id, line_items.amount_cents, line_items.amount_currency, line_items.created_at, line_items.updated_at, line_items.start, line_items.expiration, line_items.unit_id, line_items.row_number
Filter: (line_items.row_number = 1)
Rows Removed by Filter: 63
Buffers: shared hit=269
-> WindowAgg (cost=1682.58..2013.40 rows=12030 width=68) (actual time=44.582..60.093 rows=12073 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration, leases.unit_id, row_number() OVER (?), leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone))
Buffers: shared hit=269
-> Sort (cost=1682.58..1712.65 rows=12030 width=56) (actual time=44.524..45.478 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Sort Key: leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.start DESC
Sort Method: quicksort Memory: 2082kB
Buffers: shared hit=269
-> Hash Join (cost=378.43..867.28 rows=12030 width=56) (actual time=7.626..33.901 rows=12073 loops=1)
Output: line_items_1.start, leases.unit_id, date_trunc('month'::text, (line_items_1.start)::timestamp with time zone), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
Inner Unique: true
Hash Cond: (line_items_1.lease_id = leases.id)
Buffers: shared hit=269
-> Seq Scan on public.line_items line_items_1 (cost=0.00..395.25 rows=12736 width=44) (actual time=0.012..5.813 rows=12736 loops=1)
Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration
Filter: ((line_items_1.name)::text = 'RNT'::text)
Rows Removed by Filter: 4884
Buffers: shared hit=175
-> Hash (cost=292.99..292.99 rows=6835 width=8) (actual time=7.537..7.537 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Buckets: 8192 Batches: 1 Memory Usage: 332kB
Buffers: shared hit=94
-> Seq Scan on public.leases (cost=0.00..292.99 rows=6835 width=8) (actual time=0.010..4.954 rows=6837 loops=1)
Output: leases.unit_id, leases.id
Filter: (leases.community_id = ANY ('{X}'::bigint[]))
Rows Removed by Filter: 401
Buffers: shared hit=94
CTE month_series
-> Result (cost=0.28..33371.24 rows=1476000 width=24) (actual time=0.065..11.553 rows=1476 loops=1)
Output: units.id, units.community_id, units.bedrooms, units.bathrooms, ((generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval)) - '1 day'::interval)
Buffers: shared hit=712
-> ProjectSet (cost=0.28..7541.24 rows=1476000 width=24) (actual time=0.062..9.869 rows=1476 loops=1)
Output: generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval), units.id, units.community_id, units.bedrooms, units.bathrooms
Buffers: shared hit=712
-> Index Scan using units_pkey on public.units (cost=0.28..146.48 rows=1476 width=16) (actual time=0.038..4.365 rows=1476 loops=1)
Output: units.id, units.number, units.bedrooms, units.bathrooms, units.square_footage, units.community_id, units.building_id, units.created_at, units.updated_at, units.slug, units.status, units.note, units.half_bathrooms, units.display_number
Filter: (units.community_id = ANY ('{X}'::integer[]))
Rows Removed by Filter: 201
Buffers: shared hit=712
CTE unit_rent_month
-> Sort (cost=2524430.27..2528120.27 rows=1476000 width=32) (actual time=8019.918..8019.998 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, ((date_trunc('month'::text, month_series.dt))::date), month_series.dt
Sort Key: month_series.dt, month_series.unit_id
Sort Method: quicksort Memory: 164kB
Buffers: shared hit=990
-> Nested Loop Left Join (cost=1.51..2302560.00 rows=1476000 width=32) (actual time=78.693..8018.611 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, (date_trunc('month'::text, month_series.dt))::date, month_series.dt
Buffers: shared hit=984
-> CTE Scan on month_series (cost=0.00..29520.00 rows=1476000 width=24) (actual time=0.067..13.332 rows=1476 loops=1)
Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, month_series.dt
Buffers: shared hit=712
-> Limit (cost=1.51..1.51 rows=1 width=12) (actual time=5.419..5.420 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Buffers: shared hit=272
-> Sort (cost=1.51..1.51 rows=1 width=12) (actual time=5.418..5.418 rows=1 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Sort Key: last_rent.start DESC
Sort Method: top-N heapsort Memory: 25kB
Buffers: shared hit=272
-> CTE Scan on last_rent (cost=0.00..1.50 rows=1 width=12) (actual time=3.395..5.411 rows=8 loops=1476)
Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
Filter: ((last_rent.start <= month_series.dt) AND (last_rent.unit_id = month_series.unit_id))
Rows Removed by Filter: 12002
Buffers: shared hit=269
Planning time: 3.219 ms
Execution time: 8021.388 ms


Based on what the analyze is saying, I believe the issue to be the subquery and the lateral join. Would anyone help me figure how to optimize this? Happy to share more information if needed.









share







New contributor




sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

























    0















    I have three tables: units (apartment units) leases, and line items (a better name would have been recurring rents). A unit has many leases and lease has many line items (associated with each rent change). My goal is given a set of a months, list the unit and the most recent rent for that unit.



    The following query does exactly what I would hope but is extremely slow (8 seconds) on a relatively small data set (~1500 units, ~5000 leases, ~15000 line items).



    with "last_rent" as (
    SELECT "line_items".* FROM (SELECT "line_items".*, "leases".unit_id, row_number() over (
    partition by unit_id, date_trunc('month', "line_items".start)
    order by "line_items".start desc
    )
    FROM "line_items" INNER JOIN "leases" ON "leases"."id" = "line_items"."lease_id" WHERE "line_items"."name" = 'RNT' AND "leases"."community_id" IN (X)) as "line_items" WHERE "line_items"."row_number" = 1
    ), "month_series" as (
    SELECT id as unit_id, community_id, bedrooms, bathrooms, generate_series( '2019-03-01', '2019-03-01'::date, interval '1 month' ) - interval '1 day' dt
    FROM "units" WHERE "units"."community_id" IN (X) ORDER BY "units"."id" ASC
    ), "unit_rent_month" as (
    select unit_id, community_id, bedrooms, bathrooms, amount_cents, date_trunc('month', dt)::date as period
    from "month_series"
    left join lateral (
    select amount_cents
    from "last_rent"
    where
    "last_rent"."unit_id" = "month_series"."unit_id"
    and "last_rent"."start" <= "month_series"."dt"
    order by unit_id, "last_rent".start desc
    limit 1
    ) last_rent on true
    order by dt, unit_id
    )
    select *
    from "unit_rent_month";


    My strategy was to isolate the last rent of each month (in case of early move outs or cancellations) by using a window function. I created another table (month_series) which lists all the unit_ids and the months. I tried doing a lateral join between last rent and month series.



    The analyze looks like this:



    CTE Scan on unit_rent_month  (cost=2563655.29..2593175.29 rows=1476000 width=24) (actual time=8019.924..8020.607 rows=1476 loops=1)
    Output: unit_rent_month.unit_id, unit_rent_month.community_id, unit_rent_month.bedrooms, unit_rent_month.bathrooms, unit_rent_month.amount_cents, unit_rent_month.period
    Buffers: shared hit=990
    CTE last_rent
    -> Subquery Scan on line_items (cost=1682.58..2163.78 rows=60 width=56) (actual time=44.584..63.921 rows=12010 loops=1)
    Output: line_items.id, line_items.name, line_items.lease_id, line_items.amount_cents, line_items.amount_currency, line_items.created_at, line_items.updated_at, line_items.start, line_items.expiration, line_items.unit_id, line_items.row_number
    Filter: (line_items.row_number = 1)
    Rows Removed by Filter: 63
    Buffers: shared hit=269
    -> WindowAgg (cost=1682.58..2013.40 rows=12030 width=68) (actual time=44.582..60.093 rows=12073 loops=1)
    Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration, leases.unit_id, row_number() OVER (?), leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone))
    Buffers: shared hit=269
    -> Sort (cost=1682.58..1712.65 rows=12030 width=56) (actual time=44.524..45.478 rows=12073 loops=1)
    Output: line_items_1.start, leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
    Sort Key: leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.start DESC
    Sort Method: quicksort Memory: 2082kB
    Buffers: shared hit=269
    -> Hash Join (cost=378.43..867.28 rows=12030 width=56) (actual time=7.626..33.901 rows=12073 loops=1)
    Output: line_items_1.start, leases.unit_id, date_trunc('month'::text, (line_items_1.start)::timestamp with time zone), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
    Inner Unique: true
    Hash Cond: (line_items_1.lease_id = leases.id)
    Buffers: shared hit=269
    -> Seq Scan on public.line_items line_items_1 (cost=0.00..395.25 rows=12736 width=44) (actual time=0.012..5.813 rows=12736 loops=1)
    Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration
    Filter: ((line_items_1.name)::text = 'RNT'::text)
    Rows Removed by Filter: 4884
    Buffers: shared hit=175
    -> Hash (cost=292.99..292.99 rows=6835 width=8) (actual time=7.537..7.537 rows=6837 loops=1)
    Output: leases.unit_id, leases.id
    Buckets: 8192 Batches: 1 Memory Usage: 332kB
    Buffers: shared hit=94
    -> Seq Scan on public.leases (cost=0.00..292.99 rows=6835 width=8) (actual time=0.010..4.954 rows=6837 loops=1)
    Output: leases.unit_id, leases.id
    Filter: (leases.community_id = ANY ('{X}'::bigint[]))
    Rows Removed by Filter: 401
    Buffers: shared hit=94
    CTE month_series
    -> Result (cost=0.28..33371.24 rows=1476000 width=24) (actual time=0.065..11.553 rows=1476 loops=1)
    Output: units.id, units.community_id, units.bedrooms, units.bathrooms, ((generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval)) - '1 day'::interval)
    Buffers: shared hit=712
    -> ProjectSet (cost=0.28..7541.24 rows=1476000 width=24) (actual time=0.062..9.869 rows=1476 loops=1)
    Output: generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval), units.id, units.community_id, units.bedrooms, units.bathrooms
    Buffers: shared hit=712
    -> Index Scan using units_pkey on public.units (cost=0.28..146.48 rows=1476 width=16) (actual time=0.038..4.365 rows=1476 loops=1)
    Output: units.id, units.number, units.bedrooms, units.bathrooms, units.square_footage, units.community_id, units.building_id, units.created_at, units.updated_at, units.slug, units.status, units.note, units.half_bathrooms, units.display_number
    Filter: (units.community_id = ANY ('{X}'::integer[]))
    Rows Removed by Filter: 201
    Buffers: shared hit=712
    CTE unit_rent_month
    -> Sort (cost=2524430.27..2528120.27 rows=1476000 width=32) (actual time=8019.918..8019.998 rows=1476 loops=1)
    Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, ((date_trunc('month'::text, month_series.dt))::date), month_series.dt
    Sort Key: month_series.dt, month_series.unit_id
    Sort Method: quicksort Memory: 164kB
    Buffers: shared hit=990
    -> Nested Loop Left Join (cost=1.51..2302560.00 rows=1476000 width=32) (actual time=78.693..8018.611 rows=1476 loops=1)
    Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, (date_trunc('month'::text, month_series.dt))::date, month_series.dt
    Buffers: shared hit=984
    -> CTE Scan on month_series (cost=0.00..29520.00 rows=1476000 width=24) (actual time=0.067..13.332 rows=1476 loops=1)
    Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, month_series.dt
    Buffers: shared hit=712
    -> Limit (cost=1.51..1.51 rows=1 width=12) (actual time=5.419..5.420 rows=1 loops=1476)
    Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
    Buffers: shared hit=272
    -> Sort (cost=1.51..1.51 rows=1 width=12) (actual time=5.418..5.418 rows=1 loops=1476)
    Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
    Sort Key: last_rent.start DESC
    Sort Method: top-N heapsort Memory: 25kB
    Buffers: shared hit=272
    -> CTE Scan on last_rent (cost=0.00..1.50 rows=1 width=12) (actual time=3.395..5.411 rows=8 loops=1476)
    Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
    Filter: ((last_rent.start <= month_series.dt) AND (last_rent.unit_id = month_series.unit_id))
    Rows Removed by Filter: 12002
    Buffers: shared hit=269
    Planning time: 3.219 ms
    Execution time: 8021.388 ms


    Based on what the analyze is saying, I believe the issue to be the subquery and the lateral join. Would anyone help me figure how to optimize this? Happy to share more information if needed.









    share







    New contributor




    sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.























      0












      0








      0








      I have three tables: units (apartment units) leases, and line items (a better name would have been recurring rents). A unit has many leases and lease has many line items (associated with each rent change). My goal is given a set of a months, list the unit and the most recent rent for that unit.



      The following query does exactly what I would hope but is extremely slow (8 seconds) on a relatively small data set (~1500 units, ~5000 leases, ~15000 line items).



      with "last_rent" as (
      SELECT "line_items".* FROM (SELECT "line_items".*, "leases".unit_id, row_number() over (
      partition by unit_id, date_trunc('month', "line_items".start)
      order by "line_items".start desc
      )
      FROM "line_items" INNER JOIN "leases" ON "leases"."id" = "line_items"."lease_id" WHERE "line_items"."name" = 'RNT' AND "leases"."community_id" IN (X)) as "line_items" WHERE "line_items"."row_number" = 1
      ), "month_series" as (
      SELECT id as unit_id, community_id, bedrooms, bathrooms, generate_series( '2019-03-01', '2019-03-01'::date, interval '1 month' ) - interval '1 day' dt
      FROM "units" WHERE "units"."community_id" IN (X) ORDER BY "units"."id" ASC
      ), "unit_rent_month" as (
      select unit_id, community_id, bedrooms, bathrooms, amount_cents, date_trunc('month', dt)::date as period
      from "month_series"
      left join lateral (
      select amount_cents
      from "last_rent"
      where
      "last_rent"."unit_id" = "month_series"."unit_id"
      and "last_rent"."start" <= "month_series"."dt"
      order by unit_id, "last_rent".start desc
      limit 1
      ) last_rent on true
      order by dt, unit_id
      )
      select *
      from "unit_rent_month";


      My strategy was to isolate the last rent of each month (in case of early move outs or cancellations) by using a window function. I created another table (month_series) which lists all the unit_ids and the months. I tried doing a lateral join between last rent and month series.



      The analyze looks like this:



      CTE Scan on unit_rent_month  (cost=2563655.29..2593175.29 rows=1476000 width=24) (actual time=8019.924..8020.607 rows=1476 loops=1)
      Output: unit_rent_month.unit_id, unit_rent_month.community_id, unit_rent_month.bedrooms, unit_rent_month.bathrooms, unit_rent_month.amount_cents, unit_rent_month.period
      Buffers: shared hit=990
      CTE last_rent
      -> Subquery Scan on line_items (cost=1682.58..2163.78 rows=60 width=56) (actual time=44.584..63.921 rows=12010 loops=1)
      Output: line_items.id, line_items.name, line_items.lease_id, line_items.amount_cents, line_items.amount_currency, line_items.created_at, line_items.updated_at, line_items.start, line_items.expiration, line_items.unit_id, line_items.row_number
      Filter: (line_items.row_number = 1)
      Rows Removed by Filter: 63
      Buffers: shared hit=269
      -> WindowAgg (cost=1682.58..2013.40 rows=12030 width=68) (actual time=44.582..60.093 rows=12073 loops=1)
      Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration, leases.unit_id, row_number() OVER (?), leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone))
      Buffers: shared hit=269
      -> Sort (cost=1682.58..1712.65 rows=12030 width=56) (actual time=44.524..45.478 rows=12073 loops=1)
      Output: line_items_1.start, leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
      Sort Key: leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.start DESC
      Sort Method: quicksort Memory: 2082kB
      Buffers: shared hit=269
      -> Hash Join (cost=378.43..867.28 rows=12030 width=56) (actual time=7.626..33.901 rows=12073 loops=1)
      Output: line_items_1.start, leases.unit_id, date_trunc('month'::text, (line_items_1.start)::timestamp with time zone), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
      Inner Unique: true
      Hash Cond: (line_items_1.lease_id = leases.id)
      Buffers: shared hit=269
      -> Seq Scan on public.line_items line_items_1 (cost=0.00..395.25 rows=12736 width=44) (actual time=0.012..5.813 rows=12736 loops=1)
      Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration
      Filter: ((line_items_1.name)::text = 'RNT'::text)
      Rows Removed by Filter: 4884
      Buffers: shared hit=175
      -> Hash (cost=292.99..292.99 rows=6835 width=8) (actual time=7.537..7.537 rows=6837 loops=1)
      Output: leases.unit_id, leases.id
      Buckets: 8192 Batches: 1 Memory Usage: 332kB
      Buffers: shared hit=94
      -> Seq Scan on public.leases (cost=0.00..292.99 rows=6835 width=8) (actual time=0.010..4.954 rows=6837 loops=1)
      Output: leases.unit_id, leases.id
      Filter: (leases.community_id = ANY ('{X}'::bigint[]))
      Rows Removed by Filter: 401
      Buffers: shared hit=94
      CTE month_series
      -> Result (cost=0.28..33371.24 rows=1476000 width=24) (actual time=0.065..11.553 rows=1476 loops=1)
      Output: units.id, units.community_id, units.bedrooms, units.bathrooms, ((generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval)) - '1 day'::interval)
      Buffers: shared hit=712
      -> ProjectSet (cost=0.28..7541.24 rows=1476000 width=24) (actual time=0.062..9.869 rows=1476 loops=1)
      Output: generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval), units.id, units.community_id, units.bedrooms, units.bathrooms
      Buffers: shared hit=712
      -> Index Scan using units_pkey on public.units (cost=0.28..146.48 rows=1476 width=16) (actual time=0.038..4.365 rows=1476 loops=1)
      Output: units.id, units.number, units.bedrooms, units.bathrooms, units.square_footage, units.community_id, units.building_id, units.created_at, units.updated_at, units.slug, units.status, units.note, units.half_bathrooms, units.display_number
      Filter: (units.community_id = ANY ('{X}'::integer[]))
      Rows Removed by Filter: 201
      Buffers: shared hit=712
      CTE unit_rent_month
      -> Sort (cost=2524430.27..2528120.27 rows=1476000 width=32) (actual time=8019.918..8019.998 rows=1476 loops=1)
      Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, ((date_trunc('month'::text, month_series.dt))::date), month_series.dt
      Sort Key: month_series.dt, month_series.unit_id
      Sort Method: quicksort Memory: 164kB
      Buffers: shared hit=990
      -> Nested Loop Left Join (cost=1.51..2302560.00 rows=1476000 width=32) (actual time=78.693..8018.611 rows=1476 loops=1)
      Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, (date_trunc('month'::text, month_series.dt))::date, month_series.dt
      Buffers: shared hit=984
      -> CTE Scan on month_series (cost=0.00..29520.00 rows=1476000 width=24) (actual time=0.067..13.332 rows=1476 loops=1)
      Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, month_series.dt
      Buffers: shared hit=712
      -> Limit (cost=1.51..1.51 rows=1 width=12) (actual time=5.419..5.420 rows=1 loops=1476)
      Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
      Buffers: shared hit=272
      -> Sort (cost=1.51..1.51 rows=1 width=12) (actual time=5.418..5.418 rows=1 loops=1476)
      Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
      Sort Key: last_rent.start DESC
      Sort Method: top-N heapsort Memory: 25kB
      Buffers: shared hit=272
      -> CTE Scan on last_rent (cost=0.00..1.50 rows=1 width=12) (actual time=3.395..5.411 rows=8 loops=1476)
      Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
      Filter: ((last_rent.start <= month_series.dt) AND (last_rent.unit_id = month_series.unit_id))
      Rows Removed by Filter: 12002
      Buffers: shared hit=269
      Planning time: 3.219 ms
      Execution time: 8021.388 ms


      Based on what the analyze is saying, I believe the issue to be the subquery and the lateral join. Would anyone help me figure how to optimize this? Happy to share more information if needed.









      share







      New contributor




      sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.












      I have three tables: units (apartment units) leases, and line items (a better name would have been recurring rents). A unit has many leases and lease has many line items (associated with each rent change). My goal is given a set of a months, list the unit and the most recent rent for that unit.



      The following query does exactly what I would hope but is extremely slow (8 seconds) on a relatively small data set (~1500 units, ~5000 leases, ~15000 line items).



      with "last_rent" as (
      SELECT "line_items".* FROM (SELECT "line_items".*, "leases".unit_id, row_number() over (
      partition by unit_id, date_trunc('month', "line_items".start)
      order by "line_items".start desc
      )
      FROM "line_items" INNER JOIN "leases" ON "leases"."id" = "line_items"."lease_id" WHERE "line_items"."name" = 'RNT' AND "leases"."community_id" IN (X)) as "line_items" WHERE "line_items"."row_number" = 1
      ), "month_series" as (
      SELECT id as unit_id, community_id, bedrooms, bathrooms, generate_series( '2019-03-01', '2019-03-01'::date, interval '1 month' ) - interval '1 day' dt
      FROM "units" WHERE "units"."community_id" IN (X) ORDER BY "units"."id" ASC
      ), "unit_rent_month" as (
      select unit_id, community_id, bedrooms, bathrooms, amount_cents, date_trunc('month', dt)::date as period
      from "month_series"
      left join lateral (
      select amount_cents
      from "last_rent"
      where
      "last_rent"."unit_id" = "month_series"."unit_id"
      and "last_rent"."start" <= "month_series"."dt"
      order by unit_id, "last_rent".start desc
      limit 1
      ) last_rent on true
      order by dt, unit_id
      )
      select *
      from "unit_rent_month";


      My strategy was to isolate the last rent of each month (in case of early move outs or cancellations) by using a window function. I created another table (month_series) which lists all the unit_ids and the months. I tried doing a lateral join between last rent and month series.



      The analyze looks like this:



      CTE Scan on unit_rent_month  (cost=2563655.29..2593175.29 rows=1476000 width=24) (actual time=8019.924..8020.607 rows=1476 loops=1)
      Output: unit_rent_month.unit_id, unit_rent_month.community_id, unit_rent_month.bedrooms, unit_rent_month.bathrooms, unit_rent_month.amount_cents, unit_rent_month.period
      Buffers: shared hit=990
      CTE last_rent
      -> Subquery Scan on line_items (cost=1682.58..2163.78 rows=60 width=56) (actual time=44.584..63.921 rows=12010 loops=1)
      Output: line_items.id, line_items.name, line_items.lease_id, line_items.amount_cents, line_items.amount_currency, line_items.created_at, line_items.updated_at, line_items.start, line_items.expiration, line_items.unit_id, line_items.row_number
      Filter: (line_items.row_number = 1)
      Rows Removed by Filter: 63
      Buffers: shared hit=269
      -> WindowAgg (cost=1682.58..2013.40 rows=12030 width=68) (actual time=44.582..60.093 rows=12073 loops=1)
      Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration, leases.unit_id, row_number() OVER (?), leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone))
      Buffers: shared hit=269
      -> Sort (cost=1682.58..1712.65 rows=12030 width=56) (actual time=44.524..45.478 rows=12073 loops=1)
      Output: line_items_1.start, leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
      Sort Key: leases.unit_id, (date_trunc('month'::text, (line_items_1.start)::timestamp with time zone)), line_items_1.start DESC
      Sort Method: quicksort Memory: 2082kB
      Buffers: shared hit=269
      -> Hash Join (cost=378.43..867.28 rows=12030 width=56) (actual time=7.626..33.901 rows=12073 loops=1)
      Output: line_items_1.start, leases.unit_id, date_trunc('month'::text, (line_items_1.start)::timestamp with time zone), line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.expiration
      Inner Unique: true
      Hash Cond: (line_items_1.lease_id = leases.id)
      Buffers: shared hit=269
      -> Seq Scan on public.line_items line_items_1 (cost=0.00..395.25 rows=12736 width=44) (actual time=0.012..5.813 rows=12736 loops=1)
      Output: line_items_1.id, line_items_1.name, line_items_1.lease_id, line_items_1.amount_cents, line_items_1.amount_currency, line_items_1.created_at, line_items_1.updated_at, line_items_1.start, line_items_1.expiration
      Filter: ((line_items_1.name)::text = 'RNT'::text)
      Rows Removed by Filter: 4884
      Buffers: shared hit=175
      -> Hash (cost=292.99..292.99 rows=6835 width=8) (actual time=7.537..7.537 rows=6837 loops=1)
      Output: leases.unit_id, leases.id
      Buckets: 8192 Batches: 1 Memory Usage: 332kB
      Buffers: shared hit=94
      -> Seq Scan on public.leases (cost=0.00..292.99 rows=6835 width=8) (actual time=0.010..4.954 rows=6837 loops=1)
      Output: leases.unit_id, leases.id
      Filter: (leases.community_id = ANY ('{X}'::bigint[]))
      Rows Removed by Filter: 401
      Buffers: shared hit=94
      CTE month_series
      -> Result (cost=0.28..33371.24 rows=1476000 width=24) (actual time=0.065..11.553 rows=1476 loops=1)
      Output: units.id, units.community_id, units.bedrooms, units.bathrooms, ((generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval)) - '1 day'::interval)
      Buffers: shared hit=712
      -> ProjectSet (cost=0.28..7541.24 rows=1476000 width=24) (actual time=0.062..9.869 rows=1476 loops=1)
      Output: generate_series('2019-03-01 00:00:00-08'::timestamp with time zone, ('2019-03-01'::date)::timestamp with time zone, '1 mon'::interval), units.id, units.community_id, units.bedrooms, units.bathrooms
      Buffers: shared hit=712
      -> Index Scan using units_pkey on public.units (cost=0.28..146.48 rows=1476 width=16) (actual time=0.038..4.365 rows=1476 loops=1)
      Output: units.id, units.number, units.bedrooms, units.bathrooms, units.square_footage, units.community_id, units.building_id, units.created_at, units.updated_at, units.slug, units.status, units.note, units.half_bathrooms, units.display_number
      Filter: (units.community_id = ANY ('{X}'::integer[]))
      Rows Removed by Filter: 201
      Buffers: shared hit=712
      CTE unit_rent_month
      -> Sort (cost=2524430.27..2528120.27 rows=1476000 width=32) (actual time=8019.918..8019.998 rows=1476 loops=1)
      Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, ((date_trunc('month'::text, month_series.dt))::date), month_series.dt
      Sort Key: month_series.dt, month_series.unit_id
      Sort Method: quicksort Memory: 164kB
      Buffers: shared hit=990
      -> Nested Loop Left Join (cost=1.51..2302560.00 rows=1476000 width=32) (actual time=78.693..8018.611 rows=1476 loops=1)
      Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, last_rent.amount_cents, (date_trunc('month'::text, month_series.dt))::date, month_series.dt
      Buffers: shared hit=984
      -> CTE Scan on month_series (cost=0.00..29520.00 rows=1476000 width=24) (actual time=0.067..13.332 rows=1476 loops=1)
      Output: month_series.unit_id, month_series.community_id, month_series.bedrooms, month_series.bathrooms, month_series.dt
      Buffers: shared hit=712
      -> Limit (cost=1.51..1.51 rows=1 width=12) (actual time=5.419..5.420 rows=1 loops=1476)
      Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
      Buffers: shared hit=272
      -> Sort (cost=1.51..1.51 rows=1 width=12) (actual time=5.418..5.418 rows=1 loops=1476)
      Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
      Sort Key: last_rent.start DESC
      Sort Method: top-N heapsort Memory: 25kB
      Buffers: shared hit=272
      -> CTE Scan on last_rent (cost=0.00..1.50 rows=1 width=12) (actual time=3.395..5.411 rows=8 loops=1476)
      Output: last_rent.amount_cents, last_rent.unit_id, last_rent.start
      Filter: ((last_rent.start <= month_series.dt) AND (last_rent.unit_id = month_series.unit_id))
      Rows Removed by Filter: 12002
      Buffers: shared hit=269
      Planning time: 3.219 ms
      Execution time: 8021.388 ms


      Based on what the analyze is saying, I believe the issue to be the subquery and the lateral join. Would anyone help me figure how to optimize this? Happy to share more information if needed.







      postgresql postgresql-performance





      share







      New contributor




      sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.










      share







      New contributor




      sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      share



      share






      New contributor




      sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 5 mins ago









      sunnyrjunejasunnyrjuneja

      1011




      1011




      New contributor




      sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      sunnyrjuneja is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          0






          active

          oldest

          votes











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "182"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f230516%2fslow-left-join-lateral-in-subquery%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.













          sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.












          sunnyrjuneja is a new contributor. Be nice, and check out our Code of Conduct.
















          Thanks for contributing an answer to Database Administrators Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f230516%2fslow-left-join-lateral-in-subquery%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Discografia di Klaus Schulze Indice Album in studio | Album dal vivo | Singoli | Antologie | Colonne...

          Armoriale delle famiglie italiane (Car) Indice Armi | Bibliografia | Menu di navigazioneBlasone...

          Lupi Siderali Indice Storia | Organizzazione | La Tredicesima Compagnia | Aspetto | Membri Importanti...