Can indices save a giant `WHERE foo IN (…)` query with thousands of entries in the query list?Optimizing...
Coworker asking me to not bring cakes due to self control issue. What should I do?
How to write Muḥammad ibn Mūsā al-Khwārizmī?
Was there a pre-determined arrangment for division of Germany in case it surrendered before any Soviet forces entered its territory?
How can I put a period right after the algorithm's number in the algorithm's title?
Can you say "leftside right"?
What could cause an entire planet of humans to become aphasic?
Using Ansible, how can I take actions on each file in a specific location?
What is wrong with my use of "find -print0"?
How to deal with an underperforming subordinate?
What if I miss a connection and don't have money to book next flight?
Why is it that Bernie Sanders is always called a "socialist"?
I am a giant among ants
How do I add a strong "onion flavor" to the biryani (in restaurant style)?
Promise.all returning empty objects
Including proofs of known theorems in master's thesis
Minimum Viable Product for RTS game?
What does an unprocessed RAW file look like?
How long can the stop in a stop-and-go be?
Sing Baby Shark
Why do single electrical receptacles exist?
Probability X1 ≥ X2
How can I prevent an oracle who can see into the past from knowing everything that has happened?
Is there a way to pause a running process on Linux systems and resume later?
Can I use a single resistor for multiple LED with different +ve sources?
Can indices save a giant `WHERE foo IN (…)` query with thousands of entries in the query list?
Optimizing ORDER BY in a full text search queryHow can I speed up a Postgres query containing lots of Joins with an ILIKE conditionSlow fulltext search due to wildly inaccurate row estimatesHow to index a query with `WHERE field IS NULL`?How can I create a database where entries are compressed files searchable by index?Indices for a join query with datetime rangeDuplicate entries in query with sumHow can I query for terms like “@foo” with Postgres full text search?Scalable way to manage aggregated statistics over time in hierarchy?How to index two tables for JOINed query optimisation
I have some client code hitting my postgres instance with a giant SELECT
statement that is matching against a list of IDs. The query ends up looking like this...
SELECT "id","name","status","regionid" FROM "public"."store" WHERE "regionid" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$106,$107,$108,$109,$110,$111,$112,$113,$114,$115,$116,$117,$118,$119,$120,$121,$122,$123,$124,$125,$126,$127,$128,$129,$130,$131,$132,$133,$134,$135,$136,$137,$138,$139,$140,$141,$142,$143,$144,$145,$146,$147,$148,$149,$150,$151,$152,$153,$154,$155,$156,$157,$158,$159,$160,$161,$162,$163,$164,$165,$166,$167,$168,$169,$170,$171,$172,$173,$174,$175,$176,$177,$178,$179,$180,$181,$182,$183,$184,$185,$186,$187,$188,$189,$190,$191,$192,$193,$194,$195,$196,$197,$198,$199,$200,$201,$202,$203,$204,$205,$206,$207,$208,$209,$210,$211,$212,$213,$214,$215,$216,$217,$218,$219,$220,$221,$222,$223,$224,$225,$226,$227,$228,$229,$230,$231,$232,$233,$234,$235,$236,$237,$238,$239,$240,$241,$242,$243,$244,$245,$246,$247,$248,$249,$250,$251,$252,$253,$254,$255,$256,$257,$258,$259,$260,$261,$262,$263,$264,$265,$266,$267,$268,$269,$270,$271,$272,$273,$274,$275,$276,$277,$278,$279,$280,$281,$282,$283,$284,$285,$286,$287,$288,$289,$290,$291,$292,$293,$294,$295,$296,$297,$298,$299,$300,$301,$302,$303,$304,$305,$306,$307,$308,$309,$310,$311,$312,$313,$314,$315,$316,$317,$318,$319,$320,$321,$322,$323,$324,$325,$326,$327,$328,$329,$330,$331,$332,$333,$334,$335,$336,$337,$338,$339,$340,$341,$342,$343,$344,$345,$346,$347,$348,$349,$350,$351,$352,$353,$354,$355,$356,$357,$358,$359,$360,$361,$362,$363,$364,$365,$366,$367,$368,$369,$370,$371,$372,$373,$374,$375,$376,$377,$378,$379,$380,$381,$382,$383,$384,$385,$386,$387,$388,$389,$390,$391,$392,$393,$394,$395,$396,$397,$398,$399,$400,$401,$402,$403,$404,$405,$406,$407,$408,$409,$410,$411,$412,$413,$414,$415,$416,$417,$418,$419,$420,$421,$422,$423,$424,$425,$426,$427,$428,$429,$430,$431,$432,$433,$434,$435,$436,$437,$438,$439,$440,$441,$442,$443,$444,$445,$446,$447,$448,$449,$450,$451,$452,$453,$454,$455,$456,$457,$458,$459,$460,$461,$462,$463,$464,$465,$466,$467,$468,$469,$470,$471,$472,$473,$474,$475,$476,$477,$478,$479,$480,$481,$482,$483,$484,$485,$486,$487,$488,$489,$490,$491,$492,$493,$494,$495,$496,$497,$498,$499,$500,$501,$502,$503,$504,$505,$506,$507,$508,$509,$510,$511,$512,$513,$514,$515,$516,$517,$518,$519,$520,$521,$522,$523,$524,$525,$526,$527,$528,$529,$530,$531,$532,$533,$534,$535,$536,$537,$538,$539,$540,$541,$542,$543,$544,$545,$546,$547,$548,$549,$550,$551,$552,$553,$554,$555,$556,$557,$558,$559,$560,$561,$562,$563,$564,$565,$566,$567,$568,$569,$570,$571,$572,$573,$574,$575,$576,$577,$578,$579,$580,$581,$582,$583,$584,$585,$586,$587,$588,$589,$590,$591,$592,$593,$594,$595,$596,$597,$598,$599,$600,$601,$602,$603,$604,$605,$606,$607,$608,$609,$610,$611,$612,$613,$614,$615,$616,$617,$618,$619,$620,$621,$622,$623,$624,$625,$626,$627,$628,$629,$630,$631,$632,$633,$634,$635,$636,$637,$638,$639,$640,$641,$642,$643,$644,$645,$646,$647,$648,$649,$650,$651,$652,$653,$654,$655,$656,$657,$658,$659,$660,$661,$662,$663,$664,$665,$666,$667,$668,$669,$670,$671,$672,$673,$674,$675,$676,$677,$678,$679,$680,$681,$682,$683,$684,$685,$686,$687,$688,$689,$690,$691,$692,$693,$694,$695,$696,$697,$698,$699,$700,$701,$702,$703,$704,$705,$706,$707,$708,$709,$710,$711,$712,$713,$714,$715,$716,$717,$718,$719,$720,$721,$722,$723,$724,$725,$726,$727,$728,$729,$730,$731,$732,$733,$734,$735,$736,$737,$738,$739,$740,$741,$742,$743,$744,$745,$746,$747,$748,$749,$750,$751,$752,$753,$754,$755,$756,$757,$758,$759,$760,$761,$762,$763,$764,$765,$766,$767,$768,$769,$770,$771,$772,$773,$774,$775,$776,$777,$778,$779,$780,$781,$782,$783,$784,$785,$786,$787,$788,$789,$790,$791,$792,$793,$794,$795,$796,$797,$798,$799,$800,$801,$802,$803,$804,$805,$806,$807,$808,$809,$810,$811,$812,$813,$814,$815,$816,$817,$818,$819,$820,$821,$822,$823,$824,$825,$826,$827,$828,$829,$830,$831,$832,$833,$834,$835,$836,$837,$838,$839,$840,$841,$842,$843,$844,$845,$846,$847,$848,$849,$850,$851,$852,$853,$854,$855,$856,$857,$858,$859,$860,$861,$862,$863,$864,$865,$866,$867,$868,$869,$870,$871,$872,$873,$874,$875,$876,$877,$878,$879,$880,$881,$882,$883,$884,$885,$886,$887,$888,$889,$890,$891,$892,$893,$894,$895,$896,$897,$898,$899,$900,$901,$902,$903,$904,$905,$906,$907,$908,$909,$910,$911,$912,$913,$914,$915,$916,$917,$918,$919,$920,$921,$922,$923,$924,$925,$926,$927,$928,$929,$930,$931,$932,$933,$934,$935,$936,$937,$938,$939,$940,$941,$942,$943,$944,$945,$946,$947,$948,$949,$950,$951,$952,$953,$954,$955,$956,$957,$958,$959,$960,$961,$962,$963,$964,$965,$966,$967,$968,$969,$970,$971,$972,$973,$974,$975,$976,$977,$978,$979,$980,$981,$982,$983,$984,$985,$986,$987,$988,$989,$990,$991,$992,$993,$994,$995,$996,$997,$998,$999,$1000,$1001,$1002,$1003,$1004,$1005,$1006,$1007,$1008) ORDER BY "id" [ '00178340-4467-11e6-8a37-219b2acdc346',
'003be4e0-0260-11e6-af27-2d5012c820c0',
'006a6460-db4f-11e5-9c34-e3c0bb24c767',
// snip ]
It seems like this query is especially slow. My initial intuition is to just slap an index on regionid
within the store
table, but I wonder if the database will really continue to scale as the number of regionid
elements that are queried over grows over time.
What is the typical way to scale this over time? Will a simple index on regionid
scalably save me going forwards, or is the WHERE IN
ultimately going to be a crux?
EDIT: Here is the result of an EXPLAIN
query on a shortened dataset:
[
{
"Plan": {
"Startup Cost": 211.01,
"Plans": [
{
"Startup Cost": 34.7,
"Plan Width": 110,
"Plans": [
{
"Startup Cost": 0,
"Plan Width": 0,
"Node Type": "Bitmap Index Scan",
"Index Cond": "(store.regionid = ANY ('{00178340-4467-11e6-8a37-219b2acdc346,003be4e0-0260-11e6-af27-2d5012c820c0,006a6460-db4f-11e5-9c34-e3c0bb24c767,00811b60-4d37-11e6-80df-5b5d32d23050,008d2270-8694-11e6-9802-ed42b1afc98c,014d2350-4d4a-11e6-b004-9d13a9078e9a,01516370-94a3-11e6-8ece-0f437fd4c46e,015654a0-4874-11e6-8a37-219b2acdc346}'::text[]))",
"Plan Rows": 55,
"Parent Relationship": "Outer",
"Index Name": "unique_stores",
"Total Cost": 34.69
}
],
"Recheck Cond": "(store.regionid = ANY ('{00178340-4467-11e6-8a37-219b2acdc346,003be4e0-0260-11e6-af27-2d5012c820c0,006a6460-db4f-11e5-9c34-e3c0bb24c767,00811b60-4d37-11e6-80df-5b5d32d23050,008d2270-8694-11e6-9802-ed42b1afc98c,014d2350-4d4a-11e6-b004-9d13a9078e9a,01516370-94a3-11e6-8ece-0f437fd4c46e,015654a0-4874-11e6-8a37-219b2acdc346}'::text[]))",
"Node Type": "Bitmap Heap Scan",
"Plan Rows": 55,
"Relation Name": "store",
"Alias": "store",
"Output": [
"id",
"name",
"status",
"regionid",
"storeid"
],
"Parent Relationship": "Outer",
"Total Cost": 209.42,
"Schema": "public"
}
],
"Sort Key": [
"store.id",
"store.storeid"
],
"Plan Rows": 55,
"Node Type": "Sort",
"Output": [
"id",
"name",
"status",
"regionid",
"storeid"
],
"Plan Width": 110,
"Total Cost": 211.15
}
}
]
postgresql index
bumped to the homepage by Community♦ 18 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
|
show 5 more comments
I have some client code hitting my postgres instance with a giant SELECT
statement that is matching against a list of IDs. The query ends up looking like this...
SELECT "id","name","status","regionid" FROM "public"."store" WHERE "regionid" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$106,$107,$108,$109,$110,$111,$112,$113,$114,$115,$116,$117,$118,$119,$120,$121,$122,$123,$124,$125,$126,$127,$128,$129,$130,$131,$132,$133,$134,$135,$136,$137,$138,$139,$140,$141,$142,$143,$144,$145,$146,$147,$148,$149,$150,$151,$152,$153,$154,$155,$156,$157,$158,$159,$160,$161,$162,$163,$164,$165,$166,$167,$168,$169,$170,$171,$172,$173,$174,$175,$176,$177,$178,$179,$180,$181,$182,$183,$184,$185,$186,$187,$188,$189,$190,$191,$192,$193,$194,$195,$196,$197,$198,$199,$200,$201,$202,$203,$204,$205,$206,$207,$208,$209,$210,$211,$212,$213,$214,$215,$216,$217,$218,$219,$220,$221,$222,$223,$224,$225,$226,$227,$228,$229,$230,$231,$232,$233,$234,$235,$236,$237,$238,$239,$240,$241,$242,$243,$244,$245,$246,$247,$248,$249,$250,$251,$252,$253,$254,$255,$256,$257,$258,$259,$260,$261,$262,$263,$264,$265,$266,$267,$268,$269,$270,$271,$272,$273,$274,$275,$276,$277,$278,$279,$280,$281,$282,$283,$284,$285,$286,$287,$288,$289,$290,$291,$292,$293,$294,$295,$296,$297,$298,$299,$300,$301,$302,$303,$304,$305,$306,$307,$308,$309,$310,$311,$312,$313,$314,$315,$316,$317,$318,$319,$320,$321,$322,$323,$324,$325,$326,$327,$328,$329,$330,$331,$332,$333,$334,$335,$336,$337,$338,$339,$340,$341,$342,$343,$344,$345,$346,$347,$348,$349,$350,$351,$352,$353,$354,$355,$356,$357,$358,$359,$360,$361,$362,$363,$364,$365,$366,$367,$368,$369,$370,$371,$372,$373,$374,$375,$376,$377,$378,$379,$380,$381,$382,$383,$384,$385,$386,$387,$388,$389,$390,$391,$392,$393,$394,$395,$396,$397,$398,$399,$400,$401,$402,$403,$404,$405,$406,$407,$408,$409,$410,$411,$412,$413,$414,$415,$416,$417,$418,$419,$420,$421,$422,$423,$424,$425,$426,$427,$428,$429,$430,$431,$432,$433,$434,$435,$436,$437,$438,$439,$440,$441,$442,$443,$444,$445,$446,$447,$448,$449,$450,$451,$452,$453,$454,$455,$456,$457,$458,$459,$460,$461,$462,$463,$464,$465,$466,$467,$468,$469,$470,$471,$472,$473,$474,$475,$476,$477,$478,$479,$480,$481,$482,$483,$484,$485,$486,$487,$488,$489,$490,$491,$492,$493,$494,$495,$496,$497,$498,$499,$500,$501,$502,$503,$504,$505,$506,$507,$508,$509,$510,$511,$512,$513,$514,$515,$516,$517,$518,$519,$520,$521,$522,$523,$524,$525,$526,$527,$528,$529,$530,$531,$532,$533,$534,$535,$536,$537,$538,$539,$540,$541,$542,$543,$544,$545,$546,$547,$548,$549,$550,$551,$552,$553,$554,$555,$556,$557,$558,$559,$560,$561,$562,$563,$564,$565,$566,$567,$568,$569,$570,$571,$572,$573,$574,$575,$576,$577,$578,$579,$580,$581,$582,$583,$584,$585,$586,$587,$588,$589,$590,$591,$592,$593,$594,$595,$596,$597,$598,$599,$600,$601,$602,$603,$604,$605,$606,$607,$608,$609,$610,$611,$612,$613,$614,$615,$616,$617,$618,$619,$620,$621,$622,$623,$624,$625,$626,$627,$628,$629,$630,$631,$632,$633,$634,$635,$636,$637,$638,$639,$640,$641,$642,$643,$644,$645,$646,$647,$648,$649,$650,$651,$652,$653,$654,$655,$656,$657,$658,$659,$660,$661,$662,$663,$664,$665,$666,$667,$668,$669,$670,$671,$672,$673,$674,$675,$676,$677,$678,$679,$680,$681,$682,$683,$684,$685,$686,$687,$688,$689,$690,$691,$692,$693,$694,$695,$696,$697,$698,$699,$700,$701,$702,$703,$704,$705,$706,$707,$708,$709,$710,$711,$712,$713,$714,$715,$716,$717,$718,$719,$720,$721,$722,$723,$724,$725,$726,$727,$728,$729,$730,$731,$732,$733,$734,$735,$736,$737,$738,$739,$740,$741,$742,$743,$744,$745,$746,$747,$748,$749,$750,$751,$752,$753,$754,$755,$756,$757,$758,$759,$760,$761,$762,$763,$764,$765,$766,$767,$768,$769,$770,$771,$772,$773,$774,$775,$776,$777,$778,$779,$780,$781,$782,$783,$784,$785,$786,$787,$788,$789,$790,$791,$792,$793,$794,$795,$796,$797,$798,$799,$800,$801,$802,$803,$804,$805,$806,$807,$808,$809,$810,$811,$812,$813,$814,$815,$816,$817,$818,$819,$820,$821,$822,$823,$824,$825,$826,$827,$828,$829,$830,$831,$832,$833,$834,$835,$836,$837,$838,$839,$840,$841,$842,$843,$844,$845,$846,$847,$848,$849,$850,$851,$852,$853,$854,$855,$856,$857,$858,$859,$860,$861,$862,$863,$864,$865,$866,$867,$868,$869,$870,$871,$872,$873,$874,$875,$876,$877,$878,$879,$880,$881,$882,$883,$884,$885,$886,$887,$888,$889,$890,$891,$892,$893,$894,$895,$896,$897,$898,$899,$900,$901,$902,$903,$904,$905,$906,$907,$908,$909,$910,$911,$912,$913,$914,$915,$916,$917,$918,$919,$920,$921,$922,$923,$924,$925,$926,$927,$928,$929,$930,$931,$932,$933,$934,$935,$936,$937,$938,$939,$940,$941,$942,$943,$944,$945,$946,$947,$948,$949,$950,$951,$952,$953,$954,$955,$956,$957,$958,$959,$960,$961,$962,$963,$964,$965,$966,$967,$968,$969,$970,$971,$972,$973,$974,$975,$976,$977,$978,$979,$980,$981,$982,$983,$984,$985,$986,$987,$988,$989,$990,$991,$992,$993,$994,$995,$996,$997,$998,$999,$1000,$1001,$1002,$1003,$1004,$1005,$1006,$1007,$1008) ORDER BY "id" [ '00178340-4467-11e6-8a37-219b2acdc346',
'003be4e0-0260-11e6-af27-2d5012c820c0',
'006a6460-db4f-11e5-9c34-e3c0bb24c767',
// snip ]
It seems like this query is especially slow. My initial intuition is to just slap an index on regionid
within the store
table, but I wonder if the database will really continue to scale as the number of regionid
elements that are queried over grows over time.
What is the typical way to scale this over time? Will a simple index on regionid
scalably save me going forwards, or is the WHERE IN
ultimately going to be a crux?
EDIT: Here is the result of an EXPLAIN
query on a shortened dataset:
[
{
"Plan": {
"Startup Cost": 211.01,
"Plans": [
{
"Startup Cost": 34.7,
"Plan Width": 110,
"Plans": [
{
"Startup Cost": 0,
"Plan Width": 0,
"Node Type": "Bitmap Index Scan",
"Index Cond": "(store.regionid = ANY ('{00178340-4467-11e6-8a37-219b2acdc346,003be4e0-0260-11e6-af27-2d5012c820c0,006a6460-db4f-11e5-9c34-e3c0bb24c767,00811b60-4d37-11e6-80df-5b5d32d23050,008d2270-8694-11e6-9802-ed42b1afc98c,014d2350-4d4a-11e6-b004-9d13a9078e9a,01516370-94a3-11e6-8ece-0f437fd4c46e,015654a0-4874-11e6-8a37-219b2acdc346}'::text[]))",
"Plan Rows": 55,
"Parent Relationship": "Outer",
"Index Name": "unique_stores",
"Total Cost": 34.69
}
],
"Recheck Cond": "(store.regionid = ANY ('{00178340-4467-11e6-8a37-219b2acdc346,003be4e0-0260-11e6-af27-2d5012c820c0,006a6460-db4f-11e5-9c34-e3c0bb24c767,00811b60-4d37-11e6-80df-5b5d32d23050,008d2270-8694-11e6-9802-ed42b1afc98c,014d2350-4d4a-11e6-b004-9d13a9078e9a,01516370-94a3-11e6-8ece-0f437fd4c46e,015654a0-4874-11e6-8a37-219b2acdc346}'::text[]))",
"Node Type": "Bitmap Heap Scan",
"Plan Rows": 55,
"Relation Name": "store",
"Alias": "store",
"Output": [
"id",
"name",
"status",
"regionid",
"storeid"
],
"Parent Relationship": "Outer",
"Total Cost": 209.42,
"Schema": "public"
}
],
"Sort Key": [
"store.id",
"store.storeid"
],
"Plan Rows": 55,
"Node Type": "Sort",
"Output": [
"id",
"name",
"status",
"regionid",
"storeid"
],
"Plan Width": 110,
"Total Cost": 211.15
}
}
]
postgresql index
bumped to the homepage by Community♦ 18 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
1
How many distinctregionid
values are in thestore
table? If the application is listing out 1000 regions and there are only 1500 regions in total, it's almost certainly going to be faster to do a table scan than to use an index onregionid
. If there are 150,000 different regions, on the other hand, using an index onregionid
would almost certainly be more efficient even if the list grew to a few thousand entries.
– Justin Cave
Nov 2 '16 at 19:17
1
Separate from that, though, long before I got to 1000 elements, I'd seriously think about loading the regions I want to search into a temp table or some other structure so that I wasn't generating thousands of different queries that differed only by the number of bind variables in theIN
list.
– Justin Cave
Nov 2 '16 at 19:17
@JustinCave There is a typical ratio of about 20 stores per region...so it will scale ~20:1. With respect to your second comment, I'll have to do some reading up on that one.
– Jim
Nov 2 '16 at 19:21
For some context (I am still trying to sort this out, myself a bit!), it appears that a client ORM is generating this query through a permissions check. So, a user has permission (enforced at an application level) to access stores from regions with those IDs, and then they are querying for the accessible stores through thisWHERE IN
query (generated with an ORM)
– Jim
Nov 2 '16 at 19:22
That's not the ratio to be concerned with. It doesn't matter whether there are 1, 10, 100, or 1000 stores per region. The important question is what fraction of the rows in the table the query is returning. If there are tens or hundreds of thousands of regions, then an IN list with 1000 regions is only going to return a few percent of the rows in the table and an index would make sense. If the query is going to return most of the rows in the table because the IN list is so long, then an index won't be helpful.
– Justin Cave
Nov 2 '16 at 19:25
|
show 5 more comments
I have some client code hitting my postgres instance with a giant SELECT
statement that is matching against a list of IDs. The query ends up looking like this...
SELECT "id","name","status","regionid" FROM "public"."store" WHERE "regionid" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$106,$107,$108,$109,$110,$111,$112,$113,$114,$115,$116,$117,$118,$119,$120,$121,$122,$123,$124,$125,$126,$127,$128,$129,$130,$131,$132,$133,$134,$135,$136,$137,$138,$139,$140,$141,$142,$143,$144,$145,$146,$147,$148,$149,$150,$151,$152,$153,$154,$155,$156,$157,$158,$159,$160,$161,$162,$163,$164,$165,$166,$167,$168,$169,$170,$171,$172,$173,$174,$175,$176,$177,$178,$179,$180,$181,$182,$183,$184,$185,$186,$187,$188,$189,$190,$191,$192,$193,$194,$195,$196,$197,$198,$199,$200,$201,$202,$203,$204,$205,$206,$207,$208,$209,$210,$211,$212,$213,$214,$215,$216,$217,$218,$219,$220,$221,$222,$223,$224,$225,$226,$227,$228,$229,$230,$231,$232,$233,$234,$235,$236,$237,$238,$239,$240,$241,$242,$243,$244,$245,$246,$247,$248,$249,$250,$251,$252,$253,$254,$255,$256,$257,$258,$259,$260,$261,$262,$263,$264,$265,$266,$267,$268,$269,$270,$271,$272,$273,$274,$275,$276,$277,$278,$279,$280,$281,$282,$283,$284,$285,$286,$287,$288,$289,$290,$291,$292,$293,$294,$295,$296,$297,$298,$299,$300,$301,$302,$303,$304,$305,$306,$307,$308,$309,$310,$311,$312,$313,$314,$315,$316,$317,$318,$319,$320,$321,$322,$323,$324,$325,$326,$327,$328,$329,$330,$331,$332,$333,$334,$335,$336,$337,$338,$339,$340,$341,$342,$343,$344,$345,$346,$347,$348,$349,$350,$351,$352,$353,$354,$355,$356,$357,$358,$359,$360,$361,$362,$363,$364,$365,$366,$367,$368,$369,$370,$371,$372,$373,$374,$375,$376,$377,$378,$379,$380,$381,$382,$383,$384,$385,$386,$387,$388,$389,$390,$391,$392,$393,$394,$395,$396,$397,$398,$399,$400,$401,$402,$403,$404,$405,$406,$407,$408,$409,$410,$411,$412,$413,$414,$415,$416,$417,$418,$419,$420,$421,$422,$423,$424,$425,$426,$427,$428,$429,$430,$431,$432,$433,$434,$435,$436,$437,$438,$439,$440,$441,$442,$443,$444,$445,$446,$447,$448,$449,$450,$451,$452,$453,$454,$455,$456,$457,$458,$459,$460,$461,$462,$463,$464,$465,$466,$467,$468,$469,$470,$471,$472,$473,$474,$475,$476,$477,$478,$479,$480,$481,$482,$483,$484,$485,$486,$487,$488,$489,$490,$491,$492,$493,$494,$495,$496,$497,$498,$499,$500,$501,$502,$503,$504,$505,$506,$507,$508,$509,$510,$511,$512,$513,$514,$515,$516,$517,$518,$519,$520,$521,$522,$523,$524,$525,$526,$527,$528,$529,$530,$531,$532,$533,$534,$535,$536,$537,$538,$539,$540,$541,$542,$543,$544,$545,$546,$547,$548,$549,$550,$551,$552,$553,$554,$555,$556,$557,$558,$559,$560,$561,$562,$563,$564,$565,$566,$567,$568,$569,$570,$571,$572,$573,$574,$575,$576,$577,$578,$579,$580,$581,$582,$583,$584,$585,$586,$587,$588,$589,$590,$591,$592,$593,$594,$595,$596,$597,$598,$599,$600,$601,$602,$603,$604,$605,$606,$607,$608,$609,$610,$611,$612,$613,$614,$615,$616,$617,$618,$619,$620,$621,$622,$623,$624,$625,$626,$627,$628,$629,$630,$631,$632,$633,$634,$635,$636,$637,$638,$639,$640,$641,$642,$643,$644,$645,$646,$647,$648,$649,$650,$651,$652,$653,$654,$655,$656,$657,$658,$659,$660,$661,$662,$663,$664,$665,$666,$667,$668,$669,$670,$671,$672,$673,$674,$675,$676,$677,$678,$679,$680,$681,$682,$683,$684,$685,$686,$687,$688,$689,$690,$691,$692,$693,$694,$695,$696,$697,$698,$699,$700,$701,$702,$703,$704,$705,$706,$707,$708,$709,$710,$711,$712,$713,$714,$715,$716,$717,$718,$719,$720,$721,$722,$723,$724,$725,$726,$727,$728,$729,$730,$731,$732,$733,$734,$735,$736,$737,$738,$739,$740,$741,$742,$743,$744,$745,$746,$747,$748,$749,$750,$751,$752,$753,$754,$755,$756,$757,$758,$759,$760,$761,$762,$763,$764,$765,$766,$767,$768,$769,$770,$771,$772,$773,$774,$775,$776,$777,$778,$779,$780,$781,$782,$783,$784,$785,$786,$787,$788,$789,$790,$791,$792,$793,$794,$795,$796,$797,$798,$799,$800,$801,$802,$803,$804,$805,$806,$807,$808,$809,$810,$811,$812,$813,$814,$815,$816,$817,$818,$819,$820,$821,$822,$823,$824,$825,$826,$827,$828,$829,$830,$831,$832,$833,$834,$835,$836,$837,$838,$839,$840,$841,$842,$843,$844,$845,$846,$847,$848,$849,$850,$851,$852,$853,$854,$855,$856,$857,$858,$859,$860,$861,$862,$863,$864,$865,$866,$867,$868,$869,$870,$871,$872,$873,$874,$875,$876,$877,$878,$879,$880,$881,$882,$883,$884,$885,$886,$887,$888,$889,$890,$891,$892,$893,$894,$895,$896,$897,$898,$899,$900,$901,$902,$903,$904,$905,$906,$907,$908,$909,$910,$911,$912,$913,$914,$915,$916,$917,$918,$919,$920,$921,$922,$923,$924,$925,$926,$927,$928,$929,$930,$931,$932,$933,$934,$935,$936,$937,$938,$939,$940,$941,$942,$943,$944,$945,$946,$947,$948,$949,$950,$951,$952,$953,$954,$955,$956,$957,$958,$959,$960,$961,$962,$963,$964,$965,$966,$967,$968,$969,$970,$971,$972,$973,$974,$975,$976,$977,$978,$979,$980,$981,$982,$983,$984,$985,$986,$987,$988,$989,$990,$991,$992,$993,$994,$995,$996,$997,$998,$999,$1000,$1001,$1002,$1003,$1004,$1005,$1006,$1007,$1008) ORDER BY "id" [ '00178340-4467-11e6-8a37-219b2acdc346',
'003be4e0-0260-11e6-af27-2d5012c820c0',
'006a6460-db4f-11e5-9c34-e3c0bb24c767',
// snip ]
It seems like this query is especially slow. My initial intuition is to just slap an index on regionid
within the store
table, but I wonder if the database will really continue to scale as the number of regionid
elements that are queried over grows over time.
What is the typical way to scale this over time? Will a simple index on regionid
scalably save me going forwards, or is the WHERE IN
ultimately going to be a crux?
EDIT: Here is the result of an EXPLAIN
query on a shortened dataset:
[
{
"Plan": {
"Startup Cost": 211.01,
"Plans": [
{
"Startup Cost": 34.7,
"Plan Width": 110,
"Plans": [
{
"Startup Cost": 0,
"Plan Width": 0,
"Node Type": "Bitmap Index Scan",
"Index Cond": "(store.regionid = ANY ('{00178340-4467-11e6-8a37-219b2acdc346,003be4e0-0260-11e6-af27-2d5012c820c0,006a6460-db4f-11e5-9c34-e3c0bb24c767,00811b60-4d37-11e6-80df-5b5d32d23050,008d2270-8694-11e6-9802-ed42b1afc98c,014d2350-4d4a-11e6-b004-9d13a9078e9a,01516370-94a3-11e6-8ece-0f437fd4c46e,015654a0-4874-11e6-8a37-219b2acdc346}'::text[]))",
"Plan Rows": 55,
"Parent Relationship": "Outer",
"Index Name": "unique_stores",
"Total Cost": 34.69
}
],
"Recheck Cond": "(store.regionid = ANY ('{00178340-4467-11e6-8a37-219b2acdc346,003be4e0-0260-11e6-af27-2d5012c820c0,006a6460-db4f-11e5-9c34-e3c0bb24c767,00811b60-4d37-11e6-80df-5b5d32d23050,008d2270-8694-11e6-9802-ed42b1afc98c,014d2350-4d4a-11e6-b004-9d13a9078e9a,01516370-94a3-11e6-8ece-0f437fd4c46e,015654a0-4874-11e6-8a37-219b2acdc346}'::text[]))",
"Node Type": "Bitmap Heap Scan",
"Plan Rows": 55,
"Relation Name": "store",
"Alias": "store",
"Output": [
"id",
"name",
"status",
"regionid",
"storeid"
],
"Parent Relationship": "Outer",
"Total Cost": 209.42,
"Schema": "public"
}
],
"Sort Key": [
"store.id",
"store.storeid"
],
"Plan Rows": 55,
"Node Type": "Sort",
"Output": [
"id",
"name",
"status",
"regionid",
"storeid"
],
"Plan Width": 110,
"Total Cost": 211.15
}
}
]
postgresql index
I have some client code hitting my postgres instance with a giant SELECT
statement that is matching against a list of IDs. The query ends up looking like this...
SELECT "id","name","status","regionid" FROM "public"."store" WHERE "regionid" IN ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38,$39,$40,$41,$42,$43,$44,$45,$46,$47,$48,$49,$50,$51,$52,$53,$54,$55,$56,$57,$58,$59,$60,$61,$62,$63,$64,$65,$66,$67,$68,$69,$70,$71,$72,$73,$74,$75,$76,$77,$78,$79,$80,$81,$82,$83,$84,$85,$86,$87,$88,$89,$90,$91,$92,$93,$94,$95,$96,$97,$98,$99,$100,$101,$102,$103,$104,$105,$106,$107,$108,$109,$110,$111,$112,$113,$114,$115,$116,$117,$118,$119,$120,$121,$122,$123,$124,$125,$126,$127,$128,$129,$130,$131,$132,$133,$134,$135,$136,$137,$138,$139,$140,$141,$142,$143,$144,$145,$146,$147,$148,$149,$150,$151,$152,$153,$154,$155,$156,$157,$158,$159,$160,$161,$162,$163,$164,$165,$166,$167,$168,$169,$170,$171,$172,$173,$174,$175,$176,$177,$178,$179,$180,$181,$182,$183,$184,$185,$186,$187,$188,$189,$190,$191,$192,$193,$194,$195,$196,$197,$198,$199,$200,$201,$202,$203,$204,$205,$206,$207,$208,$209,$210,$211,$212,$213,$214,$215,$216,$217,$218,$219,$220,$221,$222,$223,$224,$225,$226,$227,$228,$229,$230,$231,$232,$233,$234,$235,$236,$237,$238,$239,$240,$241,$242,$243,$244,$245,$246,$247,$248,$249,$250,$251,$252,$253,$254,$255,$256,$257,$258,$259,$260,$261,$262,$263,$264,$265,$266,$267,$268,$269,$270,$271,$272,$273,$274,$275,$276,$277,$278,$279,$280,$281,$282,$283,$284,$285,$286,$287,$288,$289,$290,$291,$292,$293,$294,$295,$296,$297,$298,$299,$300,$301,$302,$303,$304,$305,$306,$307,$308,$309,$310,$311,$312,$313,$314,$315,$316,$317,$318,$319,$320,$321,$322,$323,$324,$325,$326,$327,$328,$329,$330,$331,$332,$333,$334,$335,$336,$337,$338,$339,$340,$341,$342,$343,$344,$345,$346,$347,$348,$349,$350,$351,$352,$353,$354,$355,$356,$357,$358,$359,$360,$361,$362,$363,$364,$365,$366,$367,$368,$369,$370,$371,$372,$373,$374,$375,$376,$377,$378,$379,$380,$381,$382,$383,$384,$385,$386,$387,$388,$389,$390,$391,$392,$393,$394,$395,$396,$397,$398,$399,$400,$401,$402,$403,$404,$405,$406,$407,$408,$409,$410,$411,$412,$413,$414,$415,$416,$417,$418,$419,$420,$421,$422,$423,$424,$425,$426,$427,$428,$429,$430,$431,$432,$433,$434,$435,$436,$437,$438,$439,$440,$441,$442,$443,$444,$445,$446,$447,$448,$449,$450,$451,$452,$453,$454,$455,$456,$457,$458,$459,$460,$461,$462,$463,$464,$465,$466,$467,$468,$469,$470,$471,$472,$473,$474,$475,$476,$477,$478,$479,$480,$481,$482,$483,$484,$485,$486,$487,$488,$489,$490,$491,$492,$493,$494,$495,$496,$497,$498,$499,$500,$501,$502,$503,$504,$505,$506,$507,$508,$509,$510,$511,$512,$513,$514,$515,$516,$517,$518,$519,$520,$521,$522,$523,$524,$525,$526,$527,$528,$529,$530,$531,$532,$533,$534,$535,$536,$537,$538,$539,$540,$541,$542,$543,$544,$545,$546,$547,$548,$549,$550,$551,$552,$553,$554,$555,$556,$557,$558,$559,$560,$561,$562,$563,$564,$565,$566,$567,$568,$569,$570,$571,$572,$573,$574,$575,$576,$577,$578,$579,$580,$581,$582,$583,$584,$585,$586,$587,$588,$589,$590,$591,$592,$593,$594,$595,$596,$597,$598,$599,$600,$601,$602,$603,$604,$605,$606,$607,$608,$609,$610,$611,$612,$613,$614,$615,$616,$617,$618,$619,$620,$621,$622,$623,$624,$625,$626,$627,$628,$629,$630,$631,$632,$633,$634,$635,$636,$637,$638,$639,$640,$641,$642,$643,$644,$645,$646,$647,$648,$649,$650,$651,$652,$653,$654,$655,$656,$657,$658,$659,$660,$661,$662,$663,$664,$665,$666,$667,$668,$669,$670,$671,$672,$673,$674,$675,$676,$677,$678,$679,$680,$681,$682,$683,$684,$685,$686,$687,$688,$689,$690,$691,$692,$693,$694,$695,$696,$697,$698,$699,$700,$701,$702,$703,$704,$705,$706,$707,$708,$709,$710,$711,$712,$713,$714,$715,$716,$717,$718,$719,$720,$721,$722,$723,$724,$725,$726,$727,$728,$729,$730,$731,$732,$733,$734,$735,$736,$737,$738,$739,$740,$741,$742,$743,$744,$745,$746,$747,$748,$749,$750,$751,$752,$753,$754,$755,$756,$757,$758,$759,$760,$761,$762,$763,$764,$765,$766,$767,$768,$769,$770,$771,$772,$773,$774,$775,$776,$777,$778,$779,$780,$781,$782,$783,$784,$785,$786,$787,$788,$789,$790,$791,$792,$793,$794,$795,$796,$797,$798,$799,$800,$801,$802,$803,$804,$805,$806,$807,$808,$809,$810,$811,$812,$813,$814,$815,$816,$817,$818,$819,$820,$821,$822,$823,$824,$825,$826,$827,$828,$829,$830,$831,$832,$833,$834,$835,$836,$837,$838,$839,$840,$841,$842,$843,$844,$845,$846,$847,$848,$849,$850,$851,$852,$853,$854,$855,$856,$857,$858,$859,$860,$861,$862,$863,$864,$865,$866,$867,$868,$869,$870,$871,$872,$873,$874,$875,$876,$877,$878,$879,$880,$881,$882,$883,$884,$885,$886,$887,$888,$889,$890,$891,$892,$893,$894,$895,$896,$897,$898,$899,$900,$901,$902,$903,$904,$905,$906,$907,$908,$909,$910,$911,$912,$913,$914,$915,$916,$917,$918,$919,$920,$921,$922,$923,$924,$925,$926,$927,$928,$929,$930,$931,$932,$933,$934,$935,$936,$937,$938,$939,$940,$941,$942,$943,$944,$945,$946,$947,$948,$949,$950,$951,$952,$953,$954,$955,$956,$957,$958,$959,$960,$961,$962,$963,$964,$965,$966,$967,$968,$969,$970,$971,$972,$973,$974,$975,$976,$977,$978,$979,$980,$981,$982,$983,$984,$985,$986,$987,$988,$989,$990,$991,$992,$993,$994,$995,$996,$997,$998,$999,$1000,$1001,$1002,$1003,$1004,$1005,$1006,$1007,$1008) ORDER BY "id" [ '00178340-4467-11e6-8a37-219b2acdc346',
'003be4e0-0260-11e6-af27-2d5012c820c0',
'006a6460-db4f-11e5-9c34-e3c0bb24c767',
// snip ]
It seems like this query is especially slow. My initial intuition is to just slap an index on regionid
within the store
table, but I wonder if the database will really continue to scale as the number of regionid
elements that are queried over grows over time.
What is the typical way to scale this over time? Will a simple index on regionid
scalably save me going forwards, or is the WHERE IN
ultimately going to be a crux?
EDIT: Here is the result of an EXPLAIN
query on a shortened dataset:
[
{
"Plan": {
"Startup Cost": 211.01,
"Plans": [
{
"Startup Cost": 34.7,
"Plan Width": 110,
"Plans": [
{
"Startup Cost": 0,
"Plan Width": 0,
"Node Type": "Bitmap Index Scan",
"Index Cond": "(store.regionid = ANY ('{00178340-4467-11e6-8a37-219b2acdc346,003be4e0-0260-11e6-af27-2d5012c820c0,006a6460-db4f-11e5-9c34-e3c0bb24c767,00811b60-4d37-11e6-80df-5b5d32d23050,008d2270-8694-11e6-9802-ed42b1afc98c,014d2350-4d4a-11e6-b004-9d13a9078e9a,01516370-94a3-11e6-8ece-0f437fd4c46e,015654a0-4874-11e6-8a37-219b2acdc346}'::text[]))",
"Plan Rows": 55,
"Parent Relationship": "Outer",
"Index Name": "unique_stores",
"Total Cost": 34.69
}
],
"Recheck Cond": "(store.regionid = ANY ('{00178340-4467-11e6-8a37-219b2acdc346,003be4e0-0260-11e6-af27-2d5012c820c0,006a6460-db4f-11e5-9c34-e3c0bb24c767,00811b60-4d37-11e6-80df-5b5d32d23050,008d2270-8694-11e6-9802-ed42b1afc98c,014d2350-4d4a-11e6-b004-9d13a9078e9a,01516370-94a3-11e6-8ece-0f437fd4c46e,015654a0-4874-11e6-8a37-219b2acdc346}'::text[]))",
"Node Type": "Bitmap Heap Scan",
"Plan Rows": 55,
"Relation Name": "store",
"Alias": "store",
"Output": [
"id",
"name",
"status",
"regionid",
"storeid"
],
"Parent Relationship": "Outer",
"Total Cost": 209.42,
"Schema": "public"
}
],
"Sort Key": [
"store.id",
"store.storeid"
],
"Plan Rows": 55,
"Node Type": "Sort",
"Output": [
"id",
"name",
"status",
"regionid",
"storeid"
],
"Plan Width": 110,
"Total Cost": 211.15
}
}
]
postgresql index
postgresql index
edited Nov 2 '16 at 19:36
Jim
asked Nov 2 '16 at 19:11
JimJim
1112
1112
bumped to the homepage by Community♦ 18 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 18 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
1
How many distinctregionid
values are in thestore
table? If the application is listing out 1000 regions and there are only 1500 regions in total, it's almost certainly going to be faster to do a table scan than to use an index onregionid
. If there are 150,000 different regions, on the other hand, using an index onregionid
would almost certainly be more efficient even if the list grew to a few thousand entries.
– Justin Cave
Nov 2 '16 at 19:17
1
Separate from that, though, long before I got to 1000 elements, I'd seriously think about loading the regions I want to search into a temp table or some other structure so that I wasn't generating thousands of different queries that differed only by the number of bind variables in theIN
list.
– Justin Cave
Nov 2 '16 at 19:17
@JustinCave There is a typical ratio of about 20 stores per region...so it will scale ~20:1. With respect to your second comment, I'll have to do some reading up on that one.
– Jim
Nov 2 '16 at 19:21
For some context (I am still trying to sort this out, myself a bit!), it appears that a client ORM is generating this query through a permissions check. So, a user has permission (enforced at an application level) to access stores from regions with those IDs, and then they are querying for the accessible stores through thisWHERE IN
query (generated with an ORM)
– Jim
Nov 2 '16 at 19:22
That's not the ratio to be concerned with. It doesn't matter whether there are 1, 10, 100, or 1000 stores per region. The important question is what fraction of the rows in the table the query is returning. If there are tens or hundreds of thousands of regions, then an IN list with 1000 regions is only going to return a few percent of the rows in the table and an index would make sense. If the query is going to return most of the rows in the table because the IN list is so long, then an index won't be helpful.
– Justin Cave
Nov 2 '16 at 19:25
|
show 5 more comments
1
How many distinctregionid
values are in thestore
table? If the application is listing out 1000 regions and there are only 1500 regions in total, it's almost certainly going to be faster to do a table scan than to use an index onregionid
. If there are 150,000 different regions, on the other hand, using an index onregionid
would almost certainly be more efficient even if the list grew to a few thousand entries.
– Justin Cave
Nov 2 '16 at 19:17
1
Separate from that, though, long before I got to 1000 elements, I'd seriously think about loading the regions I want to search into a temp table or some other structure so that I wasn't generating thousands of different queries that differed only by the number of bind variables in theIN
list.
– Justin Cave
Nov 2 '16 at 19:17
@JustinCave There is a typical ratio of about 20 stores per region...so it will scale ~20:1. With respect to your second comment, I'll have to do some reading up on that one.
– Jim
Nov 2 '16 at 19:21
For some context (I am still trying to sort this out, myself a bit!), it appears that a client ORM is generating this query through a permissions check. So, a user has permission (enforced at an application level) to access stores from regions with those IDs, and then they are querying for the accessible stores through thisWHERE IN
query (generated with an ORM)
– Jim
Nov 2 '16 at 19:22
That's not the ratio to be concerned with. It doesn't matter whether there are 1, 10, 100, or 1000 stores per region. The important question is what fraction of the rows in the table the query is returning. If there are tens or hundreds of thousands of regions, then an IN list with 1000 regions is only going to return a few percent of the rows in the table and an index would make sense. If the query is going to return most of the rows in the table because the IN list is so long, then an index won't be helpful.
– Justin Cave
Nov 2 '16 at 19:25
1
1
How many distinct
regionid
values are in the store
table? If the application is listing out 1000 regions and there are only 1500 regions in total, it's almost certainly going to be faster to do a table scan than to use an index on regionid
. If there are 150,000 different regions, on the other hand, using an index on regionid
would almost certainly be more efficient even if the list grew to a few thousand entries.– Justin Cave
Nov 2 '16 at 19:17
How many distinct
regionid
values are in the store
table? If the application is listing out 1000 regions and there are only 1500 regions in total, it's almost certainly going to be faster to do a table scan than to use an index on regionid
. If there are 150,000 different regions, on the other hand, using an index on regionid
would almost certainly be more efficient even if the list grew to a few thousand entries.– Justin Cave
Nov 2 '16 at 19:17
1
1
Separate from that, though, long before I got to 1000 elements, I'd seriously think about loading the regions I want to search into a temp table or some other structure so that I wasn't generating thousands of different queries that differed only by the number of bind variables in the
IN
list.– Justin Cave
Nov 2 '16 at 19:17
Separate from that, though, long before I got to 1000 elements, I'd seriously think about loading the regions I want to search into a temp table or some other structure so that I wasn't generating thousands of different queries that differed only by the number of bind variables in the
IN
list.– Justin Cave
Nov 2 '16 at 19:17
@JustinCave There is a typical ratio of about 20 stores per region...so it will scale ~20:1. With respect to your second comment, I'll have to do some reading up on that one.
– Jim
Nov 2 '16 at 19:21
@JustinCave There is a typical ratio of about 20 stores per region...so it will scale ~20:1. With respect to your second comment, I'll have to do some reading up on that one.
– Jim
Nov 2 '16 at 19:21
For some context (I am still trying to sort this out, myself a bit!), it appears that a client ORM is generating this query through a permissions check. So, a user has permission (enforced at an application level) to access stores from regions with those IDs, and then they are querying for the accessible stores through this
WHERE IN
query (generated with an ORM)– Jim
Nov 2 '16 at 19:22
For some context (I am still trying to sort this out, myself a bit!), it appears that a client ORM is generating this query through a permissions check. So, a user has permission (enforced at an application level) to access stores from regions with those IDs, and then they are querying for the accessible stores through this
WHERE IN
query (generated with an ORM)– Jim
Nov 2 '16 at 19:22
That's not the ratio to be concerned with. It doesn't matter whether there are 1, 10, 100, or 1000 stores per region. The important question is what fraction of the rows in the table the query is returning. If there are tens or hundreds of thousands of regions, then an IN list with 1000 regions is only going to return a few percent of the rows in the table and an index would make sense. If the query is going to return most of the rows in the table because the IN list is so long, then an index won't be helpful.
– Justin Cave
Nov 2 '16 at 19:25
That's not the ratio to be concerned with. It doesn't matter whether there are 1, 10, 100, or 1000 stores per region. The important question is what fraction of the rows in the table the query is returning. If there are tens or hundreds of thousands of regions, then an IN list with 1000 regions is only going to return a few percent of the rows in the table and an index would make sense. If the query is going to return most of the rows in the table because the IN list is so long, then an index won't be helpful.
– Justin Cave
Nov 2 '16 at 19:25
|
show 5 more comments
1 Answer
1
active
oldest
votes
Since Postgres 9.2 index-only scan is supported so, in answer to your question whether an index can help in this situation, I think it definitely might.
Instead of just indexing the regionid
field, create an index on (regionid, id, name, status)
and the query can scan the index instead of the table. Assuming that the table has many more fields than just those four, then this operation should be much quicker.
Don't forget to test whether this impacts on the UPDATE
, DELETE
and INSERT
operations.
More info at Use The Index Luke
@Jim this was an interesting question and I'm keen to hear back whether my proposed solution was any help. Could you share your findings?
– mendosi
Nov 10 '16 at 0:26
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f154081%2fcan-indices-save-a-giant-where-foo-in-query-with-thousands-of-entries-in%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Since Postgres 9.2 index-only scan is supported so, in answer to your question whether an index can help in this situation, I think it definitely might.
Instead of just indexing the regionid
field, create an index on (regionid, id, name, status)
and the query can scan the index instead of the table. Assuming that the table has many more fields than just those four, then this operation should be much quicker.
Don't forget to test whether this impacts on the UPDATE
, DELETE
and INSERT
operations.
More info at Use The Index Luke
@Jim this was an interesting question and I'm keen to hear back whether my proposed solution was any help. Could you share your findings?
– mendosi
Nov 10 '16 at 0:26
add a comment |
Since Postgres 9.2 index-only scan is supported so, in answer to your question whether an index can help in this situation, I think it definitely might.
Instead of just indexing the regionid
field, create an index on (regionid, id, name, status)
and the query can scan the index instead of the table. Assuming that the table has many more fields than just those four, then this operation should be much quicker.
Don't forget to test whether this impacts on the UPDATE
, DELETE
and INSERT
operations.
More info at Use The Index Luke
@Jim this was an interesting question and I'm keen to hear back whether my proposed solution was any help. Could you share your findings?
– mendosi
Nov 10 '16 at 0:26
add a comment |
Since Postgres 9.2 index-only scan is supported so, in answer to your question whether an index can help in this situation, I think it definitely might.
Instead of just indexing the regionid
field, create an index on (regionid, id, name, status)
and the query can scan the index instead of the table. Assuming that the table has many more fields than just those four, then this operation should be much quicker.
Don't forget to test whether this impacts on the UPDATE
, DELETE
and INSERT
operations.
More info at Use The Index Luke
Since Postgres 9.2 index-only scan is supported so, in answer to your question whether an index can help in this situation, I think it definitely might.
Instead of just indexing the regionid
field, create an index on (regionid, id, name, status)
and the query can scan the index instead of the table. Assuming that the table has many more fields than just those four, then this operation should be much quicker.
Don't forget to test whether this impacts on the UPDATE
, DELETE
and INSERT
operations.
More info at Use The Index Luke
edited Nov 3 '16 at 9:38
dezso
22.2k116096
22.2k116096
answered Nov 3 '16 at 3:23
mendosimendosi
1,972520
1,972520
@Jim this was an interesting question and I'm keen to hear back whether my proposed solution was any help. Could you share your findings?
– mendosi
Nov 10 '16 at 0:26
add a comment |
@Jim this was an interesting question and I'm keen to hear back whether my proposed solution was any help. Could you share your findings?
– mendosi
Nov 10 '16 at 0:26
@Jim this was an interesting question and I'm keen to hear back whether my proposed solution was any help. Could you share your findings?
– mendosi
Nov 10 '16 at 0:26
@Jim this was an interesting question and I'm keen to hear back whether my proposed solution was any help. Could you share your findings?
– mendosi
Nov 10 '16 at 0:26
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f154081%2fcan-indices-save-a-giant-where-foo-in-query-with-thousands-of-entries-in%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
How many distinct
regionid
values are in thestore
table? If the application is listing out 1000 regions and there are only 1500 regions in total, it's almost certainly going to be faster to do a table scan than to use an index onregionid
. If there are 150,000 different regions, on the other hand, using an index onregionid
would almost certainly be more efficient even if the list grew to a few thousand entries.– Justin Cave
Nov 2 '16 at 19:17
1
Separate from that, though, long before I got to 1000 elements, I'd seriously think about loading the regions I want to search into a temp table or some other structure so that I wasn't generating thousands of different queries that differed only by the number of bind variables in the
IN
list.– Justin Cave
Nov 2 '16 at 19:17
@JustinCave There is a typical ratio of about 20 stores per region...so it will scale ~20:1. With respect to your second comment, I'll have to do some reading up on that one.
– Jim
Nov 2 '16 at 19:21
For some context (I am still trying to sort this out, myself a bit!), it appears that a client ORM is generating this query through a permissions check. So, a user has permission (enforced at an application level) to access stores from regions with those IDs, and then they are querying for the accessible stores through this
WHERE IN
query (generated with an ORM)– Jim
Nov 2 '16 at 19:22
That's not the ratio to be concerned with. It doesn't matter whether there are 1, 10, 100, or 1000 stores per region. The important question is what fraction of the rows in the table the query is returning. If there are tens or hundreds of thousands of regions, then an IN list with 1000 regions is only going to return a few percent of the rows in the table and an index would make sense. If the query is going to return most of the rows in the table because the IN list is so long, then an index won't be helpful.
– Justin Cave
Nov 2 '16 at 19:25