PostgreSQL: hstore trgm search on fixed keyGROUP BY one column, while sorting by another in PostgreSQLJSONB...

What is an efficient way to digitize a family photo collection?

Can you say "leftside right"?

Is the tritone (A4 / d5) still banned in Roman Catholic music?

Trying to make a 3dplot

What is an explicit bijection in combinatorics?

How to make transparent background from pdf to png

Are one-line email responses considered disrespectful?

Is there any danger of my neighbor having my wife's signature?

Do the speed limit reductions due to pollution also apply to electric cars in France?

Is there a Tezos equivalent of ERC?

How can I handle players killing my NPC outside of combat?

Does しかたない imply disappointment?

Isn't a semicolon (';') needed after a function declaration in C++?

Python to write multiple dataframes and highlight rows inside an excel file

Can a Way of Shadow Monk use Shadow Step to teleport to a dark ceiling and then body slam another creature?

How can I give a Ranger advantage on a check due to Favored Enemy without spoiling the story for the player?

When distributing a Linux kernel driver as source code, what's the difference between Proprietary and GPL license?

Partial derivative with respect to three variables

Why write a book when there's a movie in my head?

How to transport 10,000 terrestrial trolls across ocean fast?

How to deal with an underperforming subordinate?

Is Screenshot Time-tracking Common?

What does "don't have a baby" imply or mean in this sentence?

Is GMW different from Secret Sharing?



PostgreSQL: hstore trgm search on fixed key


GROUP BY one column, while sorting by another in PostgreSQLJSONB vs BRIN IndexHow to properly index hstore tags column to faster search for keysHow to store thousands of properties related to a record in PostgreSQL?Dynamically convert hstore keys into columns for an unknown set of keysHow to implement search for world-wide Wikipedia/OSM places with alternative-namesHow can I update an hstore[] field?JSONB with indexing vs. hstoreupdating a hstore with multiple new columnsAny way to squeeze performance from Postgres JSONB queries?Is there a way to enforce uniqueness of arbitrary key-value pairs in a JSONB object columnTrigram index on all values of a JSONB column













1















I have a table of products with custom properties for each product. So I have props::hstore column in my table. I want the user to be able to search products by key/value by first selecting the key and then entering the value with autocomplete feature. So I need two steps here:




  1. Get all the keys in props field. SELECT DISTINCT with skeys should work here, but I don't understand how to create index.

  2. Find most relavent values for fixed key (autocomplete feature). But seems like gin_trgm_ops can create index only on value not on value for fixed keys.


Also, maybe it can be a good idea to change the hstore with jsonb, but I don't see why it can be more perfomant.










share|improve this question























  • There may be a way after all. How many rows? How many distinct keys total? How many distinct combinations of keys? Your version of Postgres?

    – Erwin Brandstetter
    Feb 6 at 23:25











  • @ErwinBrandstetter around 10kk of rows, 10-20 distinct keys, around 1000 combinations. It's 9.6

    – Ximik
    Feb 7 at 13:26


















1















I have a table of products with custom properties for each product. So I have props::hstore column in my table. I want the user to be able to search products by key/value by first selecting the key and then entering the value with autocomplete feature. So I need two steps here:




  1. Get all the keys in props field. SELECT DISTINCT with skeys should work here, but I don't understand how to create index.

  2. Find most relavent values for fixed key (autocomplete feature). But seems like gin_trgm_ops can create index only on value not on value for fixed keys.


Also, maybe it can be a good idea to change the hstore with jsonb, but I don't see why it can be more perfomant.










share|improve this question























  • There may be a way after all. How many rows? How many distinct keys total? How many distinct combinations of keys? Your version of Postgres?

    – Erwin Brandstetter
    Feb 6 at 23:25











  • @ErwinBrandstetter around 10kk of rows, 10-20 distinct keys, around 1000 combinations. It's 9.6

    – Ximik
    Feb 7 at 13:26
















1












1








1








I have a table of products with custom properties for each product. So I have props::hstore column in my table. I want the user to be able to search products by key/value by first selecting the key and then entering the value with autocomplete feature. So I need two steps here:




  1. Get all the keys in props field. SELECT DISTINCT with skeys should work here, but I don't understand how to create index.

  2. Find most relavent values for fixed key (autocomplete feature). But seems like gin_trgm_ops can create index only on value not on value for fixed keys.


Also, maybe it can be a good idea to change the hstore with jsonb, but I don't see why it can be more perfomant.










share|improve this question














I have a table of products with custom properties for each product. So I have props::hstore column in my table. I want the user to be able to search products by key/value by first selecting the key and then entering the value with autocomplete feature. So I need two steps here:




  1. Get all the keys in props field. SELECT DISTINCT with skeys should work here, but I don't understand how to create index.

  2. Find most relavent values for fixed key (autocomplete feature). But seems like gin_trgm_ops can create index only on value not on value for fixed keys.


Also, maybe it can be a good idea to change the hstore with jsonb, but I don't see why it can be more perfomant.







postgresql index-tuning hstore






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 6 at 14:24









XimikXimik

1155




1155













  • There may be a way after all. How many rows? How many distinct keys total? How many distinct combinations of keys? Your version of Postgres?

    – Erwin Brandstetter
    Feb 6 at 23:25











  • @ErwinBrandstetter around 10kk of rows, 10-20 distinct keys, around 1000 combinations. It's 9.6

    – Ximik
    Feb 7 at 13:26





















  • There may be a way after all. How many rows? How many distinct keys total? How many distinct combinations of keys? Your version of Postgres?

    – Erwin Brandstetter
    Feb 6 at 23:25











  • @ErwinBrandstetter around 10kk of rows, 10-20 distinct keys, around 1000 combinations. It's 9.6

    – Ximik
    Feb 7 at 13:26



















There may be a way after all. How many rows? How many distinct keys total? How many distinct combinations of keys? Your version of Postgres?

– Erwin Brandstetter
Feb 6 at 23:25





There may be a way after all. How many rows? How many distinct keys total? How many distinct combinations of keys? Your version of Postgres?

– Erwin Brandstetter
Feb 6 at 23:25













@ErwinBrandstetter around 10kk of rows, 10-20 distinct keys, around 1000 combinations. It's 9.6

– Ximik
Feb 7 at 13:26







@ErwinBrandstetter around 10kk of rows, 10-20 distinct keys, around 1000 combinations. It's 9.6

– Ximik
Feb 7 at 13:26












2 Answers
2






active

oldest

votes


















1














There is no performant way to get all distinct names of keys used in an entire hstore (or jsonb) column for a table with many rows. You could used a materialized view if the list of distinct keys does not have to be perfectly up to date. But more likely, you should store your key-value pairs in a real relational table rather than in denormalized form.






share|improve this answer































    1














    With only 10-20 distinct keys, I would urgently consider one separate column per key in the same row. Unused keys stay NULL. Related case:




    • JSONB vs BRIN Index


    That said, there is a way to get the list of distinct key names from a document type column like jsonb or hstore comparatively quickly:



    Create a btree expression index on the sorted array of key names (or a concatenated string), traverse it with a recursive CTE to emulate a loose index scan and condense the distinct set of keys from the distinct set of combinations of keys.



    That will read 1000 index tuples for 1000 combinations (ideally in an index-only scan) instead of all 10kk of rows. A difference of factor 10k.



    You need an IMMUTABLE function extracting the array or list of keys for the index - like akeys() for hstore.



    A btree index on an array is rarely useful. This is the rare case.



    Related code examples:




    • GROUP BY one column, while sorting by another in PostgreSQL






    share|improve this answer


























    • I'll add code for an implementation when I find time.

      – Erwin Brandstetter
      Feb 11 at 13:54











    • I've made several perfomance measures and use now id, key, value table with props::int[]. However will check your option as well, thanks.

      – Ximik
      Feb 11 at 13:58











    • @Ximik: If you can provide a minimum demo setup for your current implementation in a fiddle, I can demonstrate my idea on this base - when I get around to it.

      – Erwin Brandstetter
      Feb 11 at 14:02











    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229036%2fpostgresql-hstore-trgm-search-on-fixed-key%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    There is no performant way to get all distinct names of keys used in an entire hstore (or jsonb) column for a table with many rows. You could used a materialized view if the list of distinct keys does not have to be perfectly up to date. But more likely, you should store your key-value pairs in a real relational table rather than in denormalized form.






    share|improve this answer




























      1














      There is no performant way to get all distinct names of keys used in an entire hstore (or jsonb) column for a table with many rows. You could used a materialized view if the list of distinct keys does not have to be perfectly up to date. But more likely, you should store your key-value pairs in a real relational table rather than in denormalized form.






      share|improve this answer


























        1












        1








        1







        There is no performant way to get all distinct names of keys used in an entire hstore (or jsonb) column for a table with many rows. You could used a materialized view if the list of distinct keys does not have to be perfectly up to date. But more likely, you should store your key-value pairs in a real relational table rather than in denormalized form.






        share|improve this answer













        There is no performant way to get all distinct names of keys used in an entire hstore (or jsonb) column for a table with many rows. You could used a materialized view if the list of distinct keys does not have to be perfectly up to date. But more likely, you should store your key-value pairs in a real relational table rather than in denormalized form.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Feb 6 at 16:02









        jjanesjjanes

        13.3k817




        13.3k817

























            1














            With only 10-20 distinct keys, I would urgently consider one separate column per key in the same row. Unused keys stay NULL. Related case:




            • JSONB vs BRIN Index


            That said, there is a way to get the list of distinct key names from a document type column like jsonb or hstore comparatively quickly:



            Create a btree expression index on the sorted array of key names (or a concatenated string), traverse it with a recursive CTE to emulate a loose index scan and condense the distinct set of keys from the distinct set of combinations of keys.



            That will read 1000 index tuples for 1000 combinations (ideally in an index-only scan) instead of all 10kk of rows. A difference of factor 10k.



            You need an IMMUTABLE function extracting the array or list of keys for the index - like akeys() for hstore.



            A btree index on an array is rarely useful. This is the rare case.



            Related code examples:




            • GROUP BY one column, while sorting by another in PostgreSQL






            share|improve this answer


























            • I'll add code for an implementation when I find time.

              – Erwin Brandstetter
              Feb 11 at 13:54











            • I've made several perfomance measures and use now id, key, value table with props::int[]. However will check your option as well, thanks.

              – Ximik
              Feb 11 at 13:58











            • @Ximik: If you can provide a minimum demo setup for your current implementation in a fiddle, I can demonstrate my idea on this base - when I get around to it.

              – Erwin Brandstetter
              Feb 11 at 14:02
















            1














            With only 10-20 distinct keys, I would urgently consider one separate column per key in the same row. Unused keys stay NULL. Related case:




            • JSONB vs BRIN Index


            That said, there is a way to get the list of distinct key names from a document type column like jsonb or hstore comparatively quickly:



            Create a btree expression index on the sorted array of key names (or a concatenated string), traverse it with a recursive CTE to emulate a loose index scan and condense the distinct set of keys from the distinct set of combinations of keys.



            That will read 1000 index tuples for 1000 combinations (ideally in an index-only scan) instead of all 10kk of rows. A difference of factor 10k.



            You need an IMMUTABLE function extracting the array or list of keys for the index - like akeys() for hstore.



            A btree index on an array is rarely useful. This is the rare case.



            Related code examples:




            • GROUP BY one column, while sorting by another in PostgreSQL






            share|improve this answer


























            • I'll add code for an implementation when I find time.

              – Erwin Brandstetter
              Feb 11 at 13:54











            • I've made several perfomance measures and use now id, key, value table with props::int[]. However will check your option as well, thanks.

              – Ximik
              Feb 11 at 13:58











            • @Ximik: If you can provide a minimum demo setup for your current implementation in a fiddle, I can demonstrate my idea on this base - when I get around to it.

              – Erwin Brandstetter
              Feb 11 at 14:02














            1












            1








            1







            With only 10-20 distinct keys, I would urgently consider one separate column per key in the same row. Unused keys stay NULL. Related case:




            • JSONB vs BRIN Index


            That said, there is a way to get the list of distinct key names from a document type column like jsonb or hstore comparatively quickly:



            Create a btree expression index on the sorted array of key names (or a concatenated string), traverse it with a recursive CTE to emulate a loose index scan and condense the distinct set of keys from the distinct set of combinations of keys.



            That will read 1000 index tuples for 1000 combinations (ideally in an index-only scan) instead of all 10kk of rows. A difference of factor 10k.



            You need an IMMUTABLE function extracting the array or list of keys for the index - like akeys() for hstore.



            A btree index on an array is rarely useful. This is the rare case.



            Related code examples:




            • GROUP BY one column, while sorting by another in PostgreSQL






            share|improve this answer















            With only 10-20 distinct keys, I would urgently consider one separate column per key in the same row. Unused keys stay NULL. Related case:




            • JSONB vs BRIN Index


            That said, there is a way to get the list of distinct key names from a document type column like jsonb or hstore comparatively quickly:



            Create a btree expression index on the sorted array of key names (or a concatenated string), traverse it with a recursive CTE to emulate a loose index scan and condense the distinct set of keys from the distinct set of combinations of keys.



            That will read 1000 index tuples for 1000 combinations (ideally in an index-only scan) instead of all 10kk of rows. A difference of factor 10k.



            You need an IMMUTABLE function extracting the array or list of keys for the index - like akeys() for hstore.



            A btree index on an array is rarely useful. This is the rare case.



            Related code examples:




            • GROUP BY one column, while sorting by another in PostgreSQL







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 8 mins ago

























            answered Feb 11 at 13:53









            Erwin BrandstetterErwin Brandstetter

            93k9178292




            93k9178292













            • I'll add code for an implementation when I find time.

              – Erwin Brandstetter
              Feb 11 at 13:54











            • I've made several perfomance measures and use now id, key, value table with props::int[]. However will check your option as well, thanks.

              – Ximik
              Feb 11 at 13:58











            • @Ximik: If you can provide a minimum demo setup for your current implementation in a fiddle, I can demonstrate my idea on this base - when I get around to it.

              – Erwin Brandstetter
              Feb 11 at 14:02



















            • I'll add code for an implementation when I find time.

              – Erwin Brandstetter
              Feb 11 at 13:54











            • I've made several perfomance measures and use now id, key, value table with props::int[]. However will check your option as well, thanks.

              – Ximik
              Feb 11 at 13:58











            • @Ximik: If you can provide a minimum demo setup for your current implementation in a fiddle, I can demonstrate my idea on this base - when I get around to it.

              – Erwin Brandstetter
              Feb 11 at 14:02

















            I'll add code for an implementation when I find time.

            – Erwin Brandstetter
            Feb 11 at 13:54





            I'll add code for an implementation when I find time.

            – Erwin Brandstetter
            Feb 11 at 13:54













            I've made several perfomance measures and use now id, key, value table with props::int[]. However will check your option as well, thanks.

            – Ximik
            Feb 11 at 13:58





            I've made several perfomance measures and use now id, key, value table with props::int[]. However will check your option as well, thanks.

            – Ximik
            Feb 11 at 13:58













            @Ximik: If you can provide a minimum demo setup for your current implementation in a fiddle, I can demonstrate my idea on this base - when I get around to it.

            – Erwin Brandstetter
            Feb 11 at 14:02





            @Ximik: If you can provide a minimum demo setup for your current implementation in a fiddle, I can demonstrate my idea on this base - when I get around to it.

            – Erwin Brandstetter
            Feb 11 at 14:02


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f229036%2fpostgresql-hstore-trgm-search-on-fixed-key%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Discografia di Klaus Schulze Indice Album in studio | Album dal vivo | Singoli | Antologie | Colonne...

            Armoriale delle famiglie italiane (Car) Indice Armi | Bibliografia | Menu di navigazioneBlasone...

            Lupi Siderali Indice Storia | Organizzazione | La Tredicesima Compagnia | Aspetto | Membri Importanti...