Guid vs INT - Which is better as a primary key?Sequential GUID or bigint for 'huge' database table PKAre...

Bursted bubble like details on material

Players preemptively rolling, even though their rolls are useless or are checking the wrong skills

If I tried and failed to start my own business, how do I apply for a job without job experience?

How to deal with an underperforming subordinate?

Was there a pre-determined arrangement for the division of Germany in case it surrendered before any Soviet forces entered its territory?

smartctl reports overall health test as passed but the tests failed?

Can I travel from country A to country B to country C without going back to country A?

How can guns be countered by melee combat without raw-ability or exceptional explanations?

Calculating the strength of an ionic bond that contains poly-atomic ions

How can I prevent an oracle who can see into the past from knowing everything that has happened?

Sri Keyuravati of krama tradition

Sing Baby Shark

What are some idioms that means something along the lines of "switching it up every day to not do the same thing over and over"?

Can you say "leftside right"?

Do the speed limit reductions due to pollution also apply to electric cars in France?

Are all power cords made equal?

Was Opportunity's last message to Earth "My battery is low and it's getting dark"?

How to draw a node with two options using TikZ graphs in LaTeX

How do I add a strong "onion flavor" to the biryani (in restaurant style)?

What is an efficient way to digitize a family photo collection?

Why did Ylvis use "go" instead of "say" in phrases like "Dog goes 'woof'"?

Explicit Riemann Hilbert correspondence

Buying a "Used" Router

What's the reason that we have a different number of days each month?



Guid vs INT - Which is better as a primary key?


Sequential GUID or bigint for 'huge' database table PKAre there any performance differences with a GUID vs INT clustered indexed column, especially with joins on those columns?Why use an int as a lookup table's primary key?Mysql int vs varchar as primary key (InnoDB Storage Engine?Best solution to fixing database design with GUID as primary keyChanging primary key from varchar to intUse GUID as primary key in Azure?Impact of additional int field as primary keySQL Server: performance of IDENTITY vs INT as primary keyTrying to understand how an auto increment primary key is better than no primary key and some other primary key questionsalter primary id field to uniqueidentifier GUID as default in sql serverClustered Sequential GUID Primary Key vs Non-Clustered GUID and Clustered Sequential ID Primary Keys













88















I've being reading around reasons to use or not Guid and int.



int is smaller, faster, easy to remember, keeps a chronological sequence. And as for Guid, the only advantage I found is that it is unique. In which case a Guid would be better than and int and why?



From what I've seen, int has no flaws except by the number limit, which in many cases are irrelevant.



Why exactly was Guid created? I actually think it has a purpose other than serving as primary key of a simple table. (Any example of a real application using Guid for something?)



( Guid = UniqueIdentifier ) type on SQL Server










share|improve this question


















  • 1





    Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.

    – onedaywhen
    Jan 31 '12 at 8:48













  • Also remember the difference between (Primary) KEY and INDEX.

    – Allan S. Hansen
    Jan 7 '14 at 6:55






  • 1





    Also discussed on SO: stackoverflow.com/questions/11033435/…

    – Jon of All Trades
    Jun 22 '15 at 21:41






  • 2





    "int has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bit INT is entirely irrelevant given that the upper limit of a signed, 64-bit BIGINT is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes for INT) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.

    – Solomon Rutzky
    Oct 7 '15 at 18:41
















88















I've being reading around reasons to use or not Guid and int.



int is smaller, faster, easy to remember, keeps a chronological sequence. And as for Guid, the only advantage I found is that it is unique. In which case a Guid would be better than and int and why?



From what I've seen, int has no flaws except by the number limit, which in many cases are irrelevant.



Why exactly was Guid created? I actually think it has a purpose other than serving as primary key of a simple table. (Any example of a real application using Guid for something?)



( Guid = UniqueIdentifier ) type on SQL Server










share|improve this question


















  • 1





    Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.

    – onedaywhen
    Jan 31 '12 at 8:48













  • Also remember the difference between (Primary) KEY and INDEX.

    – Allan S. Hansen
    Jan 7 '14 at 6:55






  • 1





    Also discussed on SO: stackoverflow.com/questions/11033435/…

    – Jon of All Trades
    Jun 22 '15 at 21:41






  • 2





    "int has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bit INT is entirely irrelevant given that the upper limit of a signed, 64-bit BIGINT is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes for INT) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.

    – Solomon Rutzky
    Oct 7 '15 at 18:41














88












88








88


32






I've being reading around reasons to use or not Guid and int.



int is smaller, faster, easy to remember, keeps a chronological sequence. And as for Guid, the only advantage I found is that it is unique. In which case a Guid would be better than and int and why?



From what I've seen, int has no flaws except by the number limit, which in many cases are irrelevant.



Why exactly was Guid created? I actually think it has a purpose other than serving as primary key of a simple table. (Any example of a real application using Guid for something?)



( Guid = UniqueIdentifier ) type on SQL Server










share|improve this question














I've being reading around reasons to use or not Guid and int.



int is smaller, faster, easy to remember, keeps a chronological sequence. And as for Guid, the only advantage I found is that it is unique. In which case a Guid would be better than and int and why?



From what I've seen, int has no flaws except by the number limit, which in many cases are irrelevant.



Why exactly was Guid created? I actually think it has a purpose other than serving as primary key of a simple table. (Any example of a real application using Guid for something?)



( Guid = UniqueIdentifier ) type on SQL Server







performance sql-server primary-key uniqueidentifier






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Jan 5 '11 at 7:46









BrunoLMBrunoLM

1,18541721




1,18541721








  • 1





    Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.

    – onedaywhen
    Jan 31 '12 at 8:48













  • Also remember the difference between (Primary) KEY and INDEX.

    – Allan S. Hansen
    Jan 7 '14 at 6:55






  • 1





    Also discussed on SO: stackoverflow.com/questions/11033435/…

    – Jon of All Trades
    Jun 22 '15 at 21:41






  • 2





    "int has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bit INT is entirely irrelevant given that the upper limit of a signed, 64-bit BIGINT is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes for INT) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.

    – Solomon Rutzky
    Oct 7 '15 at 18:41














  • 1





    Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.

    – onedaywhen
    Jan 31 '12 at 8:48













  • Also remember the difference between (Primary) KEY and INDEX.

    – Allan S. Hansen
    Jan 7 '14 at 6:55






  • 1





    Also discussed on SO: stackoverflow.com/questions/11033435/…

    – Jon of All Trades
    Jun 22 '15 at 21:41






  • 2





    "int has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bit INT is entirely irrelevant given that the upper limit of a signed, 64-bit BIGINT is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes for INT) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.

    – Solomon Rutzky
    Oct 7 '15 at 18:41








1




1





Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.

– onedaywhen
Jan 31 '12 at 8:48







Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.

– onedaywhen
Jan 31 '12 at 8:48















Also remember the difference between (Primary) KEY and INDEX.

– Allan S. Hansen
Jan 7 '14 at 6:55





Also remember the difference between (Primary) KEY and INDEX.

– Allan S. Hansen
Jan 7 '14 at 6:55




1




1





Also discussed on SO: stackoverflow.com/questions/11033435/…

– Jon of All Trades
Jun 22 '15 at 21:41





Also discussed on SO: stackoverflow.com/questions/11033435/…

– Jon of All Trades
Jun 22 '15 at 21:41




2




2





"int has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bit INT is entirely irrelevant given that the upper limit of a signed, 64-bit BIGINT is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes for INT) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.

– Solomon Rutzky
Oct 7 '15 at 18:41





"int has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bit INT is entirely irrelevant given that the upper limit of a signed, 64-bit BIGINT is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes for INT) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.

– Solomon Rutzky
Oct 7 '15 at 18:41










7 Answers
7






active

oldest

votes


















83














This has been asked in Stack Overflow here and here.



Jeff's post explains a lot about pros and cons of using GUID.




GUID Pros




  • Unique across every table, every database and every server

  • Allows easy merging of records from different databases

  • Allows easy distribution of databases across multiple servers

  • You can generate IDs anywhere, instead of having to roundtrip to the database

  • Most replication scenarios require GUID columns anyway


GUID Cons




  • It is a whopping 4 times larger than the traditional 4-byte index
    value; this can have serious
    performance and storage implications
    if you're not careful

  • Cumbersome to debug (where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}')

  • The generated GUIDs should be partially sequential for best
    performance (eg, newsequentialid() on
    SQL Server 2005+) and to enable use of
    clustered indexes




If you are certain about performance and you are not planning to replicate or merge records, then use int, and set it auto increment (identity seed in SQL Server).






share|improve this answer





















  • 18





    Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)

    – Brann
    Jul 8 '11 at 21:22






  • 2





    If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.

    – datagod
    Aug 6 '11 at 4:32






  • 6





    Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.

    – mrdenny
    Aug 6 '11 at 6:37








  • 16





    @Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.

    – Solomon Rutzky
    Oct 7 '15 at 18:54






  • 2





    @ChadKuehn Choosing UNIQUEIDENTIFIER over INT because INT has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.

    – Solomon Rutzky
    Oct 7 '15 at 18:59





















17














If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.






share|improve this answer





















  • 3





    GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.

    – Greg
    Oct 7 '15 at 18:07



















14














I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id column AND a guid column. The guid can be used as needed to globally uniquely identify the row and id can be used for queries, sorting and human identification of the row.






share|improve this answer



















  • 3





    What value does the GUID give if the id already is sufficient for humans to identify a row?

    – Martin Smith
    Apr 3 '15 at 18:06






  • 6





    The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.

    – rmirabelle
    Apr 3 '15 at 21:46






  • 1





    @MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the INT PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.

    – Solomon Rutzky
    Oct 7 '15 at 19:05






  • 1





    @rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.

    – easuter
    Oct 31 '15 at 0:52






  • 1





    @easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)

    – Solomon Rutzky
    Oct 31 '15 at 15:26



















1














Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.



Of course, the drawback of this is like "Say no to scalability!"





Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.



I know this sounds really obvious, but I see that being forgotten quite often.



For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.



However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).



Thanks @VahiD for the clarifications.






share|improve this answer


























  • using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.

    – VahiD
    Oct 7 '15 at 10:39






  • 1





    Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.

    – Alpha
    Oct 7 '15 at 17:13











  • upvote for the development in the years :)

    – VahiD
    Oct 8 '15 at 18:28



















0














Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.



Just a thought and I hope I am remembering correctly. Have a great day!






share|improve this answer



















  • 2





    Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.

    – LowlyDBA
    May 1 '18 at 21:17





















0














Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.





share








New contributor




golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




























    -6














    Use both



    Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.



    But bind a column to GUID so that every row also has a unique column






    share|improve this answer



















    • 1





      Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.

      – Andriy M
      Jan 14 '16 at 6:29











    • GUID is 36 character long will be difficult to read in case you are searching for a specific case..

      – Abdul Hannan Ijaz
      Jan 14 '16 at 7:57






    • 1





      All right, but that doesn't really explain why the OP should use both int and guid, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?

      – Andriy M
      Jan 14 '16 at 8:18













    • Yup i meant the same thing .. cool BTW :)

      – Abdul Hannan Ijaz
      Jan 14 '16 at 9:22











    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f264%2fguid-vs-int-which-is-better-as-a-primary-key%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    7 Answers
    7






    active

    oldest

    votes








    7 Answers
    7






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    83














    This has been asked in Stack Overflow here and here.



    Jeff's post explains a lot about pros and cons of using GUID.




    GUID Pros




    • Unique across every table, every database and every server

    • Allows easy merging of records from different databases

    • Allows easy distribution of databases across multiple servers

    • You can generate IDs anywhere, instead of having to roundtrip to the database

    • Most replication scenarios require GUID columns anyway


    GUID Cons




    • It is a whopping 4 times larger than the traditional 4-byte index
      value; this can have serious
      performance and storage implications
      if you're not careful

    • Cumbersome to debug (where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}')

    • The generated GUIDs should be partially sequential for best
      performance (eg, newsequentialid() on
      SQL Server 2005+) and to enable use of
      clustered indexes




    If you are certain about performance and you are not planning to replicate or merge records, then use int, and set it auto increment (identity seed in SQL Server).






    share|improve this answer





















    • 18





      Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)

      – Brann
      Jul 8 '11 at 21:22






    • 2





      If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.

      – datagod
      Aug 6 '11 at 4:32






    • 6





      Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.

      – mrdenny
      Aug 6 '11 at 6:37








    • 16





      @Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.

      – Solomon Rutzky
      Oct 7 '15 at 18:54






    • 2





      @ChadKuehn Choosing UNIQUEIDENTIFIER over INT because INT has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.

      – Solomon Rutzky
      Oct 7 '15 at 18:59


















    83














    This has been asked in Stack Overflow here and here.



    Jeff's post explains a lot about pros and cons of using GUID.




    GUID Pros




    • Unique across every table, every database and every server

    • Allows easy merging of records from different databases

    • Allows easy distribution of databases across multiple servers

    • You can generate IDs anywhere, instead of having to roundtrip to the database

    • Most replication scenarios require GUID columns anyway


    GUID Cons




    • It is a whopping 4 times larger than the traditional 4-byte index
      value; this can have serious
      performance and storage implications
      if you're not careful

    • Cumbersome to debug (where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}')

    • The generated GUIDs should be partially sequential for best
      performance (eg, newsequentialid() on
      SQL Server 2005+) and to enable use of
      clustered indexes




    If you are certain about performance and you are not planning to replicate or merge records, then use int, and set it auto increment (identity seed in SQL Server).






    share|improve this answer





















    • 18





      Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)

      – Brann
      Jul 8 '11 at 21:22






    • 2





      If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.

      – datagod
      Aug 6 '11 at 4:32






    • 6





      Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.

      – mrdenny
      Aug 6 '11 at 6:37








    • 16





      @Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.

      – Solomon Rutzky
      Oct 7 '15 at 18:54






    • 2





      @ChadKuehn Choosing UNIQUEIDENTIFIER over INT because INT has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.

      – Solomon Rutzky
      Oct 7 '15 at 18:59
















    83












    83








    83







    This has been asked in Stack Overflow here and here.



    Jeff's post explains a lot about pros and cons of using GUID.




    GUID Pros




    • Unique across every table, every database and every server

    • Allows easy merging of records from different databases

    • Allows easy distribution of databases across multiple servers

    • You can generate IDs anywhere, instead of having to roundtrip to the database

    • Most replication scenarios require GUID columns anyway


    GUID Cons




    • It is a whopping 4 times larger than the traditional 4-byte index
      value; this can have serious
      performance and storage implications
      if you're not careful

    • Cumbersome to debug (where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}')

    • The generated GUIDs should be partially sequential for best
      performance (eg, newsequentialid() on
      SQL Server 2005+) and to enable use of
      clustered indexes




    If you are certain about performance and you are not planning to replicate or merge records, then use int, and set it auto increment (identity seed in SQL Server).






    share|improve this answer















    This has been asked in Stack Overflow here and here.



    Jeff's post explains a lot about pros and cons of using GUID.




    GUID Pros




    • Unique across every table, every database and every server

    • Allows easy merging of records from different databases

    • Allows easy distribution of databases across multiple servers

    • You can generate IDs anywhere, instead of having to roundtrip to the database

    • Most replication scenarios require GUID columns anyway


    GUID Cons




    • It is a whopping 4 times larger than the traditional 4-byte index
      value; this can have serious
      performance and storage implications
      if you're not careful

    • Cumbersome to debug (where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}')

    • The generated GUIDs should be partially sequential for best
      performance (eg, newsequentialid() on
      SQL Server 2005+) and to enable use of
      clustered indexes




    If you are certain about performance and you are not planning to replicate or merge records, then use int, and set it auto increment (identity seed in SQL Server).







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Mar 29 '18 at 16:33









    ypercubeᵀᴹ

    76.9k11134214




    76.9k11134214










    answered Jan 5 '11 at 8:17









    CoderHawkCoderHawk

    3,40522435




    3,40522435








    • 18





      Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)

      – Brann
      Jul 8 '11 at 21:22






    • 2





      If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.

      – datagod
      Aug 6 '11 at 4:32






    • 6





      Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.

      – mrdenny
      Aug 6 '11 at 6:37








    • 16





      @Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.

      – Solomon Rutzky
      Oct 7 '15 at 18:54






    • 2





      @ChadKuehn Choosing UNIQUEIDENTIFIER over INT because INT has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.

      – Solomon Rutzky
      Oct 7 '15 at 18:59
















    • 18





      Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)

      – Brann
      Jul 8 '11 at 21:22






    • 2





      If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.

      – datagod
      Aug 6 '11 at 4:32






    • 6





      Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.

      – mrdenny
      Aug 6 '11 at 6:37








    • 16





      @Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.

      – Solomon Rutzky
      Oct 7 '15 at 18:54






    • 2





      @ChadKuehn Choosing UNIQUEIDENTIFIER over INT because INT has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.

      – Solomon Rutzky
      Oct 7 '15 at 18:59










    18




    18





    Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)

    – Brann
    Jul 8 '11 at 21:22





    Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)

    – Brann
    Jul 8 '11 at 21:22




    2




    2





    If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.

    – datagod
    Aug 6 '11 at 4:32





    If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.

    – datagod
    Aug 6 '11 at 4:32




    6




    6





    Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.

    – mrdenny
    Aug 6 '11 at 6:37







    Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.

    – mrdenny
    Aug 6 '11 at 6:37






    16




    16





    @Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.

    – Solomon Rutzky
    Oct 7 '15 at 18:54





    @Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.

    – Solomon Rutzky
    Oct 7 '15 at 18:54




    2




    2





    @ChadKuehn Choosing UNIQUEIDENTIFIER over INT because INT has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.

    – Solomon Rutzky
    Oct 7 '15 at 18:59







    @ChadKuehn Choosing UNIQUEIDENTIFIER over INT because INT has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.

    – Solomon Rutzky
    Oct 7 '15 at 18:59















    17














    If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.






    share|improve this answer





















    • 3





      GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.

      – Greg
      Oct 7 '15 at 18:07
















    17














    If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.






    share|improve this answer





















    • 3





      GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.

      – Greg
      Oct 7 '15 at 18:07














    17












    17








    17







    If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.






    share|improve this answer















    If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 6 '14 at 17:41

























    answered Jan 5 '11 at 8:13









    TMLTML

    1,0861020




    1,0861020








    • 3





      GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.

      – Greg
      Oct 7 '15 at 18:07














    • 3





      GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.

      – Greg
      Oct 7 '15 at 18:07








    3




    3





    GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.

    – Greg
    Oct 7 '15 at 18:07





    GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.

    – Greg
    Oct 7 '15 at 18:07











    14














    I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id column AND a guid column. The guid can be used as needed to globally uniquely identify the row and id can be used for queries, sorting and human identification of the row.






    share|improve this answer



















    • 3





      What value does the GUID give if the id already is sufficient for humans to identify a row?

      – Martin Smith
      Apr 3 '15 at 18:06






    • 6





      The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.

      – rmirabelle
      Apr 3 '15 at 21:46






    • 1





      @MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the INT PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.

      – Solomon Rutzky
      Oct 7 '15 at 19:05






    • 1





      @rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.

      – easuter
      Oct 31 '15 at 0:52






    • 1





      @easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)

      – Solomon Rutzky
      Oct 31 '15 at 15:26
















    14














    I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id column AND a guid column. The guid can be used as needed to globally uniquely identify the row and id can be used for queries, sorting and human identification of the row.






    share|improve this answer



















    • 3





      What value does the GUID give if the id already is sufficient for humans to identify a row?

      – Martin Smith
      Apr 3 '15 at 18:06






    • 6





      The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.

      – rmirabelle
      Apr 3 '15 at 21:46






    • 1





      @MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the INT PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.

      – Solomon Rutzky
      Oct 7 '15 at 19:05






    • 1





      @rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.

      – easuter
      Oct 31 '15 at 0:52






    • 1





      @easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)

      – Solomon Rutzky
      Oct 31 '15 at 15:26














    14












    14








    14







    I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id column AND a guid column. The guid can be used as needed to globally uniquely identify the row and id can be used for queries, sorting and human identification of the row.






    share|improve this answer













    I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id column AND a guid column. The guid can be used as needed to globally uniquely identify the row and id can be used for queries, sorting and human identification of the row.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Apr 3 '15 at 17:04









    rmirabellermirabelle

    24124




    24124








    • 3





      What value does the GUID give if the id already is sufficient for humans to identify a row?

      – Martin Smith
      Apr 3 '15 at 18:06






    • 6





      The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.

      – rmirabelle
      Apr 3 '15 at 21:46






    • 1





      @MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the INT PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.

      – Solomon Rutzky
      Oct 7 '15 at 19:05






    • 1





      @rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.

      – easuter
      Oct 31 '15 at 0:52






    • 1





      @easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)

      – Solomon Rutzky
      Oct 31 '15 at 15:26














    • 3





      What value does the GUID give if the id already is sufficient for humans to identify a row?

      – Martin Smith
      Apr 3 '15 at 18:06






    • 6





      The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.

      – rmirabelle
      Apr 3 '15 at 21:46






    • 1





      @MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the INT PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.

      – Solomon Rutzky
      Oct 7 '15 at 19:05






    • 1





      @rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.

      – easuter
      Oct 31 '15 at 0:52






    • 1





      @easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)

      – Solomon Rutzky
      Oct 31 '15 at 15:26








    3




    3





    What value does the GUID give if the id already is sufficient for humans to identify a row?

    – Martin Smith
    Apr 3 '15 at 18:06





    What value does the GUID give if the id already is sufficient for humans to identify a row?

    – Martin Smith
    Apr 3 '15 at 18:06




    6




    6





    The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.

    – rmirabelle
    Apr 3 '15 at 21:46





    The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.

    – rmirabelle
    Apr 3 '15 at 21:46




    1




    1





    @MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the INT PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.

    – Solomon Rutzky
    Oct 7 '15 at 19:05





    @MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the INT PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.

    – Solomon Rutzky
    Oct 7 '15 at 19:05




    1




    1





    @rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.

    – easuter
    Oct 31 '15 at 0:52





    @rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.

    – easuter
    Oct 31 '15 at 0:52




    1




    1





    @easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)

    – Solomon Rutzky
    Oct 31 '15 at 15:26





    @easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)

    – Solomon Rutzky
    Oct 31 '15 at 15:26











    1














    Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.



    Of course, the drawback of this is like "Say no to scalability!"





    Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.



    I know this sounds really obvious, but I see that being forgotten quite often.



    For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.



    However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).



    Thanks @VahiD for the clarifications.






    share|improve this answer


























    • using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.

      – VahiD
      Oct 7 '15 at 10:39






    • 1





      Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.

      – Alpha
      Oct 7 '15 at 17:13











    • upvote for the development in the years :)

      – VahiD
      Oct 8 '15 at 18:28
















    1














    Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.



    Of course, the drawback of this is like "Say no to scalability!"





    Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.



    I know this sounds really obvious, but I see that being forgotten quite often.



    For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.



    However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).



    Thanks @VahiD for the clarifications.






    share|improve this answer


























    • using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.

      – VahiD
      Oct 7 '15 at 10:39






    • 1





      Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.

      – Alpha
      Oct 7 '15 at 17:13











    • upvote for the development in the years :)

      – VahiD
      Oct 8 '15 at 18:28














    1












    1








    1







    Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.



    Of course, the drawback of this is like "Say no to scalability!"





    Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.



    I know this sounds really obvious, but I see that being forgotten quite often.



    For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.



    However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).



    Thanks @VahiD for the clarifications.






    share|improve this answer















    Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.



    Of course, the drawback of this is like "Say no to scalability!"





    Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.



    I know this sounds really obvious, but I see that being forgotten quite often.



    For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.



    However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).



    Thanks @VahiD for the clarifications.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Oct 7 '15 at 17:16

























    answered Aug 6 '11 at 0:57









    AlphaAlpha

    19029




    19029













    • using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.

      – VahiD
      Oct 7 '15 at 10:39






    • 1





      Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.

      – Alpha
      Oct 7 '15 at 17:13











    • upvote for the development in the years :)

      – VahiD
      Oct 8 '15 at 18:28



















    • using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.

      – VahiD
      Oct 7 '15 at 10:39






    • 1





      Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.

      – Alpha
      Oct 7 '15 at 17:13











    • upvote for the development in the years :)

      – VahiD
      Oct 8 '15 at 18:28

















    using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.

    – VahiD
    Oct 7 '15 at 10:39





    using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.

    – VahiD
    Oct 7 '15 at 10:39




    1




    1





    Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.

    – Alpha
    Oct 7 '15 at 17:13





    Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.

    – Alpha
    Oct 7 '15 at 17:13













    upvote for the development in the years :)

    – VahiD
    Oct 8 '15 at 18:28





    upvote for the development in the years :)

    – VahiD
    Oct 8 '15 at 18:28











    0














    Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.



    Just a thought and I hope I am remembering correctly. Have a great day!






    share|improve this answer



















    • 2





      Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.

      – LowlyDBA
      May 1 '18 at 21:17


















    0














    Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.



    Just a thought and I hope I am remembering correctly. Have a great day!






    share|improve this answer



















    • 2





      Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.

      – LowlyDBA
      May 1 '18 at 21:17
















    0












    0








    0







    Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.



    Just a thought and I hope I am remembering correctly. Have a great day!






    share|improve this answer













    Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.



    Just a thought and I hope I am remembering correctly. Have a great day!







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered May 1 '18 at 21:12









    bobo8734bobo8734

    11




    11








    • 2





      Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.

      – LowlyDBA
      May 1 '18 at 21:17
















    • 2





      Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.

      – LowlyDBA
      May 1 '18 at 21:17










    2




    2





    Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.

    – LowlyDBA
    May 1 '18 at 21:17







    Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.

    – LowlyDBA
    May 1 '18 at 21:17













    0














    Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.





    share








    New contributor




    golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.

























      0














      Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.





      share








      New contributor




      golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.























        0












        0








        0







        Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.





        share








        New contributor




        golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.










        Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.






        share








        New contributor




        golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.








        share


        share






        New contributor




        golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.









        answered 7 mins ago









        golopotgolopot

        1011




        1011




        New contributor




        golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.





        New contributor





        golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.






        golopot is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
        Check out our Code of Conduct.























            -6














            Use both



            Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.



            But bind a column to GUID so that every row also has a unique column






            share|improve this answer



















            • 1





              Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.

              – Andriy M
              Jan 14 '16 at 6:29











            • GUID is 36 character long will be difficult to read in case you are searching for a specific case..

              – Abdul Hannan Ijaz
              Jan 14 '16 at 7:57






            • 1





              All right, but that doesn't really explain why the OP should use both int and guid, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?

              – Andriy M
              Jan 14 '16 at 8:18













            • Yup i meant the same thing .. cool BTW :)

              – Abdul Hannan Ijaz
              Jan 14 '16 at 9:22
















            -6














            Use both



            Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.



            But bind a column to GUID so that every row also has a unique column






            share|improve this answer



















            • 1





              Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.

              – Andriy M
              Jan 14 '16 at 6:29











            • GUID is 36 character long will be difficult to read in case you are searching for a specific case..

              – Abdul Hannan Ijaz
              Jan 14 '16 at 7:57






            • 1





              All right, but that doesn't really explain why the OP should use both int and guid, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?

              – Andriy M
              Jan 14 '16 at 8:18













            • Yup i meant the same thing .. cool BTW :)

              – Abdul Hannan Ijaz
              Jan 14 '16 at 9:22














            -6












            -6








            -6







            Use both



            Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.



            But bind a column to GUID so that every row also has a unique column






            share|improve this answer













            Use both



            Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.



            But bind a column to GUID so that every row also has a unique column







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Jan 5 '16 at 12:58









            Abdul Hannan IjazAbdul Hannan Ijaz

            1675




            1675








            • 1





              Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.

              – Andriy M
              Jan 14 '16 at 6:29











            • GUID is 36 character long will be difficult to read in case you are searching for a specific case..

              – Abdul Hannan Ijaz
              Jan 14 '16 at 7:57






            • 1





              All right, but that doesn't really explain why the OP should use both int and guid, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?

              – Andriy M
              Jan 14 '16 at 8:18













            • Yup i meant the same thing .. cool BTW :)

              – Abdul Hannan Ijaz
              Jan 14 '16 at 9:22














            • 1





              Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.

              – Andriy M
              Jan 14 '16 at 6:29











            • GUID is 36 character long will be difficult to read in case you are searching for a specific case..

              – Abdul Hannan Ijaz
              Jan 14 '16 at 7:57






            • 1





              All right, but that doesn't really explain why the OP should use both int and guid, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?

              – Andriy M
              Jan 14 '16 at 8:18













            • Yup i meant the same thing .. cool BTW :)

              – Abdul Hannan Ijaz
              Jan 14 '16 at 9:22








            1




            1





            Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.

            – Andriy M
            Jan 14 '16 at 6:29





            Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.

            – Andriy M
            Jan 14 '16 at 6:29













            GUID is 36 character long will be difficult to read in case you are searching for a specific case..

            – Abdul Hannan Ijaz
            Jan 14 '16 at 7:57





            GUID is 36 character long will be difficult to read in case you are searching for a specific case..

            – Abdul Hannan Ijaz
            Jan 14 '16 at 7:57




            1




            1





            All right, but that doesn't really explain why the OP should use both int and guid, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?

            – Andriy M
            Jan 14 '16 at 8:18







            All right, but that doesn't really explain why the OP should use both int and guid, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?

            – Andriy M
            Jan 14 '16 at 8:18















            Yup i meant the same thing .. cool BTW :)

            – Abdul Hannan Ijaz
            Jan 14 '16 at 9:22





            Yup i meant the same thing .. cool BTW :)

            – Abdul Hannan Ijaz
            Jan 14 '16 at 9:22


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f264%2fguid-vs-int-which-is-better-as-a-primary-key%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Szabolcs (Ungheria) Altri progetti | Menu di navigazione48°10′14.56″N 21°29′33.14″E /...

            Discografia di Klaus Schulze Indice Album in studio | Album dal vivo | Singoli | Antologie | Colonne...

            How to make inet_server_addr() return localhost in spite of ::1/128RETURN NEXT in Postgres FunctionConnect to...