Guid vs INT - Which is better as a primary key?Sequential GUID or bigint for 'huge' database table PKAre...
Bursted bubble like details on material
Players preemptively rolling, even though their rolls are useless or are checking the wrong skills
If I tried and failed to start my own business, how do I apply for a job without job experience?
How to deal with an underperforming subordinate?
Was there a pre-determined arrangement for the division of Germany in case it surrendered before any Soviet forces entered its territory?
smartctl reports overall health test as passed but the tests failed?
Can I travel from country A to country B to country C without going back to country A?
How can guns be countered by melee combat without raw-ability or exceptional explanations?
Calculating the strength of an ionic bond that contains poly-atomic ions
How can I prevent an oracle who can see into the past from knowing everything that has happened?
Sri Keyuravati of krama tradition
Sing Baby Shark
What are some idioms that means something along the lines of "switching it up every day to not do the same thing over and over"?
Can you say "leftside right"?
Do the speed limit reductions due to pollution also apply to electric cars in France?
Are all power cords made equal?
Was Opportunity's last message to Earth "My battery is low and it's getting dark"?
How to draw a node with two options using TikZ graphs in LaTeX
How do I add a strong "onion flavor" to the biryani (in restaurant style)?
What is an efficient way to digitize a family photo collection?
Why did Ylvis use "go" instead of "say" in phrases like "Dog goes 'woof'"?
Explicit Riemann Hilbert correspondence
Buying a "Used" Router
What's the reason that we have a different number of days each month?
Guid vs INT - Which is better as a primary key?
Sequential GUID or bigint for 'huge' database table PKAre there any performance differences with a GUID vs INT clustered indexed column, especially with joins on those columns?Why use an int as a lookup table's primary key?Mysql int vs varchar as primary key (InnoDB Storage Engine?Best solution to fixing database design with GUID as primary keyChanging primary key from varchar to intUse GUID as primary key in Azure?Impact of additional int field as primary keySQL Server: performance of IDENTITY vs INT as primary keyTrying to understand how an auto increment primary key is better than no primary key and some other primary key questionsalter primary id field to uniqueidentifier GUID as default in sql serverClustered Sequential GUID Primary Key vs Non-Clustered GUID and Clustered Sequential ID Primary Keys
I've being reading around reasons to use or not Guid
and int
.
int
is smaller, faster, easy to remember, keeps a chronological sequence. And as for Guid
, the only advantage I found is that it is unique. In which case a Guid
would be better than and int
and why?
From what I've seen, int
has no flaws except by the number limit, which in many cases are irrelevant.
Why exactly was Guid
created? I actually think it has a purpose other than serving as primary key of a simple table. (Any example of a real application using Guid
for something?)
( Guid = UniqueIdentifier ) type on SQL Server
performance sql-server primary-key uniqueidentifier
add a comment |
I've being reading around reasons to use or not Guid
and int
.
int
is smaller, faster, easy to remember, keeps a chronological sequence. And as for Guid
, the only advantage I found is that it is unique. In which case a Guid
would be better than and int
and why?
From what I've seen, int
has no flaws except by the number limit, which in many cases are irrelevant.
Why exactly was Guid
created? I actually think it has a purpose other than serving as primary key of a simple table. (Any example of a real application using Guid
for something?)
( Guid = UniqueIdentifier ) type on SQL Server
performance sql-server primary-key uniqueidentifier
1
Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.
– onedaywhen
Jan 31 '12 at 8:48
Also remember the difference between (Primary) KEY and INDEX.
– Allan S. Hansen
Jan 7 '14 at 6:55
1
Also discussed on SO: stackoverflow.com/questions/11033435/…
– Jon of All Trades
Jun 22 '15 at 21:41
2
"int
has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bitINT
is entirely irrelevant given that the upper limit of a signed, 64-bitBIGINT
is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes forINT
) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.
– Solomon Rutzky
Oct 7 '15 at 18:41
add a comment |
I've being reading around reasons to use or not Guid
and int
.
int
is smaller, faster, easy to remember, keeps a chronological sequence. And as for Guid
, the only advantage I found is that it is unique. In which case a Guid
would be better than and int
and why?
From what I've seen, int
has no flaws except by the number limit, which in many cases are irrelevant.
Why exactly was Guid
created? I actually think it has a purpose other than serving as primary key of a simple table. (Any example of a real application using Guid
for something?)
( Guid = UniqueIdentifier ) type on SQL Server
performance sql-server primary-key uniqueidentifier
I've being reading around reasons to use or not Guid
and int
.
int
is smaller, faster, easy to remember, keeps a chronological sequence. And as for Guid
, the only advantage I found is that it is unique. In which case a Guid
would be better than and int
and why?
From what I've seen, int
has no flaws except by the number limit, which in many cases are irrelevant.
Why exactly was Guid
created? I actually think it has a purpose other than serving as primary key of a simple table. (Any example of a real application using Guid
for something?)
( Guid = UniqueIdentifier ) type on SQL Server
performance sql-server primary-key uniqueidentifier
performance sql-server primary-key uniqueidentifier
asked Jan 5 '11 at 7:46
BrunoLMBrunoLM
1,18541721
1,18541721
1
Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.
– onedaywhen
Jan 31 '12 at 8:48
Also remember the difference between (Primary) KEY and INDEX.
– Allan S. Hansen
Jan 7 '14 at 6:55
1
Also discussed on SO: stackoverflow.com/questions/11033435/…
– Jon of All Trades
Jun 22 '15 at 21:41
2
"int
has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bitINT
is entirely irrelevant given that the upper limit of a signed, 64-bitBIGINT
is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes forINT
) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.
– Solomon Rutzky
Oct 7 '15 at 18:41
add a comment |
1
Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.
– onedaywhen
Jan 31 '12 at 8:48
Also remember the difference between (Primary) KEY and INDEX.
– Allan S. Hansen
Jan 7 '14 at 6:55
1
Also discussed on SO: stackoverflow.com/questions/11033435/…
– Jon of All Trades
Jun 22 '15 at 21:41
2
"int
has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bitINT
is entirely irrelevant given that the upper limit of a signed, 64-bitBIGINT
is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes forINT
) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.
– Solomon Rutzky
Oct 7 '15 at 18:41
1
1
Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.
– onedaywhen
Jan 31 '12 at 8:48
Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.
– onedaywhen
Jan 31 '12 at 8:48
Also remember the difference between (Primary) KEY and INDEX.
– Allan S. Hansen
Jan 7 '14 at 6:55
Also remember the difference between (Primary) KEY and INDEX.
– Allan S. Hansen
Jan 7 '14 at 6:55
1
1
Also discussed on SO: stackoverflow.com/questions/11033435/…
– Jon of All Trades
Jun 22 '15 at 21:41
Also discussed on SO: stackoverflow.com/questions/11033435/…
– Jon of All Trades
Jun 22 '15 at 21:41
2
2
"
int
has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bit INT
is entirely irrelevant given that the upper limit of a signed, 64-bit BIGINT
is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes for INT
) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.– Solomon Rutzky
Oct 7 '15 at 18:41
"
int
has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bit INT
is entirely irrelevant given that the upper limit of a signed, 64-bit BIGINT
is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes for INT
) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.– Solomon Rutzky
Oct 7 '15 at 18:41
add a comment |
7 Answers
7
active
oldest
votes
This has been asked in Stack Overflow here and here.
Jeff's post explains a lot about pros and cons of using GUID.
GUID Pros
- Unique across every table, every database and every server
- Allows easy merging of records from different databases
- Allows easy distribution of databases across multiple servers
- You can generate IDs anywhere, instead of having to roundtrip to the database
- Most replication scenarios require GUID columns anyway
GUID Cons
- It is a whopping 4 times larger than the traditional 4-byte index
value; this can have serious
performance and storage implications
if you're not careful
- Cumbersome to debug (
where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}'
)
- The generated GUIDs should be partially sequential for best
performance (eg,newsequentialid()
on
SQL Server 2005+) and to enable use of
clustered indexes
If you are certain about performance and you are not planning to replicate or merge records, then use int
, and set it auto increment (identity seed in SQL Server).
18
Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)
– Brann
Jul 8 '11 at 21:22
2
If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.
– datagod
Aug 6 '11 at 4:32
6
Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.
– mrdenny
Aug 6 '11 at 6:37
16
@Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.
– Solomon Rutzky
Oct 7 '15 at 18:54
2
@ChadKuehn ChoosingUNIQUEIDENTIFIER
overINT
becauseINT
has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of anINT
by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with aBIGINT
that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.
– Solomon Rutzky
Oct 7 '15 at 18:59
|
show 8 more comments
If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.
3
GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.
– Greg
Oct 7 '15 at 18:07
add a comment |
I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id
column AND a guid
column. The guid
can be used as needed to globally uniquely identify the row and id
can be used for queries, sorting and human identification of the row.
3
What value does the GUID give if theid
already is sufficient for humans to identify a row?
– Martin Smith
Apr 3 '15 at 18:06
6
The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.
– rmirabelle
Apr 3 '15 at 21:46
1
@MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via theINT
PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.
– Solomon Rutzky
Oct 7 '15 at 19:05
1
@rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.
– easuter
Oct 31 '15 at 0:52
1
@easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)
– Solomon Rutzky
Oct 31 '15 at 15:26
|
show 6 more comments
Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.
Of course, the drawback of this is like "Say no to scalability!"
Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.
I know this sounds really obvious, but I see that being forgotten quite often.
For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.
However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).
Thanks @VahiD for the clarifications.
using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.
– VahiD
Oct 7 '15 at 10:39
1
Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.
– Alpha
Oct 7 '15 at 17:13
upvote for the development in the years :)
– VahiD
Oct 8 '15 at 18:28
add a comment |
Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.
Just a thought and I hope I am remembering correctly. Have a great day!
2
Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.
– LowlyDBA
May 1 '18 at 21:17
add a comment |
Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id
to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.
New contributor
add a comment |
Use both
Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.
But bind a column to GUID so that every row also has a unique column
1
Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.
– Andriy M
Jan 14 '16 at 6:29
GUID is 36 character long will be difficult to read in case you are searching for a specific case..
– Abdul Hannan Ijaz
Jan 14 '16 at 7:57
1
All right, but that doesn't really explain why the OP should use bothint
andguid
, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?
– Andriy M
Jan 14 '16 at 8:18
Yup i meant the same thing .. cool BTW :)
– Abdul Hannan Ijaz
Jan 14 '16 at 9:22
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f264%2fguid-vs-int-which-is-better-as-a-primary-key%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
7 Answers
7
active
oldest
votes
7 Answers
7
active
oldest
votes
active
oldest
votes
active
oldest
votes
This has been asked in Stack Overflow here and here.
Jeff's post explains a lot about pros and cons of using GUID.
GUID Pros
- Unique across every table, every database and every server
- Allows easy merging of records from different databases
- Allows easy distribution of databases across multiple servers
- You can generate IDs anywhere, instead of having to roundtrip to the database
- Most replication scenarios require GUID columns anyway
GUID Cons
- It is a whopping 4 times larger than the traditional 4-byte index
value; this can have serious
performance and storage implications
if you're not careful
- Cumbersome to debug (
where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}'
)
- The generated GUIDs should be partially sequential for best
performance (eg,newsequentialid()
on
SQL Server 2005+) and to enable use of
clustered indexes
If you are certain about performance and you are not planning to replicate or merge records, then use int
, and set it auto increment (identity seed in SQL Server).
18
Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)
– Brann
Jul 8 '11 at 21:22
2
If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.
– datagod
Aug 6 '11 at 4:32
6
Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.
– mrdenny
Aug 6 '11 at 6:37
16
@Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.
– Solomon Rutzky
Oct 7 '15 at 18:54
2
@ChadKuehn ChoosingUNIQUEIDENTIFIER
overINT
becauseINT
has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of anINT
by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with aBIGINT
that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.
– Solomon Rutzky
Oct 7 '15 at 18:59
|
show 8 more comments
This has been asked in Stack Overflow here and here.
Jeff's post explains a lot about pros and cons of using GUID.
GUID Pros
- Unique across every table, every database and every server
- Allows easy merging of records from different databases
- Allows easy distribution of databases across multiple servers
- You can generate IDs anywhere, instead of having to roundtrip to the database
- Most replication scenarios require GUID columns anyway
GUID Cons
- It is a whopping 4 times larger than the traditional 4-byte index
value; this can have serious
performance and storage implications
if you're not careful
- Cumbersome to debug (
where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}'
)
- The generated GUIDs should be partially sequential for best
performance (eg,newsequentialid()
on
SQL Server 2005+) and to enable use of
clustered indexes
If you are certain about performance and you are not planning to replicate or merge records, then use int
, and set it auto increment (identity seed in SQL Server).
18
Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)
– Brann
Jul 8 '11 at 21:22
2
If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.
– datagod
Aug 6 '11 at 4:32
6
Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.
– mrdenny
Aug 6 '11 at 6:37
16
@Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.
– Solomon Rutzky
Oct 7 '15 at 18:54
2
@ChadKuehn ChoosingUNIQUEIDENTIFIER
overINT
becauseINT
has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of anINT
by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with aBIGINT
that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.
– Solomon Rutzky
Oct 7 '15 at 18:59
|
show 8 more comments
This has been asked in Stack Overflow here and here.
Jeff's post explains a lot about pros and cons of using GUID.
GUID Pros
- Unique across every table, every database and every server
- Allows easy merging of records from different databases
- Allows easy distribution of databases across multiple servers
- You can generate IDs anywhere, instead of having to roundtrip to the database
- Most replication scenarios require GUID columns anyway
GUID Cons
- It is a whopping 4 times larger than the traditional 4-byte index
value; this can have serious
performance and storage implications
if you're not careful
- Cumbersome to debug (
where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}'
)
- The generated GUIDs should be partially sequential for best
performance (eg,newsequentialid()
on
SQL Server 2005+) and to enable use of
clustered indexes
If you are certain about performance and you are not planning to replicate or merge records, then use int
, and set it auto increment (identity seed in SQL Server).
This has been asked in Stack Overflow here and here.
Jeff's post explains a lot about pros and cons of using GUID.
GUID Pros
- Unique across every table, every database and every server
- Allows easy merging of records from different databases
- Allows easy distribution of databases across multiple servers
- You can generate IDs anywhere, instead of having to roundtrip to the database
- Most replication scenarios require GUID columns anyway
GUID Cons
- It is a whopping 4 times larger than the traditional 4-byte index
value; this can have serious
performance and storage implications
if you're not careful
- Cumbersome to debug (
where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}'
)
- The generated GUIDs should be partially sequential for best
performance (eg,newsequentialid()
on
SQL Server 2005+) and to enable use of
clustered indexes
If you are certain about performance and you are not planning to replicate or merge records, then use int
, and set it auto increment (identity seed in SQL Server).
edited Mar 29 '18 at 16:33
ypercubeᵀᴹ
76.9k11134214
76.9k11134214
answered Jan 5 '11 at 8:17
CoderHawkCoderHawk
3,40522435
3,40522435
18
Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)
– Brann
Jul 8 '11 at 21:22
2
If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.
– datagod
Aug 6 '11 at 4:32
6
Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.
– mrdenny
Aug 6 '11 at 6:37
16
@Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.
– Solomon Rutzky
Oct 7 '15 at 18:54
2
@ChadKuehn ChoosingUNIQUEIDENTIFIER
overINT
becauseINT
has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of anINT
by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with aBIGINT
that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.
– Solomon Rutzky
Oct 7 '15 at 18:59
|
show 8 more comments
18
Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)
– Brann
Jul 8 '11 at 21:22
2
If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.
– datagod
Aug 6 '11 at 4:32
6
Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.
– mrdenny
Aug 6 '11 at 6:37
16
@Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.
– Solomon Rutzky
Oct 7 '15 at 18:54
2
@ChadKuehn ChoosingUNIQUEIDENTIFIER
overINT
becauseINT
has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of anINT
by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with aBIGINT
that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.
– Solomon Rutzky
Oct 7 '15 at 18:59
18
18
Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)
– Brann
Jul 8 '11 at 21:22
Another con of the GUID approach is that you cannot use it as an identifier for your end-user. Do you really expect your users to tell you on the phone that they have an issue with Order "BAE7DF4-DDF-3RG-5TY3E3RF456AS10" ? :)
– Brann
Jul 8 '11 at 21:22
2
2
If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.
– datagod
Aug 6 '11 at 4:32
If you do not use sequential guids, and your primary key is clustered (the SQL Server defaul) then all your data inserts will be randomly scattered throughout the table, leading to massive fragmentation of your data. That is assuming that the data would normally be inserted in some sort of order, such as chronological.
– datagod
Aug 6 '11 at 4:32
6
6
Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.
– mrdenny
Aug 6 '11 at 6:37
Sequential guids are only sequential until the SQL instance is restarted. Then the first value will more than likely be lower than the prior one because of the way that the root value is generated, causing all sorts of problems all over again.
– mrdenny
Aug 6 '11 at 6:37
16
16
@Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.
– Solomon Rutzky
Oct 7 '15 at 18:54
@Brann Ideally you wouldn't be given your PK values out to end-users in the first place. I know it is somewhat common to do so, and it is something I myself have done in the past before I learned not to. But since it shouldn't be done, that particular reason to prefer INT over GUID isn't a valid one.
– Solomon Rutzky
Oct 7 '15 at 18:54
2
2
@ChadKuehn Choosing
UNIQUEIDENTIFIER
over INT
because INT
has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT
by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT
that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.– Solomon Rutzky
Oct 7 '15 at 18:59
@ChadKuehn Choosing
UNIQUEIDENTIFIER
over INT
because INT
has an upper limit is rather poor reasoning since being limitless, while true enough, is not a practical benefit. You can easily double the effective capacity of an INT
by starting it at the lower limit (-2.14 billion) instead of at 1. Or, if the full 4.3 billion isn't enough, then start out with a BIGINT
that is still only 8 bytes as compared to 16 for the GUID, and it is seqeuential.– Solomon Rutzky
Oct 7 '15 at 18:59
|
show 8 more comments
If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.
3
GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.
– Greg
Oct 7 '15 at 18:07
add a comment |
If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.
3
GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.
– Greg
Oct 7 '15 at 18:07
add a comment |
If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.
If you're synchronizing your data with an external source, a persistent GUID can be much better. A quick example of where we're using a GUIDs is a tool that is sent to the customer to crawl their network and do certain classes of auto-discovery, store the records found, and then all the customer records are integrated into a central database back on our end. If we used an integer, we would have 7,398 "1"s, and it'd be a lot harder to keep track of which "1" was which.
edited Jan 6 '14 at 17:41
answered Jan 5 '11 at 8:13
TMLTML
1,0861020
1,0861020
3
GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.
– Greg
Oct 7 '15 at 18:07
add a comment |
3
GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.
– Greg
Oct 7 '15 at 18:07
3
3
GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.
– Greg
Oct 7 '15 at 18:07
GUIDs are definitely good as external identifiers, and I would keep a non-clustered index of that as the "external key" I would still keep an int as the "internal key" which is the basis for the clustered index and foreign key relationships. If something is going to cross an architectural boundary (e.g. communicating with another app) I do appreciate having something that cannot be mixed up.
– Greg
Oct 7 '15 at 18:07
add a comment |
I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id
column AND a guid
column. The guid
can be used as needed to globally uniquely identify the row and id
can be used for queries, sorting and human identification of the row.
3
What value does the GUID give if theid
already is sufficient for humans to identify a row?
– Martin Smith
Apr 3 '15 at 18:06
6
The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.
– rmirabelle
Apr 3 '15 at 21:46
1
@MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via theINT
PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.
– Solomon Rutzky
Oct 7 '15 at 19:05
1
@rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.
– easuter
Oct 31 '15 at 0:52
1
@easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)
– Solomon Rutzky
Oct 31 '15 at 15:26
|
show 6 more comments
I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id
column AND a guid
column. The guid
can be used as needed to globally uniquely identify the row and id
can be used for queries, sorting and human identification of the row.
3
What value does the GUID give if theid
already is sufficient for humans to identify a row?
– Martin Smith
Apr 3 '15 at 18:06
6
The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.
– rmirabelle
Apr 3 '15 at 21:46
1
@MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via theINT
PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.
– Solomon Rutzky
Oct 7 '15 at 19:05
1
@rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.
– easuter
Oct 31 '15 at 0:52
1
@easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)
– Solomon Rutzky
Oct 31 '15 at 15:26
|
show 6 more comments
I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id
column AND a guid
column. The guid
can be used as needed to globally uniquely identify the row and id
can be used for queries, sorting and human identification of the row.
I have used a hybrid approach with success. Tables contain BOTH an auto-increment primary key integer id
column AND a guid
column. The guid
can be used as needed to globally uniquely identify the row and id
can be used for queries, sorting and human identification of the row.
answered Apr 3 '15 at 17:04
rmirabellermirabelle
24124
24124
3
What value does the GUID give if theid
already is sufficient for humans to identify a row?
– Martin Smith
Apr 3 '15 at 18:06
6
The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.
– rmirabelle
Apr 3 '15 at 21:46
1
@MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via theINT
PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.
– Solomon Rutzky
Oct 7 '15 at 19:05
1
@rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.
– easuter
Oct 31 '15 at 0:52
1
@easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)
– Solomon Rutzky
Oct 31 '15 at 15:26
|
show 6 more comments
3
What value does the GUID give if theid
already is sufficient for humans to identify a row?
– Martin Smith
Apr 3 '15 at 18:06
6
The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.
– rmirabelle
Apr 3 '15 at 21:46
1
@MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via theINT
PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.
– Solomon Rutzky
Oct 7 '15 at 19:05
1
@rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.
– easuter
Oct 31 '15 at 0:52
1
@easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)
– Solomon Rutzky
Oct 31 '15 at 15:26
3
3
What value does the GUID give if the
id
already is sufficient for humans to identify a row?– Martin Smith
Apr 3 '15 at 18:06
What value does the GUID give if the
id
already is sufficient for humans to identify a row?– Martin Smith
Apr 3 '15 at 18:06
6
6
The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.
– rmirabelle
Apr 3 '15 at 21:46
The id identifies the row in this table. The GUID (at least in theory) identifies this row anywhere in the known universe. In my project, Android mobiles each have a structurally identical copy of the table on a local SQLite database. The row and its GUID are each generated on Android. Then, when Android is synchronized to the back-end database, its local row is written to the back-end table without fear of conflicting with rows created from any other Android mobile.
– rmirabelle
Apr 3 '15 at 21:46
1
1
@MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the
INT
PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.– Solomon Rutzky
Oct 7 '15 at 19:05
@MartinSmith I have used this approach myself and it works quite nicely. The GUID is just an alternate key, with a NonClustered index, and is passed in from the application, but only resides in the primary table. All related tables are related via the
INT
PK. I find it strange that this approach is not much more common given it is the best of both worlds. It seems like most people just prefer to solve problems in very absolutist terms, not realizing that the PK doesn't need to be a GUID in order for the app to still use GUIDs for global uniqueness and/or portability.– Solomon Rutzky
Oct 7 '15 at 19:05
1
1
@rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.
– easuter
Oct 31 '15 at 0:52
@rmirabelle I had thought about this approach and was hesitating, but your answer convinced me. Basically I'm in a situation where I need to have a unique identifier for a work item (that can come in over the network from anywhere), but I don't want to round-trip to the database first. GUIDs are a good solution for this but I imagine JOINs will become much slower if I don't have a sequential clustered key.
– easuter
Oct 31 '15 at 0:52
1
1
@easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)
– Solomon Rutzky
Oct 31 '15 at 15:26
@easuter I agree with not adding ID fields "just for the sake of it", such as in many-to-many "bridge" tables where the PK should be a composite of the two FKs that are being related. But here it is not a trade-off since the ID field is not merely for the sake of it. Allowing the system to work efficiently is fairly important ;-). AND, I would argue that in your case, since the GUIDs are generated externally, those are not guaranteed unique, even if pragmatically they are. But the responsibility for data integrity is reason enough to have GUID be an alternate key and ID be PK in your case :)
– Solomon Rutzky
Oct 31 '15 at 15:26
|
show 6 more comments
Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.
Of course, the drawback of this is like "Say no to scalability!"
Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.
I know this sounds really obvious, but I see that being forgotten quite often.
For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.
However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).
Thanks @VahiD for the clarifications.
using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.
– VahiD
Oct 7 '15 at 10:39
1
Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.
– Alpha
Oct 7 '15 at 17:13
upvote for the development in the years :)
– VahiD
Oct 8 '15 at 18:28
add a comment |
Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.
Of course, the drawback of this is like "Say no to scalability!"
Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.
I know this sounds really obvious, but I see that being forgotten quite often.
For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.
However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).
Thanks @VahiD for the clarifications.
using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.
– VahiD
Oct 7 '15 at 10:39
1
Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.
– Alpha
Oct 7 '15 at 17:13
upvote for the development in the years :)
– VahiD
Oct 8 '15 at 18:28
add a comment |
Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.
Of course, the drawback of this is like "Say no to scalability!"
Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.
I know this sounds really obvious, but I see that being forgotten quite often.
For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.
However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).
Thanks @VahiD for the clarifications.
Some best practices out there still mention that you should use a data type that accomodates with the less memory possible the whole set of values you're going to use. For instance, if you're using it to store number of employers in a small business and you're unlikely to get to a 100, then no one would suggest in using a bigint value while int (even smallint) would do.
Of course, the drawback of this is like "Say no to scalability!"
Also, I know this is not totally related, but there's another factor regarding this. When not excesive, I usually try to recommend to use a non-autogenerated primary key, if it does make sense. For instance, if you're saving driver's information, don't bother in creating a new autogenerated column for "ID", just use the license number.
I know this sounds really obvious, but I see that being forgotten quite often.
For context: this part of the answer was addressed from a data theoretical approach, where you want your PK to be the uniquely data-identifier for a record. Most of the times we create those when they already exist, hence the previous answer.
However, it is very rare that you can have tight control over these datapoints, and as such, you may need to make corrections or adjustments. You can't do that with primary keys (well, you can, but it can be a pain).
Thanks @VahiD for the clarifications.
edited Oct 7 '15 at 17:16
answered Aug 6 '11 at 0:57
AlphaAlpha
19029
19029
using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.
– VahiD
Oct 7 '15 at 10:39
1
Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.
– Alpha
Oct 7 '15 at 17:13
upvote for the development in the years :)
– VahiD
Oct 8 '15 at 18:28
add a comment |
using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.
– VahiD
Oct 7 '15 at 10:39
1
Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.
– Alpha
Oct 7 '15 at 17:13
upvote for the development in the years :)
– VahiD
Oct 8 '15 at 18:28
using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.
– VahiD
Oct 7 '15 at 10:39
using meaningful primary keys is not recommended at all, consider below scenario, someone entered wrong license number and you've used this id in 3-4 tables as foreign key, how do you fix this mistake? simply editing the license number could not be enough in this case.
– VahiD
Oct 7 '15 at 10:39
1
1
Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.
– Alpha
Oct 7 '15 at 17:13
Funny: I read your comment and I thought "yes, of course", then went back to read my answer and thought "did I say that"? Funny how things change in a couple of years. I was probably coming from a more theoretical background, but unless you have tight control over it (rarely) it does not provide much benefit. I'll update the answer.
– Alpha
Oct 7 '15 at 17:13
upvote for the development in the years :)
– VahiD
Oct 8 '15 at 18:28
upvote for the development in the years :)
– VahiD
Oct 8 '15 at 18:28
add a comment |
Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.
Just a thought and I hope I am remembering correctly. Have a great day!
2
Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.
– LowlyDBA
May 1 '18 at 21:17
add a comment |
Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.
Just a thought and I hope I am remembering correctly. Have a great day!
2
Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.
– LowlyDBA
May 1 '18 at 21:17
add a comment |
Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.
Just a thought and I hope I am remembering correctly. Have a great day!
Another thing with how GUIDs are generated. mrdenny correctly pointed out that even if newsequentialid() is being used, restarting the instances causes new values to begin with the "holes" left behind in prior processing. Another thing that affects "sequential" GUIDs is the network card. If I remember correctly, the UID of the NIC is used as part of the GUID algorithm. If a NIC is replaced, there is no guarantee that the UID will be a higher value to maintain the sequential aspect of things. I am also not sure how multiple NICs might affect the assignment of values using the algorithm.
Just a thought and I hope I am remembering correctly. Have a great day!
answered May 1 '18 at 21:12
bobo8734bobo8734
11
11
2
Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.
– LowlyDBA
May 1 '18 at 21:17
add a comment |
2
Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.
– LowlyDBA
May 1 '18 at 21:17
2
2
Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.
– LowlyDBA
May 1 '18 at 21:17
Welcome to Database Administrators, bobo8734. Could you find some sources for these comments? If you're unsure of them maybe they'd be better served as a comment (when you have the rep for it) than a standalone answer.
– LowlyDBA
May 1 '18 at 21:17
add a comment |
Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id
to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.
New contributor
add a comment |
Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id
to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.
New contributor
add a comment |
Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id
to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.
New contributor
Using auto increment IDs might leak information about your business activity. If you are running a shop and uses order_id
to publicly identify a purchase, then anybody can find out your monthly number of sales by simple arithmetic.
New contributor
New contributor
answered 7 mins ago
golopotgolopot
1011
1011
New contributor
New contributor
add a comment |
add a comment |
Use both
Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.
But bind a column to GUID so that every row also has a unique column
1
Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.
– Andriy M
Jan 14 '16 at 6:29
GUID is 36 character long will be difficult to read in case you are searching for a specific case..
– Abdul Hannan Ijaz
Jan 14 '16 at 7:57
1
All right, but that doesn't really explain why the OP should use bothint
andguid
, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?
– Andriy M
Jan 14 '16 at 8:18
Yup i meant the same thing .. cool BTW :)
– Abdul Hannan Ijaz
Jan 14 '16 at 9:22
add a comment |
Use both
Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.
But bind a column to GUID so that every row also has a unique column
1
Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.
– Andriy M
Jan 14 '16 at 6:29
GUID is 36 character long will be difficult to read in case you are searching for a specific case..
– Abdul Hannan Ijaz
Jan 14 '16 at 7:57
1
All right, but that doesn't really explain why the OP should use bothint
andguid
, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?
– Andriy M
Jan 14 '16 at 8:18
Yup i meant the same thing .. cool BTW :)
– Abdul Hannan Ijaz
Jan 14 '16 at 9:22
add a comment |
Use both
Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.
But bind a column to GUID so that every row also has a unique column
Use both
Use int/Bigint for Primary Key as it is easy to maintain and use as foreign key relations.
But bind a column to GUID so that every row also has a unique column
answered Jan 5 '16 at 12:58
Abdul Hannan IjazAbdul Hannan Ijaz
1675
1675
1
Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.
– Andriy M
Jan 14 '16 at 6:29
GUID is 36 character long will be difficult to read in case you are searching for a specific case..
– Abdul Hannan Ijaz
Jan 14 '16 at 7:57
1
All right, but that doesn't really explain why the OP should use bothint
andguid
, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?
– Andriy M
Jan 14 '16 at 8:18
Yup i meant the same thing .. cool BTW :)
– Abdul Hannan Ijaz
Jan 14 '16 at 9:22
add a comment |
1
Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.
– Andriy M
Jan 14 '16 at 6:29
GUID is 36 character long will be difficult to read in case you are searching for a specific case..
– Abdul Hannan Ijaz
Jan 14 '16 at 7:57
1
All right, but that doesn't really explain why the OP should use bothint
andguid
, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?
– Andriy M
Jan 14 '16 at 8:18
Yup i meant the same thing .. cool BTW :)
– Abdul Hannan Ijaz
Jan 14 '16 at 9:22
1
1
Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.
– Andriy M
Jan 14 '16 at 6:29
Explaining your reasoning behind this suggestion wouldn't hurt anyone, I'm sure.
– Andriy M
Jan 14 '16 at 6:29
GUID is 36 character long will be difficult to read in case you are searching for a specific case..
– Abdul Hannan Ijaz
Jan 14 '16 at 7:57
GUID is 36 character long will be difficult to read in case you are searching for a specific case..
– Abdul Hannan Ijaz
Jan 14 '16 at 7:57
1
1
All right, but that doesn't really explain why the OP should use both
int
and guid
, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?– Andriy M
Jan 14 '16 at 8:18
All right, but that doesn't really explain why the OP should use both
int
and guid
, as you are suggesting in your answer. And besides, I wasn't talking about explaining your suggestion just to me – my point was that you might want to update your answer. By the way, are you aware that another answerer has already suggested the same (more or less) as you?– Andriy M
Jan 14 '16 at 8:18
Yup i meant the same thing .. cool BTW :)
– Abdul Hannan Ijaz
Jan 14 '16 at 9:22
Yup i meant the same thing .. cool BTW :)
– Abdul Hannan Ijaz
Jan 14 '16 at 9:22
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f264%2fguid-vs-int-which-is-better-as-a-primary-key%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Rather than primary key, I think you mean surrogate key i.e. a key that is not the natural key (the latter being the key we use in the real world). Possibly you mean clustered index.
– onedaywhen
Jan 31 '12 at 8:48
Also remember the difference between (Primary) KEY and INDEX.
– Allan S. Hansen
Jan 7 '14 at 6:55
1
Also discussed on SO: stackoverflow.com/questions/11033435/…
– Jon of All Trades
Jun 22 '15 at 21:41
2
"
int
has no flaws except by the number limit, which in many cases are irrelevant.": actually, within this context of INT vs GUID, the upper limit of a signed, 32-bitINT
is entirely irrelevant given that the upper limit of a signed, 64-bitBIGINT
is well beyond nearly all uses (even more so if you start numbering at the lower limit; and same goes forINT
) and it is still half the size of a GUID (8 bytes instead of 16) and sequential.– Solomon Rutzky
Oct 7 '15 at 18:41