What's the proper design for the table that will hold the processor load stats?Choosing a database for big...
Minimum Viable Product for RTS game?
Did the characters in Moving Pictures not know about cameras like Twoflower's?
How can I portray body horror and still be sensitive to people with disabilities?
Have the UK Conservatives lost the working majority and if so, what does this mean?
How can guns be countered by melee combat without raw-ability or exceptional explanations?
What if you do not believe in the project benefits?
Why and/or operations in python statement are behaving unexpectedly?
Found a major flaw in paper from home university – to which I would like to return
How can I make my enemies feel real and make combat more engaging?
If I have Haste cast on me, does it reduce the casting time for my spells that normally take more than a turn to cast?
What's the function of the word "ли" in the following contexts?
What is formjacking?
What does an unprocessed RAW file look like?
How can changes in personality/values of a person who turned into a vampire be explained?
Is Screenshot Time-tracking Common?
Can I combine Divination spells with Arcane Eye?
Including proofs of known theorems in master's thesis
How many copper coins fit inside a cubic foot?
Is it common to refer to someone as "Prof. Dr. [LastName]"?
Sets which are both Sum-free and Product-free.
Integral problem. Unsure of the approach.
What's the meaning of #0?
Why do single electrical receptacles exist?
How can I persuade an unwilling soul to become willing?
What's the proper design for the table that will hold the processor load stats?
Choosing a database for big dataDatabase modelling : Creating a model for restaurant application which gives statisticsWhich service to use for unoptimized, slow mysql database ?Store table names to reference other tablesHow can I count some rows before inserting?Structure a database for a BlogDatabase Design: Should I use a Materialized ViewHow to query chart data in a database agnostic way?Determine record count grouped as year-month for each user including 0 countsWhat DB should I use in my case?
Let's say that I need to implement a widget on the site that allows the user to view the processor load stats via a chart. The user will be allowed to choose how detailed the statistics are: with minute precision, day, month and year precision. The data for processor load is supplied each second.
How should I design the table that holds the processor load data so that I could efficiently query the database for the statistics (or should I, for example, use separate tables for each statistic)?
The target DBMS is MySQL.
mysql database-design
bumped to the homepage by Community♦ 8 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
Let's say that I need to implement a widget on the site that allows the user to view the processor load stats via a chart. The user will be allowed to choose how detailed the statistics are: with minute precision, day, month and year precision. The data for processor load is supplied each second.
How should I design the table that holds the processor load data so that I could efficiently query the database for the statistics (or should I, for example, use separate tables for each statistic)?
The target DBMS is MySQL.
mysql database-design
bumped to the homepage by Community♦ 8 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
Let's say that I need to implement a widget on the site that allows the user to view the processor load stats via a chart. The user will be allowed to choose how detailed the statistics are: with minute precision, day, month and year precision. The data for processor load is supplied each second.
How should I design the table that holds the processor load data so that I could efficiently query the database for the statistics (or should I, for example, use separate tables for each statistic)?
The target DBMS is MySQL.
mysql database-design
Let's say that I need to implement a widget on the site that allows the user to view the processor load stats via a chart. The user will be allowed to choose how detailed the statistics are: with minute precision, day, month and year precision. The data for processor load is supplied each second.
How should I design the table that holds the processor load data so that I could efficiently query the database for the statistics (or should I, for example, use separate tables for each statistic)?
The target DBMS is MySQL.
mysql database-design
mysql database-design
asked Sep 17 '14 at 10:34
ExanderExander
156
156
bumped to the homepage by Community♦ 8 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 8 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
One idea would be to create one table and partition it by day of the week ( or some other criteria).
The data older than a week gets moved to a history table, ensuring that the "live" table doesn't get too big.
I think that having different tables for each statistic might not be a good thing since you would need to join them each time.
When partitioning, keep in mind that the maximum number of partitions is 1024
http://dev.mysql.com/doc/refman/5.1/en/partitioning-limitations.html#partitioning-limitations-max-partitions
add a comment |
The data should be partitioned into separate tables the time frames that you mentioned, minute, day, month, year, etc. The goal is to reduce processing on either server (mysql or web) or client (in js for example). You want to grab the data quickly and display quickly, right?
This requires either to handle the insertion of the metric for each table at the time the metric is observed or more likely to insert into the minute table only and then aggregate the data into the other tables on a schedule. MapReduce tools can help with the aggregation of data.
The problem in doing this in a relational persistence model (mysql) is that this data is very key/value oriented (http://en.wikipedia.org/wiki/NoSQL#Key-value_stores). You can of course do what you're talking about in mysql and I'm assuming you're going to have a lookup table or a host_id column to relate the metrics to the servers you're monitoring. One huge table with metrics (load) and then a table for the hosts you're monitoring with a foreign key (host_id) to constrain and join the two.
The alternative solution if you feel that maybe this is geared towards a key/value persistence model is to utilize something like redis or dynamodb (in AWS) where you create a set of tables (min, hour, day, etc) for each monitored host and use the timestamp for the key (the value being the load). You can also simply prepend the host_id to the key if you don't want to create sets of tables for each unique host. Btw there are downsides of each. The first thing I can think of is that unless you use a hashing technique you're going to be pounding the same server for a specific host. In AWS land for example your key retrieval method is a hash (of your key) which spreads the load across your table instances (actually the disks the data is stored on): http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DataModel.html.
There are plenty of other key/value storage systems as well and mongodb is something you might research as well: http://en.wikipedia.org/wiki/MongoDB
add a comment |
MySQL is absolutely excessive for that purpose. There is another very useful software that intended exactly for storing various stats - RRDtool
RRDtool is not a RDBMS, it hasn't indexes, queries and so on. There is only the circular table (only one per DB) of predefined capacity that overwritten from the head row by row when filled to the end.
RRDtool store not exact readings but rather averaged values with some granularity. For each table you can define different granularities and incoming data will be processed automatically. Then you can refer to the data in terms of timestamps [begin-end] and RRDtool automatically returns you the dataset with appropriate granularity.
RRDtool have various interfaces that returns data as CSV, XML and even pretty cool plots.
If you want to store loads, RRDtool is the tool of choice.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f76820%2fwhats-the-proper-design-for-the-table-that-will-hold-the-processor-load-stats%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
One idea would be to create one table and partition it by day of the week ( or some other criteria).
The data older than a week gets moved to a history table, ensuring that the "live" table doesn't get too big.
I think that having different tables for each statistic might not be a good thing since you would need to join them each time.
When partitioning, keep in mind that the maximum number of partitions is 1024
http://dev.mysql.com/doc/refman/5.1/en/partitioning-limitations.html#partitioning-limitations-max-partitions
add a comment |
One idea would be to create one table and partition it by day of the week ( or some other criteria).
The data older than a week gets moved to a history table, ensuring that the "live" table doesn't get too big.
I think that having different tables for each statistic might not be a good thing since you would need to join them each time.
When partitioning, keep in mind that the maximum number of partitions is 1024
http://dev.mysql.com/doc/refman/5.1/en/partitioning-limitations.html#partitioning-limitations-max-partitions
add a comment |
One idea would be to create one table and partition it by day of the week ( or some other criteria).
The data older than a week gets moved to a history table, ensuring that the "live" table doesn't get too big.
I think that having different tables for each statistic might not be a good thing since you would need to join them each time.
When partitioning, keep in mind that the maximum number of partitions is 1024
http://dev.mysql.com/doc/refman/5.1/en/partitioning-limitations.html#partitioning-limitations-max-partitions
One idea would be to create one table and partition it by day of the week ( or some other criteria).
The data older than a week gets moved to a history table, ensuring that the "live" table doesn't get too big.
I think that having different tables for each statistic might not be a good thing since you would need to join them each time.
When partitioning, keep in mind that the maximum number of partitions is 1024
http://dev.mysql.com/doc/refman/5.1/en/partitioning-limitations.html#partitioning-limitations-max-partitions
answered Sep 17 '14 at 11:41
user4045978user4045978
11
11
add a comment |
add a comment |
The data should be partitioned into separate tables the time frames that you mentioned, minute, day, month, year, etc. The goal is to reduce processing on either server (mysql or web) or client (in js for example). You want to grab the data quickly and display quickly, right?
This requires either to handle the insertion of the metric for each table at the time the metric is observed or more likely to insert into the minute table only and then aggregate the data into the other tables on a schedule. MapReduce tools can help with the aggregation of data.
The problem in doing this in a relational persistence model (mysql) is that this data is very key/value oriented (http://en.wikipedia.org/wiki/NoSQL#Key-value_stores). You can of course do what you're talking about in mysql and I'm assuming you're going to have a lookup table or a host_id column to relate the metrics to the servers you're monitoring. One huge table with metrics (load) and then a table for the hosts you're monitoring with a foreign key (host_id) to constrain and join the two.
The alternative solution if you feel that maybe this is geared towards a key/value persistence model is to utilize something like redis or dynamodb (in AWS) where you create a set of tables (min, hour, day, etc) for each monitored host and use the timestamp for the key (the value being the load). You can also simply prepend the host_id to the key if you don't want to create sets of tables for each unique host. Btw there are downsides of each. The first thing I can think of is that unless you use a hashing technique you're going to be pounding the same server for a specific host. In AWS land for example your key retrieval method is a hash (of your key) which spreads the load across your table instances (actually the disks the data is stored on): http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DataModel.html.
There are plenty of other key/value storage systems as well and mongodb is something you might research as well: http://en.wikipedia.org/wiki/MongoDB
add a comment |
The data should be partitioned into separate tables the time frames that you mentioned, minute, day, month, year, etc. The goal is to reduce processing on either server (mysql or web) or client (in js for example). You want to grab the data quickly and display quickly, right?
This requires either to handle the insertion of the metric for each table at the time the metric is observed or more likely to insert into the minute table only and then aggregate the data into the other tables on a schedule. MapReduce tools can help with the aggregation of data.
The problem in doing this in a relational persistence model (mysql) is that this data is very key/value oriented (http://en.wikipedia.org/wiki/NoSQL#Key-value_stores). You can of course do what you're talking about in mysql and I'm assuming you're going to have a lookup table or a host_id column to relate the metrics to the servers you're monitoring. One huge table with metrics (load) and then a table for the hosts you're monitoring with a foreign key (host_id) to constrain and join the two.
The alternative solution if you feel that maybe this is geared towards a key/value persistence model is to utilize something like redis or dynamodb (in AWS) where you create a set of tables (min, hour, day, etc) for each monitored host and use the timestamp for the key (the value being the load). You can also simply prepend the host_id to the key if you don't want to create sets of tables for each unique host. Btw there are downsides of each. The first thing I can think of is that unless you use a hashing technique you're going to be pounding the same server for a specific host. In AWS land for example your key retrieval method is a hash (of your key) which spreads the load across your table instances (actually the disks the data is stored on): http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DataModel.html.
There are plenty of other key/value storage systems as well and mongodb is something you might research as well: http://en.wikipedia.org/wiki/MongoDB
add a comment |
The data should be partitioned into separate tables the time frames that you mentioned, minute, day, month, year, etc. The goal is to reduce processing on either server (mysql or web) or client (in js for example). You want to grab the data quickly and display quickly, right?
This requires either to handle the insertion of the metric for each table at the time the metric is observed or more likely to insert into the minute table only and then aggregate the data into the other tables on a schedule. MapReduce tools can help with the aggregation of data.
The problem in doing this in a relational persistence model (mysql) is that this data is very key/value oriented (http://en.wikipedia.org/wiki/NoSQL#Key-value_stores). You can of course do what you're talking about in mysql and I'm assuming you're going to have a lookup table or a host_id column to relate the metrics to the servers you're monitoring. One huge table with metrics (load) and then a table for the hosts you're monitoring with a foreign key (host_id) to constrain and join the two.
The alternative solution if you feel that maybe this is geared towards a key/value persistence model is to utilize something like redis or dynamodb (in AWS) where you create a set of tables (min, hour, day, etc) for each monitored host and use the timestamp for the key (the value being the load). You can also simply prepend the host_id to the key if you don't want to create sets of tables for each unique host. Btw there are downsides of each. The first thing I can think of is that unless you use a hashing technique you're going to be pounding the same server for a specific host. In AWS land for example your key retrieval method is a hash (of your key) which spreads the load across your table instances (actually the disks the data is stored on): http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DataModel.html.
There are plenty of other key/value storage systems as well and mongodb is something you might research as well: http://en.wikipedia.org/wiki/MongoDB
The data should be partitioned into separate tables the time frames that you mentioned, minute, day, month, year, etc. The goal is to reduce processing on either server (mysql or web) or client (in js for example). You want to grab the data quickly and display quickly, right?
This requires either to handle the insertion of the metric for each table at the time the metric is observed or more likely to insert into the minute table only and then aggregate the data into the other tables on a schedule. MapReduce tools can help with the aggregation of data.
The problem in doing this in a relational persistence model (mysql) is that this data is very key/value oriented (http://en.wikipedia.org/wiki/NoSQL#Key-value_stores). You can of course do what you're talking about in mysql and I'm assuming you're going to have a lookup table or a host_id column to relate the metrics to the servers you're monitoring. One huge table with metrics (load) and then a table for the hosts you're monitoring with a foreign key (host_id) to constrain and join the two.
The alternative solution if you feel that maybe this is geared towards a key/value persistence model is to utilize something like redis or dynamodb (in AWS) where you create a set of tables (min, hour, day, etc) for each monitored host and use the timestamp for the key (the value being the load). You can also simply prepend the host_id to the key if you don't want to create sets of tables for each unique host. Btw there are downsides of each. The first thing I can think of is that unless you use a hashing technique you're going to be pounding the same server for a specific host. In AWS land for example your key retrieval method is a hash (of your key) which spreads the load across your table instances (actually the disks the data is stored on): http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/DataModel.html.
There are plenty of other key/value storage systems as well and mongodb is something you might research as well: http://en.wikipedia.org/wiki/MongoDB
answered Sep 17 '14 at 16:48
ekeyserekeyser
1593
1593
add a comment |
add a comment |
MySQL is absolutely excessive for that purpose. There is another very useful software that intended exactly for storing various stats - RRDtool
RRDtool is not a RDBMS, it hasn't indexes, queries and so on. There is only the circular table (only one per DB) of predefined capacity that overwritten from the head row by row when filled to the end.
RRDtool store not exact readings but rather averaged values with some granularity. For each table you can define different granularities and incoming data will be processed automatically. Then you can refer to the data in terms of timestamps [begin-end] and RRDtool automatically returns you the dataset with appropriate granularity.
RRDtool have various interfaces that returns data as CSV, XML and even pretty cool plots.
If you want to store loads, RRDtool is the tool of choice.
add a comment |
MySQL is absolutely excessive for that purpose. There is another very useful software that intended exactly for storing various stats - RRDtool
RRDtool is not a RDBMS, it hasn't indexes, queries and so on. There is only the circular table (only one per DB) of predefined capacity that overwritten from the head row by row when filled to the end.
RRDtool store not exact readings but rather averaged values with some granularity. For each table you can define different granularities and incoming data will be processed automatically. Then you can refer to the data in terms of timestamps [begin-end] and RRDtool automatically returns you the dataset with appropriate granularity.
RRDtool have various interfaces that returns data as CSV, XML and even pretty cool plots.
If you want to store loads, RRDtool is the tool of choice.
add a comment |
MySQL is absolutely excessive for that purpose. There is another very useful software that intended exactly for storing various stats - RRDtool
RRDtool is not a RDBMS, it hasn't indexes, queries and so on. There is only the circular table (only one per DB) of predefined capacity that overwritten from the head row by row when filled to the end.
RRDtool store not exact readings but rather averaged values with some granularity. For each table you can define different granularities and incoming data will be processed automatically. Then you can refer to the data in terms of timestamps [begin-end] and RRDtool automatically returns you the dataset with appropriate granularity.
RRDtool have various interfaces that returns data as CSV, XML and even pretty cool plots.
If you want to store loads, RRDtool is the tool of choice.
MySQL is absolutely excessive for that purpose. There is another very useful software that intended exactly for storing various stats - RRDtool
RRDtool is not a RDBMS, it hasn't indexes, queries and so on. There is only the circular table (only one per DB) of predefined capacity that overwritten from the head row by row when filled to the end.
RRDtool store not exact readings but rather averaged values with some granularity. For each table you can define different granularities and incoming data will be processed automatically. Then you can refer to the data in terms of timestamps [begin-end] and RRDtool automatically returns you the dataset with appropriate granularity.
RRDtool have various interfaces that returns data as CSV, XML and even pretty cool plots.
If you want to store loads, RRDtool is the tool of choice.
answered Sep 17 '14 at 22:37
KondybasKondybas
2,648912
2,648912
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f76820%2fwhats-the-proper-design-for-the-table-that-will-hold-the-processor-load-stats%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown