Why is opening a file faster than reading variable content?Command in variable one are affecting output of...

Why would you use 2 alternate layout buttons instead of 1, when only one can be selected at once

Is it common to refer to someone as "Prof. Dr. [LastName]"?

Does the Holy Ark weigh 4 tons?

Do error bars on probabilities have any meaning?

How can a kingdom keep the secret of a missing monarch from the public?

Taking an academic pseudonym?

SFDX CLI - Locked with an active writer?

Is layered encryption more secure than long passwords?

Why Is Image Exporting At Larger Dimensions Than In Illustrator File?

What is the difference between crontab -e and nano /etc/crontab?

Why is Bernie Sanders maximum accepted donation on actblue $5600?

80-bit collision resistence because of 80-bit x87 registers?

What is formjacking?

How can I differentiate duration vs starting time

Is it possible to detect 100% of SQLi with a simple regex?

Smallest possible mole

How can changes in personality/values of a person who turned into a vampire be explained?

How to achieve physical gender equality?

What does an unprocessed RAW file look like?

Can I combine Divination spells with Arcane Eye?

How do I write a maintainable, fast, compile-time bit-mask in C++?

Sauna: Wood does not feel so hot

Why do we interpret the accelerated expansion of the universe as the proof for the existence of dark energy?

Counting primes in a range whose digits are all prime



Why is opening a file faster than reading variable content?


Command in variable one are affecting output of other variable when displayed together in UNIXCount result in a find/exec statementCron only occasionally sends e-mail on output and errorsWhy there is such a difference in execution time of echo and cat?Recover from faking /proc/meminfoCalling a file with variable file name in bash scriptpass variable in sshparallel processing reading from a file in a loopbash script which will highlight maximum disk usage line on outputHooking up core dump before apport













33















In a bash script I need various values from /proc/ files. Until now I have dozens of lines grepping the files directly like that:



grep -oP '^MemFree: *K[0-9]+' /proc/meminfo


In an effort to make that more efficient I saved the file content in a variable and grepped that:



a=$(</proc/meminfo)
echo "$a" | grep -oP '^MemFree: *K[0-9]+'


Instead of opening the file multiple times this should just open it once and grep the variable content, which I assumed would be faster – but in fact it is slower:





bash 4.4.19 $ time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null
real 0m0.803s
user 0m0.619s
sys 0m0.232s
bash 4.4.19 $ a=$(</proc/meminfo)
bash 4.4.19 $ time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null
real 0m1.182s
user 0m1.425s
sys 0m0.506s


The same is true for dash and zsh. I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:



bash 4.4.19 $ cat </proc/meminfo >meminfo
bash 4.4.19 $ time for i in $(seq 1 1000);do grep ^MemFree meminfo; done >/dev/null
real 0m0.790s
user 0m0.608s
sys 0m0.227s


Using a here string to save the pipe makes it slightly faster, but still not as fast as with the files:



bash 4.4.19 $ time for i in $(seq 1 1000);do <<<"$a" grep ^MemFree; done >/dev/null
real 0m0.977s
user 0m0.758s
sys 0m0.268s


Why is opening a file faster than reading the same content from a variable?










share|improve this question

























  • @l0b0 This assumption is not faulty, the question shows how I came up with it and the answers explain why this is the case. Your edit now makes the answers not answering the title question any more: They don’t say whether that’s the case.

    – dessert
    7 hours ago











  • OK, clarified. Because the heading was wrong in the vast majority of cases, just not for certain memory mapped special files.

    – l0b0
    1 hour ago











  • @l0b0 No, that’s what I’m asking here: “I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:” It is not special to /proc/ files, reading regular files is faster as well!

    – dessert
    22 mins ago
















33















In a bash script I need various values from /proc/ files. Until now I have dozens of lines grepping the files directly like that:



grep -oP '^MemFree: *K[0-9]+' /proc/meminfo


In an effort to make that more efficient I saved the file content in a variable and grepped that:



a=$(</proc/meminfo)
echo "$a" | grep -oP '^MemFree: *K[0-9]+'


Instead of opening the file multiple times this should just open it once and grep the variable content, which I assumed would be faster – but in fact it is slower:





bash 4.4.19 $ time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null
real 0m0.803s
user 0m0.619s
sys 0m0.232s
bash 4.4.19 $ a=$(</proc/meminfo)
bash 4.4.19 $ time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null
real 0m1.182s
user 0m1.425s
sys 0m0.506s


The same is true for dash and zsh. I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:



bash 4.4.19 $ cat </proc/meminfo >meminfo
bash 4.4.19 $ time for i in $(seq 1 1000);do grep ^MemFree meminfo; done >/dev/null
real 0m0.790s
user 0m0.608s
sys 0m0.227s


Using a here string to save the pipe makes it slightly faster, but still not as fast as with the files:



bash 4.4.19 $ time for i in $(seq 1 1000);do <<<"$a" grep ^MemFree; done >/dev/null
real 0m0.977s
user 0m0.758s
sys 0m0.268s


Why is opening a file faster than reading the same content from a variable?










share|improve this question

























  • @l0b0 This assumption is not faulty, the question shows how I came up with it and the answers explain why this is the case. Your edit now makes the answers not answering the title question any more: They don’t say whether that’s the case.

    – dessert
    7 hours ago











  • OK, clarified. Because the heading was wrong in the vast majority of cases, just not for certain memory mapped special files.

    – l0b0
    1 hour ago











  • @l0b0 No, that’s what I’m asking here: “I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:” It is not special to /proc/ files, reading regular files is faster as well!

    – dessert
    22 mins ago














33












33








33


6






In a bash script I need various values from /proc/ files. Until now I have dozens of lines grepping the files directly like that:



grep -oP '^MemFree: *K[0-9]+' /proc/meminfo


In an effort to make that more efficient I saved the file content in a variable and grepped that:



a=$(</proc/meminfo)
echo "$a" | grep -oP '^MemFree: *K[0-9]+'


Instead of opening the file multiple times this should just open it once and grep the variable content, which I assumed would be faster – but in fact it is slower:





bash 4.4.19 $ time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null
real 0m0.803s
user 0m0.619s
sys 0m0.232s
bash 4.4.19 $ a=$(</proc/meminfo)
bash 4.4.19 $ time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null
real 0m1.182s
user 0m1.425s
sys 0m0.506s


The same is true for dash and zsh. I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:



bash 4.4.19 $ cat </proc/meminfo >meminfo
bash 4.4.19 $ time for i in $(seq 1 1000);do grep ^MemFree meminfo; done >/dev/null
real 0m0.790s
user 0m0.608s
sys 0m0.227s


Using a here string to save the pipe makes it slightly faster, but still not as fast as with the files:



bash 4.4.19 $ time for i in $(seq 1 1000);do <<<"$a" grep ^MemFree; done >/dev/null
real 0m0.977s
user 0m0.758s
sys 0m0.268s


Why is opening a file faster than reading the same content from a variable?










share|improve this question
















In a bash script I need various values from /proc/ files. Until now I have dozens of lines grepping the files directly like that:



grep -oP '^MemFree: *K[0-9]+' /proc/meminfo


In an effort to make that more efficient I saved the file content in a variable and grepped that:



a=$(</proc/meminfo)
echo "$a" | grep -oP '^MemFree: *K[0-9]+'


Instead of opening the file multiple times this should just open it once and grep the variable content, which I assumed would be faster – but in fact it is slower:





bash 4.4.19 $ time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null
real 0m0.803s
user 0m0.619s
sys 0m0.232s
bash 4.4.19 $ a=$(</proc/meminfo)
bash 4.4.19 $ time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null
real 0m1.182s
user 0m1.425s
sys 0m0.506s


The same is true for dash and zsh. I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:



bash 4.4.19 $ cat </proc/meminfo >meminfo
bash 4.4.19 $ time for i in $(seq 1 1000);do grep ^MemFree meminfo; done >/dev/null
real 0m0.790s
user 0m0.608s
sys 0m0.227s


Using a here string to save the pipe makes it slightly faster, but still not as fast as with the files:



bash 4.4.19 $ time for i in $(seq 1 1000);do <<<"$a" grep ^MemFree; done >/dev/null
real 0m0.977s
user 0m0.758s
sys 0m0.268s


Why is opening a file faster than reading the same content from a variable?







bash shell-script shell zsh variable






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 21 mins ago







dessert

















asked yesterday









dessertdessert

1,243623




1,243623













  • @l0b0 This assumption is not faulty, the question shows how I came up with it and the answers explain why this is the case. Your edit now makes the answers not answering the title question any more: They don’t say whether that’s the case.

    – dessert
    7 hours ago











  • OK, clarified. Because the heading was wrong in the vast majority of cases, just not for certain memory mapped special files.

    – l0b0
    1 hour ago











  • @l0b0 No, that’s what I’m asking here: “I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:” It is not special to /proc/ files, reading regular files is faster as well!

    – dessert
    22 mins ago



















  • @l0b0 This assumption is not faulty, the question shows how I came up with it and the answers explain why this is the case. Your edit now makes the answers not answering the title question any more: They don’t say whether that’s the case.

    – dessert
    7 hours ago











  • OK, clarified. Because the heading was wrong in the vast majority of cases, just not for certain memory mapped special files.

    – l0b0
    1 hour ago











  • @l0b0 No, that’s what I’m asking here: “I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:” It is not special to /proc/ files, reading regular files is faster as well!

    – dessert
    22 mins ago

















@l0b0 This assumption is not faulty, the question shows how I came up with it and the answers explain why this is the case. Your edit now makes the answers not answering the title question any more: They don’t say whether that’s the case.

– dessert
7 hours ago





@l0b0 This assumption is not faulty, the question shows how I came up with it and the answers explain why this is the case. Your edit now makes the answers not answering the title question any more: They don’t say whether that’s the case.

– dessert
7 hours ago













OK, clarified. Because the heading was wrong in the vast majority of cases, just not for certain memory mapped special files.

– l0b0
1 hour ago





OK, clarified. Because the heading was wrong in the vast majority of cases, just not for certain memory mapped special files.

– l0b0
1 hour ago













@l0b0 No, that’s what I’m asking here: “I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:” It is not special to /proc/ files, reading regular files is faster as well!

– dessert
22 mins ago





@l0b0 No, that’s what I’m asking here: “I suspected the special state of /proc/ files as a reason, but when I copy the content of /proc/meminfo to a regular file and use that the results are the same:” It is not special to /proc/ files, reading regular files is faster as well!

– dessert
22 mins ago










2 Answers
2






active

oldest

votes


















42














grep -oP '^MemFree: *K[0-9]+' /proc/meminfo forks a process that executes grep that opens /proc/meminfo (a virtual file, in memory, no disk I/O involved) reads it and matches the regexp.



The most expensive part in that is forking the process and loading the grep utility and its library dependencies, doing the dynamic linking, open the locale database, dozens of files that are on disk (but likely cached in memory).



The part about reading /proc/meminfo is insignificant in comparison, the kernel needs little time to generate the information in there and grep needs little time to read it.



In:



a=$(</proc/meminfo)


In most shells that support that $(<...) ksh operator, the shell just opens the file and read its content (and strips the trailing newline characters). bash is different and much less efficient in that it forks a process to do that reading and passes the data to the parent via a pipe. But here, it's done once so it doesn't matter.



In:



printf '%sn' "$a" | grep '^MemFree'


The shell needs to spawn two processes, which are running concurrently but interact between each other via a pipe. That pipe creation, tearing down, and writing and reading from it has some little cost. The much greater cost is the spawning of an extra process. The scheduling of the processes has some impact as well.



You may find that using the zsh <<< operator makes it slightly quicker:



grep '^MemFree' <<< "$a"


In zsh and bash, that's done by writing the content of $a in a temporary file, that is less expensive than spawning an extra process, but will probably not give you any gain compared to getting the data straight off /proc/meminfo. That's still less efficient than your approach that copies /proc/meminfo on disk, as the writing of the temp file is done at each iteration.



dash doesn't support here-strings, but its heredocs are implemented with a pipe that doesn't involve spawning an extra process. In:



 grep '^MemFree' << EOF
$a
EOF


The shell creates a pipe, forks a process. The child executes grep with its stdin as the reading end of the pipe, and the parent writes the content at the other end of the pipe.



But that pipe handling and process synchronisation is still likely to be more expensive than just getting the data straight off /proc/meminfo.



The content of /proc/meminfo is short and takes not much time to produce. If you want to save some CPU cycles, you want to remove the expensive parts: forking processes and running external commands.



Like:



IFS= read -rd '' meminfo < /proc/meminfo
memfree=${meminfo#*MemFree:}
memfree=${memfree%%$'n'*}
memfree=${memfree#"${memfree%%[! ]*}"}


Avoid bash though whose pattern matching is very ineficient. With zsh -o extendedglob, you can shorten it to:



memfree=${${"$(</proc/meminfo)"##*MemFree: #}%%$'n'*}


Note that ^ is special in many shells (Bourne, fish, rc, es and zsh with the extendedglob option at least), I'd recommend quoting it. Also note that echo can't be used to output arbitrary data (hence my use of printf above).






share|improve this answer





















  • 4





    In the case with printf you say the shell needs to spawn two processes, but isn't printf a shell builtin?

    – David Conrad
    yesterday






  • 6





    @DavidConrad It is, but most shells don't try to analyze the pipeline for which parts it could run in the current process. It just forks itself and lets the children figure it out. In this case, the parent process forks twice; the child for the left side then sees a built-in and executes it; the child for the right side sees grep and execs.

    – chepner
    yesterday






  • 1





    @DavidConrad, the pipe is an IPC mechanism, so in any case the two sides will have to run in different processes. While in A | B, there are some shells like AT&T ksh or zsh that run B in the current shell process if it's a builtin or compound or function command, I don't know of any that runs A in the current process. If anything, to do that, they would have to handle SIGPIPE in a complex way as if A was running in child process and without terminating the shell for the behaviour not to be too surprising when B exits early. It's much easier to run B in the parent process.

    – Stéphane Chazelas
    13 hours ago











  • Bash supports <<<

    – D. Ben Knoble
    2 hours ago



















6














In your first case you are just using grep utility and finding something from file /proc/meminfo, /proc is a virtual file system so /proc/meminfo file is in the memory, and it requires very little time to fetch its content.



But in the second case, you are creating a pipe, then passing the first command's output to the second command using this pipe, which is costly.



The difference is because of /proc (because it is in memory) and pipe, see the example below:



time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null

real 0m0.914s
user 0m0.032s
sys 0m0.148s


cat /proc/meminfo > file
time for i in {1..1000};do grep ^MemFree file;done >/dev/null

real 0m0.938s
user 0m0.032s
sys 0m0.152s


time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null

real 0m1.016s
user 0m0.040s
sys 0m0.232s





share|improve this answer

























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f501828%2fwhy-is-opening-a-file-faster-than-reading-variable-content%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    42














    grep -oP '^MemFree: *K[0-9]+' /proc/meminfo forks a process that executes grep that opens /proc/meminfo (a virtual file, in memory, no disk I/O involved) reads it and matches the regexp.



    The most expensive part in that is forking the process and loading the grep utility and its library dependencies, doing the dynamic linking, open the locale database, dozens of files that are on disk (but likely cached in memory).



    The part about reading /proc/meminfo is insignificant in comparison, the kernel needs little time to generate the information in there and grep needs little time to read it.



    In:



    a=$(</proc/meminfo)


    In most shells that support that $(<...) ksh operator, the shell just opens the file and read its content (and strips the trailing newline characters). bash is different and much less efficient in that it forks a process to do that reading and passes the data to the parent via a pipe. But here, it's done once so it doesn't matter.



    In:



    printf '%sn' "$a" | grep '^MemFree'


    The shell needs to spawn two processes, which are running concurrently but interact between each other via a pipe. That pipe creation, tearing down, and writing and reading from it has some little cost. The much greater cost is the spawning of an extra process. The scheduling of the processes has some impact as well.



    You may find that using the zsh <<< operator makes it slightly quicker:



    grep '^MemFree' <<< "$a"


    In zsh and bash, that's done by writing the content of $a in a temporary file, that is less expensive than spawning an extra process, but will probably not give you any gain compared to getting the data straight off /proc/meminfo. That's still less efficient than your approach that copies /proc/meminfo on disk, as the writing of the temp file is done at each iteration.



    dash doesn't support here-strings, but its heredocs are implemented with a pipe that doesn't involve spawning an extra process. In:



     grep '^MemFree' << EOF
    $a
    EOF


    The shell creates a pipe, forks a process. The child executes grep with its stdin as the reading end of the pipe, and the parent writes the content at the other end of the pipe.



    But that pipe handling and process synchronisation is still likely to be more expensive than just getting the data straight off /proc/meminfo.



    The content of /proc/meminfo is short and takes not much time to produce. If you want to save some CPU cycles, you want to remove the expensive parts: forking processes and running external commands.



    Like:



    IFS= read -rd '' meminfo < /proc/meminfo
    memfree=${meminfo#*MemFree:}
    memfree=${memfree%%$'n'*}
    memfree=${memfree#"${memfree%%[! ]*}"}


    Avoid bash though whose pattern matching is very ineficient. With zsh -o extendedglob, you can shorten it to:



    memfree=${${"$(</proc/meminfo)"##*MemFree: #}%%$'n'*}


    Note that ^ is special in many shells (Bourne, fish, rc, es and zsh with the extendedglob option at least), I'd recommend quoting it. Also note that echo can't be used to output arbitrary data (hence my use of printf above).






    share|improve this answer





















    • 4





      In the case with printf you say the shell needs to spawn two processes, but isn't printf a shell builtin?

      – David Conrad
      yesterday






    • 6





      @DavidConrad It is, but most shells don't try to analyze the pipeline for which parts it could run in the current process. It just forks itself and lets the children figure it out. In this case, the parent process forks twice; the child for the left side then sees a built-in and executes it; the child for the right side sees grep and execs.

      – chepner
      yesterday






    • 1





      @DavidConrad, the pipe is an IPC mechanism, so in any case the two sides will have to run in different processes. While in A | B, there are some shells like AT&T ksh or zsh that run B in the current shell process if it's a builtin or compound or function command, I don't know of any that runs A in the current process. If anything, to do that, they would have to handle SIGPIPE in a complex way as if A was running in child process and without terminating the shell for the behaviour not to be too surprising when B exits early. It's much easier to run B in the parent process.

      – Stéphane Chazelas
      13 hours ago











    • Bash supports <<<

      – D. Ben Knoble
      2 hours ago
















    42














    grep -oP '^MemFree: *K[0-9]+' /proc/meminfo forks a process that executes grep that opens /proc/meminfo (a virtual file, in memory, no disk I/O involved) reads it and matches the regexp.



    The most expensive part in that is forking the process and loading the grep utility and its library dependencies, doing the dynamic linking, open the locale database, dozens of files that are on disk (but likely cached in memory).



    The part about reading /proc/meminfo is insignificant in comparison, the kernel needs little time to generate the information in there and grep needs little time to read it.



    In:



    a=$(</proc/meminfo)


    In most shells that support that $(<...) ksh operator, the shell just opens the file and read its content (and strips the trailing newline characters). bash is different and much less efficient in that it forks a process to do that reading and passes the data to the parent via a pipe. But here, it's done once so it doesn't matter.



    In:



    printf '%sn' "$a" | grep '^MemFree'


    The shell needs to spawn two processes, which are running concurrently but interact between each other via a pipe. That pipe creation, tearing down, and writing and reading from it has some little cost. The much greater cost is the spawning of an extra process. The scheduling of the processes has some impact as well.



    You may find that using the zsh <<< operator makes it slightly quicker:



    grep '^MemFree' <<< "$a"


    In zsh and bash, that's done by writing the content of $a in a temporary file, that is less expensive than spawning an extra process, but will probably not give you any gain compared to getting the data straight off /proc/meminfo. That's still less efficient than your approach that copies /proc/meminfo on disk, as the writing of the temp file is done at each iteration.



    dash doesn't support here-strings, but its heredocs are implemented with a pipe that doesn't involve spawning an extra process. In:



     grep '^MemFree' << EOF
    $a
    EOF


    The shell creates a pipe, forks a process. The child executes grep with its stdin as the reading end of the pipe, and the parent writes the content at the other end of the pipe.



    But that pipe handling and process synchronisation is still likely to be more expensive than just getting the data straight off /proc/meminfo.



    The content of /proc/meminfo is short and takes not much time to produce. If you want to save some CPU cycles, you want to remove the expensive parts: forking processes and running external commands.



    Like:



    IFS= read -rd '' meminfo < /proc/meminfo
    memfree=${meminfo#*MemFree:}
    memfree=${memfree%%$'n'*}
    memfree=${memfree#"${memfree%%[! ]*}"}


    Avoid bash though whose pattern matching is very ineficient. With zsh -o extendedglob, you can shorten it to:



    memfree=${${"$(</proc/meminfo)"##*MemFree: #}%%$'n'*}


    Note that ^ is special in many shells (Bourne, fish, rc, es and zsh with the extendedglob option at least), I'd recommend quoting it. Also note that echo can't be used to output arbitrary data (hence my use of printf above).






    share|improve this answer





















    • 4





      In the case with printf you say the shell needs to spawn two processes, but isn't printf a shell builtin?

      – David Conrad
      yesterday






    • 6





      @DavidConrad It is, but most shells don't try to analyze the pipeline for which parts it could run in the current process. It just forks itself and lets the children figure it out. In this case, the parent process forks twice; the child for the left side then sees a built-in and executes it; the child for the right side sees grep and execs.

      – chepner
      yesterday






    • 1





      @DavidConrad, the pipe is an IPC mechanism, so in any case the two sides will have to run in different processes. While in A | B, there are some shells like AT&T ksh or zsh that run B in the current shell process if it's a builtin or compound or function command, I don't know of any that runs A in the current process. If anything, to do that, they would have to handle SIGPIPE in a complex way as if A was running in child process and without terminating the shell for the behaviour not to be too surprising when B exits early. It's much easier to run B in the parent process.

      – Stéphane Chazelas
      13 hours ago











    • Bash supports <<<

      – D. Ben Knoble
      2 hours ago














    42












    42








    42







    grep -oP '^MemFree: *K[0-9]+' /proc/meminfo forks a process that executes grep that opens /proc/meminfo (a virtual file, in memory, no disk I/O involved) reads it and matches the regexp.



    The most expensive part in that is forking the process and loading the grep utility and its library dependencies, doing the dynamic linking, open the locale database, dozens of files that are on disk (but likely cached in memory).



    The part about reading /proc/meminfo is insignificant in comparison, the kernel needs little time to generate the information in there and grep needs little time to read it.



    In:



    a=$(</proc/meminfo)


    In most shells that support that $(<...) ksh operator, the shell just opens the file and read its content (and strips the trailing newline characters). bash is different and much less efficient in that it forks a process to do that reading and passes the data to the parent via a pipe. But here, it's done once so it doesn't matter.



    In:



    printf '%sn' "$a" | grep '^MemFree'


    The shell needs to spawn two processes, which are running concurrently but interact between each other via a pipe. That pipe creation, tearing down, and writing and reading from it has some little cost. The much greater cost is the spawning of an extra process. The scheduling of the processes has some impact as well.



    You may find that using the zsh <<< operator makes it slightly quicker:



    grep '^MemFree' <<< "$a"


    In zsh and bash, that's done by writing the content of $a in a temporary file, that is less expensive than spawning an extra process, but will probably not give you any gain compared to getting the data straight off /proc/meminfo. That's still less efficient than your approach that copies /proc/meminfo on disk, as the writing of the temp file is done at each iteration.



    dash doesn't support here-strings, but its heredocs are implemented with a pipe that doesn't involve spawning an extra process. In:



     grep '^MemFree' << EOF
    $a
    EOF


    The shell creates a pipe, forks a process. The child executes grep with its stdin as the reading end of the pipe, and the parent writes the content at the other end of the pipe.



    But that pipe handling and process synchronisation is still likely to be more expensive than just getting the data straight off /proc/meminfo.



    The content of /proc/meminfo is short and takes not much time to produce. If you want to save some CPU cycles, you want to remove the expensive parts: forking processes and running external commands.



    Like:



    IFS= read -rd '' meminfo < /proc/meminfo
    memfree=${meminfo#*MemFree:}
    memfree=${memfree%%$'n'*}
    memfree=${memfree#"${memfree%%[! ]*}"}


    Avoid bash though whose pattern matching is very ineficient. With zsh -o extendedglob, you can shorten it to:



    memfree=${${"$(</proc/meminfo)"##*MemFree: #}%%$'n'*}


    Note that ^ is special in many shells (Bourne, fish, rc, es and zsh with the extendedglob option at least), I'd recommend quoting it. Also note that echo can't be used to output arbitrary data (hence my use of printf above).






    share|improve this answer















    grep -oP '^MemFree: *K[0-9]+' /proc/meminfo forks a process that executes grep that opens /proc/meminfo (a virtual file, in memory, no disk I/O involved) reads it and matches the regexp.



    The most expensive part in that is forking the process and loading the grep utility and its library dependencies, doing the dynamic linking, open the locale database, dozens of files that are on disk (but likely cached in memory).



    The part about reading /proc/meminfo is insignificant in comparison, the kernel needs little time to generate the information in there and grep needs little time to read it.



    In:



    a=$(</proc/meminfo)


    In most shells that support that $(<...) ksh operator, the shell just opens the file and read its content (and strips the trailing newline characters). bash is different and much less efficient in that it forks a process to do that reading and passes the data to the parent via a pipe. But here, it's done once so it doesn't matter.



    In:



    printf '%sn' "$a" | grep '^MemFree'


    The shell needs to spawn two processes, which are running concurrently but interact between each other via a pipe. That pipe creation, tearing down, and writing and reading from it has some little cost. The much greater cost is the spawning of an extra process. The scheduling of the processes has some impact as well.



    You may find that using the zsh <<< operator makes it slightly quicker:



    grep '^MemFree' <<< "$a"


    In zsh and bash, that's done by writing the content of $a in a temporary file, that is less expensive than spawning an extra process, but will probably not give you any gain compared to getting the data straight off /proc/meminfo. That's still less efficient than your approach that copies /proc/meminfo on disk, as the writing of the temp file is done at each iteration.



    dash doesn't support here-strings, but its heredocs are implemented with a pipe that doesn't involve spawning an extra process. In:



     grep '^MemFree' << EOF
    $a
    EOF


    The shell creates a pipe, forks a process. The child executes grep with its stdin as the reading end of the pipe, and the parent writes the content at the other end of the pipe.



    But that pipe handling and process synchronisation is still likely to be more expensive than just getting the data straight off /proc/meminfo.



    The content of /proc/meminfo is short and takes not much time to produce. If you want to save some CPU cycles, you want to remove the expensive parts: forking processes and running external commands.



    Like:



    IFS= read -rd '' meminfo < /proc/meminfo
    memfree=${meminfo#*MemFree:}
    memfree=${memfree%%$'n'*}
    memfree=${memfree#"${memfree%%[! ]*}"}


    Avoid bash though whose pattern matching is very ineficient. With zsh -o extendedglob, you can shorten it to:



    memfree=${${"$(</proc/meminfo)"##*MemFree: #}%%$'n'*}


    Note that ^ is special in many shells (Bourne, fish, rc, es and zsh with the extendedglob option at least), I'd recommend quoting it. Also note that echo can't be used to output arbitrary data (hence my use of printf above).







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited yesterday









    terdon

    131k32257436




    131k32257436










    answered yesterday









    Stéphane ChazelasStéphane Chazelas

    306k57580935




    306k57580935








    • 4





      In the case with printf you say the shell needs to spawn two processes, but isn't printf a shell builtin?

      – David Conrad
      yesterday






    • 6





      @DavidConrad It is, but most shells don't try to analyze the pipeline for which parts it could run in the current process. It just forks itself and lets the children figure it out. In this case, the parent process forks twice; the child for the left side then sees a built-in and executes it; the child for the right side sees grep and execs.

      – chepner
      yesterday






    • 1





      @DavidConrad, the pipe is an IPC mechanism, so in any case the two sides will have to run in different processes. While in A | B, there are some shells like AT&T ksh or zsh that run B in the current shell process if it's a builtin or compound or function command, I don't know of any that runs A in the current process. If anything, to do that, they would have to handle SIGPIPE in a complex way as if A was running in child process and without terminating the shell for the behaviour not to be too surprising when B exits early. It's much easier to run B in the parent process.

      – Stéphane Chazelas
      13 hours ago











    • Bash supports <<<

      – D. Ben Knoble
      2 hours ago














    • 4





      In the case with printf you say the shell needs to spawn two processes, but isn't printf a shell builtin?

      – David Conrad
      yesterday






    • 6





      @DavidConrad It is, but most shells don't try to analyze the pipeline for which parts it could run in the current process. It just forks itself and lets the children figure it out. In this case, the parent process forks twice; the child for the left side then sees a built-in and executes it; the child for the right side sees grep and execs.

      – chepner
      yesterday






    • 1





      @DavidConrad, the pipe is an IPC mechanism, so in any case the two sides will have to run in different processes. While in A | B, there are some shells like AT&T ksh or zsh that run B in the current shell process if it's a builtin or compound or function command, I don't know of any that runs A in the current process. If anything, to do that, they would have to handle SIGPIPE in a complex way as if A was running in child process and without terminating the shell for the behaviour not to be too surprising when B exits early. It's much easier to run B in the parent process.

      – Stéphane Chazelas
      13 hours ago











    • Bash supports <<<

      – D. Ben Knoble
      2 hours ago








    4




    4





    In the case with printf you say the shell needs to spawn two processes, but isn't printf a shell builtin?

    – David Conrad
    yesterday





    In the case with printf you say the shell needs to spawn two processes, but isn't printf a shell builtin?

    – David Conrad
    yesterday




    6




    6





    @DavidConrad It is, but most shells don't try to analyze the pipeline for which parts it could run in the current process. It just forks itself and lets the children figure it out. In this case, the parent process forks twice; the child for the left side then sees a built-in and executes it; the child for the right side sees grep and execs.

    – chepner
    yesterday





    @DavidConrad It is, but most shells don't try to analyze the pipeline for which parts it could run in the current process. It just forks itself and lets the children figure it out. In this case, the parent process forks twice; the child for the left side then sees a built-in and executes it; the child for the right side sees grep and execs.

    – chepner
    yesterday




    1




    1





    @DavidConrad, the pipe is an IPC mechanism, so in any case the two sides will have to run in different processes. While in A | B, there are some shells like AT&T ksh or zsh that run B in the current shell process if it's a builtin or compound or function command, I don't know of any that runs A in the current process. If anything, to do that, they would have to handle SIGPIPE in a complex way as if A was running in child process and without terminating the shell for the behaviour not to be too surprising when B exits early. It's much easier to run B in the parent process.

    – Stéphane Chazelas
    13 hours ago





    @DavidConrad, the pipe is an IPC mechanism, so in any case the two sides will have to run in different processes. While in A | B, there are some shells like AT&T ksh or zsh that run B in the current shell process if it's a builtin or compound or function command, I don't know of any that runs A in the current process. If anything, to do that, they would have to handle SIGPIPE in a complex way as if A was running in child process and without terminating the shell for the behaviour not to be too surprising when B exits early. It's much easier to run B in the parent process.

    – Stéphane Chazelas
    13 hours ago













    Bash supports <<<

    – D. Ben Knoble
    2 hours ago





    Bash supports <<<

    – D. Ben Knoble
    2 hours ago













    6














    In your first case you are just using grep utility and finding something from file /proc/meminfo, /proc is a virtual file system so /proc/meminfo file is in the memory, and it requires very little time to fetch its content.



    But in the second case, you are creating a pipe, then passing the first command's output to the second command using this pipe, which is costly.



    The difference is because of /proc (because it is in memory) and pipe, see the example below:



    time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null

    real 0m0.914s
    user 0m0.032s
    sys 0m0.148s


    cat /proc/meminfo > file
    time for i in {1..1000};do grep ^MemFree file;done >/dev/null

    real 0m0.938s
    user 0m0.032s
    sys 0m0.152s


    time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null

    real 0m1.016s
    user 0m0.040s
    sys 0m0.232s





    share|improve this answer






























      6














      In your first case you are just using grep utility and finding something from file /proc/meminfo, /proc is a virtual file system so /proc/meminfo file is in the memory, and it requires very little time to fetch its content.



      But in the second case, you are creating a pipe, then passing the first command's output to the second command using this pipe, which is costly.



      The difference is because of /proc (because it is in memory) and pipe, see the example below:



      time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null

      real 0m0.914s
      user 0m0.032s
      sys 0m0.148s


      cat /proc/meminfo > file
      time for i in {1..1000};do grep ^MemFree file;done >/dev/null

      real 0m0.938s
      user 0m0.032s
      sys 0m0.152s


      time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null

      real 0m1.016s
      user 0m0.040s
      sys 0m0.232s





      share|improve this answer




























        6












        6








        6







        In your first case you are just using grep utility and finding something from file /proc/meminfo, /proc is a virtual file system so /proc/meminfo file is in the memory, and it requires very little time to fetch its content.



        But in the second case, you are creating a pipe, then passing the first command's output to the second command using this pipe, which is costly.



        The difference is because of /proc (because it is in memory) and pipe, see the example below:



        time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null

        real 0m0.914s
        user 0m0.032s
        sys 0m0.148s


        cat /proc/meminfo > file
        time for i in {1..1000};do grep ^MemFree file;done >/dev/null

        real 0m0.938s
        user 0m0.032s
        sys 0m0.152s


        time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null

        real 0m1.016s
        user 0m0.040s
        sys 0m0.232s





        share|improve this answer















        In your first case you are just using grep utility and finding something from file /proc/meminfo, /proc is a virtual file system so /proc/meminfo file is in the memory, and it requires very little time to fetch its content.



        But in the second case, you are creating a pipe, then passing the first command's output to the second command using this pipe, which is costly.



        The difference is because of /proc (because it is in memory) and pipe, see the example below:



        time for i in {1..1000};do grep ^MemFree /proc/meminfo;done >/dev/null

        real 0m0.914s
        user 0m0.032s
        sys 0m0.148s


        cat /proc/meminfo > file
        time for i in {1..1000};do grep ^MemFree file;done >/dev/null

        real 0m0.938s
        user 0m0.032s
        sys 0m0.152s


        time for i in {1..1000};do echo "$a"|grep ^MemFree; done >/dev/null

        real 0m1.016s
        user 0m0.040s
        sys 0m0.232s






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited yesterday









        terdon

        131k32257436




        131k32257436










        answered yesterday









        PRYPRY

        2,53531026




        2,53531026






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f501828%2fwhy-is-opening-a-file-faster-than-reading-variable-content%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Szabolcs (Ungheria) Altri progetti | Menu di navigazione48°10′14.56″N 21°29′33.14″E /...

            Discografia di Klaus Schulze Indice Album in studio | Album dal vivo | Singoli | Antologie | Colonne...

            How to make inet_server_addr() return localhost in spite of ::1/128RETURN NEXT in Postgres FunctionConnect to...