Excluding or including by awk

What does @ mean in a hostname in DNS configuration?

What happens if both players misunderstand the game state until it's too late?

bash aliases do not expand even with shopt expand_aliases

Is there any danger of my neighbor having my wife's signature?

Spanning tree Priority values

Question: "Are you hungry?" Answer: "I feel like eating."

How can guns be countered by melee combat without raw-ability or exceptional explanations?

How can I persuade an unwilling soul to become willing?

In the Lost in Space intro why was Dr. Smith actor listed as a special guest star?

Can a planet be tidally unlocked?

Reduce Reflections

Buying a "Used" Router

What is the smallest molar volume?

3D buried view in Tikz

Was Opportunity's last message to Earth "My battery is low and it's getting dark"?

Coworker is trying to get me to sign his petition to run for office. How to decline politely?

Can a Hydra make multiple opportunity attacks at once?

UK visa start date and Flight Depature Time

How do I avoid the "chosen hero" feeling?

Isn't a semicolon (';') needed after a function declaration in C++?

Can I do anything else with aspersions other than cast them?

Why is Shelob considered evil?

Is it common to refer to someone as "Prof. Dr. [LastName]"?

What does "don't have a baby" imply or mean in this sentence?



Excluding or including by awk














2












$begingroup$


I have a gtf file as attached enter link description here



By this command one could extract coding parts of genome



awk '{if($3=="transcript" && $20==""protein_coding";"){print $0}}' gencode.gtf


How I could exclude coding parts from this file keeping non coding regions










share|improve this question









$endgroup$

















    2












    $begingroup$


    I have a gtf file as attached enter link description here



    By this command one could extract coding parts of genome



    awk '{if($3=="transcript" && $20==""protein_coding";"){print $0}}' gencode.gtf


    How I could exclude coding parts from this file keeping non coding regions










    share|improve this question









    $endgroup$















      2












      2








      2


      1



      $begingroup$


      I have a gtf file as attached enter link description here



      By this command one could extract coding parts of genome



      awk '{if($3=="transcript" && $20==""protein_coding";"){print $0}}' gencode.gtf


      How I could exclude coding parts from this file keeping non coding regions










      share|improve this question









      $endgroup$




      I have a gtf file as attached enter link description here



      By this command one could extract coding parts of genome



      awk '{if($3=="transcript" && $20==""protein_coding";"){print $0}}' gencode.gtf


      How I could exclude coding parts from this file keeping non coding regions







      linux wgs bash






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 1 hour ago









      Feresh TehFeresh Teh

      38111




      38111






















          1 Answer
          1






          active

          oldest

          votes


















          3












          $begingroup$

          Getting the non coding regions of a protein coding transcript, sounds like you are looking for UTR.



          UTR has its own feature in the gtf file. So you can do this:



          $ awk -v FS="t" '$3=="UTR"' gencode.gtf


          If the gtf file is compressed use this instead:



          $ zcat gencode.gtf.gz | awk -v FS="t" '$3=="UTR"'


          BTW: Why are you using such an old release of gencode? The current version is v29.






          share|improve this answer











          $endgroup$













          • $begingroup$
            Sorry, literally I need non coding regions of human genome, but for asking my question here I referred to coding parts too
            $endgroup$
            – Feresh Teh
            1 hour ago










          • $begingroup$
            Sorry I tried hat but my output is empty
            $endgroup$
            – Feresh Teh
            1 hour ago






          • 1




            $begingroup$
            As @Wouter tells you, the non coding region of a genome is the complement of the coding regions. Coding regions have its own feature in the gtf file. You can get them with $ awk -v FS="t" '$3=="CDS"' gencode.gtf. Reading the manual for bedtools complement is your task.
            $endgroup$
            – finswimmer
            53 mins ago












          • $begingroup$
            Thank you but both of your commands return nothing :(
            $endgroup$
            – Feresh Teh
            47 mins ago











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "676"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7098%2fexcluding-or-including-by-awk%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          3












          $begingroup$

          Getting the non coding regions of a protein coding transcript, sounds like you are looking for UTR.



          UTR has its own feature in the gtf file. So you can do this:



          $ awk -v FS="t" '$3=="UTR"' gencode.gtf


          If the gtf file is compressed use this instead:



          $ zcat gencode.gtf.gz | awk -v FS="t" '$3=="UTR"'


          BTW: Why are you using such an old release of gencode? The current version is v29.






          share|improve this answer











          $endgroup$













          • $begingroup$
            Sorry, literally I need non coding regions of human genome, but for asking my question here I referred to coding parts too
            $endgroup$
            – Feresh Teh
            1 hour ago










          • $begingroup$
            Sorry I tried hat but my output is empty
            $endgroup$
            – Feresh Teh
            1 hour ago






          • 1




            $begingroup$
            As @Wouter tells you, the non coding region of a genome is the complement of the coding regions. Coding regions have its own feature in the gtf file. You can get them with $ awk -v FS="t" '$3=="CDS"' gencode.gtf. Reading the manual for bedtools complement is your task.
            $endgroup$
            – finswimmer
            53 mins ago












          • $begingroup$
            Thank you but both of your commands return nothing :(
            $endgroup$
            – Feresh Teh
            47 mins ago
















          3












          $begingroup$

          Getting the non coding regions of a protein coding transcript, sounds like you are looking for UTR.



          UTR has its own feature in the gtf file. So you can do this:



          $ awk -v FS="t" '$3=="UTR"' gencode.gtf


          If the gtf file is compressed use this instead:



          $ zcat gencode.gtf.gz | awk -v FS="t" '$3=="UTR"'


          BTW: Why are you using such an old release of gencode? The current version is v29.






          share|improve this answer











          $endgroup$













          • $begingroup$
            Sorry, literally I need non coding regions of human genome, but for asking my question here I referred to coding parts too
            $endgroup$
            – Feresh Teh
            1 hour ago










          • $begingroup$
            Sorry I tried hat but my output is empty
            $endgroup$
            – Feresh Teh
            1 hour ago






          • 1




            $begingroup$
            As @Wouter tells you, the non coding region of a genome is the complement of the coding regions. Coding regions have its own feature in the gtf file. You can get them with $ awk -v FS="t" '$3=="CDS"' gencode.gtf. Reading the manual for bedtools complement is your task.
            $endgroup$
            – finswimmer
            53 mins ago












          • $begingroup$
            Thank you but both of your commands return nothing :(
            $endgroup$
            – Feresh Teh
            47 mins ago














          3












          3








          3





          $begingroup$

          Getting the non coding regions of a protein coding transcript, sounds like you are looking for UTR.



          UTR has its own feature in the gtf file. So you can do this:



          $ awk -v FS="t" '$3=="UTR"' gencode.gtf


          If the gtf file is compressed use this instead:



          $ zcat gencode.gtf.gz | awk -v FS="t" '$3=="UTR"'


          BTW: Why are you using such an old release of gencode? The current version is v29.






          share|improve this answer











          $endgroup$



          Getting the non coding regions of a protein coding transcript, sounds like you are looking for UTR.



          UTR has its own feature in the gtf file. So you can do this:



          $ awk -v FS="t" '$3=="UTR"' gencode.gtf


          If the gtf file is compressed use this instead:



          $ zcat gencode.gtf.gz | awk -v FS="t" '$3=="UTR"'


          BTW: Why are you using such an old release of gencode? The current version is v29.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 52 mins ago

























          answered 1 hour ago









          finswimmerfinswimmer

          962210




          962210












          • $begingroup$
            Sorry, literally I need non coding regions of human genome, but for asking my question here I referred to coding parts too
            $endgroup$
            – Feresh Teh
            1 hour ago










          • $begingroup$
            Sorry I tried hat but my output is empty
            $endgroup$
            – Feresh Teh
            1 hour ago






          • 1




            $begingroup$
            As @Wouter tells you, the non coding region of a genome is the complement of the coding regions. Coding regions have its own feature in the gtf file. You can get them with $ awk -v FS="t" '$3=="CDS"' gencode.gtf. Reading the manual for bedtools complement is your task.
            $endgroup$
            – finswimmer
            53 mins ago












          • $begingroup$
            Thank you but both of your commands return nothing :(
            $endgroup$
            – Feresh Teh
            47 mins ago


















          • $begingroup$
            Sorry, literally I need non coding regions of human genome, but for asking my question here I referred to coding parts too
            $endgroup$
            – Feresh Teh
            1 hour ago










          • $begingroup$
            Sorry I tried hat but my output is empty
            $endgroup$
            – Feresh Teh
            1 hour ago






          • 1




            $begingroup$
            As @Wouter tells you, the non coding region of a genome is the complement of the coding regions. Coding regions have its own feature in the gtf file. You can get them with $ awk -v FS="t" '$3=="CDS"' gencode.gtf. Reading the manual for bedtools complement is your task.
            $endgroup$
            – finswimmer
            53 mins ago












          • $begingroup$
            Thank you but both of your commands return nothing :(
            $endgroup$
            – Feresh Teh
            47 mins ago
















          $begingroup$
          Sorry, literally I need non coding regions of human genome, but for asking my question here I referred to coding parts too
          $endgroup$
          – Feresh Teh
          1 hour ago




          $begingroup$
          Sorry, literally I need non coding regions of human genome, but for asking my question here I referred to coding parts too
          $endgroup$
          – Feresh Teh
          1 hour ago












          $begingroup$
          Sorry I tried hat but my output is empty
          $endgroup$
          – Feresh Teh
          1 hour ago




          $begingroup$
          Sorry I tried hat but my output is empty
          $endgroup$
          – Feresh Teh
          1 hour ago




          1




          1




          $begingroup$
          As @Wouter tells you, the non coding region of a genome is the complement of the coding regions. Coding regions have its own feature in the gtf file. You can get them with $ awk -v FS="t" '$3=="CDS"' gencode.gtf. Reading the manual for bedtools complement is your task.
          $endgroup$
          – finswimmer
          53 mins ago






          $begingroup$
          As @Wouter tells you, the non coding region of a genome is the complement of the coding regions. Coding regions have its own feature in the gtf file. You can get them with $ awk -v FS="t" '$3=="CDS"' gencode.gtf. Reading the manual for bedtools complement is your task.
          $endgroup$
          – finswimmer
          53 mins ago














          $begingroup$
          Thank you but both of your commands return nothing :(
          $endgroup$
          – Feresh Teh
          47 mins ago




          $begingroup$
          Thank you but both of your commands return nothing :(
          $endgroup$
          – Feresh Teh
          47 mins ago


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Bioinformatics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7098%2fexcluding-or-including-by-awk%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Szabolcs (Ungheria) Altri progetti | Menu di navigazione48°10′14.56″N 21°29′33.14″E /...

          Discografia di Klaus Schulze Indice Album in studio | Album dal vivo | Singoli | Antologie | Colonne...

          How to make inet_server_addr() return localhost in spite of ::1/128RETURN NEXT in Postgres FunctionConnect to...