Resorting data from a multidimensional listPopulating and storing a very large array in MathematicaRapidly...

bash aliases do not expand even with shopt expand_aliases

Including proofs of known theorems in master's thesis

Can a Way of Shadow Monk use Shadow Step to cling to a dark ceiling?

I am a loser when it comes to jobs, what possibilities do I have?

How many copper coins fit inside a cubic foot?

Boss asked me to sign a resignation paper without a date on it along with my new contract

What is the reward?

What does it mean for south of due west?

PostGIS function to move a polygon to centre over new point coordinates

70s or 80s movie about aliens in a Television

Can I legally make a website about boycotting a certain company?

What really causes series inductance of capacitors?

Whats happened with already installed GNOME apps if I install and run KDE to Ubuntu 18.04?

Is there a configuration of the 8-puzzle where locking a tile makes it harder?

What could cause an entire planet of humans to become aphasic?

Why is perturbation theory used in quantum mechanics?

How can guns be countered by melee combat without raw-ability or exceptional explanations?

What does an unprocessed RAW file look like?

Identical projects by students at two different colleges: still plagiarism?

How bad is a Computer Science course that doesn't teach Design Patterns?

How does holding onto an active but un-used credit card affect your ability to get a loan?

When distributing a Linux kernel driver as source code, what's the difference between Proprietary and GPL license?

Is layered encryption more secure than long passwords?

Have the UK Conservatives lost the working majority and if so, what does this mean?



Resorting data from a multidimensional list


Populating and storing a very large array in MathematicaRapidly binning 3D dataExtracting a range of data from a tableSummarizing SQL TablesPlot a histogram according to a set of rules using data from two tablesQuery in Lists Mathematica - Pivot tableCombining first 5 columns from CSV file into AbsoluteTime dateHow to convert a Number to String for Export a TableFind values in table based on criteriumWhenEvent, Storing Sow data, Loops













2












$begingroup$


I have a dataset that is generated in a 4D table. A snippet would look something like:



{{{{x,y},{x2,y2},...}...}...} and so on, in code form:



data=Table[{RandomReal[{0, 4}], RandomReal[{0, 1}]}, {InTh, 5}, {InE, 6}, backEn, 780}, {backAngle, 0, 89}]


Now, I want to sort/rebin the data in such a way that a program goes over the list and sums y values which lie in a certain x range, let's say my bins are generated by using:



driftBins=Table[i,{i,0,4,0.001}]


The approach I currently have is really poor and uses For loops as I am not really familiar with Mathematica to an expert level.



driftRebinned2 = {};
start = AbsoluteTime[];
For[i = 1, i < Length[driftBins], i++,
summy = 0;
For[InTh = 1, InTh <= 5, InTh++,
For[InE = 1, InE < 6, InE++,
For[backEn = 1, backEn <= 780, backEn++,
For[backAngle = 1,
backAngle <= 90, backAngle++,
If[driftBins[[i + 1]] >
data[[InTh, InE, backEn, backAngle]][[1]] &&
driftBins[[i]] <= data[[InTh, InE, backEn, backAngle]][[1]],
summy = summy + data[[InTh, InE, backEn, backAngle]][[2]]]
]
]
]
]; AppendTo[
driftRebinned2, {(driftBins[[i]] + driftBins[[i + 1]])/2, summy}];
];
AbsoluteTime[] - start


Is there a way to optimize this code or do it in a table? I tried doing it via a table but the issue I encounter there is I am not sure how to sum up y values that lie within a certain x range.



The data generation is just an example. My real data is a result of reading a data file where the values of the data file are weighted by a combination of weighing function which depend on the four iterators.










share|improve this question











$endgroup$












  • $begingroup$
    You could try using Compile along with Internal`Bag. But there is certainly also a high-level approach using tensor routines.
    $endgroup$
    – Henrik Schumacher
    1 hour ago












  • $begingroup$
    What is driftBins? It is currently undefined.
    $endgroup$
    – Henrik Schumacher
    1 hour ago










  • $begingroup$
    Btw.: Your data is much faster generated with data = Join[ RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5 ]...
    $endgroup$
    – Henrik Schumacher
    1 hour ago










  • $begingroup$
    Sorry edited it. Yhe driftTable was supposed to be driftBins. Also the data generation is just an example. My real data comes from data files weighted with respect to a weighing factor dependent on the iterator i.e. inTh, inE, backen, backangle
    $endgroup$
    – WaleeK
    34 mins ago
















2












$begingroup$


I have a dataset that is generated in a 4D table. A snippet would look something like:



{{{{x,y},{x2,y2},...}...}...} and so on, in code form:



data=Table[{RandomReal[{0, 4}], RandomReal[{0, 1}]}, {InTh, 5}, {InE, 6}, backEn, 780}, {backAngle, 0, 89}]


Now, I want to sort/rebin the data in such a way that a program goes over the list and sums y values which lie in a certain x range, let's say my bins are generated by using:



driftBins=Table[i,{i,0,4,0.001}]


The approach I currently have is really poor and uses For loops as I am not really familiar with Mathematica to an expert level.



driftRebinned2 = {};
start = AbsoluteTime[];
For[i = 1, i < Length[driftBins], i++,
summy = 0;
For[InTh = 1, InTh <= 5, InTh++,
For[InE = 1, InE < 6, InE++,
For[backEn = 1, backEn <= 780, backEn++,
For[backAngle = 1,
backAngle <= 90, backAngle++,
If[driftBins[[i + 1]] >
data[[InTh, InE, backEn, backAngle]][[1]] &&
driftBins[[i]] <= data[[InTh, InE, backEn, backAngle]][[1]],
summy = summy + data[[InTh, InE, backEn, backAngle]][[2]]]
]
]
]
]; AppendTo[
driftRebinned2, {(driftBins[[i]] + driftBins[[i + 1]])/2, summy}];
];
AbsoluteTime[] - start


Is there a way to optimize this code or do it in a table? I tried doing it via a table but the issue I encounter there is I am not sure how to sum up y values that lie within a certain x range.



The data generation is just an example. My real data is a result of reading a data file where the values of the data file are weighted by a combination of weighing function which depend on the four iterators.










share|improve this question











$endgroup$












  • $begingroup$
    You could try using Compile along with Internal`Bag. But there is certainly also a high-level approach using tensor routines.
    $endgroup$
    – Henrik Schumacher
    1 hour ago












  • $begingroup$
    What is driftBins? It is currently undefined.
    $endgroup$
    – Henrik Schumacher
    1 hour ago










  • $begingroup$
    Btw.: Your data is much faster generated with data = Join[ RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5 ]...
    $endgroup$
    – Henrik Schumacher
    1 hour ago










  • $begingroup$
    Sorry edited it. Yhe driftTable was supposed to be driftBins. Also the data generation is just an example. My real data comes from data files weighted with respect to a weighing factor dependent on the iterator i.e. inTh, inE, backen, backangle
    $endgroup$
    – WaleeK
    34 mins ago














2












2








2





$begingroup$


I have a dataset that is generated in a 4D table. A snippet would look something like:



{{{{x,y},{x2,y2},...}...}...} and so on, in code form:



data=Table[{RandomReal[{0, 4}], RandomReal[{0, 1}]}, {InTh, 5}, {InE, 6}, backEn, 780}, {backAngle, 0, 89}]


Now, I want to sort/rebin the data in such a way that a program goes over the list and sums y values which lie in a certain x range, let's say my bins are generated by using:



driftBins=Table[i,{i,0,4,0.001}]


The approach I currently have is really poor and uses For loops as I am not really familiar with Mathematica to an expert level.



driftRebinned2 = {};
start = AbsoluteTime[];
For[i = 1, i < Length[driftBins], i++,
summy = 0;
For[InTh = 1, InTh <= 5, InTh++,
For[InE = 1, InE < 6, InE++,
For[backEn = 1, backEn <= 780, backEn++,
For[backAngle = 1,
backAngle <= 90, backAngle++,
If[driftBins[[i + 1]] >
data[[InTh, InE, backEn, backAngle]][[1]] &&
driftBins[[i]] <= data[[InTh, InE, backEn, backAngle]][[1]],
summy = summy + data[[InTh, InE, backEn, backAngle]][[2]]]
]
]
]
]; AppendTo[
driftRebinned2, {(driftBins[[i]] + driftBins[[i + 1]])/2, summy}];
];
AbsoluteTime[] - start


Is there a way to optimize this code or do it in a table? I tried doing it via a table but the issue I encounter there is I am not sure how to sum up y values that lie within a certain x range.



The data generation is just an example. My real data is a result of reading a data file where the values of the data file are weighted by a combination of weighing function which depend on the four iterators.










share|improve this question











$endgroup$




I have a dataset that is generated in a 4D table. A snippet would look something like:



{{{{x,y},{x2,y2},...}...}...} and so on, in code form:



data=Table[{RandomReal[{0, 4}], RandomReal[{0, 1}]}, {InTh, 5}, {InE, 6}, backEn, 780}, {backAngle, 0, 89}]


Now, I want to sort/rebin the data in such a way that a program goes over the list and sums y values which lie in a certain x range, let's say my bins are generated by using:



driftBins=Table[i,{i,0,4,0.001}]


The approach I currently have is really poor and uses For loops as I am not really familiar with Mathematica to an expert level.



driftRebinned2 = {};
start = AbsoluteTime[];
For[i = 1, i < Length[driftBins], i++,
summy = 0;
For[InTh = 1, InTh <= 5, InTh++,
For[InE = 1, InE < 6, InE++,
For[backEn = 1, backEn <= 780, backEn++,
For[backAngle = 1,
backAngle <= 90, backAngle++,
If[driftBins[[i + 1]] >
data[[InTh, InE, backEn, backAngle]][[1]] &&
driftBins[[i]] <= data[[InTh, InE, backEn, backAngle]][[1]],
summy = summy + data[[InTh, InE, backEn, backAngle]][[2]]]
]
]
]
]; AppendTo[
driftRebinned2, {(driftBins[[i]] + driftBins[[i + 1]])/2, summy}];
];
AbsoluteTime[] - start


Is there a way to optimize this code or do it in a table? I tried doing it via a table but the issue I encounter there is I am not sure how to sum up y values that lie within a certain x range.



The data generation is just an example. My real data is a result of reading a data file where the values of the data file are weighted by a combination of weighing function which depend on the four iterators.







list-manipulation table databin






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 31 mins ago







WaleeK

















asked 2 hours ago









WaleeKWaleeK

184




184












  • $begingroup$
    You could try using Compile along with Internal`Bag. But there is certainly also a high-level approach using tensor routines.
    $endgroup$
    – Henrik Schumacher
    1 hour ago












  • $begingroup$
    What is driftBins? It is currently undefined.
    $endgroup$
    – Henrik Schumacher
    1 hour ago










  • $begingroup$
    Btw.: Your data is much faster generated with data = Join[ RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5 ]...
    $endgroup$
    – Henrik Schumacher
    1 hour ago










  • $begingroup$
    Sorry edited it. Yhe driftTable was supposed to be driftBins. Also the data generation is just an example. My real data comes from data files weighted with respect to a weighing factor dependent on the iterator i.e. inTh, inE, backen, backangle
    $endgroup$
    – WaleeK
    34 mins ago


















  • $begingroup$
    You could try using Compile along with Internal`Bag. But there is certainly also a high-level approach using tensor routines.
    $endgroup$
    – Henrik Schumacher
    1 hour ago












  • $begingroup$
    What is driftBins? It is currently undefined.
    $endgroup$
    – Henrik Schumacher
    1 hour ago










  • $begingroup$
    Btw.: Your data is much faster generated with data = Join[ RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5 ]...
    $endgroup$
    – Henrik Schumacher
    1 hour ago










  • $begingroup$
    Sorry edited it. Yhe driftTable was supposed to be driftBins. Also the data generation is just an example. My real data comes from data files weighted with respect to a weighing factor dependent on the iterator i.e. inTh, inE, backen, backangle
    $endgroup$
    – WaleeK
    34 mins ago
















$begingroup$
You could try using Compile along with Internal`Bag. But there is certainly also a high-level approach using tensor routines.
$endgroup$
– Henrik Schumacher
1 hour ago






$begingroup$
You could try using Compile along with Internal`Bag. But there is certainly also a high-level approach using tensor routines.
$endgroup$
– Henrik Schumacher
1 hour ago














$begingroup$
What is driftBins? It is currently undefined.
$endgroup$
– Henrik Schumacher
1 hour ago




$begingroup$
What is driftBins? It is currently undefined.
$endgroup$
– Henrik Schumacher
1 hour ago












$begingroup$
Btw.: Your data is much faster generated with data = Join[ RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5 ]...
$endgroup$
– Henrik Schumacher
1 hour ago




$begingroup$
Btw.: Your data is much faster generated with data = Join[ RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5 ]...
$endgroup$
– Henrik Schumacher
1 hour ago












$begingroup$
Sorry edited it. Yhe driftTable was supposed to be driftBins. Also the data generation is just an example. My real data comes from data files weighted with respect to a weighing factor dependent on the iterator i.e. inTh, inE, backen, backangle
$endgroup$
– WaleeK
34 mins ago




$begingroup$
Sorry edited it. Yhe driftTable was supposed to be driftBins. Also the data generation is just an example. My real data comes from data files weighted with respect to a weighing factor dependent on the iterator i.e. inTh, inE, backen, backangle
$endgroup$
– WaleeK
34 mins ago










2 Answers
2






active

oldest

votes


















4












$begingroup$

I haven't checked this for correctness, but in runs in about a second:



data = Join[
RandomReal[{0, 4}, {5, 6, 780, 90, 1}],
RandomReal[{0, 1}, {5, 6, 780, 90, 1}],
5
];


Because the nested structure of data is not need anywhere, I flatten it to a matrix. Afterwards I employ Nearest to find the correct bin for each entry of data1 (exploting that the bins are equally spaced). Afterwards, a compiled routine assemble adds the contributions for each bin into a large vector.



data1 = Flatten[data, 3];

driftBins = Table[i, {i, 0., 4., 0.001}];
bincenters = MovingAverage[driftBins, 2];
binradii = 0.5 (driftBins[[2]] - driftBins[[1]]);
idx = Developer`ToPackedArray[
Nearest[bincenters -> Automatic, data1[[All, 1]], {1, binradii}]
];


assemble =
Compile[{{idx, _Integer, 1}, {y, _Real, 1}, {n, _Integer}},
Block[{a},
a = Table[0., {n}];
Do[a[[idx[[i]]]] += y[[i]], {i, 1, Length[idx]}];
a
]
];

symmylist = assemble[Flatten[idx], data1[[All, 2]], Length[bincenters]];
driftRebinned2 = Transpose[{bincenters, symmylist}];


A somewhat more flexible approach is by using BinLists:



sum = Compile[{{a, _Real, 2}},
Total[a[[All, 2]]],
RuntimeAttributes -> {Listable},
Parallelization -> True
];

symmylist3 = sum[
Flatten[
BinLists[Flatten[data, 3];, {driftBins}, {{-∞, ∞}}],
1
]
];
driftRebinned3 = Transpose[{bincenters, symmylist3}];





share|improve this answer











$endgroup$





















    3












    $begingroup$

    I'll assume here that the bins are equally spaced.



    Starting like @HenrikSchumacher suggests:



    data = Join[RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5];
    data1 = Flatten[data, 3];


    If your data need to be weighted depending on their four indices, you could instead first do something like



    weighteddata = MapIndexed[{#1[[1]], f[#1[[2]],#2]}&, data, {4}];


    with f some function you define that does the weighting. Then use data1 = Flatten[weighteddata, 3] instead.



    For each data point, round the $x$-coordinate down to the nearest thousandth (you may use Round or Ceiling instead of Floor to define the bins, depending on what exactly you need):



    data2 = {Floor[#[[1]], 0.001], #[[2]]} & /@ data1;


    Gather together all data points with the same $x$-coordinate (rounded down to the nearest thousandth):



    A = GatherBy[data2, First];


    From these gathered lists, calculate for each bin (i) the $x$-value of the lower bin edge (or the center of the bin, or whatever) and (ii) the sum of the $y$-values (or the mean, or length, or whatever):



    B = (#[[1, 1]] -> Total[#[[All, 2]]]) & /@ A;


    The results in B are in random order.
    Make an ordered list of all bins and their sum:



    bins = Range[0, 4, 0.001];
    Transpose[{bins, Lookup[B, bins, 0]}]



    {{0, ...}, {0.001, ...}, {0.002, ...}, ..., {4, ...}}




    (The dots stand for the sum values in each bin.)



    If you prefer having the bin centers as first coordinate instead of the bin lower limits, then you could do



    bins = Range[0, 3.999, 0.001];
    Transpose[{bins+0.0005, Lookup[B, bins, 0]}]



    {{0.0005, ...}, {0.0015, ...}, {0.0025, ...}, ..., {3.9995, ...}}




    I agree with @HenrikSchumacher that BinLists is nicer than my use of GatherBy here.






    share|improve this answer











    $endgroup$













      Your Answer





      StackExchange.ifUsing("editor", function () {
      return StackExchange.using("mathjaxEditing", function () {
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
      });
      });
      }, "mathjax-editing");

      StackExchange.ready(function() {
      var channelOptions = {
      tags: "".split(" "),
      id: "387"
      };
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function() {
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled) {
      StackExchange.using("snippets", function() {
      createEditor();
      });
      }
      else {
      createEditor();
      }
      });

      function createEditor() {
      StackExchange.prepareEditor({
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader: {
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      },
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      });


      }
      });














      draft saved

      draft discarded


















      StackExchange.ready(
      function () {
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f192019%2fresorting-data-from-a-multidimensional-list%23new-answer', 'question_page');
      }
      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      4












      $begingroup$

      I haven't checked this for correctness, but in runs in about a second:



      data = Join[
      RandomReal[{0, 4}, {5, 6, 780, 90, 1}],
      RandomReal[{0, 1}, {5, 6, 780, 90, 1}],
      5
      ];


      Because the nested structure of data is not need anywhere, I flatten it to a matrix. Afterwards I employ Nearest to find the correct bin for each entry of data1 (exploting that the bins are equally spaced). Afterwards, a compiled routine assemble adds the contributions for each bin into a large vector.



      data1 = Flatten[data, 3];

      driftBins = Table[i, {i, 0., 4., 0.001}];
      bincenters = MovingAverage[driftBins, 2];
      binradii = 0.5 (driftBins[[2]] - driftBins[[1]]);
      idx = Developer`ToPackedArray[
      Nearest[bincenters -> Automatic, data1[[All, 1]], {1, binradii}]
      ];


      assemble =
      Compile[{{idx, _Integer, 1}, {y, _Real, 1}, {n, _Integer}},
      Block[{a},
      a = Table[0., {n}];
      Do[a[[idx[[i]]]] += y[[i]], {i, 1, Length[idx]}];
      a
      ]
      ];

      symmylist = assemble[Flatten[idx], data1[[All, 2]], Length[bincenters]];
      driftRebinned2 = Transpose[{bincenters, symmylist}];


      A somewhat more flexible approach is by using BinLists:



      sum = Compile[{{a, _Real, 2}},
      Total[a[[All, 2]]],
      RuntimeAttributes -> {Listable},
      Parallelization -> True
      ];

      symmylist3 = sum[
      Flatten[
      BinLists[Flatten[data, 3];, {driftBins}, {{-∞, ∞}}],
      1
      ]
      ];
      driftRebinned3 = Transpose[{bincenters, symmylist3}];





      share|improve this answer











      $endgroup$


















        4












        $begingroup$

        I haven't checked this for correctness, but in runs in about a second:



        data = Join[
        RandomReal[{0, 4}, {5, 6, 780, 90, 1}],
        RandomReal[{0, 1}, {5, 6, 780, 90, 1}],
        5
        ];


        Because the nested structure of data is not need anywhere, I flatten it to a matrix. Afterwards I employ Nearest to find the correct bin for each entry of data1 (exploting that the bins are equally spaced). Afterwards, a compiled routine assemble adds the contributions for each bin into a large vector.



        data1 = Flatten[data, 3];

        driftBins = Table[i, {i, 0., 4., 0.001}];
        bincenters = MovingAverage[driftBins, 2];
        binradii = 0.5 (driftBins[[2]] - driftBins[[1]]);
        idx = Developer`ToPackedArray[
        Nearest[bincenters -> Automatic, data1[[All, 1]], {1, binradii}]
        ];


        assemble =
        Compile[{{idx, _Integer, 1}, {y, _Real, 1}, {n, _Integer}},
        Block[{a},
        a = Table[0., {n}];
        Do[a[[idx[[i]]]] += y[[i]], {i, 1, Length[idx]}];
        a
        ]
        ];

        symmylist = assemble[Flatten[idx], data1[[All, 2]], Length[bincenters]];
        driftRebinned2 = Transpose[{bincenters, symmylist}];


        A somewhat more flexible approach is by using BinLists:



        sum = Compile[{{a, _Real, 2}},
        Total[a[[All, 2]]],
        RuntimeAttributes -> {Listable},
        Parallelization -> True
        ];

        symmylist3 = sum[
        Flatten[
        BinLists[Flatten[data, 3];, {driftBins}, {{-∞, ∞}}],
        1
        ]
        ];
        driftRebinned3 = Transpose[{bincenters, symmylist3}];





        share|improve this answer











        $endgroup$
















          4












          4








          4





          $begingroup$

          I haven't checked this for correctness, but in runs in about a second:



          data = Join[
          RandomReal[{0, 4}, {5, 6, 780, 90, 1}],
          RandomReal[{0, 1}, {5, 6, 780, 90, 1}],
          5
          ];


          Because the nested structure of data is not need anywhere, I flatten it to a matrix. Afterwards I employ Nearest to find the correct bin for each entry of data1 (exploting that the bins are equally spaced). Afterwards, a compiled routine assemble adds the contributions for each bin into a large vector.



          data1 = Flatten[data, 3];

          driftBins = Table[i, {i, 0., 4., 0.001}];
          bincenters = MovingAverage[driftBins, 2];
          binradii = 0.5 (driftBins[[2]] - driftBins[[1]]);
          idx = Developer`ToPackedArray[
          Nearest[bincenters -> Automatic, data1[[All, 1]], {1, binradii}]
          ];


          assemble =
          Compile[{{idx, _Integer, 1}, {y, _Real, 1}, {n, _Integer}},
          Block[{a},
          a = Table[0., {n}];
          Do[a[[idx[[i]]]] += y[[i]], {i, 1, Length[idx]}];
          a
          ]
          ];

          symmylist = assemble[Flatten[idx], data1[[All, 2]], Length[bincenters]];
          driftRebinned2 = Transpose[{bincenters, symmylist}];


          A somewhat more flexible approach is by using BinLists:



          sum = Compile[{{a, _Real, 2}},
          Total[a[[All, 2]]],
          RuntimeAttributes -> {Listable},
          Parallelization -> True
          ];

          symmylist3 = sum[
          Flatten[
          BinLists[Flatten[data, 3];, {driftBins}, {{-∞, ∞}}],
          1
          ]
          ];
          driftRebinned3 = Transpose[{bincenters, symmylist3}];





          share|improve this answer











          $endgroup$



          I haven't checked this for correctness, but in runs in about a second:



          data = Join[
          RandomReal[{0, 4}, {5, 6, 780, 90, 1}],
          RandomReal[{0, 1}, {5, 6, 780, 90, 1}],
          5
          ];


          Because the nested structure of data is not need anywhere, I flatten it to a matrix. Afterwards I employ Nearest to find the correct bin for each entry of data1 (exploting that the bins are equally spaced). Afterwards, a compiled routine assemble adds the contributions for each bin into a large vector.



          data1 = Flatten[data, 3];

          driftBins = Table[i, {i, 0., 4., 0.001}];
          bincenters = MovingAverage[driftBins, 2];
          binradii = 0.5 (driftBins[[2]] - driftBins[[1]]);
          idx = Developer`ToPackedArray[
          Nearest[bincenters -> Automatic, data1[[All, 1]], {1, binradii}]
          ];


          assemble =
          Compile[{{idx, _Integer, 1}, {y, _Real, 1}, {n, _Integer}},
          Block[{a},
          a = Table[0., {n}];
          Do[a[[idx[[i]]]] += y[[i]], {i, 1, Length[idx]}];
          a
          ]
          ];

          symmylist = assemble[Flatten[idx], data1[[All, 2]], Length[bincenters]];
          driftRebinned2 = Transpose[{bincenters, symmylist}];


          A somewhat more flexible approach is by using BinLists:



          sum = Compile[{{a, _Real, 2}},
          Total[a[[All, 2]]],
          RuntimeAttributes -> {Listable},
          Parallelization -> True
          ];

          symmylist3 = sum[
          Flatten[
          BinLists[Flatten[data, 3];, {driftBins}, {{-∞, ∞}}],
          1
          ]
          ];
          driftRebinned3 = Transpose[{bincenters, symmylist3}];






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 51 mins ago

























          answered 1 hour ago









          Henrik SchumacherHenrik Schumacher

          54.3k472150




          54.3k472150























              3












              $begingroup$

              I'll assume here that the bins are equally spaced.



              Starting like @HenrikSchumacher suggests:



              data = Join[RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5];
              data1 = Flatten[data, 3];


              If your data need to be weighted depending on their four indices, you could instead first do something like



              weighteddata = MapIndexed[{#1[[1]], f[#1[[2]],#2]}&, data, {4}];


              with f some function you define that does the weighting. Then use data1 = Flatten[weighteddata, 3] instead.



              For each data point, round the $x$-coordinate down to the nearest thousandth (you may use Round or Ceiling instead of Floor to define the bins, depending on what exactly you need):



              data2 = {Floor[#[[1]], 0.001], #[[2]]} & /@ data1;


              Gather together all data points with the same $x$-coordinate (rounded down to the nearest thousandth):



              A = GatherBy[data2, First];


              From these gathered lists, calculate for each bin (i) the $x$-value of the lower bin edge (or the center of the bin, or whatever) and (ii) the sum of the $y$-values (or the mean, or length, or whatever):



              B = (#[[1, 1]] -> Total[#[[All, 2]]]) & /@ A;


              The results in B are in random order.
              Make an ordered list of all bins and their sum:



              bins = Range[0, 4, 0.001];
              Transpose[{bins, Lookup[B, bins, 0]}]



              {{0, ...}, {0.001, ...}, {0.002, ...}, ..., {4, ...}}




              (The dots stand for the sum values in each bin.)



              If you prefer having the bin centers as first coordinate instead of the bin lower limits, then you could do



              bins = Range[0, 3.999, 0.001];
              Transpose[{bins+0.0005, Lookup[B, bins, 0]}]



              {{0.0005, ...}, {0.0015, ...}, {0.0025, ...}, ..., {3.9995, ...}}




              I agree with @HenrikSchumacher that BinLists is nicer than my use of GatherBy here.






              share|improve this answer











              $endgroup$


















                3












                $begingroup$

                I'll assume here that the bins are equally spaced.



                Starting like @HenrikSchumacher suggests:



                data = Join[RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5];
                data1 = Flatten[data, 3];


                If your data need to be weighted depending on their four indices, you could instead first do something like



                weighteddata = MapIndexed[{#1[[1]], f[#1[[2]],#2]}&, data, {4}];


                with f some function you define that does the weighting. Then use data1 = Flatten[weighteddata, 3] instead.



                For each data point, round the $x$-coordinate down to the nearest thousandth (you may use Round or Ceiling instead of Floor to define the bins, depending on what exactly you need):



                data2 = {Floor[#[[1]], 0.001], #[[2]]} & /@ data1;


                Gather together all data points with the same $x$-coordinate (rounded down to the nearest thousandth):



                A = GatherBy[data2, First];


                From these gathered lists, calculate for each bin (i) the $x$-value of the lower bin edge (or the center of the bin, or whatever) and (ii) the sum of the $y$-values (or the mean, or length, or whatever):



                B = (#[[1, 1]] -> Total[#[[All, 2]]]) & /@ A;


                The results in B are in random order.
                Make an ordered list of all bins and their sum:



                bins = Range[0, 4, 0.001];
                Transpose[{bins, Lookup[B, bins, 0]}]



                {{0, ...}, {0.001, ...}, {0.002, ...}, ..., {4, ...}}




                (The dots stand for the sum values in each bin.)



                If you prefer having the bin centers as first coordinate instead of the bin lower limits, then you could do



                bins = Range[0, 3.999, 0.001];
                Transpose[{bins+0.0005, Lookup[B, bins, 0]}]



                {{0.0005, ...}, {0.0015, ...}, {0.0025, ...}, ..., {3.9995, ...}}




                I agree with @HenrikSchumacher that BinLists is nicer than my use of GatherBy here.






                share|improve this answer











                $endgroup$
















                  3












                  3








                  3





                  $begingroup$

                  I'll assume here that the bins are equally spaced.



                  Starting like @HenrikSchumacher suggests:



                  data = Join[RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5];
                  data1 = Flatten[data, 3];


                  If your data need to be weighted depending on their four indices, you could instead first do something like



                  weighteddata = MapIndexed[{#1[[1]], f[#1[[2]],#2]}&, data, {4}];


                  with f some function you define that does the weighting. Then use data1 = Flatten[weighteddata, 3] instead.



                  For each data point, round the $x$-coordinate down to the nearest thousandth (you may use Round or Ceiling instead of Floor to define the bins, depending on what exactly you need):



                  data2 = {Floor[#[[1]], 0.001], #[[2]]} & /@ data1;


                  Gather together all data points with the same $x$-coordinate (rounded down to the nearest thousandth):



                  A = GatherBy[data2, First];


                  From these gathered lists, calculate for each bin (i) the $x$-value of the lower bin edge (or the center of the bin, or whatever) and (ii) the sum of the $y$-values (or the mean, or length, or whatever):



                  B = (#[[1, 1]] -> Total[#[[All, 2]]]) & /@ A;


                  The results in B are in random order.
                  Make an ordered list of all bins and their sum:



                  bins = Range[0, 4, 0.001];
                  Transpose[{bins, Lookup[B, bins, 0]}]



                  {{0, ...}, {0.001, ...}, {0.002, ...}, ..., {4, ...}}




                  (The dots stand for the sum values in each bin.)



                  If you prefer having the bin centers as first coordinate instead of the bin lower limits, then you could do



                  bins = Range[0, 3.999, 0.001];
                  Transpose[{bins+0.0005, Lookup[B, bins, 0]}]



                  {{0.0005, ...}, {0.0015, ...}, {0.0025, ...}, ..., {3.9995, ...}}




                  I agree with @HenrikSchumacher that BinLists is nicer than my use of GatherBy here.






                  share|improve this answer











                  $endgroup$



                  I'll assume here that the bins are equally spaced.



                  Starting like @HenrikSchumacher suggests:



                  data = Join[RandomReal[{0, 4}, {5, 6, 780, 90, 1}], RandomReal[{0, 1}, {5, 6, 780, 90, 1}], 5];
                  data1 = Flatten[data, 3];


                  If your data need to be weighted depending on their four indices, you could instead first do something like



                  weighteddata = MapIndexed[{#1[[1]], f[#1[[2]],#2]}&, data, {4}];


                  with f some function you define that does the weighting. Then use data1 = Flatten[weighteddata, 3] instead.



                  For each data point, round the $x$-coordinate down to the nearest thousandth (you may use Round or Ceiling instead of Floor to define the bins, depending on what exactly you need):



                  data2 = {Floor[#[[1]], 0.001], #[[2]]} & /@ data1;


                  Gather together all data points with the same $x$-coordinate (rounded down to the nearest thousandth):



                  A = GatherBy[data2, First];


                  From these gathered lists, calculate for each bin (i) the $x$-value of the lower bin edge (or the center of the bin, or whatever) and (ii) the sum of the $y$-values (or the mean, or length, or whatever):



                  B = (#[[1, 1]] -> Total[#[[All, 2]]]) & /@ A;


                  The results in B are in random order.
                  Make an ordered list of all bins and their sum:



                  bins = Range[0, 4, 0.001];
                  Transpose[{bins, Lookup[B, bins, 0]}]



                  {{0, ...}, {0.001, ...}, {0.002, ...}, ..., {4, ...}}




                  (The dots stand for the sum values in each bin.)



                  If you prefer having the bin centers as first coordinate instead of the bin lower limits, then you could do



                  bins = Range[0, 3.999, 0.001];
                  Transpose[{bins+0.0005, Lookup[B, bins, 0]}]



                  {{0.0005, ...}, {0.0015, ...}, {0.0025, ...}, ..., {3.9995, ...}}




                  I agree with @HenrikSchumacher that BinLists is nicer than my use of GatherBy here.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 23 mins ago

























                  answered 49 mins ago









                  RomanRoman

                  1,666614




                  1,666614






























                      draft saved

                      draft discarded




















































                      Thanks for contributing an answer to Mathematica Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid



                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.


                      Use MathJax to format equations. MathJax reference.


                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function () {
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f192019%2fresorting-data-from-a-multidimensional-list%23new-answer', 'question_page');
                      }
                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      Discografia di Klaus Schulze Indice Album in studio | Album dal vivo | Singoli | Antologie | Colonne...

                      Armoriale delle famiglie italiane (Car) Indice Armi | Bibliografia | Menu di navigazioneBlasone...

                      Lupi Siderali Indice Storia | Organizzazione | La Tredicesima Compagnia | Aspetto | Membri Importanti...