gerbil output file comparison #15

chklopp · 2020-06-23T06:45:13Z

Is there an efficient way to compare large gerbil output files in order to retrieve kmers which are only in one of the two input files?

merbert · 2020-07-07T11:52:12Z

Unfortunately that is not easily possible. It would be possible to treat two input data one after the other with the same hash strategy and then to fix and sort the partitions in the output file. Then the comparison could be done quite efficiently. However, implementing this would be quite complex and unfortunately I don't have time for that at the moment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gerbil output file comparison #15

gerbil output file comparison #15

chklopp commented Jun 23, 2020

merbert commented Jul 7, 2020

gerbil output file comparison #15

gerbil output file comparison #15

Comments

chklopp commented Jun 23, 2020

merbert commented Jul 7, 2020