Draft: Add color.h test #524

jodavies · 2024-05-21T19:15:21Z

This test seems trickier than expected. Currently:

mpirun -np {1,2} parform: OK
mpirun -np {3,4} parform: hang?
mpirun -np {5,6} parform: crash

valgrind vorm: OK
valgrind tvorm -w2: mostly ~~hangs~~ takes 100-200s, sometimes finishes in 2s
valgrind tvorm -w4: OK, finishes in 2s

The CI sees valgrind errors in vorm, tvorm that I can't reproduce locally. Edit: I can reproduce them on ubuntu 20.04 (as the runners are using) but not in 22.04.

jodavies · 2024-05-22T14:35:21Z

I added some print statements to PF_UnpackRedefinedPreVars to try to work out what happens. There are many successful redefines, and then we have:

0 i = 0
0 trying to redefine ik1 (35) to 1
0    loop j = 0; j < 2
0       AC.pfirstnum[0] (ik1c) (37), index 35
0       AC.pfirstnum[1] (adj) (38), index 35

so it fails to find the variable it is trying to redefine in pfirstnum. So then it evaluates

if ( AC.inputnumbers[j] < inputnumber ) {

for j=2, causing the Conditional jump or move depends on uninitialised value(s).

So, why is ik1 no longer in the AC.pfirstnum array? Earlier in the program it was there (and had index 35).

tueda · 2024-05-22T23:01:44Z

Thanks for the investigation. I will look into the ParFORM issue. (You know, in programming, the person you were a month ago is a stranger. Then, the person you were more than 10 years ago is...)

By the way, maybe this (the code that gives Valgrind error) should be broken up into small unit tests.

jodavies · 2024-06-12T11:08:36Z

It seems the problem with valgrind and tvorm -w2 is due to the load balancing. The same issue happens with -w3. If I run this test under callgrind, we see that ThreadsProcessor makes 100s of millions of calls of LoadReadjusted (which is stealing terms from the working thread and distributing them around the idle threads) which also involve locks.

With w4, there are only ~400 calls of LoadReadjusted.

Edit: it is some kind of race condition though it seems: if I add a MesPrint in LoadReadjusted it prints only ~400 times, even when running under valgrind.

The easiest solution is to just disable valgrind for this test...

Don't run this test with tform and less than 4 threads, it goes extremely slowly. See discussion in vermaseren#524 .

jodavies · 2024-06-13T09:36:07Z

Once #525 is merged we can rebase this on top and the parform tests will run successfully also.

jodavies · 2024-10-16T11:05:29Z

Here is a simpler way to reproduce the load-balancing issue under valgrind. I cut the expression out of the color test.

This script does basically nothing at all, but on my system valgrind tform -w2 test-lb.frm takes about 30 seconds and results in millions of calls to LoadReadjusted and its pthread mutex. Using 4 threads or regular form, it finishes in a fraction of a second.

#-
#include color.h

Local Q10 =
    + NA*I2R*cR*[cR-cA/2]^3
    - 1/2*NA*I2R*cA*[cR-cA/2]^3
    - 3*NA*I2R*cA*cR*[cR-cA/2]^2
    + 5/2*NA*I2R*cA^2*[cR-cA/2]^2
    + 7/4*NA*I2R*cA^2*cR*[cR-cA/2]
    - 25/12*NA*I2R*cA^3*[cR-cA/2]
    - 1/12*NA*I2R*cA^3*cR
    + 19/48*NA*I2R*cA^4
    + 5*cOlTt(cOlpA1,cOlpA1,cOlpA1,cOlpA1)*[cR-cA/2]
    - 17/6*cOlTt(cOlpA1,cOlpA1,cOlpA1,cOlpA1)*cA
    + 2*cOlTt(cOlpA1,cOlpA1,cOlpA1,N1_?,N2_?)*f(cOlpA1,N1_?,N2_?)*i_
    + cOlTt(cOlpA1,cOlpA1,N1_?,cOlpA1,N2_?)*f(cOlpA1,N1_?,N2_?)*i_
    + cOlTt(cOlpA1,cOlpA1,N1_?,N2_?)*f(cOlpA1,N1_?,N3_?)*f(cOlpA1,N2_?,N3_?)
    - 1/6*cOlTt(N1_?,N2_?,N3_?,N4_?,N5_?)*f(N1_?,N3_?,N6_?)*f(N2_?,N5_?,N7_?)*f(N4_?,N6_?,N7_?)*i_*cA
    + 1/6*cOlTt(N1_?,N2_?,N3_?,N4_?,N5_?)*f(N1_?,N5_?,N6_?)*f(N2_?,N3_?,N7_?)*f(N4_?,N6_?,N7_?)*i_*cA
    - 1/12*cOlTt(N1_?,N2_?,N3_?,N4_?,N5_?)*f(N1_?,N6_?,N7_?)*f(N2_?,N3_?,N6_?)*f(N4_?,N5_?,N7_?)*i_*cA
    - 1/12*cOlTt(N1_?,N2_?,N3_?,N4_?,N5_?)*f(N1_?,N6_?,N7_?)*f(N2_?,N3_?,N7_?)*f(N4_?,N5_?,N6_?)*i_*cA
    - 1/12*cOlTt(N1_?,N2_?,N3_?,N4_?,N5_?)*f(N1_?,N6_?,N7_?)*f(N2_?,N5_?,N6_?)*f(N3_?,N4_?,N7_?)*i_*cA
    - 1/12*cOlTt(N1_?,N2_?,N3_?,N4_?,N5_?)*f(N1_?,N6_?,N7_?)*f(N2_?,N5_?,N7_?)*f(N3_?,N4_?,N6_?)*i_*cA
    - 2/3*cOlTt(N1_?,N2_?,N3_?,N4_?)*f(N1_?,N2_?,N5_?)*f(N3_?,N4_?,N5_?)*cA*[cR-cA/2]
    + 7/18*cOlTt(N1_?,N2_?,N3_?,N4_?)*f(N1_?,N2_?,N5_?)*f(N3_?,N4_?,N5_?)*cA^2
    + 2/3*cOlTt(N1_?,N2_?,N3_?,N4_?)*f(N1_?,N4_?,N5_?)*f(N2_?,N3_?,N5_?)*cA*[cR-cA/2]
    - 7/18*cOlTt(N1_?,N2_?,N3_?,N4_?)*f(N1_?,N4_?,N5_?)*f(N2_?,N3_?,N5_?)*cA^2
   ;
.sort

Print +s;
.end

Don't run this test with tform and less than 4 threads, it goes extremely slowly. See discussion in vermaseren#524 .

coveralls · 2024-11-14T23:20:56Z

coverage: 53.172% (+2.9%) from 50.227%
when pulling 9c77706 on jodavies:test-color
into 0289435 on vermaseren:master.

tueda · 2024-11-15T11:10:29Z

Do you eventually want to put color.frm and color.h here?

The former does not comply with the rule in README (modulo user.frm). The latter is a large file and is originally distributed at the Nikhef site.

Perhaps the rule could be reconsidered and rewritten. Maybe the check.rb script could include a functionality to automatically download libraries as needed.

jodavies · 2024-11-15T11:15:13Z

color.h could be fetched by the CI config as forcer.h currently is, and the test run in the same way? It doesn't then run locally with a make check then, but is still tested on commit.

tueda · 2024-11-15T11:42:34Z

If we want to run the CI tests for color.frm in the same way as we do for forcer.frm (and it is not necessary for everyone to test color.frm locally), then we can configure this in test.yaml.

For running Valgrind/coverage jobs for color.frm on the CI, currently I'm not sure if it is ready to use, as I forgot if I wrote it in an extendable and general way. Forcer is not compatible with Valgrind, so I didn't need it.

jodavies · 2024-11-15T12:53:02Z

What do you mean, by forcer not being compatible with valgrind?

tueda · 2024-11-15T13:11:18Z

Forcer always hits memory leaks of FORM, if my memory is correct.

jodavies · 2024-11-15T13:22:49Z

OK, I will try to test that. Ideally both the color and forcer tests can pass valgrind too.

tueda · 2024-11-15T13:31:13Z

Here is the issue that is problematic for Forcer with Valgrind: #252

jodavies · 2024-11-15T20:31:50Z

Currently the forcer test (and color, if it moves to the same status) are not part of the coveralls statistic. Perhaps they should also go into that run. At least on my machine, the forcer test running with vorm takes only 8s.

jodavies · 2024-11-29T16:48:45Z

Do you think it makes sense to keep things like forcer, color etc all in separate directories, or would it be better to create a single directory "extra" (for eg) with such tests in? Then the CI can run everything in the extra dir (but not for the valgrind run).

tueda · 2024-11-29T21:30:21Z

I think collecting extra test cases into the extra directory is a good idea. We have the benchmarks directory in #541. So, the same can be done for the extra tests. (Note: formunit/fu.frm is a benchmark test.)

Also fix distribution files

jodavies · 2024-12-03T14:35:33Z

@tueda how does this seem? Now "extra" runs both the forcer and color test, and I added them to the coverage run. Further tests which require external libraries or otherwise somehow don't fit in the default tests could also go into "extra".

tueda · 2024-12-03T23:31:23Z

Oh, now I see what you meant, but just to clarify: (1) You don't want to collect checkpoint and multithreaded into extra, do you? And big tests will be put in extra. In future, adding many big tests in extra in this way may slow down this job and we would need to split it again. (2) Will you add Valgrind check for extra? See:

form/.github/workflows/test.yml

Line 181 in 3262c64

# we measure code coverage only for tests checked with Valgrind.

That said, this principle may be open to discussion.

By the way, in future, duplicated code in workflows may be rewritten with reusable workflows.

jodavies · 2024-12-04T09:35:30Z

In principle checkpoint and multithreaded could move into extra also. Then the yaml file is a bit simpler in that there are fewer "special" jobs. With this structure it is also easier to add tests to extra in the future, since one only needs to possibly add to the libraries to download (I had in mind tests for mincer, harmpol, summer, Opiter?).

If they take a long time, can't we just split the runs up with the "groups" mechanism that the valgrind tests already use?

In principle it would be nice to add valgrind tests for extra (currently as you say, we would need to skip the forcer test, and also the color test when running under tform).

Is there a particular reason why the coverage check should only be for the tests which run under valgrind? This isn't actually quite the case at the moment anyway, since a few tests are skipped for the valgrind run but run anyway with lcov.

tueda · 2024-12-04T10:55:34Z

OK, placing all the extra tests in extra is beneficial. I agree.

Is there a particular reason why the coverage check should only be for the tests which run under valgrind?

It depends on how we define the scope of test coverage. Measuring coverage for Valgrind-validated tests ensures that our tests include verification of the absence of memory errors, at least those detectable by Valgrind (though currently there are #pend_if for valgrind?, which skip this verification during coverage measurement).

jodavies · 2024-12-04T11:12:27Z

Right, this does make sense; it is good to know the % of code which is both covered and valgrind-error-free. In this case is there a way to fake the valgrind flag, so that the tests which are #pend_if valgrind? are also skipped in the lcov run?

Then all test runs could "just run everything", and rely on the #pend_if flag to do the skipping. Then we could have, for eg, the color test running with valgrind vorm but not tvorm for now.

tueda · 2024-12-04T11:19:36Z

is there a way to fake the valgrind flag

I think I can add a command option for this. --fake-valgrind or do you have a better name?

jodavies · 2024-12-04T11:26:15Z

--fake-valgrind sounds fine to me. So once that exists, the next steps would be:

move multithreaded and checkpoint to "extra"
run the "extra" tests also in the valgrind run, with #pend_if valgrind? where necessary (eg the forcer test)
run the lcov checks with --fake-valgrind

More generally, is it possible to have a "build" job which caches the executables, which are then used by all of the checks? Currently all 35 valgrind jobs compile form, as I understand it. Maybe this will improve performance a bit, though I expect some of the harder valgrind tests will still be the bottleneck.

See discussion in vermaseren#524.

tueda · 2024-12-05T03:28:04Z

is it possible to have a "build" job which caches the executables, which are then used by all of the checks?

This might be worth a try. But there could be some time overhead for uploading/downloading executables, and hardware or OS versions may not be guaranteed to fully match in different jobs(?) (I've had a bad experience with Google Colab VM, though it is a different vendor.)

jodavies · 2024-12-05T10:41:31Z

OK, then in that case maybe it is fine as it is. At least, the tests only build the binaries that they use so the build is fairly fast.

tueda mentioned this pull request May 23, 2024

ParFORM redefine bug with more than one expressions #525

Closed

jodavies added a commit to jodavies/form that referenced this pull request Jun 13, 2024

Add color.h test

70f2719

Don't run this test with tform and less than 4 threads, it goes extremely slowly. See discussion in vermaseren#524 .

jodavies force-pushed the test-color branch from 0915cc4 to b57bea9 Compare June 13, 2024 09:35

jodavies added this to the v4.3.2 milestone Nov 6, 2024

jodavies added 2 commits November 14, 2024 23:05

Add color.h test

55e32f3

Don't run this test with tform and less than 4 threads, it goes extremely slowly. See discussion in vermaseren#524 .

Resolve valgrind error on some platforms.

51f04cc

jodavies force-pushed the test-color branch from b57bea9 to 51f04cc Compare November 14, 2024 23:10

jodavies added 3 commits December 3, 2024 14:13

Rename "forcer" test dir as "extra"

bbce02b

Move color test to extra dir. Fetch color.h and put in formlib

744b70d

Run "extra" tests in coverage run

9c77706

Also fix distribution files

tueda added a commit to tueda/form that referenced this pull request Dec 4, 2024

test(check.rb): add --fake-valgrind option

3056ea0

See discussion in vermaseren#524.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft: Add color.h test #524

Draft: Add color.h test #524

jodavies commented May 21, 2024 •

edited

Loading

jodavies commented May 22, 2024

tueda commented May 22, 2024

jodavies commented Jun 12, 2024 •

edited

Loading

jodavies commented Jun 13, 2024

jodavies commented Oct 16, 2024

coveralls commented Nov 14, 2024 •

edited

Loading

tueda commented Nov 15, 2024 •

edited

Loading

jodavies commented Nov 15, 2024

tueda commented Nov 15, 2024

jodavies commented Nov 15, 2024

tueda commented Nov 15, 2024

jodavies commented Nov 15, 2024

tueda commented Nov 15, 2024

jodavies commented Nov 15, 2024

jodavies commented Nov 29, 2024

tueda commented Nov 29, 2024

jodavies commented Dec 3, 2024

tueda commented Dec 3, 2024 •

edited

Loading

jodavies commented Dec 4, 2024

tueda commented Dec 4, 2024

jodavies commented Dec 4, 2024

tueda commented Dec 4, 2024

jodavies commented Dec 4, 2024

tueda commented Dec 5, 2024

jodavies commented Dec 5, 2024

Draft: Add color.h test #524

Are you sure you want to change the base?

Draft: Add color.h test #524

Conversation

jodavies commented May 21, 2024 • edited Loading

jodavies commented May 22, 2024

tueda commented May 22, 2024

jodavies commented Jun 12, 2024 • edited Loading

jodavies commented Jun 13, 2024

jodavies commented Oct 16, 2024

coveralls commented Nov 14, 2024 • edited Loading

tueda commented Nov 15, 2024 • edited Loading

jodavies commented Nov 15, 2024

tueda commented Nov 15, 2024

jodavies commented Nov 15, 2024

tueda commented Nov 15, 2024

jodavies commented Nov 15, 2024

tueda commented Nov 15, 2024

jodavies commented Nov 15, 2024

jodavies commented Nov 29, 2024

tueda commented Nov 29, 2024

jodavies commented Dec 3, 2024

tueda commented Dec 3, 2024 • edited Loading

jodavies commented Dec 4, 2024

tueda commented Dec 4, 2024

jodavies commented Dec 4, 2024

tueda commented Dec 4, 2024

jodavies commented Dec 4, 2024

tueda commented Dec 5, 2024

jodavies commented Dec 5, 2024

jodavies commented May 21, 2024 •

edited

Loading

jodavies commented Jun 12, 2024 •

edited

Loading

coveralls commented Nov 14, 2024 •

edited

Loading

tueda commented Nov 15, 2024 •

edited

Loading

tueda commented Dec 3, 2024 •

edited

Loading