Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make fitting of the GOK more robust in calc_Huntley2006() #664

Merged
merged 3 commits into from
Mar 28, 2025
Merged

Conversation

mcol
Copy link
Contributor

@mcol mcol commented Mar 27, 2025

When fitting the simulated curve with the GOK model fails, the fit is attempted again with 100 different values for D0 and the best fit is used. This should reduce the number of occasions in which the error message "Could not fit simulated
model curve, check suitability of model and parameters" is reported.

When I made the simulated curve of the testcase work, I started getting an error in the fitting of the unfaded curve (the fit failed with "Missing value or an infinity produced when evaluating the model" from nlsLM). I reckon that what happens is that we take the starting values from the exponential model, and in that case the c parameter was negative, so at first I took its absolute value, but then I settled on the following change, which fixed that error and got a solution for the testcase:

-   c = coef(fit_start)[["c"]],
+   c = max(coef(fit_start)[["c"]], 1),

Fixes #660.

@RLumSK
Copy link
Member

RLumSK commented Mar 27, 2025

Thanks @mcol. Could you make als dataset I had sent work?

@mcol
Copy link
Contributor Author

mcol commented Mar 27, 2025

The testcase I've added is derived from the first dataset you sent me, just with values rounded to 2 decimal places. Your original dataset works too, just very slowly (~30 minutes) as it asks for 10000 MC iterations, and each of them now has to fit 100 models for the simulated curve. I got results testing with n.MC = 100 in ~20 seconds.

Copy link
Member

@RLumSK RLumSK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon thorough examination, particularly regarding the extended execution times, it becomes evident that we should consider dividing the computation into distinct processes to harness the capabilities of multicore systems.

@mcol
Copy link
Contributor Author

mcol commented Mar 28, 2025

I've now parallelized the MC computations. The annoying thing is that we cannot have calls to .throw_error() inside the parLapply() block as it generates this sort of messages:

Error in checkForRemoteErrors(val) :
  one node produced an error: attempt to select less than one element in integerOneIndex

The only workable solution I found is to return NA and check outside of the parallel region for errors. Therefore we lose the advantage of returning early in case of lack of fit, as we need to wait for the whole MC run to complete.

@mcol
Copy link
Contributor Author

mcol commented Mar 28, 2025

This takes the time to run 10,000 MC iterations on your example with 10 cores to ~5 minutes.

Copy link
Member

@RLumSK RLumSK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great @mcol!

@RLumSK RLumSK merged commit 7da65e7 into master Mar 28, 2025
4 checks passed
@mcol mcol deleted the issue_660 branch March 28, 2025 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make calc_Huntley2006() fitting more robust
2 participants