Last isave steps not written to file #10

vhaasteren · 2016-03-31T20:37:55Z

Standard isave=1000. Last 1000 steps are not written to the chain file.

paulthebaker · 2021-08-17T23:30:10Z

This bug is still here. Currently the last isave / thin samples are not saved to file, but are present in sampler._chain. The actual count gets weird if isave / thin is not an integer ratio, but no error is raised and nothing crashes.

kdolum · 2023-04-03T18:32:19Z

I discovered this independently just now. Here's an analysis:

Suppose you say you want 100,000 samples, and isave is the default 1000. The first sample is the initial sample. Thus the sampler takes 99,999 steps and finishes. The initial sample has iter=0. The final sample has iter=99,999, and since this isn't a multiple of isave, it isn't written out. The last write takes place at iter=99,000 and writes samples numbered 98,000...98,999. Samples 99,000...99,999 are never written.

You might think you could work around this by requesting 100,001 samples, but then it crashes because of issue #20.

In my opinion, the right fix is to make Niter the number of steps that you are requesting, i.e., the number of samples not including the initial one. Then the number of samples written out will be Niter/thin + 1. I would write the initial sample immediately, then call writeToFile with iter=1000, 2000, ..., 100,000, with each call writing the last 100 (i.e., isave/thin) samples numbered iter-isave+10, iter-isave+20, ..., iter.

vhaasteren · 2023-05-11T08:06:25Z

So basically you propose to define Niter as the number of "steps" not "posterior evaluations". I think that's a good way to think about it, I don't mind that at all. I can't think of a case where outputting Niter/thin + 1 samples would suddenly break stuff down the line. Usually, when reading in a MCMC chain, you determine the number of samples from the file, not in any other way.

@kdolum, if you feel like making a PR or work with @paulthebaker on #21, then please go ahead.

kdolum · 2023-05-12T18:45:44Z

OK. I will make a PR fixing this and also #20. I'll make sure not to lose @paulthebaker's changes in #21.
I propose the following design choices:

If isave is not a multiple of thin, raise an exception. This does not make sense.
Suppose Niter = a*isave + b*thin + c. Take exactly Niter steps. This may be helpful to someone who is trying to test the sampling process rather than creating a file for later use. The last c will not be recorded anywhere. The file will contain the initial sample plus a*isave/thin + b samples that we compute.
Only allow resuming when the file contains only complete rows and their count is one plus a multiple of isave/thin. Thus if b is nonzero, or if the process crashed while writing a block, you have to delete the extra rows at the end before resuming. I recently resumed hundreds of runs and never encountered any partially written files. I think problems with crashing while writing are best dealt with by human intervention, and I'd rather avoid needing the code that writes isave/this - b rows to fill out the partial block.

If you have any objections to these choices, please let me know.

chimaerase · 2023-05-15T19:10:22Z

Thanks for your attention and work on this! For what it's worth, I agree with @kdolom's plan.

I am not an expert on parallel tempering, but as an FYI, a departed colleague of mine recently documented her PTMCMCSampler-based code with the expectation that the number of samples being generated should be ((niter-burn)/thin). I don't see burn addressed in the above discussion, and I'm not necessarily suggesting that this how the code should work. It would be very helpful though as @kdolum suggests to know in advance / have documented the # of samples that will be written to file. In my use case, this kind of predictability would help to test for / catch things like inconsistent user inputs in derived software.

+1 for clarity / documentation on the # of samples saved

kdolum · 2023-05-18T18:20:23Z

I think the problem is that burn is actually the size of the differential evolution buffer, and not a number of samples to burn at the beginning. It is also used for computing an effective sample number. My suggestion would be to change the parameter name to DEsize or something to avoid confusion, and either get rid of burn or have it do what you expected. But I think that should be a separate issue, and I'm reluctant to make such changes myself, because I don't know too much about DE.

kdolum · 2023-11-17T15:34:05Z

Fixed by #41.

vhaasteren assigned jellis18 Apr 1, 2016

paulthebaker added the bug label Aug 17, 2021

AaronDJohnson mentioned this issue Apr 3, 2023

Last 1000 samples are almost always discarded #39

Closed

vhaasteren mentioned this issue May 10, 2023

Fix bugs related in non-integer ratios with thin #21

Closed

kdolum mentioned this issue May 26, 2023

Are covariance matrix and log file entries written by every thread? #40

Open

kdolum mentioned this issue Jun 3, 2023

Fix sample-counting and resuming bugs #41

Merged

kdolum closed this as completed Nov 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Last isave steps not written to file #10

Last isave steps not written to file #10

vhaasteren commented Mar 31, 2016

paulthebaker commented Aug 17, 2021 •

edited

Loading

kdolum commented Apr 3, 2023

vhaasteren commented May 11, 2023

kdolum commented May 12, 2023

chimaerase commented May 15, 2023

kdolum commented May 18, 2023

kdolum commented Nov 17, 2023

Last isave steps not written to file #10

Last isave steps not written to file #10

Comments

vhaasteren commented Mar 31, 2016

paulthebaker commented Aug 17, 2021 • edited Loading

kdolum commented Apr 3, 2023

vhaasteren commented May 11, 2023

kdolum commented May 12, 2023

chimaerase commented May 15, 2023

kdolum commented May 18, 2023

kdolum commented Nov 17, 2023

paulthebaker commented Aug 17, 2021 •

edited

Loading