-
We're training models via the CLI on our local HPC. I'm working through the codebase, but wondering if you do not explicitly set the training and validation indices in the config json if the same train/test split is used if you train, stop training, then resume and feed in Also to clarify: I'm worried that the training set on an earlier "run" can leak into the validation set after resuming, contaminating our validation metrics. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Hey @jmarkow, Yep, good catch. The logic for selecting the training/validation splits is a bit of a mess to handle all the cases that we support, and it gets more complicated when we load in a base checkpoint since we inherit a lot (but not all?) of the settings. What I'd recommend to keep it totally clean would be to split up the labels into training and validation (and test?) as separate I'd recommend doing this with our new standalone import sleap_io as sio
# Load source labels.
labels = sio.load_file("labels.v001.slp")
# Make splits and export with embedded images.
labels.make_training_splits(n_train=0.8, n_val=0.1, n_test=0.1, save_dir="split1", seed=42)
# Splits will be saved as self-contained SLP package files with images and labels.
labels_train = sio.load_file("split1/train.pkg.slp")
labels_val = sio.load_file("split1/val.pkg.slp")
labels_test = sio.load_file("split1/test.pkg.slp") Let us know if that works for you! Cheers, Talmo PS: If you're observing that the training/validation indices are not being appropriately used/not used in the former case, let us know so we can open a bug for tracking. |
Beta Was this translation helpful? Give feedback.
-
@talmo Ah nice! I assumed this was potentially the case, so we just explicitly define the indices in the json file now. I'm guessing that's good enough? In the training logs I see indications that explicit indices are being used for training. We set
I'll load the validation metrics file and ensure that the indices match what we specify in the json. Let me know if I'm still potentially missing something here and should still split out into separate files. |
Beta Was this translation helpful? Give feedback.
@talmo Ah nice! I assumed this was potentially the case, so we just explicitly define the indices in the json file now. I'm guessing that's good enough? In the training logs I see indications that explicit indices are being used for training. We set
training_inds
andvalidation_inds
to lists of integers and then setsplit_by_inds
toTrue
.I'll load the validation metrics file and ensure that the indices match what we specify in the json. Let me know if I'm still potentially missing something here and should still split out i…