Data for dataset selection #16

bOTdESUU · 2025-03-21T18:54:46Z

Hi, thanks for the nice works!

I am trying to develop method to curate dataset similar to LIMR and it would be helpful if you could release the data for calculating the LIMR score and potentially the model on full dataset so that I do not need to rerun the RL. If I understand the code correctly, it should be the ./data/output/math.8k.json file.

hongtangshui · 2025-03-23T13:08:44Z

here

bOTdESUU · 2025-03-23T18:58:36Z

Thanks for the response.

If I understand it correctly, the scores json is the alignment socre. Instead I am interested in the rewards per epoch of all the samples which used to calulate the alinment score, the rewards r_i^k as suggested in section 2.2.1. Sorry for the confusion and please let me know if it already in the repo or would you willing to release it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data for dataset selection #16

Data for dataset selection #16

bOTdESUU commented Mar 21, 2025

hongtangshui commented Mar 23, 2025

bOTdESUU commented Mar 23, 2025

Data for dataset selection #16

Data for dataset selection #16

Comments

bOTdESUU commented Mar 21, 2025

hongtangshui commented Mar 23, 2025

bOTdESUU commented Mar 23, 2025