Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to fine-tune the TabPFN model? #8

Open
setipsh opened this issue Jan 13, 2025 · 2 comments
Open

How to fine-tune the TabPFN model? #8

setipsh opened this issue Jan 13, 2025 · 2 comments

Comments

@setipsh
Copy link

setipsh commented Jan 13, 2025

I am using the TabPFN model for a classification task and would like to fine-tune the already trained pre-trained model to better adapt it to my specific dataset. I have reviewed the documentation and code, but I still have some questions about the specific steps and parameter settings for fine-tuning.
Does fine-tuning TabPFN involve gradient updates?
How many training epochs are needed?
During the fine-tuning process, which parameters are the most important?

@fif911
Copy link
Contributor

fif911 commented Feb 2, 2025

Hey. Did you find any details? Also looking into it now

@LennartPurucker
Copy link
Collaborator

LennartPurucker commented Feb 2, 2025

Heyho, sharing a discussion from Discord (link):

Hi everyone, there are two different things that both can be called fine-tuning:
(1) Fine-tuning TabPFN to do better on one or more datasets in order to generalize better to other, similar datasets. The analogy for LLMs would be fine-tuning GPT/Llama/Mistral to your own data.
(2) Fine-tuning TabPFN to a single dataset, in order to improve its performance (e.g., to tackle large datasets that don't fit into memory). This doesn't have an analogy in LLMs but is specific to tabular data.

For (2), I already created some code, see https://github.com/LennartPurucker/finetune_tabpfn_v2 for more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants