Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyper-parameter tuning for VDPWI #121

Open
lintool opened this issue May 30, 2018 · 5 comments
Open

Hyper-parameter tuning for VDPWI #121

lintool opened this issue May 30, 2018 · 5 comments

Comments

@lintool
Copy link
Member

lintool commented May 30, 2018

According to @daemon - the VDPWI works https://github.com/castorini/Castor/tree/master/vdpwi

But the effectiveness is still below STOA because the hyper-parameters haven't been tuned yet.

@daemon
Copy link
Member

daemon commented May 30, 2018

The old implementation was about 0.5 points off for Pearson's r on the test set -- now it's closer to 2. The biggest changes from the old impl to now would be using torchtext and PyTorch 0.4. The model code itself hasn't changed.

@likicode
Copy link
Member

I ran 216 tests on parameters including: decay=[0.99, 0.95], lr=[5E-4, 1E-4], batch_size=[8, 16], momentum=[0, 0.15, 0.05], rnn_hidden_dim=[128, 256, 512], epochs=[10,15,20]. In all cases, I use RMSProp for optimization according to the paper.

The best result on test set for Pearson's r is 0.8707, which is 0.0077 lower than the original result. It's achieved under this param setting: --decay 0.95 --lr 0.0005 --optimizer rmsprop --momentum 0 --epochs 15 --batch-size 8 --rnn-hidden-dim 256.

Among all the "nearly best" results(like 0.8678, 0.8667), they share some same params: --lr 5e-4, --batch-size 8, --epochs 15.

Besides, I also ran some tests with SGD and Adam. Their performance are 1~2 points lower than RMSProp.

@Victor0118
Copy link
Member

Good results!
So it is very close to the original paper, right? It means VDPWI-pytorch works!
Btw, according to my experience, SGD + good lr is usually the best setup.
Could you send a PR to update the readme after you finish the tunning? @likicode

@likicode
Copy link
Member

I've updated the readme and sent PR. @Victor0118

@likicode
Copy link
Member

likicode commented Jul 3, 2018

I re-ran the best parameter setting (Pearson's r 0.8707) 80 times with different random seeds. The 95% confidence interval is: [0.8625, 0.8644]. Among these 80 results, the highest Pearson'r value is 0.8710 by setting the random seed as 723.

The parameter setting is: --classifier vdpwi --lr 0.0005 --optimizer rmsprop --epochs 15 --momentum 0 --batch-size 8 --rnn-hidden-dim 256

Pearson's r Spearman's p MSE
Original paper 0.8784 0.8199 0.2329
Our result 0.8710 0.8092 0.2501

I also run other parameters 10 times with different random seeds in case missing some potential good settings. There are two parameter sets achieving r value higher than 0.87: a) 0.8705 with 95% confidence interval [0.8621, 0.8667]; b)0.8702 with 95% confidence interval [0.8588, 0.8682].

@lintool To sum up, our best result improves 2 points after parameter tuning, and also very close to the result of the original torch implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants