Hyper-parameter tuning for VDPWI #121

lintool · 2018-05-30T15:44:56Z

According to @daemon - the VDPWI works https://github.com/castorini/Castor/tree/master/vdpwi

But the effectiveness is still below STOA because the hyper-parameters haven't been tuned yet.

daemon · 2018-05-30T17:29:40Z

The old implementation was about 0.5 points off for Pearson's r on the test set -- now it's closer to 2. The biggest changes from the old impl to now would be using torchtext and PyTorch 0.4. The model code itself hasn't changed.

likicode · 2018-06-27T08:20:52Z

I ran 216 tests on parameters including: decay=[0.99, 0.95], lr=[5E-4, 1E-4], batch_size=[8, 16], momentum=[0, 0.15, 0.05], rnn_hidden_dim=[128, 256, 512], epochs=[10,15,20]. In all cases, I use RMSProp for optimization according to the paper.

The best result on test set for Pearson's r is 0.8707, which is 0.0077 lower than the original result. It's achieved under this param setting: --decay 0.95 --lr 0.0005 --optimizer rmsprop --momentum 0 --epochs 15 --batch-size 8 --rnn-hidden-dim 256.

Among all the "nearly best" results(like 0.8678, 0.8667), they share some same params: --lr 5e-4, --batch-size 8, --epochs 15.

Besides, I also ran some tests with SGD and Adam. Their performance are 1~2 points lower than RMSProp.

Victor0118 · 2018-06-27T13:45:12Z

Good results!
So it is very close to the original paper, right? It means VDPWI-pytorch works!
Btw, according to my experience, SGD + good lr is usually the best setup.
Could you send a PR to update the readme after you finish the tunning? @likicode

likicode · 2018-06-27T14:57:24Z

I've updated the readme and sent PR. @Victor0118

likicode · 2018-07-03T05:06:53Z

I re-ran the best parameter setting (Pearson's r 0.8707) 80 times with different random seeds. The 95% confidence interval is: [0.8625, 0.8644]. Among these 80 results, the highest Pearson'r value is 0.8710 by setting the random seed as 723.

The parameter setting is: --classifier vdpwi --lr 0.0005 --optimizer rmsprop --epochs 15 --momentum 0 --batch-size 8 --rnn-hidden-dim 256

	Pearson's r	Spearman's p	MSE
Original paper	0.8784	0.8199	0.2329
Our result	0.8710	0.8092	0.2501

I also run other parameters 10 times with different random seeds in case missing some potential good settings. There are two parameter sets achieving r value higher than 0.87: a) 0.8705 with 95% confidence interval [0.8621, 0.8667]; b)0.8702 with 95% confidence interval [0.8588, 0.8682].

@lintool To sum up, our best result improves 2 points after parameter tuning, and also very close to the result of the original torch implementation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyper-parameter tuning for VDPWI #121

Hyper-parameter tuning for VDPWI #121

lintool commented May 30, 2018

daemon commented May 30, 2018

likicode commented Jun 27, 2018

Victor0118 commented Jun 27, 2018

likicode commented Jun 27, 2018

likicode commented Jul 3, 2018

Hyper-parameter tuning for VDPWI #121

Hyper-parameter tuning for VDPWI #121

Comments

lintool commented May 30, 2018

daemon commented May 30, 2018

likicode commented Jun 27, 2018

Victor0118 commented Jun 27, 2018

likicode commented Jun 27, 2018

likicode commented Jul 3, 2018