PUBDEV-2711: Update DL vignette with pretrained_autoencoder.

MachLearnPort · Mar 4, 2016 · 313105d · 313105d
1 parent 64679a0
commit 313105d
Showing 1 changed file with 9 additions and 0 deletions.
diff --git a/h2o-docs/src/booklets/v2_2015/source/DeepLearning_Vignette.tex b/h2o-docs/src/booklets/v2_2015/source/DeepLearning_Vignette.tex
@@ -851,6 +851,13 @@ \subsubsection{Stacked Autoencoder}
 
 \url{https://github.com/h2oai/h2o-3/blob/master/h2o-r/tests/testdir_algos/deeplearning/runit_deeplearning_stacked_autoencoder_large.R}.
 
+\subsubsection{Unsupervised Pretraining with Supervised Fine-Tuning}
+Sometimes, there's much more unlabeled data than labeled data. It this case, it might make sense to train an autoencoder model on the unlabeled data and then fine-tune the learned model with the available labels. In H2O, you would train an autoencoder model with \texttt{autoencoder} enabled, and then you can transfer its state to a supervised regular Deep Learning model by specifying \texttt{pretrained\_autoencoder}. You can seen an \texttt{R} example here:
+
+\url{https://github.com/h2oai/h2o-3/blob/master/h2o-r/tests/testdir_algos/deeplearning/runit_deeplearning_autoencoder_large.R},
+
+and the corresponding \texttt{Python} example here:
+\url{https://github.com/h2oai/h2o-3/blob/master/h2o-py/tests/testdir_algos/deeplearning/pyunit_autoencoderDeepLearning_large.py}.
 
 \section{Parameters}
 \label{sec:Parameters}
@@ -875,6 +882,8 @@ \section{Parameters}
 
 \item \texttt{autoencoder}: Logical. Enables autoencoder. The default is false. Refer to the {\textbf{\nameref{sec:DeepAutoencoders}}} section for more details.
 
+\item \texttt{pretrained\_autoencoder}: (Optional) Pretrained autoencoder model (either an \\ \texttt{H2ODeepLearningModel} or a key) to initialize the model state of a supervised DL model with. 
+
 \item \texttt{use\_all\_factor\_levels}: Logical. Uses all factor levels of categorical variance. Otherwise, omits the first factor level without loss of accuracy. Useful for variable importances and auto-enabled for autoencoder.  The default is true. Refer to the {\textbf{\nameref{sec:DeepAutoencoders}}} section for more details.
 
 \item \texttt{activation}: Specifies the nonlinear, differentiable activation function used in the network. The options are \texttt{Tanh, TanhWithDropout, Rectifier, RectifierWithDropout, Maxout,} or \\\texttt{MaxoutWithDropout}. The default is \texttt{Rectifier}. Refer to the {\textbf{\nameref{sssec:ActivationLoss}}} and {\textbf{\nameref{ssec:Regularization}}} sections for more details.