Updated docs

u-brixton · Jul 24, 2021 · 5aa7921 · 5aa7921
1 parent 7d8ee7a
commit 5aa7921
Show file tree

Hide file tree

Showing 3 changed files with 40 additions and 42 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,11 +5,11 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 
-## [0.61.12] - 2021-07-24
+## [0.61.13] - 2021-07-24
 
 ### Added
 
-- Pretraining and finetuning BigBird [whr778](https://github.com/whr778)
+- Pretraining and finetuning BigBird and XLMRoBERTa LMs [whr778](https://github.com/whr778)
 ## [0.61.10] - 2021-07-13
 
 ### Added

diff --git a/docs/_docs/20-lm-specifics.md b/docs/_docs/20-lm-specifics.md
@@ -2,10 +2,9 @@
 title: Language Modeling Specifics
 permalink: /docs/lm-specifics/
 excerpt: "Specific notes for Language Modeling tasks."
-last_modified_at: 2020/12/08 00:05:36
+last_modified_at: 2021/07/24 13:16:18
 toc: true
 ---
-
 The idea of (probabilistic) language modeling is to calculate the probability of a sentence (or sequence of words). This can be used to find the probabilities for the next word in a sequence, or the probabilities for possible words at a given (masked) position.
 
 The commonly used *pre-training* strategies reflect this idea. For example;
@@ -44,26 +43,26 @@ The process of performing Language Modeling in Simple Transformers follows the [
 2. Train the model with `train_model()`
 3. Evaluate the model with `eval_model()`
 
-
 ## Supported Model Types
 
 New model types are regularly added to the library. Language Modeling tasks currently supports the model types given below.
 
 | Model      | Model code for `LanguageModelingModel` |
 | ---------- | -------------------------------------- |
 | BERT       | bert                                   |
+| BigBird    | bigbird                                |
 | CamemBERT  | camembert                              |
 | DistilBERT | distilbert                             |
 | ELECTRA    | electra                                |
 | GPT-2      | gpt2                                   |
 | Longformer | longformer                             |
 | OpenAI GPT | openai-gpt                             |
 | RoBERTa    | roberta                                |
+| XLMRoBERTa | xlmroberta                             |
 
 **Tip:** The model code is used to specify the `model_type` in a Simple Transformers model.
 {: .notice--success}
 
-
 ## ELECTRA Models
 
 The ELECTRA model consists of a generator model and a discriminator model.
@@ -76,45 +75,44 @@ You can configure an ELECTRA model in several ways by using the options below.
 - To load a saved ELECTRA model, you can provide the path to the save files as `model_name`.
 - However, the pre-trained ELECTRA models made public by Google are available as separate generator and discriminator models. When starting from these models (Language Model fine-tuning), set `model_name` to `electra` and provide the pre-trained models as `generator_name` and `discriminator_name`. These two parameters can also be used to load locally saved generator and/or discriminator models.
 
-    ```python
-    model = LanguageModelingModel(
-        "electra",
-        "electra",
-        generator_name="outputs/generator_model",
-        discriminator_name="outputs/disciminator_model",
-    )
+  ```python
+  model = LanguageModelingModel(
+      "electra",
+      "electra",
+      generator_name="outputs/generator_model",
+      discriminator_name="outputs/disciminator_model",
+  )
 
-    ```
+  ```
 - When training an ELECTRA language model from scratch, you can define the architecture by using the `generator_config` and `discriminator_config` in the `args` dict. The [default values](https://huggingface.co/transformers/model_doc/electra.html#electraconfig) will be used for any config parameters that aren't specified.
 
-    ```python
-    model_args = {
-        "vocab_size": 52000,
-        "generator_config": {
-            "embedding_size": 128,
-            "hidden_size": 256,
-            "num_hidden_layers": 3,
-        },
-        "discriminator_config": {
-            "embedding_size": 128,
-            "hidden_size": 256,
-        },
-    }
-
-    train_file = "data/train_all.txt"
-
-    model = LanguageModelingModel(
-        "electra",
-        None,
-        args=model_args,
-        train_files=train_file,
-    )
-
-    ```
+  ```python
+  model_args = {
+      "vocab_size": 52000,
+      "generator_config": {
+          "embedding_size": 128,
+          "hidden_size": 256,
+          "num_hidden_layers": 3,
+      },
+      "discriminator_config": {
+          "embedding_size": 128,
+          "hidden_size": 256,
+      },
+  }
+
+  train_file = "data/train_all.txt"
+
+  model = LanguageModelingModel(
+      "electra",
+      None,
+      args=model_args,
+      train_files=train_file,
+  )
+
+  ```
 
 Refer to the [Language Modeling Minimal Start](/docs/lm-minimal-start/) for full (minimal) examples.
 
-
 ### Saving ELECTRA models
 
 When using ELECTRA models for downstream tasks, the ELECTRA developers recommend using the discriminator model only. Because of this, Simple Transformers will save the generator and discriminator models separately at the end of training. The discriminator model can then be used for downstream tasks.
@@ -139,7 +137,6 @@ classification_model = ClassificationModel("electra", "outputs/checkpoint-1-epoc
 **Note:** Both `save_discriminator()` and `save_generator()` methods takes in an optional `output_dir` argument which specifies where the model should be saved.
 {: .notice--info}
 
-
 ## Distributed Training
 
 Simple Transformers supports distributed language model training.
@@ -148,6 +145,7 @@ Simple Transformers supports distributed language model training.
 {: .notice--success}
 
 You can launch distributed training as shown below.
+
 ```bash
 python -m torch.distributed.launch --nproc_per_node=4 train_new_lm.py
-```
+```
diff --git a/setup.py b/setup.py
@@ -5,7 +5,7 @@
 
 setup(
     name="simpletransformers",
-    version="0.61.12",
+    version="0.61.13",
     author="Thilina Rajapakse",
     author_email="[email protected]",
     description="An easy-to-use wrapper library for the Transformers library.",