v1.4, rm unused code and codepaths

PiperOrigin-RevId: 179822701
tensorflow · Dec 21, 2017 · bac1321 · bac1321
1 parent 4354f3b
commit bac1321
Show file tree

Hide file tree

Showing 27 changed files with 723 additions and 1,898 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -14,9 +14,9 @@ env:
     - T2T_DATA_DIR=/tmp/t2t-data
     - T2T_TRAIN_DIR=/tmp/t2t-train
 script:
-  - pytest --ignore=tensor2tensor/utils/registry_test.py --ignore=tensor2tensor/utils/trainer_utils_test.py --ignore=tensor2tensor/problems_test.py --ignore=tensor2tensor/tpu/tpu_trainer_lib_test.py
+  - pytest --ignore=tensor2tensor/utils/registry_test.py --ignore=tensor2tensor/problems_test.py --ignore=tensor2tensor/tpu/tpu_trainer_lib_test.py
   - pytest tensor2tensor/utils/registry_test.py
-  - pytest tensor2tensor/utils/trainer_utils_test.py
+  - pytest tensor2tensor/tpu/tpu_trainer_lib_test.py
   - t2t-datagen 2>&1 | grep translate && echo passed
   - python -c "from tensor2tensor.models import transformer; print(transformer.Transformer.__name__)"
   - t2t-trainer --registry_help

diff --git a/docs/cloud_tpu.md b/docs/cloud_tpu.md
@@ -3,15 +3,19 @@
 Tensor2Tensor supports running on Google Cloud Platforms TPUs, chips specialized
 for ML training.
 
-Not all models are supported but we've tested so far with Transformer (sequence
-model) as well as Xception (image model).
+Models and hparams that are known to work on TPU:
+* `transformer` with `transformer_tpu`
+* `xception` with `xception_base`
+* `resnet50` with `resnet_base`
 
 To run on TPUs, you need to be part of the alpha program; if you're not, these
 commands won't work for you currently, but access will expand soon, so get
 excited for your future ML supercomputers in the cloud.
 
 ## Tutorial: Transformer En-De translation on TPU
 
+Update `gcloud`: `gcloud components update`
+
 Set your default zone to a TPU-enabled zone. TPU machines are only available in
 certain zones for now.
 ```
@@ -40,29 +44,32 @@ gcloud alpha compute tpus create \
 To see all TPU instances running: `gcloud alpha compute tpus list`.  The
 `TPU_IP` should be unique amongst the list and follow the format `10.240.i.2`.
 
-Generate data to GCS
-If you already have the data locally, use `gsutil cp` to cp to GCS.
+SSH in with port forwarding for TensorBoard
 ```
-DATA_DIR=gs://my-bucket/t2t/data/
-t2t-datagen --problem=translate_ende_wmt8k --data_dir=$DATA_DIR
+gcloud compute ssh $USER-vm -- -L 6006:localhost:6006
 ```
 
-SSH in with port forwarding for TensorBoard
+Now that you're on the cloud instance, install T2T:
 ```
-gcloud compute ssh $USER-vm -L 6006:localhost:6006
+pip install tensor2tensor --user
+# If your python bin dir isn't already in your path
+export PATH=$HOME/.local/bin:$PATH
 ```
 
-Now that you're on the cloud instance, install T2T:
+Generate data to GCS
+If you already have the data, use `gsutil cp` to copy to GCS.
 ```
-pip install tensor2tensor
+GCS_BUCKET=gs://my-bucket
+DATA_DIR=$GCS_BUCKET/t2t/data/
+t2t-datagen --problem=translate_ende_wmt8k --data_dir=$DATA_DIR
 ```
 
 Setup some vars used below. `TPU_IP` and `DATA_DIR` should be the same as what
 was used above. Note that the `DATA_DIR` and `OUT_DIR` must be GCS buckets.
 ```
 TPU_IP=<IP of TPU machine>
-DATA_DIR=gs://my-bucket/t2t/data/
-OUT_DIR=gs://my-bucket/t2t/training/
+DATA_DIR=$GCS_BUCKET/t2t/data/
+OUT_DIR=$GCS_BUCKET/t2t/training/
 TPU_MASTER=grpc://$TPU_IP:8470
 ```
 
@@ -73,25 +80,26 @@ tensorboard --logdir=$OUT_DIR > /tmp/tensorboard_logs.txt 2>&1 &
 
 Train and evaluate.
 ```
-t2t-tpu-trainer \
-  --master=$TPU_MASTER \
-  --data_dir=$DATA_DIR \
-  --output_dir=$OUT_DIR \
-  --problems=translate_ende_wmt8k \
+t2t-trainer \
   --model=transformer \
-  --hparams_set=transformer_tiny_tpu \
+  --hparams_set=transformer_tpu \
+  --problems=translate_ende_wmt8k \
   --train_steps=10 \
   --eval_steps=10 \
   --local_eval_frequency=10 \
-  --iterations_per_loop=10
+  --iterations_per_loop=10 \
+  --master=$TPU_MASTER \
+  --use_tpu=True \
+  --data_dir=$DATA_DIR \
+  --output_dir=$OUT_DIR
 ```
 
 The above command will train for 10 steps, then evaluate for 10 steps. You can
 (and should) increase the number of total training steps with the
 `--train_steps` flag. Evaluation will happen every `--local_eval_frequency`
 steps, each time for `--eval_steps`. When you increase then number of training
 steps, also increase `--iterations_per_loop`, which controls how frequently the
-TPU machine returns control to the Python code (1000 seems like a fine number).
+TPU machine returns control to the host machine (1000 seems like a fine number).
 
 Back on your local machine, open your browser and navigate to `localhost:6006`
 for TensorBoard.

diff --git a/docs/example_life.md b/docs/example_life.md
diff --git a/docs/index.md b/docs/index.md
@@ -24,6 +24,6 @@ documentation, from basic tutorials to full code documentation.
 
 ## Deep Dive
 
-* [Life of an Example](example_life.md): how all parts of T2T are connected and
+* [System Overview](overview.md): how all parts of T2T are connected and
   work together
 * [Distributed Training](distributed_training.md)