[Improve] Harmless changes (open-mmlab#284)

* harmless * remove (object) * minor * update apis * f-string * update README * polish docstring * update onnx unittest * fix * add repr * add repr_str * fix * polish docstring
borisachen · Oct 29, 2020 · e0692ad · e0692ad
1 parent d6f76cd
commit e0692ad
Show file tree

Hide file tree

Showing 57 changed files with 344 additions and 219 deletions.
diff --git a/README.md b/README.md
@@ -71,14 +71,18 @@ Details can be found in [benchmark](docs/benchmark.md).
 Supported methods for action recognition:
 - [x] [TSN](configs/recognition/tsn/README.md)
 - [x] [TSM](configs/recognition/tsm/README.md)
+- [x] [TSM Non-Local](configs/recognition/i3d)
 - [x] [R(2+1)D](configs/recognition/r2plus1d/README.md)
 - [x] [I3D](configs/recognition/i3d/README.md)
+- [x] [I3D Non-Local](configs/recognition/i3d/README.md)
 - [x] [SlowOnly](configs/recognition/slowonly/README.md)
 - [x] [SlowFast](configs/recognition/slowfast/README.md)
 - [x] [CSN](configs/recognition/csn/README.md)
 - [x] [TIN](configs/recognition/tin/README.md)
 - [x] [TPN](configs/recognition/tpn/README.md)
 - [x] [C3D](configs/recognition/c3d/README.md)
+- [x] [OmniSource](configs/recognition/omnisource/README.md)
+- [x] [MultiModality: Audio](configs/recognition_audio/resnet/README.md)
 
 Supported methods for action localization:
 - [x] [BMN](configs/localization/bmn/README.md)
@@ -100,7 +104,7 @@ The supported datasets are listed in [supported_datasets.md](docs/supported_data
 ## Get Started
 
 Please see [getting_started.md](docs/getting_started.md) for the basic usage of MMAction2.
-There are also tutorials for [finetuning models](docs/tutorials/finetune.md), [adding new dataset](docs/tutorials/new_dataset.md), [designing data pipeline](docs/tutorials/data_pipeline.md), [exporting model to onnx](docs/tutorials/export_model.md) and [adding new modules](docs/tutorials/new_modules.md).
+There are also tutorials for [finetuning models](docs/tutorials/finetune.md), [adding new dataset](docs/tutorials/new_dataset.md), [designing data pipeline](docs/tutorials/data_pipeline.md), [exporting model to onnx](docs/tutorials/export_model.md), [customizing runtime settings](docs/tutorials/customize_runtime.md) and [adding new modules](docs/tutorials/new_modules.md).
 
 A Colab tutorial is also provided. You may preview the notebook [here](demo/mmaction2_tutorial.ipynb) or directly [run](https://colab.research.google.com/github/open-mmlab/mmaction2/blob/master/demo/mmaction2_tutorial.ipynb) on Colab.
 

diff --git a/configs/localization/bsn/README.md b/configs/localization/bsn/README.md
@@ -44,7 +44,7 @@ Examples:
     ```
 
 2. train BSN(PEM) on PGM results.
-    ```python
+    ```shell
     python tools/train.py configs/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature.py
     ```
 

diff --git a/configs/recognition/csn/README.md b/configs/recognition/csn/README.md
@@ -31,7 +31,7 @@ doi = {10.1109/ICCV.2019.00565}
 
 Notes:
 
-1. The **gpus** indicates the number of gpu (32G V100) we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8x4 gpus as default.
+1. The **gpus** indicates the number of gpu (32G V100) we used to get the checkpoint. It is noteworthy that the configs we provide are used for 8 gpus as default.
 According to the [Linear Scaling Rule](https://arxiv.org/abs/1706.02677), you may set the learning rate proportional to the batch size if you use different GPUs or videos per GPU,
 e.g., lr=0.01 for 4 GPUs * 2 video/gpu and lr=0.08 for 16 GPUs * 4 video/gpu.
 2. The **inference_time** is got by this [benchmark script](/tools/analysis/benchmark.py), where we use the sampling frames strategy of the test setting and only care about the model inference time,

diff --git a/configs/recognition/csn/ircsn_ig65m_pretrained_bnfrozen_r152_32x2x1_58e_kinetics400_rgb.py b/configs/recognition/csn/ircsn_ig65m_pretrained_bnfrozen_r152_32x2x1_58e_kinetics400_rgb.py
@@ -95,7 +95,8 @@
         pipeline=test_pipeline))
 # optimizer
 optimizer = dict(
-    type='SGD', lr=0.0005, momentum=0.9, weight_decay=0.0001)  # 0.0005 for 32g
+    type='SGD', lr=0.000125, momentum=0.9,
+    weight_decay=0.0001)  # this lr is used for 8 gpus
 optimizer_config = dict(grad_clip=dict(max_norm=40, norm_type=2))
 # learning policy
 lr_config = dict(

diff --git a/configs/recognition/csn/ircsn_ig65m_pretrained_r152_32x2x1_58e_kinetics400_rgb.py b/configs/recognition/csn/ircsn_ig65m_pretrained_r152_32x2x1_58e_kinetics400_rgb.py
@@ -94,7 +94,8 @@
         pipeline=test_pipeline))
 # optimizer
 optimizer = dict(
-    type='SGD', lr=0.0005, momentum=0.9, weight_decay=0.0001)  # 0.0005 for 32g
+    type='SGD', lr=0.000125, momentum=0.9,
+    weight_decay=0.0001)  # this lr is used for 8 gpus
 optimizer_config = dict(grad_clip=dict(max_norm=40, norm_type=2))
 # learning policy
 lr_config = dict(

diff --git a/docs/api.rst b/docs/api.rst
@@ -1,6 +1,11 @@
 API Reference
 =============
 
+mmaction.apis
+-------------
+.. automodule:: mmaction.apis
+    :members:
+
 mmaction.core
 -------------
 
@@ -19,10 +24,22 @@ lr
 .. automodule:: mmaction.core.lr
     :members:
 
+mmaction.localization
+---------------------
+
+localization
+^^^^^^^^^^^^
+.. automodule:: mmaction.localization
+    :members:
 
 mmaction.models
 ---------------
 
+models
+^^^^^^
+.. automodule:: mmaction.models
+    :members:
+
 recognizers
 ^^^^^^^^^^^
 .. automodule:: mmaction.models.recognizers
@@ -48,12 +65,16 @@ heads
 .. automodule:: mmaction.models.heads
     :members:
 
+necks
+^^^^^
+.. automodule:: mmaction.models.necks
+    :members:
+
 losses
 ^^^^^^
 .. automodule:: mmaction.models.losses
     :members:
 
-
 mmaction.datasets
 -----------------
 
@@ -62,25 +83,21 @@ datasets
 .. automodule:: mmaction.datasets
     :members:
 
-
 pipelines
 ^^^^^^^^^
 .. automodule:: mmaction.datasets.pipelines
     :members:
 
-
 samplers
 ^^^^^^^^
 .. automodule:: mmaction.datasets.samplers
     :members:
 
-
 mmaction.utils
 --------------
 .. automodule:: mmaction.utils
     :members:
 
-
 mmaction.localization
 ---------------------
 .. automodule:: mmaction.localization

diff --git a/docs/getting_started.md b/docs/getting_started.md
@@ -9,10 +9,10 @@ It is recommended to symlink the dataset root to `$MMACTION2/data`.
 If your folder structure is different, you may need to change the corresponding paths in config files.
 
 ```
-mmaction
+mmaction2
 ├── mmaction
 ├── tools
-├── config
+├── configs
 ├── data
 │   ├── kinetics400
 │   │   ├── rawframes_train
@@ -24,7 +24,7 @@ mmaction
 │   │   ├── rawframes_val
 │   │   ├── ucf101_train_list.txt
 │   │   ├── ucf101_val_list.txt
-
+│   ├── ...
 ```
 For more information on data preparation, please see [data_preparation.md](data_preparation.md)
 
@@ -644,4 +644,5 @@ python tools/analysis/eval_metric.py ${CONFIG_FILE} ${RESULT_FILE} [--eval ${EVA
 ## Tutorials
 
 Currently, we provide some tutorials for users to [finetune model](tutorials/finetune.md),
-[add new dataset](tutorials/new_dataset.md), [add new modules](tutorials/new_modules.md).
+[add new dataset](tutorials/new_dataset.md), [customize data pipelines](tutorials/data_pipeline.md),
+[add new modules](tutorials/new_modules.md), [export a model to ONNX](tutorials/export_model.md) and [customize runtime settings](tutorials/customize_runtime.md).
diff --git a/docs/tutorials/customize_runtime.md b/docs/tutorials/customize_runtime.md
@@ -96,7 +96,7 @@ from .my_optimizer import MyOptimizer
 
 
 @OPTIMIZER_BUILDERS.register_module()
-class MyOptimizerConstructor(object):
+class MyOptimizerConstructor:
 
     def __init__(self, optimizer_cfg, paramwise_cfg=None):
         pass

diff --git a/docs/tutorials/data_pipeline.md b/docs/tutorials/data_pipeline.md
@@ -1,4 +1,4 @@
-# Tutorial 3: Custom Data Pipelines
+# Tutorial 3: Customize Data Pipelines
 
 ## Design of Data Pipelines
 

diff --git a/mmaction/apis/inference.py b/mmaction/apis/inference.py
@@ -20,8 +20,8 @@ def init_recognizer(config,
     Args:
         config (str | :obj:`mmcv.Config`): Config file path or the config
             object.
-        checkpoint (str, optional): Checkpoint path. If left as None, the model
-            will not load any weights. Default: None.
+        checkpoint (str | None, optional): Checkpoint path. If left as None,
+            the model will not load any weights. Default: None.
         device (str | :obj:`torch.device`): The desired device of returned
             tensor. Default: 'cuda:0'.
         use_frames (bool): Whether to use rawframes as input. Default:False.

diff --git a/mmaction/core/evaluation/accuracy.py b/mmaction/core/evaluation/accuracy.py
@@ -201,9 +201,9 @@ def pairwise_temporal_iou(candidate_segments, target_segments):
 
     Args:
         candidate_segments (np.ndarray): 1-dim/2-dim array in format
-            [init, end]/[m x 2:=[init, end]].
+            ``[init, end]/[m x 2:=[init, end]]``.
         target_segments (np.ndarray): 2-dim array in format
-            [n x 2:=[init, end]].
+            ``[n x 2:=[init, end]]``.
 
     Returns:
         t_iou (np.ndarray): 1-dim array [n] /
@@ -255,7 +255,7 @@ def average_recall_at_avg_proposals(ground_truth,
         max_avg_proposals (int | None): Max number of proposals for one video.
             Default: None.
         temporal_iou_thresholds (np.ndarray): 1D array with temporal_iou
-            thresholds. Default: np.linspace(0.5, 0.95, 10).
+            thresholds. Default: ``np.linspace(0.5, 0.95, 10)``.
 
     Returns:
         tuple([np.ndarray, np.ndarray, np.ndarray, float]):
@@ -266,7 +266,7 @@ def average_recall_at_avg_proposals(ground_truth,
             over a list of temporal_iou threshold (1D array). This is
             equivalent to ``recall.mean(axis=0)``. The ``proposals_per_video``
             is the average number of proposals per video. The auc is the area
-            under AR@AN curve.
+            under ``AR@AN`` curve.
     """
 
     total_num_videos = len(ground_truth)
@@ -375,7 +375,7 @@ def get_weighted_score(score_list, coeff_list):
             n(number of predictions) X num_samples X num_classes
         coeff_list (list[float]): List of coefficients, with shape n.
 
-    Return:
+    Returns:
         list[np.ndarray]: List of weighted scores.
     """
     assert len(score_list) == len(coeff_list)
@@ -426,12 +426,12 @@ def average_precision_at_temporal_iou(ground_truth,
     Args:
         ground_truth (dict): Dict containing the ground truth instances.
             Key: 'video_id'
-            Value (np.ndarry): 1D array of 't-start' and 't-end'.
-        proposals (np.ndarray): 2D array containing the information of proposal
-            instances, including 'video_id', 'class_id', 't-start', 't-end' and
-            'score'.
+            Value (np.ndarray): 1D array of 't-start' and 't-end'.
+        prediction (np.ndarray): 2D array containing the information of
+            proposal instances, including 'video_id', 'class_id', 't-start',
+            't-end' and 'score'.
         temporal_iou_thresholds (np.ndarray): 1D array with temporal_iou
-            thresholds. Default: np.linspace(0.5, 0.95, 10).
+            thresholds. Default: ``np.linspace(0.5, 0.95, 10)``.
 
     Returns:
         np.ndarray: 1D array of average precision score.

diff --git a/mmaction/core/evaluation/eval_detection.py b/mmaction/core/evaluation/eval_detection.py
@@ -1,19 +1,20 @@
 import json
 
 import numpy as np
-from mmcv.utils import get_logger, print_log
+from mmcv.utils import print_log
 
+from ...utils import get_root_logger
 from .accuracy import interpolated_precision_recall, pairwise_temporal_iou
 
 
-class ActivityNetDetection(object):
+class ActivityNetDetection:
     """Class to evaluate detection results on ActivityNet.
 
     Args:
-        ground_truth_filename (str): The filename of groundtruth.
-            Default: None.
-        prediction_filename (str): The filename of action detection results.
+        ground_truth_filename (str | None): The filename of groundtruth.
             Default: None.
+        prediction_filename (str | None): The filename of action detection
+            results. Default: None.
         tiou_thresholds (np.ndarray): The thresholds of temporal iou to
             evaluate. Default: ``np.linspace(0.5, 0.95, 10)``.
         verbose (bool): Whether to print verbose logs. Default: False.
@@ -33,7 +34,7 @@ def __init__(self,
         self.tiou_thresholds = tiou_thresholds
         self.verbose = verbose
         self.ap = None
-        self.logger = get_logger()
+        self.logger = get_root_logger()
         # Import ground truth and predictions.
         self.ground_truth, self.activity_index = self._import_ground_truth(
             ground_truth_filename)

diff --git a/mmaction/datasets/activitynet_dataset.py b/mmaction/datasets/activitynet_dataset.py
@@ -65,7 +65,7 @@ class ActivityNetDataset(BaseDataset):
     Args:
         ann_file (str): Path to the annotation file.
         pipeline (list[dict | callable]): A sequence of data transforms.
-        data_prefix (str): Path to a directory where videos are held.
+        data_prefix (str | None): Path to a directory where videos are held.
             Default: None.
         test_mode (bool): Store True when building test or validation dataset.
             Default: False.

diff --git a/mmaction/datasets/audio_dataset.py b/mmaction/datasets/audio_dataset.py
@@ -75,7 +75,7 @@ def evaluate(self,
                  metrics='top_k_accuracy',
                  topk=(1, 5),
                  logger=None):
-        """Evaluation in rawframe dataset.
+        """Evaluation in audio dataset.
 
         Args:
             results (list): Output results.
@@ -87,7 +87,7 @@ def evaluate(self,
             logger (logging.Logger | None): Logger for recording.
                 Default: None.
 
-        Return:
+        Returns:
             dict: Evaluation results dict.
         """
         if not isinstance(results, list):

diff --git a/mmaction/datasets/audio_feature_dataset.py b/mmaction/datasets/audio_feature_dataset.py
@@ -76,7 +76,7 @@ def evaluate(self,
                  metrics='top_k_accuracy',
                  topk=(1, 5),
                  logger=None):
-        """Evaluation in rawframe dataset.
+        """Evaluation in audio feature dataset.
 
         Args:
             results (list): Output results.
@@ -88,7 +88,7 @@ def evaluate(self,
             logger (logging.Logger | None): Logger for recording.
                 Default: None.
 
-        Return:
+        Returns:
             dict: Evaluation results dict.
         """
         if not isinstance(results, list):

diff --git a/mmaction/datasets/base.py b/mmaction/datasets/base.py
@@ -23,13 +23,13 @@ class BaseDataset(Dataset, metaclass=ABCMeta):
     Args:
         ann_file (str): Path to the annotation file.
         pipeline (list[dict | callable]): A sequence of data transforms.
-        data_prefix (str): Path to a directory where videos are held.
+        data_prefix (str | None): Path to a directory where videos are held.
             Default: None.
         test_mode (bool): Store True when building test or validation dataset.
             Default: False.
         multi_class (bool): Determines whether the dataset is a multi-class
             dataset. Default: False.
-        num_classes (int): Number of classes of the dataset, used in
+        num_classes (int | None): Number of classes of the dataset, used in
             multi-class datasets. Default: None.
         start_index (int): Specify a start index for frames in consideration of
             different filename format. However, when taking videos as input,

diff --git a/mmaction/datasets/builder.py b/mmaction/datasets/builder.py
@@ -26,7 +26,7 @@ def build_dataset(cfg, default_args=None):
 
     Args:
         cfg (dict): Config dict. It should at least contain the key "type".
-        default_args (dict, optional): Default initialization arguments.
+        default_args (dict | None, optional): Default initialization arguments.
             Default: None.
 
     Returns:

diff --git a/mmaction/datasets/dataset_wrappers.py b/mmaction/datasets/dataset_wrappers.py
@@ -2,7 +2,7 @@
 
 
 @DATASETS.register_module()
-class RepeatDataset(object):
+class RepeatDataset:
     """A wrapper of repeated dataset.
 
     The length of repeated dataset will be ``times`` larger than the original

diff --git a/mmaction/datasets/hvu_dataset.py b/mmaction/datasets/hvu_dataset.py
@@ -56,8 +56,8 @@ class HVUDataset(BaseDataset):
         pipeline (list[dict | callable]): A sequence of data transforms.
         tag_categories (list[str]): List of category names of tags.
         tag_category_nums (list[int]): List of number of tags in each category.
-        filename_tmpl: Template for each filename. `filename_tmpl is None`
-            indicates video dataset is used. Default: None.
+        filename_tmpl (str | None): Template for each filename. If set to None,
+            video dataset is used. Default: None.
         **kwargs: Keyword arguments for ``BaseDataset``.
     """
 
@@ -135,7 +135,7 @@ def evaluate(self, results, metrics='mean_average_precision', logger=None):
             logger (logging.Logger | None): Logger for recording.
                 Default: None.
 
-        Return:
+        Returns:
             dict: Evaluation results dict.
         """
         if not isinstance(results, list):