This repositary contains the official Pytorch implementation for paper BEV-DAR: Enhancing Monocular Bird's Eye View Segmentation with Depth-Aware Rasterization
To use our code, please install the following dependencies:
- torch==1.9.1
- torchvison==0.10.1
- mmcv-full==1.3.15
- CUDA 9.2+
We conduct experiments of nuScenes, Argoverse, Please down the datasets and place them under /data/nuscenes/ and so on. Note that calib.json contains the intrinsics and extrinsics matrixes of every image. Refer to the script make_labels to get the BEV annotation for nuScenes and Argoverse, respectively. The datasets' structures look like:
data
βββ nuscenes
| βββ img_dir
| βββ ann_bev_dir
| βββ calib.json
βββ argoversev1.0
| βββ img_dir
| βββ ann_bev_dir
| βββ calib.json
βββ kitti_processed
| βββ kitti_raw
| | βββ img_dir
| | βββ ann_bev_dir
| | βββ calib.json
| βββ kitti_odometry
| | βββ img_dir
| | βββ ann_bev_dir
| | βββ calib.json
| βββ kitti_object
| | βββ img_dir
| | βββ ann_bev_dir
| | βββ calib.json
### Prepare calib.json
"calib.json" contains the camera parameters of each image. Readers can generate the "calib.json" file by the instruction of [nuScenes](https://www.nuscenes.org/nuscenes#download), [Argoverse](https://www.argoverse.org/), [Kitti Raw](http://www.cvlibs.net/datasets/kitti/raw_data.php), [Kitti Odometry](http://www.cvlibs.net/datasets/kitti/eval_odometry.php), and [Kitti 3D Object](http://www.cvlibs.net/datasets/kitti/eval_3dobject.php). We also upload *calib.json* for each dataset to [google drive](https://drive.google.com/drive/folders/1Ahaed1OsA1EqlJOCHHN-MQQr2VpF8H7U?usp=sharing) and [Baidu Net Disk](https://pan.baidu.com/s/1wEzHWkazS5vLPZJVjpzHMw?pwd=2022).
## Training
Take nuScenes as an example. To train a semantic segmentation model under a specific configuration, run:
cd DAR
python -m torch.distributed.launch --nproc_per_node 1 --master_port 14300 tools/train.py ./configs/DAR/DAR_nuscenes.py --work-dir ./models_dir/DAR_nuscenes --launcher pytorch
To evaluate the performance, run the following command:
cd DAR
python -m torch.distributed.launch --nproc_per_node ${NUM_GPU} --master_port ${PORT} tools/test.py ${CONFIG} ${MODEL_PATH} --out ${SAVE_RESULT_PATH} --eval ${METRIC} --launcher pytorch
The weight for nuscenes is available at https://pan.baidu.com/s/1v6yox2KypJ6Sx9bd53IM1Q?pwd=h1g4
For example, we evaluate the mIoU on nuScenes by running:
cd DAR
python -m torch.distributed.launch --nproc_per_node 1 --master_port 14300 tools/test.py ./configs/DAR/DAR_nuscenes.py ./models_dir/DAR_nuscenes/iter_90000.pth --out ./results/DAR_nuscenes/DAR_nuscenes.pkl --eval mIoU --launcher pytorch
To get the visulization results of the model, we first change the output_type from 'iou' to 'seg' in the testing process.
# change the output_type from 'iou' to 'seg'
test_cfg=dict(mode='whole',output_type='seg',positive_thred=0.5)
)
And then, we can generate the visualization results by running the following command:
python -m torch.distributed.launch --nproc_per_node 4 --master_port 14300 tools/test.py ./configs/DAR/DAR_nuscenes.py ./models_dir/DAR_nuscenes/iter_20000.pth --format-only --eval-options "imgfile_prefix=./models_dir/pyva_swin_argoverse" --launcher pytorch
Our work is partially based on mmseg and HFT. Thanks for their contributions to the research community.