Skip to content

Latest commit

 

History

History
135 lines (97 loc) · 6.13 KB

README.md

File metadata and controls

135 lines (97 loc) · 6.13 KB
Table of contents
  1. Installation
  2. Dataset
  3. Training
  4. Inference
  5. Acknowledgments

Detecting Omissions in Geographic Maps through Computer Vision (MAPR'24)

by Phuc Nguyen, Anh Do, and Minh Hoai

Abstract: This paper explores the application of computer vision technologies to the analysis of maps, an area with substantial historical, cultural, and political significance. Our focus is on developing and evaluating a method for automatically identifying maps that depict specific regions and feature landmarks with designated names, a task that involves complex challenges due to the diverse styles and methods used in map creation. We address three main subtasks: differentiating maps from non-maps, verifying the accuracy of the region depicted, and confirming the presence or absence of particular landmark names through advanced text recognition techniques. Our approach utilizes a Convolutional Neural Network and transfer learning to differentiate maps from non-maps, verify the accuracy of depicted regions, and confirm landmark names through advanced text recognition. We also introduce the VinMap dataset, containing annotated map images of Vietnam, to train and test our method. Experiments on this dataset demonstrate that our technique achieves F1-score of 85.51% for identifying maps excluding specific territorial landmarks. This result suggests practical utility and indicates areas for future improvement.

overview

Details of the model architecture and experimental results can be found in our paper:

@inproceedings{nguyen_vinmap,
  title={Detecting Omissions in Geographic Maps through Computer Vision},
  author={Phuc Nguyen and Anh Do and Minh Hoai},
  year={2024},
  booktitle={International Conference on Multimedia Analysis and Pattern Recognition}
}

Please CITE our paper whenever this repository is used to help produce published results or incorporated into other software.

Installation 🔨

Please refer to installation guide

Dataset 📂

The VinMap dataset comprises a total of 6,858 images with diverse resolutions. Among these, 2,000 images are non-map images, 2,777 maps do not depict Vietnam, and 1,002 maps represent Vietnam and include either the Truong Sa or Hoang Sa islands (866 maps are in Vietnamese, and 136 maps are not in Vietnamese). There are 1,079 maps of Vietnam that do not contain both the Truong Sa and Hoang Sa islands (291 maps are in Vietnamese, and 788 maps are not in Vietnamese). Vietnam maps encompass various geographic regions, yet to instruct vision models to prioritize specific map areas such as Truong Sa and Hoang Sa, adhering to governmental regulations, VinMap offers box annotations for every Vietnam map containing both the Truong Sa and Hoang Sa islands. This meticulous annotation process establishes the groundwork for advancing map analysis research in Vietnam.

overview

Details of the dataset construction and experimental results can be found in our paper:

By downloading the VinMap dataset, USER agrees:

  • to use VinMap for research or educational purposes only.
  • to not distribute VinMap or part of VinMap in any original or modified form.
  • and to cite our MAPR paper above whenever VinMap is employed to help produce published results.

Please refer to dataset preparation guide

Training 🏃

After setting up data, follow this guide for training. (Minimum requirements: 40GB)

Map Classification

Inside classification directory

Classification #1: Map Classification

# Classify the image is a map or not
python train_map.py

Classification #2: Vietnam Map Classification

# Classify the image is Vietnam map or not
python train_vn.py

Text Detection

Inside mmocr directory

# Mask all detected text using Mask R-CNN
python tools/train.py ./configuration/maskrcnn_resnext101_DCN_160e_icdar

Text Recognition

Inside vietocr directory

# Recognize masked texts
python train.py

After running following comand, it will automatically convert data format to .lmdb and train your model

Inference 🚀

Running the code will print our info about the map:

  • ('This is VN map. It does not contain (Truong Sa and Hoang Sa).. ALERT')

  • ('This is not VN map.. SKIPPED')

  • ('This is VN map, It contains (Truong Sa or Hoang Sa).. OK')

  • ...

Academic version:

(Minimum requirements: 24GB)

This version we use to reproduce quantitative results of our paper. Include training, testing code:

export PYTHONPATH='/root/VinMap'
# GPU
python -W ignore single_infer.py --single_infer_image <image_path> --single_infer_path '../temp'
# CPU
python -W ignore single_infer_cpu.py --single_infer_image <image_path> --single_infer_path '../temp'

Drag and drop:

# Gradio demo
python gradio_demo.py

Acknowledgments

We sincerely thank the MIC-VN team for data crawling and labeling. We thank Mr. Que Nguyen and his team for their support in testing and deployment. The source code is built upon VietOCR and MMOCR.

Contacts

If you have any questions or suggestions about this repo, please feel free to contact me ([email protected]).

Copyright (c) 2024 VinAI

THE DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE DATA OR THE USE OR OTHER DEALINGS IN THE
DATA.