Intro

Combine CV with NLP tasks，focus on Medical Report Generation、Image/Video Captioning、VQA、Anchor-free Object Detection、Weakly Supervised Segmentation.

Image/Video Captioning
Paragraph Description Generation
Visual Question Answering
Medical Report Generation
Medical Image Processing
Object Detection
Segmentation
Weakly Supervised Segmentation
Metrics
Others

Papers and Codes/Notes

Image Video Captioning

CNN-RNN
- Show and Tell: A Neural Image Caption Generator, Oriol Vinyals et al, CVPR 2015, Google(pdf)
- Show, Attend and Tell: Neural Image Caption Generation with Visual Attention, Kelvin Xu et at, ICML 2015(pdf)(code)
- Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge, PAMI 2016(pdf)(code)
- Areas of Attention for Image Captioning, ICCV 2017(pdf)
- Rethinking the Form of Latent States in Image Captioning, ECCV 2018, CUHK(pdf)
- Recurrent Fusion Network for Image Captioning, ECCV 2018, Tencent AI Lab, 复旦(pdf)
- Move Forward and Tell- A Progressive Generator of Video Descriptions, ECCV 2018, CUHK(pdf)
- Video Paragraph Captioning Using Hierarchical Recurrent Neural Networks, CVPR 2016(pdf)
CNN-CNN
- Convolutional Image Captioning, CVPR 2018(pdf)(code)
Reinforcement Learning
- Improving Reinforcement Learning Based Image Captioning with Natural Language Prior, 2018, Tencent/IBM(pdf)
- End-to-End Video Captioning with Multitask Reinforcement Learning(pdf)
Others
- A Neural Compositional Paradigm for Image Captioning, NIPS 2018, CUHK(pdf)

Paragraph Description Generation

CNN-RNN
- DenseCap: Fully Convolutional Localization Networks for Dense Captioning, Justin Johnson et al, CVPR 2016, Standford(homepage)(code)
- A Hierarchical Approach for Generating Descriptive Image Paragraphs, Jonathan Krause et al, CVPR 2017, Stanford(homepage)(dense-caption code)
- Recurrent Topic-Transition GAN for Visual Paragraph Generation, ICCV 2017
- Diverse and Coherent Paragraph Generation from Images, ECCV 2018(code)

Visual Question Answering

CNN-RNN
- Multi-level Attention Networks for Visual Question Answering, CVPR 2017
- Motion-Appearance Co-Memory Networks for Video Question Answering, 2018
- Deep Attention Neural Tensor Network for Visual Question Answering, ECCV 2018, HIT
- Question-Guided Hybrid Convolution for Visual Question Answering, Peng Gao et al, ECCV 2018, CUHK(pdf)

Medical Report Generation

CNN-RNN
- Learning to Read Chest X-Rays- Recurrent Neural Cascade Model for Automated Image Annotation, CVPR 2016(pdf)
- TieNet Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays, Xiaosong Wang et at, CVPR 2018, NIH(pdf)(author's homepage)
- On the Automatic Generation of Medical Imaging Reports, Baoyu Jing et al., ACL 2018, CMU(pdf)(author's homepage)
- Multimodal Recurrent Model with Attention for Automated Radiology Report Generation, Yuan Xue et al., MICCAI 2018, PSU(pdf)
- Attention-Based Abnormal-Aware Fusion Network for Radiology Report Generation, Xiancheng Xie et al., 2019, Fudan University
- Addressing Data Bias Problems for Chest X-ray Image Report Generation, Philipp Harzig et al., 2019, University of Augsburg(pdf)
- Addressing Data Bias Problems for Chest X-ray Image Report Generation, Philipp Harzig et al., 2019(pdf)
Reinforcement Learning
- Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation, Christy Y. Li et al, NIPS 2018, CMU(pdf)(author's homepage)
Knowledge Graph
- Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation, Christy Y. Li et al, AAAI 2019, DU(pdf)
Other
- TextRay Mining Clinical Reports to Gain a Broad Understanding of Chest X-rays, 2018 MICCAI(pdf)
Blogs
- 医学报告生成综述

Medical Image Processing

Common Datasets

NIH Chest X-ray8/14(download link)(kaggle's download link)
- ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases, CVPR 2017, NIH(pdf)
Open-i Chest X-Ray(download link)
Radiology Objects in COntext(ROCO)
- Radiology Objects in COntext (ROCO): A Multimodal Image Dataset, MICCAI 2018(intro)(pdf)(download)

Medical Tasks

Detection
- CheXNet- Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning, 2018 吴恩达
- Attention-Guided Curriculum Learning for Weakly Supervised Classification and Localization of Thoracic Diseases on Chest Radiographs, Yuxing Tang et at, MICCAI-MLMI oral 2018, NIH(pdf)
- DeepRadiologyNet - Radiologist Level Pathology Detection in CT Head Images
- 肺部CT图像病变区域检测方法
- 基于定量影像组学的肺肿瘤良恶性预测方法
Enhance
- Super Resolution
  - Image Super-Resolution Using Deep Convolutional Networks
  - Deeply-Recursive Convolutional Network for Image Super-Resolution
Segmentation
- U-Net: Convolutional Networks for Biomedical Image Segmentation, 2015 MICCAI
- A 3D Coarse-to-Fine Framework for Automatic Pancreas Segmentation

Object-Detection

Weakly-supervised
- Learning Deep Features for Discriminative Localization, Bolei Zhou et al, CVPR 2016, MIT(pdf)(code)(note)
Anchor-based
- SSD: Single Shot MultiBox Detector, Wei Liu et al, ECCV 2016, UNC Chapel Hill(pdf)(code)(blog)
- YOLO9000- Better, Faster, Stronger, Joseph Redmon et al, CVPR 2017(pdf)(project)(code)
- FPN, Feature Pyramid Networks for Object Detection, Tsung-Yi Lin et al., CVPR 2017, FAIR(pdf)(blog)
Anchor-free
- YOLO, You Only Look Once- Unified, Real-Time Object Detection, Joseph Redmon et al, CVPR 2016(pdf)(note)
- CornerNet, CornerNet: Detecting Objects as Paired Keypoints, Hei Law et al, ECCV 2018, Michigan University(pdf)(code)(blog)
- FCOS, FCOS: Fully Convolutional One-Stage Object Detection, Zhi Tian et al, ICCV 2019, Adelaide University(pdf)(code)(blog)
- CenterNet, Objects as Points, Xingyi Zhou et al, 2019, UT Austin(pdf)(code)
Others
- Bag of Freebies for Training Object Detection Neural Networks, Zhi Zhang et al, 2019, Amazon 李沐(pdf)
- Deformable Convolutional Networks, Jifeng Dai et al, ICCV 2017, Microsoft Research Asia(pdf)(code)

Segmentation

Semantic Segmentation
- PSPNet, Pyramid Scene Parsing Network, Hengshuang Zhao et al., CVPR 2017, CUHK(pdf)(code)
Instance Segmentation
- Mask R-CNN, Kaiming He et al, ICCV 2017(Best Paper), Facebook AI Research (FAIR)(pdf)(code)

Weakly Supervised Segmentation

Bounding Box Supervision
- Weakly- and Semi-Supervised Learning of a Deep Convolutional Network for Semantic Image Segmentation, Liang-Chieh Chen et al., ICCV 2015, UCLA(pdf)(deeplab-v1-code)(model)(note)
- BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation, Jifeng Dai et al., ICCV 2015, Microsoft Research(pdf)
- Simple Does It: Weakly Supervised Instance and Semantic Segmentation, Anna Khoreva et al., CVPR 2017, Max Planck Institute for Informatics(pdf)(code)(tf-code)
- Box-driven Class-wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation, Chunfeng Song et al, CVPR 2019, CASIA(pdf)
Image Label Supervision
- FULLY CONVOLUTIONAL MULTI-CLASS MULTIPLE INSTANCE LEARNING, Deepak Pathak et al., ICLR 2015, UC Berkeley(pdf)(note)
- From Image-level to Pixel-level Labeling with Convolutional Networks, Pedro O. Pinheiro et.al., CVPR 2015, Idiap Research Institute, Martigny(pdf)(note)
- DSRG, Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing, Zilong Huang et al., CVPR 2018, HUST(pdf)(code)
- SSENet, Self-supervised Scale Equivariant Network for Weakly Supervised Semantic Segmentation, Yude Wang et al., 2019, CAS(pdf)(code)
Others
- DenseCRF, Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials, Philipp Krahenbuhl et al., NIPS 2011, Stanford University(pdf)(homepage)(code)
- A Comprehensive Analysis of Weakly-Supervised Semantic Segmentation in Different Image Domains, Lyndon Chan et al., 2019(pdf)
Good References
- JackieZhangdx's WeakSupervisedSegmentationList

Metrics

BLEU
- BLEU: a method for automatic evaluation of machine translation, Kishore Papineni et al, ACL 2002(pdf)
CIDEr
- CIDEr: Consensus-based Image Description Evaluation, CVPR 2015(pdf)(note)

Others

Visual Commonsense Reasoning(VCR-视觉常识推理)
- From Recognition to Cognition- Visual Commonsense Reasoning, Rowan Zeller et al, 2018, Paul G. Allen School(homepage)(pdf)
Language Model(语言模型)
- Transformer：Attention Is All You Need, Ashish Vaswani et al, NIPS 2017, Google Brain/Research(pdf)(code)(blog)
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Jacob Devlin et al, 2018, Googel AI Language(pdf)(code)(slides)
- ELMo：Deep contextualized word representations, Matthew E. Peters et al, NAACL 2018, Paul G. Allen School(homepage)(pdf)(code-tf)
Teacher Forcing Policy
- A learning algorithm for continually running fully recurrent neural networks, Ronald et al, Neural Computation 1989(pdf)(node)
- Professor Forcing: A New Algorithm for Training Recurrent Networks, Alex Lamb et al, NIPS 2016(pdf)
classification
- VGG, Very Deep Convolutional NetWorks for Large-Scale Image Recognition, Karen Simonyan et at., ICLR 2015(pdf)
- Inception, Going Deeper with Convolutions, Christian Szegedy et al, CVPR 2015, Google(pdf)
- ResNet, Deep Residual Learning for Image Recognition, Kaiming He et al, CVPR 2016, Microsoft Research(pdf)(code)(blog)
- SENet：Squeeze-and-Excitation Networks, Jie Hu et al, CVPR 2018, Momenta(中国无人驾驶公司) and Oxford University(pdf)(code)(blog)

Name		Name	Last commit message	Last commit date
Latest commit History 163 Commits
image-video captioning		image-video captioning
medical image proccessing		medical image proccessing
medical report generation		medical report generation
metrics		metrics
object detection		object detection
others		others
paragraph description generation		paragraph description generation
segmentation		segmentation
visual question answering		visual question answering
weakly supervised segmentation		weakly supervised segmentation
.DS_Store		.DS_Store
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intro

Papers and Codes/Notes

Image Video Captioning

Paragraph Description Generation

Visual Question Answering

Medical Report Generation

Medical Image Processing

Common Datasets

Medical Tasks

Object-Detection

Segmentation

Weakly Supervised Segmentation

Metrics

Others

About

Releases

Packages

wangleihitcs/Papers

Folders and files

Latest commit

History

Repository files navigation

Intro

Papers and Codes/Notes

Image Video Captioning

Paragraph Description Generation

Visual Question Answering

Medical Report Generation

Medical Image Processing

Common Datasets

Medical Tasks

Object-Detection

Segmentation

Weakly Supervised Segmentation

Metrics

Others

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages