1st International Workshop on Deep Learning
for Document Analysis and Recognition

ICPR 2018, Beijing, China

  • Home
  • Invited Speakers
  • People
  • Program


The technology of document analysis and recognition (DAR) aims to automatically extract information from document images and handwriting by analyzing the structure and textual contents. It has tremendous applications such as digitization of books and financial notes and information extraction from Web document images. Recognizing text from images, known as Optical Character Recognition (OCR) is the core task of DAR. Recently, OCR has achieved a great success in both scientific research and practical application for different scenes. A traditional OCR system is heavily pipelined, with hand-designed and highly-tuned modules, usually composed of line extraction, word detection, letter segmentation, and then applying different techniques to each piece of a character to figure out what the character is. Nowadays, we have entered a new era of big data, which offers both opportunities and challenges to the field of OCR and DAR. We should seek new OCR and DAR methods to be adaptive to big data, and also push forward new OCR and DAR applications benefited from big data.

Deep learning, which is considered as one of the most significant breakthrough in recent pattern recognition and computer vision fields, has greatly affected these fields and achieved impressive progress in both academy and industry. Currently, deep learning is widely accepted as an effective OCR solution, which first learns to detect text lines or words from images, then recognize the sequence of characters directly from extracted text lines or words. The hand-built and highly tuned modules are avoided in the deep learning-based OCR system. It is expected that the development of deep learning theories and applications would further influence the field of OCR and DAR.

We organize this workshop to provide a forum for highlighting the current research, and discussing some future trends on deep learning for OCR and DAR.


1Deep learning for character and text recognition

2Deep learning for scene text detection and recognition

3­Deep learning for document image processing and segmentation

4­Deep learning for layout analysis

5­Deep learning for writer identification and signature analysis

6­Deep learning for document retrieval

7­Deep learning for context modeling

8­Deep learning for graphics and symbol recognition

9­Deep learning for other DAR tasks

Lianwen Jin, South China University of Technology

Lianwen Jin received the B.S. degree from the University of Science and Technology of China, Anhui, China, and the Ph.D. degree from the South China University of Technology, Guangzhou, China, in 1991 and 1996, respectively. He is a professor in the College of Electronic and Information Engineering at the South China University of Technology. His research interests include handwriting analysis and recognition, optical character recognition, scene text detection and recognition, deep learning, and intelligent systems. He has received the New Century Excellent Talent Program of MOE China Award and the Guangdong Pearl River Distinguished Professor Award. He has authored over 100 scientific papers which were published in peer-reviewed journals such as IEEE TPAMI, IEEE TNNLS, IEEE TCYBS, IEEE TCSVT, TII, IEEE TMM, IEEE TITS, Pattern Recognition, Neurocomputing, Pattern Recognition Letter, International Journal on Document Analysis and Recognition, et.al, and in main-stream international conferences including ICPR, ICDAR, ICFHR, CVPR, AAAI, IJCAI et.al.

Weilin Huang, Malong Technologies

Dr. Weilin Huang is Chief Scientist of Malong Technologies. He was working as a postdoc researcher with Prof. Andrew Zisserman and Prof. Alison Noble in Visual Geometry Group (VGG), University of Oxford. He was an Assistant Professor with the Chinese Academy of Science. He received his Ph.D. degree from the University of Manchester, U.K. His research interests include scene text detection/recognition, large-scale image classification and medical image analysis. He has served as a PC Member or Reviewer for main computer vision conferences, including ICCV, CVPR, ECCV, MICCAI and AAAI. His team was the first runner-up at the ImageNet 2015 on scene recognition, and was the winner of WebVision Challenge in CVPR 2017.


Yongpan Wang, Alibaba Group, China

­Xiang Bai, Huazhong University of Science and Technology, China

­Cheng-Lin Liu, Institute of Automation of Chinese Academy of Sciences, China

Program Committee

C. V. Jawahar (IIIT Hyderabad)

Dimosthenis Karatzas (Universitat Autónoma de Barcelona)

Lianwen Jin (South China University of Technology)

Xiang Bai (Huazhong University of Science and Technology)

Shijian Lu (Nanyang Technological University)

Cheng-Lin Liu (Institute of Automation of Chinese Academy of Sciences)

Weilin Huang (Malong Technologies)

114:00-14:40 Presentation 1 (TBA)

214:40-15:20 Presentation 2 (TBA)

315:20-16:00 Presentation 3 (TBA)

416:00-16:20 Coffee Break

516:20-17:00 Presentation 4 (TBA)

617:00-17:40 Presentation 5 (TBA)