- Invited Speakers
The technology of document analysis and recognition (DAR) aims to automatically extract information from document images and handwriting by analyzing the structure and textual contents. It has tremendous applications such as digitization of books and financial notes and information extraction from Web document images. Recognizing text from images, known as Optical Character Recognition (OCR) is the core task of DAR. Recently, OCR has achieved a great success in both scientific research and practical application for different scenes. A traditional OCR system is heavily pipelined, with hand-designed and highly-tuned modules, usually composed of line extraction, word detection, letter segmentation, and then applying different techniques to each piece of a character to figure out what the character is. Nowadays, we have entered a new era of big data, which offers both opportunities and challenges to the field of OCR and DAR. We should seek new OCR and DAR methods to be adaptive to big data, and also push forward new OCR and DAR applications benefited from big data.
Deep learning, which is considered as one of the most significant breakthrough in recent pattern recognition and computer vision fields, has greatly affected these fields and achieved impressive progress in both academy and industry. Currently, deep learning is widely accepted as an effective OCR solution, which first learns to detect text lines or words from images, then recognize the sequence of characters directly from extracted text lines or words. The hand-built and highly tuned modules are avoided in the deep learning-based OCR system. It is expected that the development of deep learning theories and applications would further influence the field of OCR and DAR.
We organize this workshop to provide a forum for highlighting the current research, and discussing some future trends on deep learning for OCR and DAR.
1Deep learning for character and text recognition
2Deep learning for scene text detection and recognition
3Deep learning for document image processing and segmentation
4Deep learning for layout analysis
5Deep learning for writer identification and signature analysis
6Deep learning for document retrieval
7Deep learning for context modeling
8Deep learning for graphics and symbol recognition
9Deep learning for other DAR tasks
Lianwen Jin, South China University of Technology
Lianwen Jin received the B.S. degree from the University of Science and Technology of China, Anhui, China, and the Ph.D. degree from the South China University of Technology, Guangzhou, China, in 1991 and 1996, respectively. He is a professor in the College of Electronic and Information Engineering at the South China University of Technology. His research interests include handwriting analysis and recognition, optical character recognition, scene text detection and recognition, deep learning, and intelligent systems. He has received the New Century Excellent Talent Program of MOE China Award and the Guangdong Pearl River Distinguished Professor Award. He has authored over 100 scientific papers which were published in peer-reviewed journals such as IEEE TPAMI, IEEE TNNLS, IEEE TCYBS, IEEE TCSVT, TII, IEEE TMM, IEEE TITS, Pattern Recognition, Neurocomputing, Pattern Recognition Letter, International Journal on Document Analysis and Recognition, et.al, and in main-stream international conferences including ICPR, ICDAR, ICFHR, CVPR, AAAI, IJCAI et.al.
Weilin Huang, Malong Technologies
Dr. Weilin Huang is Chief Scientist of Malong Technologies. He was working as a postdoc researcher with Prof. Andrew Zisserman and Prof. Alison Noble in Visual Geometry Group (VGG), University of Oxford. He was an Assistant Professor with the Chinese Academy of Science. He received his Ph.D. degree from the University of Manchester, U.K. His research interests include scene text detection/recognition, large-scale image classification and medical image analysis. He has served as a PC Member or Reviewer for main computer vision conferences, including ICCV, CVPR, ECCV, MICCAI and AAAI. His team was the first runner-up at the ImageNet 2015 on scene recognition, and was the winner of WebVision Challenge in CVPR 2017.
Yongpan Wang, Alibaba Group, China
Xiang Bai, Huazhong University of Science and Technology, China
Cheng-Lin Liu, Institute of Automation of Chinese Academy of Sciences, China
C. V. Jawahar (IIIT Hyderabad)
Dimosthenis Karatzas (Universitat Autónoma de Barcelona)
Lianwen Jin (South China University of Technology)
Xiang Bai (Huazhong University of Science and Technology)
Shijian Lu (Nanyang Technological University)
Cheng-Lin Liu (Institute of Automation of Chinese Academy of Sciences)
Weilin Huang (Malong Technologies)
114:00-14:40 Presentation 1 (TBA)
214:40-15:20 Presentation 2 (TBA)
315:20-16:00 Presentation 3 (TBA)
416:00-16:20 Coffee Break
516:20-17:00 Presentation 4 (TBA)
617:00-17:40 Presentation 5 (TBA)