Chinese text-line detection from web videos with fully convolutional networks
Big Data Analytics20183:2
© The Author(s) 2018
Published: 5 January 2018
In recent years, video becomes the dominant resource of information on the Web, where the text within video usually carries significant semantic
information. Video text extraction and recognition plays an essential role in web multimedia understanding and retrieval for big visual data analytics and applications. To deal with challenging backgrounds and embedding noises, most conventional approaches usually tend to design sophisticated pre-processing and post-progressing steps before and after text detection. In this paper, we present a simple yet powerful pipeline that directly and uniformly detects Chinese text lines for embedded captions from web videos.
In this Chinese text-line detection system, a fully convolutional network with local context is adopted to localize via an end-to-end learning way. The produced caption predictions are with the word level that could be directly fed into the character classifier. Text-line construction is then performed by heuristic strategies. A variety of experiments are conducted on several real-world web video datasets and demonstrated the effectiveness and efficiency of our proposed method.
The proposed system can directly detect the English word and Chinese characters in the caption text-lines without word or character segmentation with the high performance on real-world web video datasets.
Video text detectionText segmentationFully convolutional networksEmbedded captionsWeb videos