中国综合性科技类核心期刊(北大核心)

中国科学引文数据库来源期刊(CSCD)

美国《化学文摘》(CA)收录

美国《数学评论》(MR)收录

俄罗斯《文摘杂志》收录

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度学习的场景文字检测研究进展

余若男 黄定江 董启文

余若男, 黄定江, 董启文. 基于深度学习的场景文字检测研究进展[J]. 华东师范大学学报(自然科学版), 2018, (5): 1-16. doi: 10.3969/j.issn.1000-5641.2018.05.001
引用本文: 余若男, 黄定江, 董启文. 基于深度学习的场景文字检测研究进展[J]. 华东师范大学学报(自然科学版), 2018, (5): 1-16. doi: 10.3969/j.issn.1000-5641.2018.05.001
YU Ruo-nan, HUANG Ding-jiang, DONG Qi-wen. Survey on scene text detection based on deep learning[J]. Journal of East China Normal University (Natural Sciences), 2018, (5): 1-16. doi: 10.3969/j.issn.1000-5641.2018.05.001
Citation: YU Ruo-nan, HUANG Ding-jiang, DONG Qi-wen. Survey on scene text detection based on deep learning[J]. Journal of East China Normal University (Natural Sciences), 2018, (5): 1-16. doi: 10.3969/j.issn.1000-5641.2018.05.001

基于深度学习的场景文字检测研究进展

doi: 10.3969/j.issn.1000-5641.2018.05.001
基金项目: 

国家自然科学基金 11501204

国家自然科学基金广东省联合项目 U1711262

详细信息
    作者简介:

    余若男, 女, 硕士研究生, 研究方向为深度学习与目标检测.E-mail:yrn130814232@163.com

    通讯作者:

    黄定江, 男, 教授, 研究方向为机器学习与人工智能及其在计算金融等跨领域中大数据的解析和应用.E-mail:djhuang@dase.ecnu.edu.cn

  • 中图分类号: TP391

Survey on scene text detection based on deep learning

  • 摘要: 在大数据驱动应用的背景下,随着计算机硬件性能的提高,基于深度学习的目标检测和图像分割算法冲破了传统算法的瓶颈,成为当前计算机视觉领域的主流算法.而场景文字检测任务受到目标检测和图像分割算法发展的影响,近年来也有了极大的突破.这篇综述的目的主要有3个方面:介绍近5年场景文字检测工作进展;比较分析先进算法的优点及不足;总结该领域相关的基准数据集和评价方法.
  • 图  1  R-CNN网络结构

    Fig.  1  Architecture of R-CNN

    图  2  CTPN结构

    Fig.  2  Architecture of the CTPN (Connectionist Text Proposal Network)

    图  3  基于图像分割的场景文字检测算法示例

    Fig.  3  Examples of scene text detection algorithms based on image segmentation

    图  4  ICDAR 2015-Incidental场景文字数据集示例

    Fig.  4  Examples from the ICDAR 2015 incidental scene text dataset

    表  1  近几年场景文字检测算法的优势及局限

    Tab.  1  Advantages and limitations of scene text detection algorithms in recent years

    方法算法优势局限
    基于目
    标检测
    的方法
    Tian等人(CTPN)[29]速度快, 性能佳只能处理水平文字
    Zhong等人(DeepText)[30]速度快, 性能佳, 合成数据, 训练样本少只能处理水平文字
    Zhang等人(FEN)[33]速度快, 性能佳只能处理水平文字
    Jiang等人(R2CNN)[31]多方向速度慢
    Ma等人(RRPN)[32]多方向速度慢
    Shi等人(SegLink)[36]多方向, 长文字, 速度快不能检测间隔大的文字行;
    不能检测形变或弯曲的文字
    Tian等人(WeText)[37]速度快, 训练数据集扩增只能处理水平文字
    基于图
    像分割
    的方法
    Zhang等人(Text-Block FCN)[40]多方向, 多语言, 多字体速度慢
    He等人(CCTN)[41]多方向, 多尺寸, 鲁棒速度慢
    Yao等人(HED-based)[42]多方向, 多语言, 弯曲文字速度慢, 不适用于模糊
    严重和高亮文字
    Polzounov等人(Wordfence)[43]多方向, 多语言, 多尺寸速度慢
    Deng等人(PixelLink)[44]多方向, 性能佳, 鲁棒对弯曲文字检测不好
    Yang等人(IncepText)[45]多方向, 性能佳, 已在OCR产品中实现速度慢
    混合
    方法
    Dai等人(FTSN)[46]多方向, 弯曲文字, 性能佳速度慢
    He等人(DDRN)[47]多方向, 直接, 高效, 一步后处理对间隔很大、单个字符、
    复杂背景的文字行检测不好
    Jiang等人(CCP)[48]多方向, 多语言速度慢
    Zhou等人(EAST)[49]多方向, 速度快, 高效对较长文字检测效果不好
    Qin和Manduchi(CSDN)[38]多方向, 不需要后处理对弯曲文字检测不好
    下载: 导出CSV

    表  2  近几年端到端文字检测系统的优势与局限

    Tab.  2  Advantages and limitations of end-to-end text detection systems in recent years

    算法年份优势局限
    Wang等人[6]2012第一个将深度学习用于场景文字检测系统中, 性能好, 鲁棒只能处理水平文字
    Jaderberg等人[5]2014鲁棒, 适用于不同分辨率的图像只能处理水平文字, 要求大量训练数据, 低效
    Jaderberg等人[51]2016合成场景文字图像, 高召回率只适用于给定语言, 只能处理水平文字
    Gupta等人(FCRN)[52]2016合成场景文字图像, 速度快只能处理水平文字
    Liao等人(TextBoxes)[53]2017速度快, 准确率高, 支持多
    尺度输入, 一步后处理
    只能处理水平文字, 对间隔大的字符
    和垂直文字检测不佳
    Li等人[54]2017第一个尝试将场景文字检测
    和识别集成到一个
    只能处理水平文字, 对小文字识别效果不佳
    单一网络中的模型, 可以处理多尺度文字
    Busta等人(Deep TextSpotter)[55]2017速度快, 性能佳, 可以
    处理多方向文字
    对单个字符或者简短的数字和
    字符片段检测不佳
    Liao等人(TextBoxes++)[56]2018鲁棒, 速度快, 可以处理
    多方向文字
    对间隔大的字符和垂直
    文字检测不佳
    Bartz等人(SEE)[57]2018网络架构简单, 网络可以
    自动学习文字检测
    只能处理水平文字, 训练困难
    Liu等人(FOTS)[58]2018鲁棒, 速度快, 可以处理多
    方向文字及长文字
    对文字区域内存在较大方差或者
    文字区域和背景有相似模式时不适用
    下载: 导出CSV

    表  3  场景文字检测常用数据集

    Tab.  3  Common datasets for scene text detection

    数据集年份图片数量(训练, 测试)方向(弯曲)语言
    ICDAR 2003[63]/ ICDAR 2005[64]509(258, 251)水平英文
    ICDAR 2011[65]484(229, 255)水平英文
    ICDAR 2013[66]/ICDAR 2015-Focused[67]462(229, 233)水平英文
    ICDAR 2015-Incidental[67]1500(1000, 500)多方向英文
    ICDAR 2017-MLT[68]9000(7200, 1800)多方向多语言
    KAIST 2010[69]3000水平英韩文
    SVT 2010[3]350(100, 250)水平英文
    NEOCR 2011[70]659多方向(弯曲)多语言
    OSTD 2011[71]89多方向英文
    MSRA-TD500 2012[9]500(300, 200)多方向中英文
    CUTE80 2014[72]80弯曲英文
    HUST-TR400 2014[73]400多方向英文
    USTB-SV1K 2015[74]1000(500, 500)多方向英文
    SCUT-FORU-DB 2016[75]3931水平中英文
    COCO-Text 2016[76]63686(43686, 20000)多方向英文
    RCTW-17 2017[77]12263(8034, 4229)多方向中文
    Total-Text 2017[78]1555(1255, 300)多方向(弯曲)英文
    CTW1500 2017[79]1500(1000, 500)多方向(弯曲)中英文
    CTW 2018[80]32285多方向(弯曲)中文
    下载: 导出CSV

    表  4  场景文字检测算法性能比较

    Tab.  4  Performance comparison of scene text detection algorithms

    算法ICDAR2013ICDAR2015IncidentalMSRA-TD500
    P/%R/%F/%P/%R/%F/%P/%R/%F/%
    Tian等人(CTPN)[29]93.0083.0087.7074.2251.5660.85///
    Zhong等人(DeepText)[30]87.1782.7984.93//////
    Zhang等人(FEN)[33]89.3094.1091.60//////
    Jiang等人(R2CNN)[31]93.5582.5987.7385.6279.6382.54///
    Ma等人(RRPN)[32]90.2271.8980.0273.2382.1777.4482.0068.0074.00
    Shi等人(SegLink)[36]87.7083.0085.3073.1076.8075.0086.0070.0077.00
    Tian等人(WeText)[37]84.2080.7082.30//////
    Zhang等人(Text-Block FCN)[40]88.1474.0080.0070.8143.0953.5883.0067.0074.00
    He等人(CCTN)[41]90.0083.0086.00///79.0065.0071.00
    Yao等人(HED-based)[42]89.0080.0084.0072.0059.0065.0063.0062.0060.00
    Polzounov等人(Wordfence)[43]65.0092.0076.00//////
    Deng等人(PixelLink)[44]87.5088.6088.1085.5082.0083.7083.0073.2077.80
    Yang等人(IncepText)[45]/// 93.8087.30 90.5087.50 79.0083.00
    Dai等人(FTSN)[46]///88.6080.0084.1087.6077.1082.00
    He等人(DDRN)[47]92.0081.0086.0082.0080.0081.0077.0070.0074.00
    Jiang等人(CCP)[48]92.2091.50 91.90//////
    Zhou等人(EAST)[49]///83.2778.3380.7287.3067.4076.10
    Qin等人(CSDN)[38]90.0083.0086.0079.0065.0071.00///
    注: "$P$"、"$R$"、"$F$"分别代表准确率、召回率和$F$度量.
    下载: 导出CSV
  • [1] ZHU Y, YAO C, BAI X. Scene text detection and recognition:Recent advances and future trends[J]. Front Comput Sci, 2014, 10(1):19-36. http://d.old.wanfangdata.com.cn/Periodical/zggdxxxswz-jsjkx201601003
    [2] YE Q, DOERMANN D. Text detection and recognition in imagery:A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(7):1480-1500. doi:  10.1109/TPAMI.2014.2366765
    [3] WANG K, BELONGIE S. Word spotting in the wild[C]//Computer Vision-ECCV 2010. Berlin: Springer, 2010: 591-604.
    [4] NEUMANN L, MATAS J. Scene text localization and recognition with oriented stroke detection[C]//2013 IEEE International Conference on Computer Vision. IEEE, 2013: 97-104.
    [5] JADERBERG M, VEDALDI A, ZISSERMAN A. Deep features for text spotting[C]//Computer Vision-ECCV 2014. Cham: Springer, 2014: 512-528.
    [6] WANG T, WU D J, COATES A, et al. End-to-end text recognition with convolutional neural networks[C]//Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012). 2012: 3304-3308.
    [7] EPSHTEIN B, OFEK E, WEXLER Y. Detecting text in natural scenes with stroke width transform[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2010: 2963-2970.
    [8] MATAS J, CHUM O, URBAN M, et al. Robust wide baseline stereo from maximally stable extremal regions[J]. Image and Vision Computing, 2004, 22:761-767. doi:  10.1016/j.imavis.2004.02.006
    [9] YAO C, BAI X, LIU W, et al. Detecting texts of arbitrary orientations in natural images[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. 2012: 1083-1090.
    [10] KANG L, LI Y, DOERMANN D. Orientation robust text line detection in natural images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014: 4034-4041.
    [11] YIN X C, YIN X, HUANG K, et al. Robust text detection in natural scene images[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(5):970-983. doi:  10.1109/TPAMI.2013.182
    [12] YIN X C, PEI W Y, ZHANG J, et al. Multi-orientation scene text detection with adaptive clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1930-1937. doi:  10.1109/TPAMI.2014.2388210
    [13] CHO H, SUNG M, JUN B. Canny text detector: Fast and robust scene text localization algorithm[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016: 3566-3573.
    [14] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2014: 580-587.
    [15] GIRSHICK R. Fast R-CNN[C]//2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015: 1440-1448.
    [16] REN S, HE K, GIRSHICK R, et al. Faster R-CNN:Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017(6):1137-1149. http://www.ncbi.nlm.nih.gov/pubmed/27295650
    [17] DAI J, LI Y, HE K, et al. R-FCN: Object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems 29. NIPS, 2016: 379-387.
    [18] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016: 779-788.
    [19] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot MultiBox detector[C]//European Conference on Computer Vision. Cham: Springer, 2016: 21-37.
    [20] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. Imagenet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems 25. NIPS, 2012: 1097-1105.
    [21] UIJLINGS J R R, VAN DE SANDE K E A, GEVERS T, et al. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2):154-171. doi:  10.1007/s11263-013-0620-5
    [22] HE K, ZHANG X, REN S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//Computer Vision-ECCV 2014. Cham: Springer, 2014: 346-361.
    [23] REDMON J, FARHADI A. YOLO9000: Better, faster, stronger[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 6517-6525.
    [24] REDMON J, FARHADI A. YOLOv3: An incremental improvement[J]. arXiv preprint, arXiv: 1804. 02767v1[cs.CV] 8 Apr 2018. http://cn.arxiv.org/abs/1804.02767
    [25] CIRESAN D, GIUSTI A, GAMBARDELLA L M, et al. Deep neural networks segment neuronal membranes in electron microscopy images[G]//Advances in Neural Information Processing Systems 25. Curran Associates, Inc, 2012: 2843-2851.
    [26] LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015: 3431-3440.
    [27] LI Y, QI H, DAI J, et al. Fully convolutional instance-aware semantic segmentation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017: 4438-4446.
    [28] HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//2017 IEEE International Conferé nce on Computer Vision (ICCV). IEEE, 2017: 2980-2988.
    [29] TIAN Z, HUANG W, HE T, et al. Detecting text in natural image with connectionist text proposal network[C]//European Conference on Computer Vision. Cham: Springer, 2016: 56-72.
    [30] ZHONG Z, JIN L, ZHANG S, et al. DeepText: A unified framework for text proposal generation and text detection in natural images[J]. arXiv preprint, arXiv: 1605. 07314v1[cs.CV] 24 May 2016.
    [31] JIANG Y, ZHU X, WANG X, et al. R2CNN: Rotational region CNN for orientation robust scene text detection[J]. arXiv preprint, arXiv: 1706. 09579v2[cs.CV] 30 Jun 2017. http://cn.arxiv.org/abs/1706.09579
    [32] MA J, SHAO W, YE H, et al. Arbitrary-oriented scene text detection via rotation proposals[J]. arXiv preprint, arXiv: 1703. 01086v3[cs.CV] 15 Mar 2018. http://cn.arxiv.org/abs/1703.01086
    [33] ZHANG S, LIU Y, JIN L, et al. Feature enhancement network: A refined scene text detector[J]. arXiv preprint, arXiv: 1711. 04249v1[cs.CV] 12 Nov 2017. http://cn.arxiv.org/abs/1711.04249
    [34] GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005, 18(5/6):602-610. http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=JJ029013030
    [35] SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions[J]. arXiv preprint, arXiv: 1409. 4842v1[cs.CV] 17 Sep 2014. http://cn.arxiv.org/abs/1409.4842
    [36] SHI B, BAI X, BELONGIE S. Detecting oriented text in natural images by linking segments[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 3482-3490.
    [37] TIAN S, LU S, LI C. WeText: Scene text detection under weak supervision[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017: 1501-1509.
    [38] QIN S, MANDUCHI R. Cascaded segmentation-detection networks for word-level text spotting[C]//201714th IAPR International Conference on Document Analysis and Recognition (ICDAR). 2017: 1275-1282.
    [39] HU H, ZHANG C, LUO Y, et al. WordSup: Exploiting word annotations for character based text detection[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017: 4950-4959.
    [40] ZHANG Z, ZHANG C, SHEN W, et al. Multi-oriented text detection with fully convolutional networks[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016: 4159-4167.
    [41] HE T, HUANG W, QIAO Y, et al. Accurate text localization in natural image with cascaded convolutional text network[J]. arXiv preprint, arXiv: 1603. 09423v1[cs.CV] 31 Mar 2016. http://cn.arxiv.org/abs/1603.09423
    [42] YAO C, BAI X, SANG N, et al. Scene text detection via holistic, multi-channel prediction[J]. arXiv preprint, arXiv: 1606. 09002v2[cs.CV] 5 Jul 2016. http://cn.arxiv.org/abs/1606.09002
    [43] POLZOUNOV A, ABLAVATSKI A, ESCALERA S, et al. Wordfence: Text detection in natural images with border awareness[C]//2017 IEEE International Conference on Image Processing (ICIP). IEEE, 2017: 1222-1226.
    [44] DENG D, LIU H, LI X, et al. PixelLink: Detecting scene text via instance segmentation[J]. arXiv preprint, arXiv: 1801. 01315v1[cs.CV] 4 Jan 2018. http://cn.arxiv.org/abs/1801.01315
    [45] YANG Q, CHENG M, ZHOU W, et al. Incep text: A new inception-text module with deformable PSROI pooling for multi-oriented scene text detection[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI). 2018: 1071-1077.
    [46] DAI Y, HUANG Z, GAO Y, et al. Fused text segmentation networks for multi-oriented scene text detection[J]. arXiv preprint, arXiv: 1709. 03272v4[cs.CV] 7 May 2018. http://cn.arxiv.org/abs/1709.03272
    [47] HE W, ZHANG X Y, YIN F, et al. Deep direct regression for multi-oriented scene text detection[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017: 745-753.
    [48] JIANG F, HAO Z, LIU X. Deep scene text detection with connected component proposals[J]. arXiv preprint, arXiv: 1708. 05133v1[cs.CV] 17 Aug 2017. http://cn.arxiv.org/abs/1708.05133
    [49] ZHOU X, YAO C, WEN H, et al. EAST: An efficient and accurate scene text detector[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 2642-2651.
    [50] KIM K H, HONG S, ROH B, et al. PVANET: Deep but lightweight neural networks for real-time object detection[J]. arXiv preprint, arXiv: 1608. 08021v3[cs.CV] 30 Sep 2016. http://cn.arxiv.org/abs/1608.08021
    [51] JADERBERG M, SIMONYAN K, VEDALDI A, et al. Reading text in the wild with convolutional neural networks[J]. International Journal of Computer Vision, 2016, 116(1):1-20. doi:  10.1007/s11263-015-0823-z
    [52] GUPTA A, VEDALDI A, ZISSERMAN A. Synthetic data for text localisation in natural images[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016: 2315-2324.
    [53] LIAO M, SHI B, BAI X, et al. TextBoxes: A fast text detector with a single deep neural network[C]//31st AAAI Conference on Artificial Intelligence. 2017: 4161-4167.
    [54] LI H, WANG P, SHEN C. Towards end-to-end text spotting with convolutional recurrent neural networks[C]//2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017: 5248-5256.
    [55] BUSTA M, NEUMANN L, MATAS J. Deep textspotter: An end-to-end trainable scene text localization and recognition framework[C]//Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, 2017: 2223-2231.
    [56] LIAO M, SHI B, BAI X. TextBoxes++:A single-shot oriented scene text detector[J]. IEEE Transactions on Image Processing, 2018, 27(8):3676-3690. doi:  10.1109/TIP.2018.2825107
    [57] BARTZ C, YANG H, MEINEL C. See: Towards semi-supervised end-to-end scene text recognition[J]. arXiv preprint, arXiv: 1712. 05404v1[cs.CV] 14 Dec 2017. http://cn.arxiv.org/abs/1712.05404
    [58] LIU X, LIANG D, YAN S, et al. FOTS: Fast oriented text spotting with a unified network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 5676-5685.
    [59] JADERBERG M, SIMONYAN K, VEDALDI A, et al. Synthetic data and artificial neural networks for natural scene text recognition[J]. arXiv preprint, arXiv: 1406. 2227v4[cs.CV] 9 Dec 2014. http://cn.arxiv.org/abs/1406.2227
    [60] SHI B, BAI X, YAO C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(11):2298-2304. doi:  10.1109/TPAMI.2016.2646371
    [61] GRAVES A, FERNÁNDEZ S, GOMEZ F, et al. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks[C]//Proceedings of the 23rd International Conference on Machine Learning. New York: ACM, 2006: 369-376.
    [62] JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks[C]//Advances in Neural Information Processing Systems 27. NIPS, 2015: 2017-2025.
    [63] LUCAS S M, PANARETOS A, SOSA L, et al. ICDAR 2003 robust reading competitions:Entries, results, and future directions[J]. International Journal of Document Analysis and Recognition (IJDAR), 2005, 7(2/3):105-122. http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ021047811/
    [64] LUCAS S M. ICDAR 2005 text locating competition results[C]//8th International Conference on Document Analysis and Recognition (ICDAR'05). 2005: 80-84.
    [65] SHAHAB A, SHAFAIT F, DENGEL A. ICDAR 2011 robust reading competition challenge 2: Reading text in scene images[C]//Document Analysis and Recognition (ICDAR), 2011 International Conference on. IEEE, 2011: 1491-1496.
    [66] KARATZAS D, SHAFAIT F, UCHIDA S, et al. ICDAR 2013 robust reading competition[C]//International Conference on Document Analysis and Recognition. IEEE Computer Society, 2013: 1484-1493.
    [67] KARATZAS D, GOMEZ-BIGORDA L, NICOLAOU A, et al. ICDAR 2015 competition on robust reading[C]//International Conference on Document Analysis and Recognition. IEEE 2015: 1156-1160.
    [68] NAYEF N, YIN F, BIZID I, et al. ICDAR2017 robust reading challenge on multi-lingual scene text detection and script identification-RRC-MLT[C]//201714th IAPR International Conference on Document Analysis and Recognition (ICDAR). 2017: 1454-1459.
    [69] LEE S, CHO M S, JUNG K, et al. Scene text extraction with edge constraint and text collinearity[C]//201020th International Conference on Pattern Recognition. 2010: 3983-3986.
    [70] NAGY R, DICKER A, MEYER-WEGENER K. NEOCR: A configurable dataset for natural image text recognition[C]//Camera-Based Document Analysis and Recognition. Berlin: Springer, 2011: 150-163.
    [71] YI C, TIAN Y. Text string detection from natural scenes by structure-based partition and grouping[J]. IEEE Transactions on Image Processing, 2011, 20(9):2594-2605. doi:  10.1109/TIP.2011.2126586
    [72] RISNUMAWAN A, SHIVAKUMARA P, CHAN C S, et al. A robust arbitrary text detection system for natural scene images[J]. Expert Systems with Applications, 2014, 41(18):8027-8048. doi:  10.1016/j.eswa.2014.07.008
    [73] YAO C, BAI X, LIU W. A unified framework for multioriented text detection and recognition[J]. IEEE Transactions on Image Processing, 2014, 23(11):4737-4749. doi:  10.1109/TIP.2014.2353813
    [74] YIN X C, PEI W Y, ZHANG J, et al. Multi-orientation scene text detection with adaptive clustering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1930-1937. doi:  10.1109/TPAMI.2014.2388210
    [75] 张树业.深度模型及其在视觉文字分析中的应用[D].广州: 华南理工大学, 2016. http://cdmd.cnki.com.cn/Article/CDMD-10561-1016770438.htm
    [76] VEIT A, MATERA T, NEUMANN L, et al. COCO-Text: Dataset and benchmark for text detection and recognition in natural images[J]. arXiv preprint, arXiv: 1601. 07140v2[cs.CV] 19 Jun 2016.
    [77] SHI B, YAO C, LIAO M, et al. ICDAR2017 competition on reading chinese text in the wild (RCTW-17)[C]//Document Analysis and Recognition (ICDAR), 201714th IAPR International Conference on. IEEE, 2017: 1429-1434.
    [78] CHNG C K, CHAN C S. Total-text: A comprehensive dataset for scene text detection and recognition[C]//201714th IAPR International Conference on Document Analysis and Recognition (ICDAR). 2017: 935-942.
    [79] LIU Y L, JIN L W, ZHANG S T, et al. Detecting curve text in the wild: New dataset and new solution[J]. arXiv preprint, arXiv: 1712. 02170v1[cs.CV] 6 Dec 2017. http://cn.arxiv.org/abs/1712.02170
    [80] YUAN T L, ZHU Z, XU K, et al. Chinese text in the wild[J]. arXiv preprint, arXiv: 1803. 00085v1[cs.CV] 28 Feb 2018. http://cn.arxiv.org/abs/1803.00085
    [81] HUA X S, LIU W Y, ZHANG H J. An automatic performance evaluation protocol for video text detection algorithms[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2004, 14(4):498-507. doi:  10.1109/TCSVT.2004.825538
    [82] WOLF C, JOLION J M. Object count/area graphs for the evaluation of object detection and segmentation algorithms[J]. International Journal of Document Analysis and Recognition (IJDAR), 2006, 8(4):280-296. doi:  10.1007/s10032-006-0014-0
    [83] EVERINGHAM M, ESLAMI S M A, GOOL L V, et al. The pascal visual object classes challenge:A retrospective[J]. International Journal of Computer Vision, 2015, 111(1):98-136. doi:  10.1007/s11263-014-0733-5
  • 加载中
图(4) / 表(4)
计量
  • 文章访问数:  300
  • HTML全文浏览量:  140
  • PDF下载量:  422
  • 被引次数: 0
出版历程
  • 收稿日期:  2018-06-27
  • 刊出日期:  2018-09-25

目录

    /

    返回文章
    返回