This website requires JavaScript.

DocScanner: Robust Document Image Rectification with Progressive Learning

Hao FengWengang ZhouJiajun DengQi TianHouqiang Li
摘要
Compared to flatbed scanners, portable smartphones are much more convenientfor physical documents digitizing. However, such digitized documents are oftendistorted due to uncontrolled physical deformations, camera positions, andillumination variations. To this end, this work presents DocScanner, a new deepnetwork architecture for document image rectification. Different from existingmethods, DocScanner addresses this issue by introducing a progressive learningmechanism. Specifically, DocScanner maintains a single estimate of therectified image, which is progressively corrected with a recurrentarchitecture. The iterative refinements make DocScanner converge to a robustand superior performance, and the lightweight recurrent architecture ensuresthe running efficiency. In addition, before the above rectification process,observing the corrupted rectified boundaries existing in prior works,DocScanner exploits a document localization module to explicitly segment theforeground document from the cluttered background environments. To furtherimprove the rectification quality, based on the geometric priori between thedistorted and the rectified images, a geometric regularization is introducedduring training to further facilitate the performance. Extensive experimentsare conducted on the Doc3D dataset and the DocUNet benchmark dataset, and thequantitative and qualitative evaluation results verify the effectiveness ofDocScanner, which outperforms previous methods on OCR accuracy, imagesimilarity, and our proposed distortion metric by a considerable margin.Furthermore, our DocScanner shows the highest efficiency in inference time andparameter count.
展开全部
图表提取

暂无人提供速读十问回答

论文十问由沈向洋博士提出,鼓励大家带着这十个问题去阅读论文,用有用的信息构建认知模型。写出自己的十问回答,还有机会在当前页面展示哦。

Q1论文试图解决什么问题?
Q2这是否是一个新的问题?
Q3这篇文章要验证一个什么科学假设?
0
被引用
笔记
问答