This website requires JavaScript.

Reading yesterday's news. Layout recognition by segmentation of historical newspaper pages

Christian Schultze (1)Niklas Kerkfeld (1)Kara Kuebart (2) ...+5 (2) Institut f\"ur Geschichtswissenschaft Universit\"at Bonn)
Jan 2024
0被引用
0笔记
摘要原文
Newspapers are important sources for historians interested in past societies' cultural values, social structures, and their changes. Since the 19th century, newspapers have been widely available and spread regionally. Today, historical newspapers are digitized but unavailable in a separate metadata-enhanced form. Machine-readable metadata, however, is a prerequisite for a mass statistical analysis of this source. This paper focuses on parsing the complex layout of historic newspaper pages, which today's machines do not understand well. We argue for using neural networks, which require detailed annotated data in large numbers. Our Bonn newspaper dataset consists of 486 pages of the \textit{K\"olnische Zeitung} from the years 1866 and 1924. We propose solving the newspaper-understanding problem by training a U-Net on our new dataset, which delivers satisfactory performance.
展开全部
机器翻译
AI理解论文&经典十问
图表提取
参考文献
发布时间 · 被引用数 · 默认排序
被引用
发布时间 · 被引用数 · 默认排序
社区问答