This website requires JavaScript.

Predicting Human Attention using Computational Attention

Zhibo YangSounak MondalSeoyoung AhnGregory ZelinskyMinh HoaiDimitris Samaras
Mar 2023
摘要
Most models of visual attention are aimed at predicting either top-down orbottom-up control, as studied using different visual search and free-viewingtasks. We propose Human Attention Transformer (HAT), a single model predictingboth forms of attention control. HAT is the new state-of-the-art (SOTA) inpredicting the scanpath of fixations made during target-present andtarget-absent search, and matches or exceeds SOTA in the prediction of tasklessfree-viewing fixation scanpaths. HAT achieves this new SOTA by using a noveltransformer-based architecture and a simplified foveated retina thatcollectively create a spatio-temporal awareness akin to the dynamic visualworking memory of humans. Unlike previous methods that rely on a coarse grid offixation cells and experience information loss due to fixation discretization,HAT features a dense-prediction architecture and outputs a dense heatmap foreach fixation, thus avoiding discretizing fixations. HAT sets a new standard incomputational attention, which emphasizes both effectiveness and generality.HAT's demonstrated scope and applicability will likely inspire the developmentof new attention models that can better predict human behavior in variousattention-demanding scenarios.
展开全部
图表提取

暂无人提供速读十问回答

论文十问由沈向洋博士提出,鼓励大家带着这十个问题去阅读论文,用有用的信息构建认知模型。写出自己的十问回答,还有机会在当前页面展示哦。

Q1论文试图解决什么问题?
Q2这是否是一个新的问题?
Q3这篇文章要验证一个什么科学假设?
0
被引用
笔记
问答