This website requires JavaScript.

StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized Tokenizer of a Large-Scale Generative Model

Zipeng XuEnver SanginetoNicu Sebe
Mar 2023
摘要
Despite the progress made in the style transfer task, most previous workfocus on transferring only relatively simple features like color or texture,while missing more abstract concepts such as overall art expression orpainter-specific traits. However, these abstract semantics can be captured bymodels like DALL-E or CLIP, which have been trained using huge datasets ofimages and textual documents. In this paper, we propose StylerDALLE, a styletransfer method that exploits both of these models and uses natural language todescribe abstract art styles. Specifically, we formulate the language-guidedstyle transfer task as a non-autoregressive token sequence translation, i.e.,from input content image to output stylized image, in the discrete latent spaceof a large-scale pretrained vector-quantized tokenizer. To incorporate styleinformation, we propose a Reinforcement Learning strategy with CLIP-basedlanguage supervision that ensures stylization and content preservationsimultaneously. Experimental results demonstrate the superiority of our method,which can effectively transfer art styles using language instructions atdifferent granularities. Code is available athttps://github.com/zipengxuc/StylerDALLE.
展开全部
图表提取

暂无人提供速读十问回答

论文十问由沈向洋博士提出,鼓励大家带着这十个问题去阅读论文,用有用的信息构建认知模型。写出自己的十问回答,还有机会在当前页面展示哦。

Q1论文试图解决什么问题?
Q2这是否是一个新的问题?
Q3这篇文章要验证一个什么科学假设?
0
被引用
笔记
问答