This website requires JavaScript.

MVTN: Learning Multi-View Transformations for 3D Understanding

Abdullah HamdiFaisal AlZahraniSilvio GiancolaBernard Ghanem
Dec 2022
摘要
Multi-view projection techniques have shown themselves to be highly effectivein achieving top-performing results in the recognition of 3D shapes. Thesemethods involve learning how to combine information from multiple view-points.However, the camera view-points from which these views are obtained are oftenfixed for all shapes. To overcome the static nature of current multi-viewtechniques, we propose learning these view-points. Specifically, we introducethe Multi-View Transformation Network (MVTN), which uses differentiablerendering to determine optimal view-points for 3D shape recognition. As aresult, MVTN can be trained end-to-end with any multi-view network for 3D shapeclassification. We integrate MVTN into a novel adaptive multi-view pipelinethat is capable of rendering both 3D meshes and point clouds. Our approachdemonstrates state-of-the-art performance in 3D classification and shaperetrieval on several benchmarks (ModelNet40, ScanObjectNN, ShapeNet Core55).Further analysis indicates that our approach exhibits improved robustness toocclusion compared to other methods. We also investigate additional aspects ofMVTN, such as 2D pretraining and its use for segmentation. To support furtherresearch in this area, we have released MVTorch, a PyTorch library for 3Dunderstanding and generation using multi-view projections.
展开全部
图表提取

暂无人提供速读十问回答

论文十问由沈向洋博士提出,鼓励大家带着这十个问题去阅读论文,用有用的信息构建认知模型。写出自己的十问回答,还有机会在当前页面展示哦。

Q1论文试图解决什么问题?
Q2这是否是一个新的问题?
Q3这篇文章要验证一个什么科学假设?
0
被引用
笔记
问答