Open Access Te Herenga Waka-Victoria University of Wellington
Browse

360° Image Manipulation from Inferred Geometry

Download (33.58 MB)
thesis
posted on 2025-06-14, 11:48 authored by Kun HuangKun Huang

This thesis explores innovative methods for enhancing the manipulation and geometric understanding of 360° images in virtual reality (VR) applications, with a focus on depth estimation, surface normal prediction, and multi-task learning. The research presents a comprehensive approach to improving the manipulation of stereo 360° images, particularly in the context of image composition. A novel method is proposed to seamlessly integrate new visual elements into stereo 360° scenes, thereby enhancing user interaction and immersion in VR environments.

To support such advanced editing applications, a new technique for estimating depth maps from monocular 360° images is introduced. This method leverages both local and global scene information, significantly improving the accuracy of depth predictions. This is particularly crucial for applications that require spatial awareness, such as virtual and augmented reality, where accurate depth perception is key to realistic interaction with the virtual environment.

Furthermore, the thesis introduces a hybrid approach that combines convolutional neural networks (CNNs) and Vision Transformers (ViTs) to improve surface normal estimation. This method takes advantage of CNNs' ability to capture fine-grained details and ViTs' strength in modeling global context, resulting in more precise surface geometry analysis from 360° imagery. This enhanced surface normal estimation plays a vital role in better understanding the spatial structure of the scene.

In addition, the research demonstrates the effectiveness of multi-task learning (MTL) for comprehensive scene geometry understanding. By predicting both depth and surface normals simultaneously from monocular 360° images, the proposed MTL framework delivers a more detailed and coherent geometric representation, further contributing to the realism and immersion of VR used scenarios.

Through these contributions, the thesis significantly advances the field of 360° image-based manipulations and scene understanding, offering new tools and methodologies that pave the way for more immersive, interactive, and spatially aware experiences for VR applications.

History

Copyright Date

2025-06-14

Date of Award

2025-06-14

Publisher

Te Herenga Waka—Victoria University of Wellington

Rights License

Author Retains Copyright

Degree Discipline

Computer Graphics

Degree Grantor

Te Herenga Waka—Victoria University of Wellington

Degree Level

Doctoral

Degree Name

Doctor of Philosophy

ANZSRC Socio-Economic Outcome code

220406 Graphics

ANZSRC Type Of Activity code

1 Pure basic research

Victoria University of Wellington Item Type

Awarded Doctoral Thesis

Language

en_NZ

Alternative Language

en_NZ

Victoria University of Wellington School

School of Engineering and Computer Science

Advisors

Zhang, Fanglue; Dodgson, Neil