Effect of Encoded Theories of Visual Perception in Computational Aesthetics
Aesthetics is an important factor in visual arts appreciation and assessment. Several theories of visual perception have been proposed and taught. Research in algorithmically modeling human aesthetic assessment is recent and does not cover all aspects of aesthetic appraisal: color and the unverified Rule-of-Thirds composition guideline are the most widely encoded, but other factors have received little attention. A gap exists in current research to correlate aesthetic assessment methods with theories of visual perception. Encoding these theories can enhance the accuracy and efficiency of computational aesthetic research. This thesis addresses the visual perceptual features of positioning and size, including how they relate to theories of balance and perception. It describes the design and implementation of an optimized saliency generation algorithm to model both object detection and gaze fixation abilities in the human vision system, allowing the extraction of relevant image features. It also details the design and conduct of four experiments that identify the optimum spatial positioning and size of salient regions to achieve aesthetically pleasing images. Finally, these two features are encoded as “visual balance” image features for aesthetic assessment and tested in a verification experiment with cropped images. Results show that this method produces image crops perceived to be more aesthetically pleasing than ground truth datasets, explain the validity of the Rule-of-Thirds guideline in limited situations, and reveal no difference in the instinctive ability for aesthetic assessment of participants with and without prior arts experience.