Metamers of Deep Networks Reveal Divergence From Human Perceptual Systems

Jenelle Feather, Massachusetts Institute of Technology

Photo of Jenelle Feather

Deep neural networks have been embraced as models of sensory systems, instantiating representational transformations that appear to resemble those in the visual and auditory systems. To more thoroughly investigate their similarity to biological systems, we synthesized model metamers – stimuli that produce the same responses at some stage of a network's representation. We generated these for natural stimuli by matching the responses of individual layers of image and audio networks trained to recognize objects and speech, respectively. We then measured whether model metamers were recognizable to human observers – a necessary condition for the model representations to replicate those of humans. Many of the model metamers were unrecognizable to humans. However, the metamer failure modes suggested architectural and task modifications that better aligned the network representations with perception, making the model metamers more recognizable to humans. We further demonstrate that the metamer test can be used to compare model representations, comparing the representations learned on different tasks or architectures. The results reveal discrepancies between model and human representations, but also show how model metamers can help guide model refinement.

Abstract Author(s): Jenelle Feather, Alex Durango, Josh McDermott