Visual Experience as Theory

8 min readAug 16, 2020

Consider the image above. With a glance, you know many things about it. If I asked you to describe it, you would likely come up with something like: “It is a grid with short red horizontal lines at the centers of cells. White lines consisting of the left and top boundaries of grid cells zigzag up and to the right across the grid, but there is also a horizontal band half way down where all four boundaries appear as lines, and a vertical band half way across with the same property. ” All of these aspects are available in your experience of the image within a second or so of seeing it. It feels as if you see the image “as it is”, including these structural aspects. But this is not the right account, as the following examples will show.

Have a look at this pair of images.

Can you spot differences between left and right? Maybe, but you would have to spend time at it. There are two dozen differences between the two. A version in which the differences are displayed in black appears at the end of this note. The fact that no difference enters awareness on seeing the images means that your visual awareness does not contain representations of all the individual segments that appear in the image. What does happen is this. As you move your eyes across the image (via saccades), the few segments that are in foveal range and are granted attention — those few can be said to be present in visual awareness at that moment. But only that moment. They are forgotten almost immediately when the eye moves on.

Your visual mind contains theories of images, not copies thereof. For example, your theory of the image at the top of this note was something like the zigzag theory given above. Similarly, your theories of the next two images ran something like “a grid of short white lines randomly oriented on a red field”. You could not see any difference between the images, because the same theory applies to both.

At this point, I’ll note that the zigzag theory is slightly wrong. If you look at the middle of the top of the lower left quadrant, you’ll see that there is a missing zig two lines down. Also, in the lower right quadrant, if you look along the diagonal, about two-thirds of the distance from upper left to lower right you’ll see a vertical rather than horizontal red line. So, your theory will need elaboration. This shows that your awareness of image A is limited by the fovea as well.

This is how vision works: At any given time, you have a theory of what is present in your visual field. As you redirect your gaze via saccades, you check the new foveal images against your theory of what they should contain. Usually, the match is good (because a billion years of evolution has tuned you well), and no theory revision is required. (Karl Friston’s free energy principle provides an account of how and why our theories, for the most part, predict incoming sensory data so well).

But what, more precisely, is contained in our awareness of an image? In the jargon of philosophy, such an awareness — such an appearance within the mind — is called a phenomenon, and the study thereof phenomenology. So we distinguish between the image itself (an array of pixels), and our awareness of it (a phenomenon). The phenomenon associated with an image includes what I have called a “theory”, such detail as is available from the fovea, and the vague shapes transmitted by peripheral vision. In addition, a phenomenon associated with an image can be characterized as a hierarchical structure. Points of color (pixels), are at the bottom of the hierarchy, then, in the case of the phenomenon arising from image A, we ascend to short line segments, followed by the aggregates: zigzags, and linear assemblies of segments. Other images have other hierarchies, often quite deep. What I have called the theory assigns certain properties and relations to the elements at each level, such as parallelism and repetition-with-variation. The structure, in the case of image A, would contain propositions such as “white lines consisting of the left and top boundaries of grid cells zigzag up and to the right across the grid”. Some aspects of the structures are accessible to introspection (as in the zigzag case), but most are not. There is evidence about the latter from a number of sources, for example, from studies of texture perception.

This sort of structure can be represented mathematically, using the tools of geometry and mathematical logic. Possibly, a detailed correlation will be found between such a structure and brain states, in some distant future. (Such correlations have been found already, but only at a very coarse level). So we have three topics to think about: image A, phenomenon A, and mathematical structure A.

This standpoint might work for the kinds of simple, grid-based, abstract images that we have been considering, but what about the realm of all images encountered by humans? Of course, the images we see in every day life are almost always two-dimensional projections of a three-dimensional world, either directly, or via photographs and paintings. It seems likely that what we perceive in those images — their phenomena — have the same nature as the phenomena that arise from the simple images. We see details in the foveal field, and also theorize patterns of various sorts — repetitions, theories of illumination, theories of what familiar objects appear in the image and where, and so forth, but formulated in three dimensions.

I don’t mean that the mathematical structure which corresponds to a phenomenon presents itself to us as equations and logical formulae. Rather the structure supports the operations that our mind applies to it, such as noting patterns of repetition in the image, recognizing familiar structure such as zigzags, and predicting outcomes of visual input as the the eye moves through saccades. In the three-dimensional case, the operations involve noting where things are in space, and, for example, predicting their appearance as we approach them. The way in which the structure and its operations are coded (if that is the right word) is unknown, but they are partly visible to the inner eye of the phenomenologist (which is the same as any other eye, but apprehends its targets as phenomena rather than things in the external world).

The account that I have just given differs from what we feel is happening.We feel, in gazing at the images A, B and C , that what is present to us is the entire image as it actually is. On the contrary, the fact is that detail is missing except in the foveal area, and it is an illusion to feel that our awareness includes detail elsewhere. What dominates is what I have called our theory of the image. The illusion of complete awareness is probably buttressed by the possibility of shifting our gaze to any part of an image at any time. However, when the shift takes place, the preceding bit of foveal detail is immediately and completely forgotten, so there is no time at which the whole image is present to our awareness, except at the level of theory.

Invisible Patterns

Just because a pattern, however simple, is present in an image, does not mean that it will show up in the associated phenomenon. The phenomenon contains only those patterns which our visual minds are able to recognize and describe. These explicit descriptions are needed for awareness of the patterns. All that we see is built out of them. This is another sense in which it is incorrect to think that we see images as they are — it is better to say, with Anaïs Nin, that we see them as we are.

Here is a demonstration of this point. Do you see a pattern in this image?

There is one, and it is very simple. See the end of this note for a version of the image in which the pattern elements are displayed in black.

Mathematical Realism

Whether or not correlations are discovered between the mathematical structures which formulate phenomena, and brain states, remains to be seen. But, in any case, the mathematical structures, if found, would not be the same as brain states. They reside in the universe of mathematics. This is where mathematical realism comes in.

Mathematical realism is the philosophical position that mathematical objects exist in their own right, independent of human thought . A mathematical realist can visualize a sort of cloud of mathematics around physical things. This cloud consists of the mathematical objects that exhibit systematic correlations to the things in question. This supports the following way of talking about the mathematical structure of physical reality. One interprets “physical thing X exhibits mathematical structure Y”, as meaning “there exists the mathematical structure Y that exhibits such and such correlation to the physical object X”. This way of putting matters takes the realistic view towards the mathematical object: it is an existent in its own right which correlates to the physical.

So, if one is a mathematical realist (such as me), then structure A simply resides in the mathematical universe. A question then arises. Is phenomenon A the same, identical thing as structure A? Maybe that’s how the world works: our awareness is constituted by mathematical structures (the Integrated Information Theorists think so). That is, the structures are not the subject matter of our perceptions, but are those perceptions. (A more detailed treatment of this topic can be found here).

But this cannot be the whole story, for a basic reason: qualia. “Qualia” is the philosophical name for the basic “feels” that arise in experience, such as color or pain. Qualia cannot be explained by mathematics. But maybe phenomenon A consists of structure A painted, as it were, by qualia.

This line of inquiry might be mistaken. But phenomena are something and made of something — and that something is not brain tissue. That they are partly made of mathematical structure is made plausible by the fact that their structural features can be captured mathematically.

Postscript

Visual Experience as Theory

Written by chrisGoad