# “Visual awareness” by Koenderink

Further is an excerpt from the ebook Visual awareness by Jan Koenderink. The book is part of a collection, published by The Clootcrans Press.

What does it mean to be “visually aware”? One thing, due to Franz Brentano (1838-1917), is that all awareness is awareness of something. One says that awareness is intentional. This does not mean that the something exists otherwise than in awareness. For instance, you are visually aware in your dreams, when you hallucinate a golden mountain, remember previous visual awareness, or have pre-visions. However, the case that you are visually aware of the scene in front of you is fairly generic.

The mainstream account of what happens in such a generic case is this: the scene in front of you really exists (as a physical object) even in the absence of awareness. Moreover, it causes your awareness. In this (currently dominant) view the awareness is a visual representation of the scene in front of you. To the degree that this representation happens to be isomorphic with the scene in front of you the awareness is veridical. The goal of visual awareness is to present you with veridical representations. Biological evolution optimizes veridicality, because veridicality implies fitness.  Human visual awareness is generally close to veridical. Animals (perhaps with exception of the higher primates) do not approach this level, as shown by ethological studies.

JUST FOR THE RECORD these silly and incoherent notions are not something I ascribe to!

But it neatly sums up the mainstream view of the matter as I read it.

The mainstream account is incoherent, and may actually be regarded as unscientific. Notice that it implies an externalist and objectivist God’s Eye view (the scene really exists and physics tells how), that it evidently misinterprets evolution (for fitness does not imply veridicality at all), and that it is embarrassing in its anthropocentricity. All this should appear to you as in the worst of taste if you call yourself a scientist.  [p. 2-3]

___________________

I hold similar views, last time expressed in the post Ideology in the vision theater (but not with the same mastery as Koenderink, of course). Recall that “computing with space“, which is the main theme of this blog/open notebook, is about rigorously understanding (and maybe using) the “computation” done by the visual brain with the purpose to understand what space IS.  This is formulated in arXiv:1011.4485  as the “Plato’s hypothesis”:

(A) reality emerges from a more primitive, non-geometrical, reality in the same way as
(B) the brain construct (understands, simulates, transforms, encodes or decodes) the image of reality, starting from intensive properties (like a bunch of spiking signals sent by receptors in the retina), without any use of extensive (i.e. spatial or geometric) properties.
___________________
Nevermind my motivations, the important message is that  Koenderink critic is a hard science point of view about a hard science piece of research. It is not just a lexical game (although I recognize the value of such games as well, but as a mathematician I am naturally inclined towards hard science).

# Ideology in the vision theater

Thanks to Kenneth Olwig for suggesting that ideology may be related to the argument from  the post  On the exterior homunculus fallacy . More precisely, Olwig points to the  following quote from The German Ideology by Karl Marx and Friedrich Engels:

If in all ideology men and their circumstances appear upside-down as in a camera obscura, this phenomenon arises just as much from their historical life-process as the inversion of objects on the retina does from their physical life-process. In direct contrast to German philosophy which descends from heaven to earth, here we ascend from earth to heaven. That is to say, we do not set out from what men say, imagine, conceive, nor from men as narrated, thought of, imagined, conceived, in order to arrive at men in the flesh. We set out from real, active men, and on the basis of their real life-process we demonstrate the development of the ideological reflexes and echoes of this life-process. The phantoms formed in the human brain are also, necessarily, sublimates of their material life-process, which is empirically verifiable and bound to material premises.

One of the first posts of this blog was The Cartesian Theater: philosophy of mind versus aerography, where I use the article “All that is landscape is melted into air: the `aerography’ of ethereal space”, Environment and Planning D: Society and Space 2011, volume 29, pages 519 – 532 by Olwig in order to argue that the Cartesian Theater notion of Dennett is showing only one half of the whole homunculus fallacy. Indeed, Dennett’s theater is a theater in a box, the invention of Inigo Jones, already designed around the king (or homunculus, in Dennett argument), using geometrical perspective for giving an appearance of reality to an artificial construct, the scenic space.

With any homunculus, I argue, comes also a scenic space, which has to be taken into account in any theory of mind, because it is as artificial, it leads to the same kind of fallacy as the homunculus. In the posts   Towards aerography, or how space is shaped to comply with the perceptions of the homunculus   and  Theatron as an eye  I further develop the subject by trying to see what becomes the homunculus fallacy if we use not the theater in a box, but the old greek theater instead (and apparently it seems that it stops to be a fallacy, as homunculi and designed scenic spaces melt into oblivion and the gnomon, the generator of self-similarity, comes to the attention). Finally, in   the post On the exterior homunculus fallacy  I argue that the original homunculus fallacy is not depending on the fact that the homunculus is inside or outside the brain, thus leading me to suppose that the neuroscientist which studies a fly’s vision system is an exterior homunculus with respect to the fly and the lab is the scenic space of this homunculus. It means that any explanation of the fly vision which makes use of arguments which are not physically embedded in the fly brain (like knowledge about the euclidean structure of the space) makes sense for the experimenter, but cannot be the real explanations, because the fly does not have a lab with a small homunculus inside the head.

Which brings me to the relation with ideology, which is more than a given point of view, is a theater in a box which invites the infected host to take the place of the homunculus, watch the show and make an opinion based on the privileged position it occupies. But the opinion can be only one, carefully designed by the author of the ideology, the scenographer.
The scenic space needs an Inigo Jones, Inigo Jones is the ignored dual of the homunculus-king. He does not use magic in order to organize the show for the king, but he adds meaning. In the case of an ideology (a word which has as root a greek word meaning “to see”, thanks again to Olwig for this) the added meaning is intentional, but in the case of a neuroscientist which experiments on the vision system of a fly (what a king) it is unintended, but still present, under the form of assumptions which lead the experimenter to an explanation of the fly vision which is different from what the fly does when seeing (likely  an evolving graph with neurons as nodes and synapses as edges, which modifies itself according to the input, without any exterior knowledge about the experimenter’s lab and techniques).

# On the exterior homunculus fallacy

If we think about a homunculus outside the brain, the homunculus fallacy still functions.

This post continues the Vision theater series  part I, part II, part III, part IV, part V, part VI  and also links to the recent Another discussion about computing with space .

The homunculus argument is a fallacy arising most commonly in the theory of vision. One may explain (human) vision by noting that light from the outside world forms an image on the retinas in the eyes and something (or someone) in the brain looks at these images as if they are images on a movie screen (this theory of vision is sometimes termed the theory of the Cartesian Theater: it is most associated, nowadays, with the psychologist David Marr). The question arises as to the nature of this internal viewer. The assumption here is that there is a ‘little man’ or ‘homunculus‘ inside the brain ‘looking at’ the movie.

The reason why this is a fallacy may be understood by asking how the homunculus ‘sees’ the internal movie. The obvious answer is that there is another homunculus inside the first homunculus’s ‘head’ or ‘brain’ looking at this ‘movie’. But how does this homunculus see the ‘outside world’? In order to answer this, we are forced to posit another homunculus inside this other homunculus’s head and so forth. In other words, we are in a situation of infinite regress. The problem with the homunculus argument is that it tries to account for a phenomenon in terms of the very phenomenon that it is supposed to explain.

Suppose instead that the homunculus is outside the brain. Why, for example think about the experimenter doing research on your vision. The fallacy functions as well, because now we have another homunculus (outside the brain) who looks at the movie screen (i.e. the measurements he performed on your visual system, in the medium controlled by him). “But how does this homunculus see the ‘outside world’?” Infinite regression again.

If you think that is outrageous, then let me give you an example. The exterior homunculus (experimenter) explains your vision by interpreting the controlled space  he put you in (the lab) and the measurements he performed. When he does this interpretation he relies on:

• physical laws
• geometrical assumptions
• statistical assumptions

at least. Suppose that the experimenter says: “to the subject [i.e. you] was presented a red apple, at distance $d$, at coordinates $x,y,z$. By the physical laws of opticks and by the geometrical setting of the controlled lab we know that the sensor $S$ of the retina of the left eye was stimulated by the light coming from the apple. We recorded a pattern of activity in the regions $A, B, C$ of the brain, which we know from other knowledge (and statistical assumptions) that   $A$ is busy with recognition of fruits, $B$ is involved in contour recognition and $C$ with memories from childhood.” I agree that is a completely bogus simplification of what the real eperimenter will say, but bear with me when I claim that the knowledge used by the experimenter for explaining how you see the apple has not much to do with the way you see and recognize the apple. In the course of the explanation, the experimenter used knowledge about the laws of optics, used measurements which are outside you, like coordinates and geometric settings in the lab, and even notions from the experimenter’s head, as “red”, “apple” and “contours”.

Should the experimenter not rely on physical laws? Or on geometrical asumptions (like the lab is in a piece of euclidean 3d space)? Of course he can rely on those. Because, in the case of physical laws, we recognize them as physical because they are invariant (i.e. change in a predictable way) on the observer. Because in the case of geometrical assumptions we recognize them as geometrical because they are invariant on the parametrization (which in the lab appears as the privilege of the observer).

But, as it is the case that optics can explain only what happens with the light until it hits the retina, not more, the assumptions in the head of the experimenter, even physical and geometrical, cannot be used as an explanation for the way you see. Because, simply put, it is much more likely that you don’t have a lab in the head which is in a euclidean space, with an apple, a lamp and rules for measuring distances and angles.

You may say that everybody knows that apples are not red, that’s a cheap shot because apples scatter light of all frequencies and it just happen that our sensors from the retina are more sensible at some frequencies than other. Obvious. However, it seems that not many recognize that contours are as immaterial as colors, they are in the mind, not in reality, as Koenderink writes in  Theory of “Edge-Detection”  JJ Koenderink – Analysis for Science, Engineering and Beyond, 2012.

The explanation of vision which uses an exterior homunculus becomes infinite regression unless we also explain how the exterior homunculus thinks about all these exterior facts as lab coordinates, rational deductions from laws of physics and so on. It is outrageous, but there is no other way.

Let’s forget about experiments on you and let’s think about experiments on fly vision. Any explanation of fly vision which uses knowledge which is not, somehow, embodied in the fly brain, falls into the (exterior)  homunculus fallacy.

So what can be done, instead? Should we rely on magic, or say that no explanation is possible, because any explanation will be issued by an exterior homunculus? Of course not. When studying vision, nobody in the right mind doubts about the laws of optics. They are science (i.e. reproducible and falsifiable). But they don’t explain all the vision, only the first, physical step. Likewise, we should strive for giving explanations of vision which are scientific, but which do not make appeal to a ghost, in the machine or outside the machine.

Up to now, I think that is the best justification for the efforts of understanding space not in a passive way.

# Another discussion about computing with space

Computing with space vs space computing.

Space (real space, we all share) is not made of points. A point is an abstraction, the unattainable goal of a thought experiment, an atom of thought. Or a string of numbers (when we think with coordinates). Quantum physics tells us we can’t perform, even in principle, a physical experiment with the goal of exactly localizing the position of an object in space.

That’s pop philosophy. It might even be wrong (for example what quantum physics tells us is that we can’t perform physical experiments for localizing a particle in the phase space (position, momentum), not in the real space, whatever that means.

That’s also the turf of theoretical physicists, there are several, with various degree of mathematical soundness, theories about the structure of space. I shall not go in this direction, further.

Instead, I want to make a case for a biology inspired point of view. I made it before, repeatedly, starting with More than discrete or continuous: a bird’s view,  but now I have a bit more tools to tackle it, and a bit of more patience to not hurry to conclusions.

So, if you prefer the red pill, then read this. Can you think about space in terms of what it does, not what it is? Can you describe space as seen by a fly, or by a toddler, or you need to stick to cartesian conventions and then fall into the trap of continuous vs discrete, and so on?

Think like this: you are a fly and you have 10^5 neurons and 10^7 synapses. You are very good at flying by using about 10-20 actuators, and you see really well because the most part of your brain is busy with that. Now, where in that brain and how exactly there is place for a  representation of an euclidean 3d space? Mind you that humans have very little idea about how flies brains are capable of doing this and also, with their huge brains and their fast computers (much more faster and much bigger than a fly’s brain) were not successful yet to make a robot with the same competences as a fly.  (They will make one, there is no magic involved, but the constraints are really hard: an autonomous fly which can survive with the energy consumption comparable with the one of a real fly, without computing or human exterior help, in a natural environment, for 24hrs, find food, avoid traps and eventually mate.)

So, after this motivating example, I state my hypothesis: whatever space is (and that’s a very hard and old problem), let’s not think about it passively (like a robot fly which is driven by some algorithms which use advanced human knowledge about euclidean geometry, systems of coordinates and the laws of mechanics) as being a receptacle, a something. Let’s think about space as described by what you can do in it.

The fly, for example, cannot possibly have a passive representation of space (and for us is the same) in the brain, but it does have the possibility to manipulate it’s actuators as a function of what it sees (i.e. of what it’s receptors perceive and send further to the brain) and of the state of it’s brain (and maybe on the history of that state, i.e. memory, stored in a mysterious way in the same tiny brain).  However, actuators, sensors, brain and the environment are just one system, there is no ghost in, or outside that fly machine.

My hypothesis is that for the fly, that’s space.  For us is the same, but we are far more complex than the fly. However, deep in our brains there are “patterns” (are they assemblies of neurons, are they patterns of synaptic activity, is it chemical, electric, …?) which are very basic (a child learns to see in the first months) and which are space, for us.

Now I’ll get mathematical. There are spaces everywhere in math, for example when we say: that’s a vector space, that’s a manifold, or even this is a group, a category, and so on. We say like this, but what we actually have (in the mind) is not a manifold, or a vector space, a group or category, but some short collections of patterns (rules, moves, axioms) which can be applied to the said objects. And that is enough for doing mathematics. This can be formalized, for example it’s enough to have some simple rules involving gates with two inputs and an output (the dilations) and we can prove that these simple rules describe all the concepts associated to any vector space, for example. and moreover not using at any moment any external knowledge. A dilation is simply the pattern of activities related to  map making.

So, according to my hypothesis, a generic vector space is this collection of rules. When it comes to the dimension of it, there are supplementary relations to be added for being able to say that we speak about a 3d vector space, but it will be always about a generic 3d vector space. There is no concrete 3d space, when we say, for example, that we live in a 3d space, what we really say is that some of the things we can do in a generic 3d space can also be done in reality (i.e. we can perform experiments showing this, although there are again studies showing that our brain is almost as bad as concerns perceiving relations in the real space which correspond to theorems in geometry, as it is when we force it to do logic reasonings).

Conclusion for this part: there may be or not a passive space, the important thing is that when we think with space and about space, what we really do is using a collection of primitive patterns of thought about it.

Now, going to the Turing machine, it lacks space. Space can be used for enumeration, for example, but the Turing machine avoids this by supposing there is a tape (ordered set). It is proved that enumeration (i.e. the thing which resembles the most with space in the world of Turing machines) does not matter in the sense that it can be arbitrarily changed and still the conclusions and definitions from the field do not change. This is alike saying that the Turing machine is a geometrical object. But there is no geometrical description of a Turing machine (as far as I know) which is not using enumeration. This is alike saying that CS people can understand the concept of a sphere in terms of atlases , parametrizations and changes between those, but they can’t define spheres without them. Geometers can, and for this reason they can speak about what is intrinsically geometric about the sphere and what is only an artifact of the chosen coordinate system. In this sense, geometers are like flies: they know what can be done on a sphere without needing to know anything about coordinate systems.

# Right angles everywhere (II), about the gnomon

In this post I shall write about the gnomon. According to wikipedia,

The gnomon is the part of a sundial that casts the shadow. Gnomon (γνώμων) is an ancient Greek word meaning “indicator”, “one who discerns,” or “that which reveals.”

In the next figure are collected the minimal ingredients needed for understanding the gnomon: the sun, a vertical shape and its horizontal shadow.

That is the minimal model of the ancient greek visual universe: sun, a man and its shadow on the beach. It is a speculation, but to me, a gnomon seems to be a visual atom.

Pythagoreans extracted from this minimal visual universe the pattern and used it for giving an explanation for the human vision, described by the next figure.

Here the sun is replaced by the eye (of a god, initially, but the pattern might apply to a mortal also), the light rays emanated by the sun are assimilated with the lines  of vision (from here the misconception that the ancient greeks really believed that the eyes shoot rays which illuminate the field of vision) and the indivisible pair man-shadow becomes the L-shape of a gnomon.  An atom of vision.

Here comes a second level of understanding the gnomon, also of pythagoreic flavor. I cite again from the wiki page:

Hero defined a gnomon as that which, added to an entity (number or shape), makes a new entity similar to the starting entity.

This justifies the Euclid’ picture of the gnomon, as a generator of self-similarity:

(image taken from the wiki page on gnomon)
So maybe the word “atom” is less appropriate than “generator”. In conclusion, according to ancient greeks, a gnomon (be it a triple sun-man-shadow or a pair eye – elementary L-shape) is the generator of the visual perception, via the mechanism of self-similarity.

In their architecture, they tried to make this obvious, readable.  Because it’s scalable (due to the relation with self-similarity), the architectural solution of constructing with gnomons  invaded the world.

# Right angles everywhere (I)

Look at almost any building in the contemporary city, it’s constructed from right angles, assembled into rectangles, assembled into boxes. We expect, in fact,  a room to have a rectangular floor, with vertical walls meeting in right angles. Exceptions are either due to architectural fancies or to historical constraints or mistakes.

When a kid draws a house, it looks like a rectangle, with the  triangle of the roof on top.

Is this normal? Where does this obsession of the right angle comes from?

The answer is that behind any right angle is hidden a gnomon. We build like this because we  are Pythagoras children, living by the rules and categories of our cultural ancestors, the ancient greeks.

Let’s see:
(I) In ancient times,  or in  places far from the greeks  (and babylonians), other architectural forms are preferred, like the  roundhouse. Here’s a Scottish broch (image taken from this wiki page)

and here’s a Buddhist stupa (image taken from the wiki page)

Another ancient building form is the step pyramid , like the Great Ziggurat of Ur (image taken from the last wiki page)

or the egyptian pyramids, or any other famous  pyramid in the world (there are plenty of them, in very different cultural frames).

Here is a Sardinian Nuraghe

Conclusion: round, conical, pyramidal is the rule, there are no right angles there!

Until the greeks: here’s the Parthenon

It is made of gnomons, here’s one (from the wiki page)

# The gnomon in the greek theater of vision, I

In the post Theatron as an eye I proposed the Greek Theater, or Theatron (as opposed to the “theater in a box”, or Cartesian Theater, see further) as a good model for   vision.

Any model of vision should avoid the homunculus fallacy. What looks less understood is that any good model of vision should avoid the scenic space fallacy. The Cartesian Theater argument against the existence of the homunculus is not, by construction, an argument against the scenic space. Or, in the Cartesian Theater, homunculus and scenic space come to existence in a pair. As a conclusion, it seems that there could not be a model of vision which avoids the homunculus but is not avoiding the scenic space. This observation is confirmed by facts: there is no good, rigorous  model of vision up to date, because all proposed models rely on the a priori existence of a scenic space. There is, on the contrary, a great quantity of experimental data and theoretical partial models which show just how complex the problem of vision is. But, essentially, from a mathematician viewpoint, it is not known how to even formulate the problem of vision.

In the influent paper “The brain a geometry engine”  J. Koenderink proposes that (at least a part of) the visual mechanism is doing a kind of massively parallel computation, by using an embodiment of the geometry of jet spaces (the euclidean infinitesimal geometry of a smooth manifold)  of the scenic space. Jean Petitot continues along this idea, by proposing a neurogeometry of vision based essentially on the sub-riemannian geometry of those jet spaces. This an active mathematical area of research, see for example “Antropomorphic image reconstruction via hypoelliptic diffusion“, by Ugo Boscain et al.

Sub-riemannian geometry is one of my favorite mathematical subjects, because it  is just a  particular model of a metric space with dilations.  Such spaces are somehow fundamental for the problem of vision, I think. Why? because there is behind them a purely relational formalism, called “emergent algebra“, which allow to understand “understanding space” in a purely relational way. Thus I hope emergent algebras could be used in order to formulate the problem of vision as the problem of computing with space, which in turn could be used for getting a good model of vision.

To my surprise, some time ago I have found that this  very complex subject has a respectable age, starting with Pythagora  and Plato!  This is how I arrived to write this blog, as an effort to disseminate what I progressively understand.

This brings me back to the theater and, finally, to gnomon. I cite from previous wiki link:

Hero defined a gnomon as that which, added to an entity (number or shape), makes a new entity similar to the starting entity.

In the greek theater, a gnomon sits in the center of the orchestra (which is the circular place where things happen in the greek thater, later replaced by the scene in the theater in a box). Why?