Tag Archives: vision

The front end visual system performs like a distributed GLC computation

In this post I want to explain why the Distributed GLC  model of computation can be seen as a proof of principle that it is possible to describe rigorously some complex functioning of the brain as computation.

If you are not aware about this as being a problem, then please learn that the matter whether what brains do is computation is very controversial. On one side there are rigorous notions of computation (expressed in terms of Turing Machines, or in terms of lambda calculus, for example) which are used with full competence in CS. On the other side, in (some parts of) neuroscience the word “computation” is used in a non-rigorous sense, not because the neuroscience specialists are incapable of understanding computation in the rigorous CS sense, but because in real brains the matters are far more complex to make sense than in regards to paper  computers. Nevertheless, (some) CS specialists believe (without much real evidence) that brains compute in the CS sense, and (some) neuroscience specialists believe that their vague notions of computation deserve to bear this name, even it does not look like computation in CS rigorous sense.

OK, I shall concentrate on a particular example which I think it is extremely interesting.

In the article by Kappers, A.M.L.; Koenderink, J.J.; Doorn, A.J. van, Basic Research Series (1992), pp. 1 – 23,

Local Operations: The Embodiment of Geometry

the authors introduce the notion of  the  “Front End Visual System” . From section 1, quotes indexed by me with (1), (2), (3).

(1) Vision […]  is sustained by a dense, hierarchically nested and heterarchically juxtaposed tangle of cyclical processes.”

(2) In this chapter we focus upon the interface between the light field and those parts of the brain nearest to the transduction stage. We call this the “visual front end”.

(3) Of course, the exact limits of the interface are essentially arbitrary, but nevertheless the notion of such an interface
is valuable.

Comments:

  • (2) is the definition of the front end
  • (3) is a guard against a possible entry path of the homunculus in the brain
  • (1)  has these very nice expression “dense tangle of cyclical processes”, will come back to this!

Let’s pass to the main part of interest: what does the front end?  Quotes from the section 1, indexed by me with (a), … (e):

  • (a) the front end is a “machine” in the sense of a syntactical transformer (or “signal processor”)
  • (b) there is no semantics (reference to the environment of the agent). The front end merely processes structure
  • (c) the front end is precategorical,  thus – in a way – the front end does not compute anything
  • (d) the front end operates in a bottom up fashion. Top down commands based upon semantical interpretations are not considered to be part of the front end proper
  • (e) the front end is a deterministic machine […]  all output depends causally on the (total) input from the immediate past.

Comments and reformulations, indexed by (I), … (IV)

  • (I) the front end is a syntactical transformer, it processes structure [from (a), (b)]
  • (II) there is no semantics [from (b)]; semantical interpretations are not part of the front end [from (d)]
  • (III) the front end does not compute, in the sense that there is no categorical like chasing diagrams type of computing [not formulated in terms of signals processed by gates?] [from (d)]
  • (IV) there is a clear mechanism, based on something like a “dense tangle of cyclical processes” which processes the total input (from the light field) from the immediate past [from (e) and (1)]

These (I)-(IV) are exactly the specifications of a distributed computation with GLC actors, namely:

  • a distributed, asynchronous, rigorously defined computation
  • based on local graph rewrites which are purely syntactic transformers,  a correspondent of both “dense tangle of cyclical processes” and also of “processes structure”
  • there is no semantics,  because there are no names or values which decorate the arrows of the GLC graphs, nor they travel through the nodes of such graphs. There is no evaluation procedure needed for the computation with GLC actors
  • the computation with GLC actors is done starting from an initial graph (structure) , which may use also external constructs (the cores are equivalents of the light field which triggers chemical reaction in the retina, which are then processed by the front end)

This is no coincidence! One of the reasons of building GLC was exactly the one of making sense of the front end visual system.

In conclusion:

  • yes, there is a way to rigorously describe what the front end does as computation in the CS sense, although
  • this notion of computation has some unique features: no evaluation, graph reduction based asynchronous, distributed, purely local. No semantics needed, no global notions or global controllers, neither in space, nor in time.

“Visual awareness” by Koenderink

Further is an excerpt from the ebook Visual awareness by Jan Koenderink. The book is part of a collection, published by The Clootcrans Press.

What does it mean to be “visually aware”? One thing, due to Franz Brentano (1838-1917), is that all awareness is awareness of something. One says that awareness is intentional. This does not mean that the something exists otherwise than in awareness. For instance, you are visually aware in your dreams, when you hallucinate a golden mountain, remember previous visual awareness, or have pre-visions. However, the case that you are visually aware of the scene in front of you is fairly generic.

The mainstream account of what happens in such a generic case is this: the scene in front of you really exists (as a physical object) even in the absence of awareness. Moreover, it causes your awareness. In this (currently dominant) view the awareness is a visual representation of the scene in front of you. To the degree that this representation happens to be isomorphic with the scene in front of you the awareness is veridical. The goal of visual awareness is to present you with veridical representations. Biological evolution optimizes veridicality, because veridicality implies fitness.  Human visual awareness is generally close to veridical. Animals (perhaps with exception of the higher primates) do not approach this level, as shown by ethological studies.

JUST FOR THE RECORD these silly and incoherent notions are not something I ascribe to!

But it neatly sums up the mainstream view of the matter as I read it.

The mainstream account is incoherent, and may actually be regarded as unscientific. Notice that it implies an externalist and objectivist God’s Eye view (the scene really exists and physics tells how), that it evidently misinterprets evolution (for fitness does not imply veridicality at all), and that it is embarrassing in its anthropocentricity. All this should appear to you as in the worst of taste if you call yourself a scientist.  [p. 2-3]

___________________

I hold similar views, last time expressed in the post Ideology in the vision theater (but not with the same mastery as Koenderink, of course). Recall that “computing with space“, which is the main theme of this blog/open notebook, is about rigorously understanding (and maybe using) the “computation” done by the visual brain with the purpose to understand what space IS.  This is formulated in arXiv:1011.4485  as the “Plato’s hypothesis”:

(A) reality emerges from a more primitive, non-geometrical, reality in the same way as
(B) the brain construct (understands, simulates, transforms, encodes or decodes) the image of reality, starting from intensive properties (like a bunch of spiking signals sent by receptors in the retina), without any use of extensive (i.e. spatial or geometric) properties.
___________________
Nevermind my motivations, the important message is that  Koenderink critic is a hard science point of view about a hard science piece of research. It is not just a lexical game (although I recognize the value of such games as well, but as a mathematician I am naturally inclined towards hard science).

Ideology in the vision theater

Thanks to Kenneth Olwig for suggesting that ideology may be related to the argument from  the post  On the exterior homunculus fallacy . More precisely, Olwig points to the  following quote from The German Ideology by Karl Marx and Friedrich Engels:

If in all ideology men and their circumstances appear upside-down as in a camera obscura, this phenomenon arises just as much from their historical life-process as the inversion of objects on the retina does from their physical life-process. In direct contrast to German philosophy which descends from heaven to earth, here we ascend from earth to heaven. That is to say, we do not set out from what men say, imagine, conceive, nor from men as narrated, thought of, imagined, conceived, in order to arrive at men in the flesh. We set out from real, active men, and on the basis of their real life-process we demonstrate the development of the ideological reflexes and echoes of this life-process. The phantoms formed in the human brain are also, necessarily, sublimates of their material life-process, which is empirically verifiable and bound to material premises.

One of the first posts of this blog was The Cartesian Theater: philosophy of mind versus aerography, where I use the article “All that is landscape is melted into air: the `aerography’ of ethereal space”, Environment and Planning D: Society and Space 2011, volume 29, pages 519 – 532 by Olwig in order to argue that the Cartesian Theater notion of Dennett is showing only one half of the whole homunculus fallacy. Indeed, Dennett’s theater is a theater in a box, the invention of Inigo Jones, already designed around the king (or homunculus, in Dennett argument), using geometrical perspective for giving an appearance of reality to an artificial construct, the scenic space.

With any homunculus, I argue, comes also a scenic space, which has to be taken into account in any theory of mind, because it is as artificial, it leads to the same kind of fallacy as the homunculus. In the posts   Towards aerography, or how space is shaped to comply with the perceptions of the homunculus   and  Theatron as an eye  I further develop the subject by trying to see what becomes the homunculus fallacy if we use not the theater in a box, but the old greek theater instead (and apparently it seems that it stops to be a fallacy, as homunculi and designed scenic spaces melt into oblivion and the gnomon, the generator of self-similarity, comes to the attention). Finally, in   the post On the exterior homunculus fallacy  I argue that the original homunculus fallacy is not depending on the fact that the homunculus is inside or outside the brain, thus leading me to suppose that the neuroscientist which studies a fly’s vision system is an exterior homunculus with respect to the fly and the lab is the scenic space of this homunculus. It means that any explanation of the fly vision which makes use of arguments which are not physically embedded in the fly brain (like knowledge about the euclidean structure of the space) makes sense for the experimenter, but cannot be the real explanations, because the fly does not have a lab with a small homunculus inside the head.

Which brings me to the relation with ideology, which is more than a given point of view, is a theater in a box which invites the infected host to take the place of the homunculus, watch the show and make an opinion based on the privileged position it occupies. But the opinion can be only one, carefully designed by the author of the ideology, the scenographer.
The scenic space needs an Inigo Jones, Inigo Jones is the ignored dual of the homunculus-king. He does not use magic in order to organize the show for the king, but he adds meaning. In the case of an ideology (a word which has as root a greek word meaning “to see”, thanks again to Olwig for this) the added meaning is intentional, but in the case of a neuroscientist which experiments on the vision system of a fly (what a king) it is unintended, but still present, under the form of assumptions which lead the experimenter to an explanation of the fly vision which is different from what the fly does when seeing (likely  an evolving graph with neurons as nodes and synapses as edges, which modifies itself according to the input, without any exterior knowledge about the experimenter’s lab and techniques).

On the exterior homunculus fallacy

If we think about a homunculus outside the brain, the homunculus fallacy still functions.

This post continues the Vision theater series  part I, part II, part III, part IV, part V, part VI  and also links to the recent Another discussion about computing with space .

According to wikipedia:

The homunculus argument is a fallacy arising most commonly in the theory of vision. One may explain (human) vision by noting that light from the outside world forms an image on the retinas in the eyes and something (or someone) in the brain looks at these images as if they are images on a movie screen (this theory of vision is sometimes termed the theory of the Cartesian Theater: it is most associated, nowadays, with the psychologist David Marr). The question arises as to the nature of this internal viewer. The assumption here is that there is a ‘little man’ or ‘homunculus‘ inside the brain ‘looking at’ the movie.

The reason why this is a fallacy may be understood by asking how the homunculus ‘sees’ the internal movie. The obvious answer is that there is another homunculus inside the first homunculus’s ‘head’ or ‘brain’ looking at this ‘movie’. But how does this homunculus see the ‘outside world’? In order to answer this, we are forced to posit another homunculus inside this other homunculus’s head and so forth. In other words, we are in a situation of infinite regress. The problem with the homunculus argument is that it tries to account for a phenomenon in terms of the very phenomenon that it is supposed to explain.

Suppose instead that the homunculus is outside the brain. Why, for example think about the experimenter doing research on your vision. The fallacy functions as well, because now we have another homunculus (outside the brain) who looks at the movie screen (i.e. the measurements he performed on your visual system, in the medium controlled by him). “But how does this homunculus see the ‘outside world’?” Infinite regression again.

If you think that is outrageous, then let me give you an example. The exterior homunculus (experimenter) explains your vision by interpreting the controlled space  he put you in (the lab) and the measurements he performed. When he does this interpretation he relies on:

  • physical laws
  • geometrical assumptions
  • statistical assumptions

at least. Suppose that the experimenter says: “to the subject [i.e. you] was presented a red apple, at distance d, at coordinates x,y,z. By the physical laws of opticks and by the geometrical setting of the controlled lab we know that the sensor S of the retina of the left eye was stimulated by the light coming from the apple. We recorded a pattern of activity in the regions A, B, C of the brain, which we know from other knowledge (and statistical assumptions) that   A is busy with recognition of fruits, B is involved in contour recognition and C with memories from childhood.” I agree that is a completely bogus simplification of what the real eperimenter will say, but bear with me when I claim that the knowledge used by the experimenter for explaining how you see the apple has not much to do with the way you see and recognize the apple. In the course of the explanation, the experimenter used knowledge about the laws of optics, used measurements which are outside you, like coordinates and geometric settings in the lab, and even notions from the experimenter’s head, as “red”, “apple” and “contours”.

Should the experimenter not rely on physical laws? Or on geometrical asumptions (like the lab is in a piece of euclidean 3d space)? Of course he can rely on those. Because, in the case of physical laws, we recognize them as physical because they are invariant (i.e. change in a predictable way) on the observer. Because in the case of geometrical assumptions we recognize them as geometrical because they are invariant on the parametrization (which in the lab appears as the privilege of the observer).

But, as it is the case that optics can explain only what happens with the light until it hits the retina, not more, the assumptions in the head of the experimenter, even physical and geometrical, cannot be used as an explanation for the way you see. Because, simply put, it is much more likely that you don’t have a lab in the head which is in a euclidean space, with an apple, a lamp and rules for measuring distances and angles.

You may say that everybody knows that apples are not red, that’s a cheap shot because apples scatter light of all frequencies and it just happen that our sensors from the retina are more sensible at some frequencies than other. Obvious. However, it seems that not many recognize that contours are as immaterial as colors, they are in the mind, not in reality, as Koenderink writes in  Theory of “Edge-Detection”  JJ Koenderink – Analysis for Science, Engineering and Beyond, 2012.

The explanation of vision which uses an exterior homunculus becomes infinite regression unless we also explain how the exterior homunculus thinks about all these exterior facts as lab coordinates, rational deductions from laws of physics and so on. It is outrageous, but there is no other way.

Let’s forget about experiments on you and let’s think about experiments on fly vision. Any explanation of fly vision which uses knowledge which is not, somehow, embodied in the fly brain, falls into the (exterior)  homunculus fallacy.

So what can be done, instead? Should we rely on magic, or say that no explanation is possible, because any explanation will be issued by an exterior homunculus? Of course not. When studying vision, nobody in the right mind doubts about the laws of optics. They are science (i.e. reproducible and falsifiable). But they don’t explain all the vision, only the first, physical step. Likewise, we should strive for giving explanations of vision which are scientific, but which do not make appeal to a ghost, in the machine or outside the machine.

Up to now, I think that is the best justification for the efforts of understanding space not in a passive way.

Another discussion about computing with space

Computing with space vs space computing.

Space (real space, we all share) is not made of points. A point is an abstraction, the unattainable goal of a thought experiment, an atom of thought. Or a string of numbers (when we think with coordinates). Quantum physics tells us we can’t perform, even in principle, a physical experiment with the goal of exactly localizing the position of an object in space.

That’s pop philosophy. It might even be wrong (for example what quantum physics tells us is that we can’t perform physical experiments for localizing a particle in the phase space (position, momentum), not in the real space, whatever that means.

That’s also the turf of theoretical physicists, there are several, with various degree of mathematical soundness, theories about the structure of space. I shall not go in this direction, further.

Instead, I want to make a case for a biology inspired point of view. I made it before, repeatedly, starting with More than discrete or continuous: a bird’s view,  but now I have a bit more tools to tackle it, and a bit of more patience to not hurry to conclusions.

So, if you prefer the red pill, then read this. Can you think about space in terms of what it does, not what it is? Can you describe space as seen by a fly, or by a toddler, or you need to stick to cartesian conventions and then fall into the trap of continuous vs discrete, and so on?

Think like this: you are a fly and you have 10^5 neurons and 10^7 synapses. You are very good at flying by using about 10-20 actuators, and you see really well because the most part of your brain is busy with that. Now, where in that brain and how exactly there is place for a  representation of an euclidean 3d space? Mind you that humans have very little idea about how flies brains are capable of doing this and also, with their huge brains and their fast computers (much more faster and much bigger than a fly’s brain) were not successful yet to make a robot with the same competences as a fly.  (They will make one, there is no magic involved, but the constraints are really hard: an autonomous fly which can survive with the energy consumption comparable with the one of a real fly, without computing or human exterior help, in a natural environment, for 24hrs, find food, avoid traps and eventually mate.)

So, after this motivating example, I state my hypothesis: whatever space is (and that’s a very hard and old problem), let’s not think about it passively (like a robot fly which is driven by some algorithms which use advanced human knowledge about euclidean geometry, systems of coordinates and the laws of mechanics) as being a receptacle, a something. Let’s think about space as described by what you can do in it.

The fly, for example, cannot possibly have a passive representation of space (and for us is the same) in the brain, but it does have the possibility to manipulate it’s actuators as a function of what it sees (i.e. of what it’s receptors perceive and send further to the brain) and of the state of it’s brain (and maybe on the history of that state, i.e. memory, stored in a mysterious way in the same tiny brain).  However, actuators, sensors, brain and the environment are just one system, there is no ghost in, or outside that fly machine.

My hypothesis is that for the fly, that’s space.  For us is the same, but we are far more complex than the fly. However, deep in our brains there are “patterns” (are they assemblies of neurons, are they patterns of synaptic activity, is it chemical, electric, …?) which are very basic (a child learns to see in the first months) and which are space, for us.

Now I’ll get mathematical. There are spaces everywhere in math, for example when we say: that’s a vector space, that’s a manifold, or even this is a group, a category, and so on. We say like this, but what we actually have (in the mind) is not a manifold, or a vector space, a group or category, but some short collections of patterns (rules, moves, axioms) which can be applied to the said objects. And that is enough for doing mathematics. This can be formalized, for example it’s enough to have some simple rules involving gates with two inputs and an output (the dilations) and we can prove that these simple rules describe all the concepts associated to any vector space, for example. and moreover not using at any moment any external knowledge. A dilation is simply the pattern of activities related to  map making.

So, according to my hypothesis, a generic vector space is this collection of rules. When it comes to the dimension of it, there are supplementary relations to be added for being able to say that we speak about a 3d vector space, but it will be always about a generic 3d vector space. There is no concrete 3d space, when we say, for example, that we live in a 3d space, what we really say is that some of the things we can do in a generic 3d space can also be done in reality (i.e. we can perform experiments showing this, although there are again studies showing that our brain is almost as bad as concerns perceiving relations in the real space which correspond to theorems in geometry, as it is when we force it to do logic reasonings).

Conclusion for this part: there may be or not a passive space, the important thing is that when we think with space and about space, what we really do is using a collection of primitive patterns of thought about it.

Now, going to the Turing machine, it lacks space. Space can be used for enumeration, for example, but the Turing machine avoids this by supposing there is a tape (ordered set). It is proved that enumeration (i.e. the thing which resembles the most with space in the world of Turing machines) does not matter in the sense that it can be arbitrarily changed and still the conclusions and definitions from the field do not change. This is alike saying that the Turing machine is a geometrical object. But there is no geometrical description of a Turing machine (as far as I know) which is not using enumeration. This is alike saying that CS people can understand the concept of a sphere in terms of atlases , parametrizations and changes between those, but they can’t define spheres without them. Geometers can, and for this reason they can speak about what is intrinsically geometric about the sphere and what is only an artifact of the chosen coordinate system. In this sense, geometers are like flies: they know what can be done on a sphere without needing to know anything about coordinate systems.

Unlimited detail is a sorting algorithm

This is a continuation of the post “Unlimited detail challenge…“.  I am still waiting for a real solution to emerge from the various intriguing ideas. Here I want to comment a bit about the kind of the algorithm UD might be.

Bruce Dell compared it with a search algorithm, more specifically compared it with a search algorithm for words. As everybody knows, words form an ordered set, with the lexicographic order.

Moreover, Dell explains that the thing of UD is to pick from the huge database (which might be in a server, while the algorithm runs on a computer elsewhere, thus the algorithm has to be in a sense an online algorithm) only the 3D atoms which are visible on the screen, then render only those atoms.

Therefore, imagine that we have a huge database of 3D atoms with some characteristics (like color) and a separate database made of the coordinates of these atoms. The UD algorithm solves a “sorting” problem in the second database. (I put “sorting” because there is no total order fully compatible with the solution of the ray-casting problem, in the sense that it remains the same when the POV is changed.) Once this “sorting” part is done, the algorithm asks for the characteristics of only those points and proceed to the rendering part, which is almost trivial.

By looking at sorting algorithms, it is then to be expected a time estimate of the kind  O((\log N)^{2}) (for given, fixed number of screen pixels), where N is the number of 3D atoms.

There are even sorting algorithms which are more efficient, like O(\log N), but they are more efficient in practice for really huge numbers of atoms , like 2^{6000}, as the AKS sorting.

So, the bottom line is this: think about UD as being a kind of sorting algorithm.

Unlimited detail challenge: the most easy formulation

Thanks to Dave H. for noticing the new Euclideon site!

Now, I propose you to think about  the most easy formulation of unlimited detail.

You live in a 2D world, you have a 1D screen which has 4 pixels. You look at the world, through the screen, by using a 90deg frustrum. The 2D world is finite, with diameter N, measured in the unit lengths of the pixels (so the screen has length 4 and your eye is at distance 2 from the screen). The world contains atoms which are located on integer coordinates in the plan. There are no two atoms in the same place. Each atom has attached to it at most P bits, representing its colour. Your position is given by a pair of integer coordinates and the screen points towards  N, S, E, W only.

Challenge: give a greedy algorithm which, given your position and the screen direction,  it chooses 4 atoms from the world which are visible through the screen, in at most O(log N) steps.

Hints:

  • think about what “visible” means in this setting
  • use creatively numbers written in base 2, as words.

_______________

The Teaser 2D UD might help.

I’m not giving this challenge because I am a secretive …. but because I enjoy collaborations.

Teaser: 2D UD

Here are two images which may (or may not) give an idea about another fast algorithm for real time rendering of 3D point cloud scenes (but attention, the images are drawn for the baby model of 2D point cloud scenes).  The secret lies in the database.

I have a busy schedule the next weeks and I have to get it out of my system. Therefore,  if anybody gets it then please send me a comment here. Has this been done before, does it work?

The images now: the first has no name

eucludeon

The second image is a photo of the Stoa Poikile, taken from here:

Stoa_Poikile

Hint: this is a solution for the ray shooting problem (read it) which eliminates trigonometry, shooting rays, computing intersections, and it uses only addition operation (once the database is well done), moreover, the database organized as in the pictures cannot be bigger than the original one (thus it is also a compression of the original database).

_______________

See the solution given  by JX of an unlimited detail algorithm here and here.

Diorama, Myriorama, Unlimited detail-orama

Let me tell  in plain words  the explanation by  JX about how a UD algorithm might work (is not just an idea, is supported by proof, experiments, go and see this post).

It is too funny! Is the computer version of a diorama. Is an unlimited-detail-orama.

Before giving the zest of the explanation of JX, let’s thinks: do you ever saw a totally artificial construction which, when you look at it, it tricks your mind to believe you look at an actual, vast piece of landscape, full of infinite detail? Yes, right? This is a serious thing, actually, it poses a lot of questions about how much can be  compressed the 3D visual experience of a mind boggling  huge database of 3D points.

Indeed, JX explains that his UD type algorithm has two parts:

  • indexing: start with a database of 3D points, like a laser scan. Then, produce another database of cubemaps centered in a net of equally spaced “centerpoints” which cover the 3D scene. The cubemaps are done at screen resolution, obtained as a projection of the scene on a reasonably small cube centered at the centerpoint. You may keep these cubemaps in various ways, one of these is by linking the centerpoint with the visible 3D points. Compress (several techniques suggested).   For this part of the algorithm there is no time constraint, it is done before the real-time rendering part.
  • real-time rendering: input where the camera is, get only the points seen from closest  centerpoint, get the cubemap, improve it by using previous cubemaps and/or neighbouring cubemaps. Take care about filling holes which appear when you change the point of view.

Now, let me show you this has been done before, in the meatspace.  And even more, like animation! Go and read this, is too funny:

  • The Daguerre Dioramas. Here’s (actually an improved version of) your cubemap JX: (image taken from the linked wiki page)

Diorama_diagram

  • But maybe you don’t work in the geospatial industry and you don’t have render farms and huge data available. Then you may use a Myriorama, with palm trees, gravel, statues, themselves rendered as dioramas. (image taken from the linked wiki page)

Myriorama_cards

  • Would you like to do animation? Here is it, look at the nice choo-choo train (polygon-rendered, at a a scale)

ExeterBank_modelrailway

(image taken from this wiki page)

Please, JX, correct me if I am wrong.

Discussion about how an UD algorithm might work

I offer this post for discussions around UD type algorithms. I shall update this post, each time indicating the original comment with the suggested updates.

[The rule concerning comments on this blog is that the first time you comment, I have to aprove it. I keep the privilege of not accepting or deleting comments which are not constructive]

For other posts here on the subject of UD see the dedicated tag unlimited detail.

I propose you to start from this comment by JX, then we may work on it to make it clear (even for a mathematician). Thank you JX for this comment!

I arranged a bit the comment, [what is written between brackets is my comment]. I numbered each paragraph, for easiness.

Now I worked and thought enough to reveal all the details, lol. [see this comment by JX]
I may dissapoint you: there’s no much mathematics in what I did. JUST SOME VERY ROUGH BRUTE-FORCE TRICKS.

1) In short: I render cubemaps but not of pixels – it is cubemaps of 3d points visible from some center.

2) When camera is in that cubemap’s center – all points projected and no holes visible. When camera moves, the world realistically changes in perspective but holes count increases. I combine few snapshots at time to decrease holes count, I also use simple hole filling algorithm. My hole filling algorithm sometimes gives same artifacts as in non-cropped UD videos (bottom and right sides) .

[source JX #2]   ( link to the artifacts image )this artifacts can be received after appliying hole-filling algorithm from left to right and then from top to the bottom, this why they appear only on right and bottom sides. Another case is viewport clipping of groups of points arranged into grid: link from my old experiment with such groups.

This confirms that UD has holes too and his claim “exactly one point for each pixel” isn’t true.
3) I used words “special”, “way”, “algorithm” etc just to fog the truth a bit. And there is some problems (with disk space) which doesn’t really bother UD as I understand. [that’s why they moved to geospatial industry] So probably my idea is very far from UD’s secret. Yes, it allows to render huge point clouds but it is stupid and I’m sure now it was done before. Maybe there is possibility to take some ideas from my engine and improve them, so here is the explanation:
4) Yes, I too started this project with this idea: “indexing is the key”. You say to the database: “camera position is XYZ, give me the points”. And there’s files in database with separated points, database just picks up few files and gives them to you. It just can’t be slow. It only may be very heavy-weight (impossible to store such many “panoramas”) .

5) I found that instead of keeping _screen pixels_ (like for panoramas) for each millimeter of camera position it is possible to keep actual _point coordinates_ (like single laser scanner frame) and project them again and again while camera moves and fill holes with other points and camera step between those files may be far bigger than millimeters (like for stereo-pairs to see volumetric image you only need two distant “snapshots”).

6) By “points linked with each other” I meant bunch of points linked to the some central points. (by points I mean _visible_ points from central point).

7) What is central point? Consider this as laser scanner frame. Scanner is static and catches points around itself. Point density near scanner is high and vice versa.

8) So again: my engine just switches gradually between virtual “scanner” snapshots of points relative to some center. During real-time presentation, for each frame a few snapshots are  projected, more points projected from the nearest, less from far snapshots.

9) Total point count isn’t very big, so real-time isn’t impossible. Some holes appear, simple algorithm fills them using only color and z-buffer data.
10) I receive frames(or snapshots) by projecting all the points using perspective matrix, I use fov 90, 256×256 or 512×512 point buffer (like z-buffer but it stores relative (to the scanner) point position XYZ).

11) I do this six times to receive cubemap. Maximum points in the frame is 512x512x6. I can easily do color interpolation for the overlapped points. I don’t pick color of the point from one place. This makes data interleaved and repeated.

12) Next functions allow me to compress point coordinates in snapshots to the 16bit values. Why it works – because we don’t need big precision for distant points, they often don’t change screen position while being moved by small steps.

int32_t expand(int16_t x, float y)
{
int8_t sign = 1;
if (x<0) { sign = -1; x = -x; }
return (x+x*(x*y))*sign;
}

int16_t shrink(int32_t z, float y)
{
int8_t sign = 1;
if (z<0) { sign = -1; z = -z; }
return ((sqrtf(4*y*z+1)-1)/(2*y))*sign;
}

13) I also compress colors to 16bit. I also compress normals to one 24bit value. I also add shader number (8bit) to the point. So one point in snapshot consists of:  16bit*3 position + 24bit normal + 16bit color + 8bit shader.

14) There must be some ways to compress it more (store colors in texture (lossy jpeg), make some points to share shader and normals). Uncompressed snapshot full of points (this may be indoor snapshot) 512x512x6 = 18Mb , 256x256x6 = 4,5Mb

Of course, after lzma compression (engine reads directly from ulzma output, which is fast) it can be up to 10 times smaller, but sometimes only 2-3 times. AND THIS IS A PROBLEM. I’m afraid, UD has smarter way to index it’s data.

For 320×240 screen resolution 512×512 is enough, 256×256 too, but there will be more holes and quality will suffer.

To summarize engine’s workflow:

15) Snapshot building stage. Render all scene points (any speed-up may be used here: octrees or, what I currently use: dynamic point skipping according to last point distance to the camera) to snapshots and compress them. Step between snapshots increases data weight AND rendering time AND quality. There’s no much sense to make step like 1 point. Or even 100 points. After this, scene is no longer needed or I should say scene won’t be used for realtime rendering.

16) Rendering stage. Load nearest snapshots to the camera and project points from them (more points for closer snapshots, less for distant. 1 main snapshot + ~6-8 additional used at time. (I am still not sure about this scheme and changing it often). Backfa..point culling applied. Shaders applied. Fill holes. Constantly update snapshots array according to the camera position.

17) If I restrict camera positions, it is possible to “compress” huge point cloud level into relatively small database. But in other cases my database will be many times greater than original point cloud scene. [ See comments   JX#2  , JX#3 , chorasimilarity#4 , chorasimilarity#5 . Here is an eye-candy image of an experiment by JX, see JX#2:]

eye_candy_by_JX

Next development steps may be:

18) dynamic camera step during snapshot building (It may be better to do more steps when more points closer to camera (simple to count during projection) and less steps when camera in air above the island, for example),

19) better snapshot compression (jpeg, maybe delta-coding for points), octree involvement during snapshot building.

20) But as I realized disk memory problems, my interest is falling.

Any questions?

UD question

I try to formulate the question about how Unlimited Detail works like this:

Let D be a database of 3D points, containing information about  M points. Let also S be the image on the screen, say with N pixels. Problem:

  • reorganize the database D to obtain another database D’ with at most O(M) bits, such that
  • starting from D’ and a finite (say 100 bytes) word there exists an algorithm which finds the image on the screen in O(N log(M)) time.

Is this reasonable?

For example, take N=1. The finite word means the position and orientation of the screen in the 3D world of the database. If the M points would admit a representation as a number (euclidean invariant hash function?) of order M^a (i.e. polynomial in the number of points), then it would be reasonable to expect  D’ to have dimension of order O(log(M)), so in this case simply by traversing D’ we get the time O(log(M)) = O(N log(M)). Even if we cannot make D’ to be O(log(M)) large, maybe the algorithm still takes O(log(M)) steps simply because M is approximately the volume, so the diameter in 3D space is roughly between M^(1/3) and M,  or due to scaling of the perspective the algorithm may still hop through D’ in geometric, not arithmetic steps.

The second remark is that there is no restriction for the time which is necessary for transforming D into D’.

Unlimited detail and 3D portal engines, or else real-time path tracing

Here are two new small pieces which might, or not, add to the understanding of how the Unlimited Detail – Euclideon algorithm might work. (Last post on this subject is Unlimited detail, software point cloud renderers, you may want to read it.)

3D-portal engines: From this 1999 page “Building a 3D portal engine“, several quotes (boldfaced by me):

Basically, a portal based engine is a way to overcome the problem of the incredible big datasets that usually make up a world. A good 3D engine should run at a decent speed, no matter what the size of the full world is; speed should be relative to the amount of detail that is actually visible. It would of course be even better if the speed would only depend on the number of pixels you want to draw, but since apparently no one has found an algorithm that does that, we’ll go for the next best thing.

A basic portal engine relies on a data set that represents the world. The ‘world’ is subdivided in areas, that I call ‘sectors’. Sectors are connected through ‘portals’, hence the name ‘Portal Engine’. The rendering process starts in the sector that the camera is in. It draws the polygons in the current sector, and when a portal is encountered, the adjacent sector is entered, and the polygons in that sector are processed. This would of course still draw every polygon in the world, assuming that all sectors are somehow connected. But, not every portal is visible. And if a portal is not visible, the sector that it links to doesn’t have to be drawn. That’s logical: A room is only visible if there’s a line of sight from the camera to that room, that is not obscured by a wall.

So now we have what we want: If a portal is invisible, tracing stops right there. If there’s a huge part of the world behind that portal, that part is never processed. The number of polygons that are actually processed is thus almost exactly equal to the number of visible polygons, plus the inserted portal polygons.

By now it should also be clear where portals should be inserted in a world: Good spots for portals are doors, corridors, windows and so on. That also makes clear why portal engines suck at outdoor scenes: It’s virtually impossible to pick good spots for portals there, and each sector can ‘see’ virtually every other sector in the world. Portal rendering can be perfectly combined with outdoor engines though: If you render your landscape with another type of engine, you could place portals in entrances of caves, buildings and so on. When the ‘normal’ renderer encounters a portal, you could simply switch to portal rendering for everything behind that portal. That way, a portal engine can even be nice for a ‘space-sim’…

So let’s dream and ask if there is any way to construct the database for the 3D scene such that the rendering process becomes an algorithm for finding the right portals, one for each pixel maybe. To think about.  The database is not a tree, but from the input given by the position of the viewer, the virtually available portals (which could be just pointers attached to faces of octrees, say, which point to the faces of smaller cubes which are visible from the bigger face, seen as a portal) organize themselves into a tree. Therefore the matter of finding what to put on a screen pixel could be solved by a search algorithm.

As a small bonus, here is the link to a patent of Euclideon Pty. Ltd. : An alternate method for the child rejection process in regards to octree rendering – AU2012903094.

Or else real-time path tracing. Related to Brigade 2, read here, and  a video:

Biological vision as a problem of Fully Homomorphic Encryption

One revelation after another! After learning about MOOC, I am reading now the Craig Gentry’s PhD thesis on fully homomorphic encryption.

Without more information loading, it looks to me that biological vision could be regarded as a fully homomorphic encryption problem.

Explanation: the problem of biological vision is the following. We have an organism, say a human or a fly.  By vision, the outer space is encrypted as a physical dynamical system in the brain, in a way which is basically unknown. However, the  encrypted information  is so good that the brain can compute, based on it, some function, which is then sent to the motor system, which in turn modifies the outer space efficiently (human kills fly or fly avoids human).

During this process there is no decryption, because there is no space, or image, in the brain (see the homunculus fallacy).

Therefore, the encryption used by the brain has to be a fully homomorphic encryption!

You may imagine how amazed I am by reading Gentry’s description, which I quote, with holes and my emphasis, from page 2 of his thesis:

Imagine you have an encryption scheme with a “noise parameter” attached to each ciphertext, where encryption outputs a ciphertext with small noise – say, less than n  – but decryption works as long as the noise is less than some threshold N \gg n. Furthermore,imagine you have algorithms […] that can take ciphertexts E(a)  and E(b)  and compute E(a+b) and E(a*b), but at the cost of adding or multiplying the noise parameters. This immediately gives a “somewhat homomorphic” encryption scheme […]. Now suppose that you have an algorithm Recrypt that takes a ciphertext E(a)  with noise N' < N  and outputs a “fresh” ciphertext E(a)  that also encrypts a, but which has noise parameter smaller than N^{1/2} .

[…] It turns out that a somewhat homomorphic encryption scheme that has this self-referential property of being able to handle circuits that are deeper than its own decryption circuit – in which case we say the somewhat homomorphic encryption scheme is “bootstrappable” – is enough to obtain the Recrypt algorithm, and thereby fully homomorphic encryption!

In order to understand my enthusiasm, here is again a link to exploring space slides, see also the posts concerning approximate structures.

Digital materialization, Euclideon and fractal image compression (III)

Bruce Dell definitely oriented his Euclideon-Unlimited Detail technique towards the geospatial industry, see for example the conference summary of the International Lidar mapping forum, Denver 2013, where he will speak about “The impact of unlimited processing power on the geospatial industry”.

He participated at the 2012 Brisbane international Geospatial Forum, July 8-11, here is the presentation program with abstracts.   His talk abstract, named the same, gives a video link which is very interesting.

The points I understood and want to stress are the following:

– he speaks about the “two bridges” between the virtual world and the real world, this is really very close to the Digital materialization philosophy.  So, I  guessed right such a link (here and here), from the exterior of both interested parts. My question, to the DM groups is: are you going to do something about this, for example collaborations with Euclideon? And even more: has the geospatial industry  things to learn from DM (I think they do)?

– there is definitely an Euclideon format which “may take a while” to get from the format of the laser scans used by the geospatial industry. In the video Dell puts an image with big racks of computers needed to do the conversion format. My question is: is he using  some part of the fractal image compression idea (maybe, for example, Dell is not using freps, but it might use in his data structure ideas from fractal image compression). Again, for the DM community, I have a question: giving that you use huge files, maybe you can use some Euclideon tricks to ease the use of them? and blend them with freps?

– really the Euclideon algorithm (which has as the fundamental part the data structure of the Euclideon format) works well.  By looking at the images from the presentation, I am asking myself if “Euclideon” name comes from some clever embedding of the Euclidean group of isometries into the data structure. I feel there must be something obvious about principal bundles … 🙂 which model an observer in the euclidean space AND the data acquired by laser scans … To think.

Digital materialization, Euclideon and fractal image compression (II)

For the background see the previous post Digital materialization, Euclideon and fractal image compression.

As I wrote before, this may have been done, please tell me if so.

The idea is that once a 3D image is encoded by a kind of fractal image compression algorithm, then the problem of attribution of a “3D atom” of the 3D image to a “2D pixel” from the screen becomes a search problem in a tree, as maybe the Unlimited Detail – Euclideon algorithm is. The kind of encoding I am writing about may have been done by, or may be useful for the groups working in “Digital materialization”.

UPDATE:  Here is a youtube video made by AEROmetrex company, “a leader in photogrammetric solutions is launching its latest technological service aero3Dpro today: a complete 3D modelling service for Australian and overseas users of geospatial and 3D data”

In the comments, they write they are using Geoverse/Unlimited Detail/Euclideon technology and they mention that

For a 15Gb 3D model the Euclideon Unlimited Format file is about 2 Gb

Further is detailed my speculation that Euclideon may use a variant of fractal image compression in their format.

I do not want to trespass any boundaries or pretending I am doing anything new, that is why I repeat my request to tell me if anything like this was done (and who did it).

1. Let me first recall a famous theorem by Hutchinson, concerning iterated function systems.

Let (X,d) be a complete metric space and let H(X) be the collection of compact sets in (X,d). On the space H(X) we may put the Hausdorff distance between subsets of X, defined in the following way. For any \varepsilon > 0 and for any set A \subset X, the \varepsilon neighbourhood of A is the set

A_{\varepsilon} = \left\{ y \in X \mbox{ : } d(x,y) \leq \varepsilon \right\} = \cup_{x \in A} B(x,\varepsilon)

where B(x, \varepsilon) is the closed ball centered at x, of radius \varepsilon.

The Hausdorff distance between sets A, B \subset X is then

d_{H}(A,B) = \inf \left\{ \varepsilon > 0 \mbox{ : } A \subset B_{\varepsilon} , B \subset A_{\varepsilon} \right\}

The important fact happening here is that (H(X), d_{H}) is a complete metric space. Moreover, if X is compact, then H(X) is compact too.

An iterated function system (IFS)  on the compact (X,d) is a finite collection of transformations of X, say a_{1}, ... , a_{N}, such that  every a_{i} is a contraction: there is r_{i} \in (0,1) such that

d(a_{i}(x), a_{i}(y)) \leq r_{i} d(x,y) for any x,y \in X.

The Hutchinson operator associated to the IFS is the transformation

A \in H(X)    goes to  T(A) = \cup_{i = 1, ..., N} a_{i}(A).

Hutchinson theorem says that T is a contraction, therefore it has a unique fixed point, i.e. a compact set A \subset X such that T(A) = A, which is the same as

A = \cup_{i = 1, ..., N} a_{i}(A).

2. Michael Barnsley had the idea of using this result for doing fractal image compression.  In few words, fractal image compression is any algorithm which solves the inverse problem: given A \in H(X), find an IFS which has A as a fixed point. In this way the set A (which represents the image, for example as the graph of the function which associates to any pixel a RGB vector of colors) is “encoded”   by the functions from the IFS. More specifically, we may take X to be (a compact subset of) \mathbb{R}^{n} and look for an IFS of affine contractions. Then the set A is encoded in (or compressed to) the set of coefficients of the affine transformations of the IFS.

3. Without going to the last detail, let us see this construction from the point of view of “Digital materialization”. The idea which is behind is to characterise a subset of X by a function \phi: X \rightarrow \mathbb{R} such that x \in A if and only if \phi(x) \leq 0.  If A is described by \phi and B is described by \psi, then

A \cup B is described by \min \left\{ \phi, \psi \right\}  and

A \cap B is described by \max \left\{ \phi , \psi \right\} and

– for any bijective transformation a  of X, the set a(A) is described by \phi \circ a^{-1}.

Thus, starting from a library of functions \phi, \psi, ... and from a library of bijective transformations of the space (like translations. rotations, etc, where this makes sense), we may describe a huge  collection of sets by a tree, which has nodes decorated with \min, \max  and edges decorated with compositions \phi \circ a, with \phi from the library of “textures” [my choice of name maybe not good, correct me please] \phi, \psi, ...  and a from the library of “legal” bijective transformations.

In fact, we may refine a bit this description by giving: [note added later: is this some version of  “Image codingbased on a fractal theory of iterated contractive image transformation”,  AE Jaquin – IEEE Trans. on image Processing, 1992 ?]

– a master shape \phi, for example, taken from the equation of a ball of radius 1, if we suppose that X is an euclidean space. Let M be the set described by \phi.

– a collection of “locators” [I’m totally inventing names here], that is a finite collection of (affine, say) contractions of X such that they send M to a subset of M

– a potentially infinite collection of “textures”, one for every IFS constructed from a finite set of locators and the masted shape \phi. Therefore, to any finite collection of locators a_{1}, ... , a_{p} we associate the “texture”

\psi = \min   \phi \circ a_{1}^{-1} , ... , \phi \circ a_{p}^{-1} .

– a finite collection of “legitimate translations”, which are just affine (say) transformations with the property that they move M to  sets which do not intersect with M, and such that they generate a group which is roughly equivalent with the space X (simply put, if X is \mathbb{R}^{n} then the group generated by legitimate translations contains a \mathbb{Z}^{n}).

Given a IFS constructed from compositions of “locators” and “legitimate translations”, say a_{1}, ... , a_{M}, there is a Hutchinson-like operator which associates to any “shape” \phi (a function obtained from  the application of a finite number of “min” and “max” to a finite collection of functions obtained from the master shape composed with locators and legitimate translations) the new shape

T(\phi) = \min  \phi \circ a_{1}^{-1} , ... , \phi \circ a_{M}^{-1} .

Notice that, in terms of the tree which describes the shapes, the tree which describes T(\phi) is easily (recursively) obtained from the tree which describes \phi.

Now, a compression algorithm associated to this data is any algorithm which solves the following problem: given a (compact) setA \subset X and a \varepsilon > 0, find an IFS constructed from “locators” and “legitimate translations” which has the fixed point inside A_{\varepsilon}.

By using locators and legitimate translations, one may define “textures” and “shapes” at a scale.

The big problem is to find efficient algorithms, but once such an algorithm is used to encode a shape (which might be time intensive) the decoding is easy!

Carl Einstein on Picasso and the visual brain

The article  “Carl Einstein, Daniel-Henry Kahnweiler, Cubism, and the Visual Brain”  by   made me realize that probably the cubism, invented by Picasso and Gris,  was a logical step further along the path towards the investigation of vision, opened by impressionists.

Indeed, far from being a game of abstraction, multiple viewpoints and other rubbish, it appears that cubism, at least in its first stages, represents the effort of understanding the first stages of vision, as happening in the (artist’s) brain. I find this story amazing, showing how far a brilliant mind (the one of Picasso) could go ahead of its time.

Here is a reproduction of the painting “Guitarist”, by Picasso, 1910, taken from the cited article:

It would be interesting to compare the statistics of edges and corners and blobs  (actually blobs are higher level features) in cubist paintings from this age with the statistics of same features in databases of natural images. My bet is that they are very close.

Digital Materialization, Euclideon and Fractal Image Compression

Are these:

Digital Materialization (DM)

Euclideon (Unlimited Detail)

Fractal Image Compression

related? Can be this rather simple (from a pure math viewpoint) research program doable and moreover DONE by anybody?

Let me explain. My wonder came after searching with google for

– “Digital materialization” AND “euclideon”   – no answers

– “Digital materialization” AND “Fractal image compression” – no answers

– “Fractal image compression” AND “euclideon” – no answers

1. Cite from the Digital Materialization Group homepage:

Digital Materialization (DM) can loosely be defined as two-way direct communication or conversion between matter and information that enable people to exactly describe, monitor, manipulate and create any arbitrary real object. DM is a general paradigm alongside a specified framework that is suitable for computer processing and includes: holistic, coherent, volumetric modeling systems; symbolic languages that are able to handle infinite degrees of freedom and detail in a compact format; and the direct interaction and/or fabrication of any object at any spatial resolution without the need for “lossy” or intermediate formats.

DM systems possess the following attributes:

  • realistic – correct spatial mapping of matter to information
  • exact – exact language and/or methods for input from and output to matter
  • infinite – ability to operate at any scale and define infinite detail
  • symbolic – accessible to individuals for design, creation and modification

As far as I understand, this works based on Function Representation (FREP), see HyperFun.org . The idea is to define an object in \mathbb{R}^{3} say, by a function F: \mathbb{R}^{3} \rightarrow \mathbb{R}, which describes the object as the set of points x where F(x) \leq 0. Start with a small library of functions (for example polynomials) and then construct other functions by using min, max, and so on. Therefore an object is described by a tree, with leaves decorated by functions and nodes decorated by min, max, … , operations.  This is a very simplistic, but fundamentally precise description. The main point is that there is no object (defined for example by a mesh in space), but instead we may check if a point x  in space belongs or not to the object by evaluating the sign of F(x). Moreover, translations, rotations, dilations (or any other easy to implement change of parametrization of the space) are implemented by composition with the function F which describes the object.

In particular, we may easily pass to polar coordinates based on the point of view and stay in this  coordinate system for visualization.

2. Fractal image compression is based on the fact that any compact set (like an image, described as a compact set in the 5D space = 2D (spatial) times 3D (RGB levels)) can be described as a fixed point of an iterated function system.

This brings me to

PROBLEM 1: Is there any version of the Hutchinson theorem about fixed points of IFS, with the space of compact sets replaced by a space of FREP functions? Correspondingly, is there a “FREP  compression algorithm”?

My guess is that the answer is YES. Let’s assume it is so.

3. In the article “Digital Foundry vs. Unlimited Detail”   it is reported the following 2008 post by Bruce Dell:

Hi every one , I’m Bruce Dell (though I’m not entirely sure how I prove that on a forum)

Any way: firstly the system isn’t ray tracing at all or anything like ray tracing. Ray tracing uses up lots of nasty multiplication and divide operators and so isn’t very fast or friendly.
Unlimited Detail is a sorting algorithm that retrieves only the 3d atoms (I wont say voxels any more it seems that word doesn’t have the prestige in the games industry that it enjoys in medicine and the sciences) that are needed, exactly one for each pixel on the screen, it displays them using a very different procedure from individual 3d to 2d conversion, instead we use a mass 3d to 2d conversion that shares the common elements of the 2d positions of all the dots combined. And so we get lots of geometry and lots of speed, speed isn’t fantastic yet compared to hardware, but its very good for a software application that’s not written for dual core. We get about 24-30 fps 1024*768 for that demo of the pyramids of monsters. The media is hyping up the death of polygons but really that’s just not practical, this will probably be released as “backgrounds only” for the next few years, until we have made a lot more tools to work with.

Assuming that we take a database representing a 3D very complex object (like a piece of landscape) and we convert it, by using a FREP compression algorithm, into a tree as described at point 1, then it becomes easy to imagine how the Unlimited Detail algorithm might work.

Problem 2:  Given a FREP representation of a collection of 3D objects, describe an efficient sorting algorithm which uses the representation and outputs the part of the union of object visible from a given point at  infinity.

Conclusion: unless this is utter giberish, the modus operandi of an “unlimited detail” algorithm could be the following:

1)- start with a database of a collection of 3d objects and compress it into a FREP format

2)- perform a “mass 3d to 2d conversion”, by using a solution of the problem 2,  in polar coordinated from the viewpoint.

Right angles everywhere (II), about the gnomon

In this post I shall write about the gnomon. According to wikipedia,

The gnomon is the part of a sundial that casts the shadow. Gnomon (γνώμων) is an ancient Greek word meaning “indicator”, “one who discerns,” or “that which reveals.”

In the next figure are collected the minimal ingredients needed for understanding the gnomon: the sun, a vertical shape and its horizontal shadow.

That is the minimal model of the ancient greek visual universe: sun, a man and its shadow on the beach. It is a speculation, but to me, a gnomon seems to be a visual atom.

Pythagoreans extracted from this minimal visual universe the pattern and used it for giving an explanation for the human vision, described by the next figure.

Here the sun is replaced by the eye (of a god, initially, but the pattern might apply to a mortal also), the light rays emanated by the sun are assimilated with the lines  of vision (from here the misconception that the ancient greeks really believed that the eyes shoot rays which illuminate the field of vision) and the indivisible pair man-shadow becomes the L-shape of a gnomon.  An atom of vision.

Here comes a second level of understanding the gnomon, also of pythagoreic flavor. I cite again from the wiki page:

Hero defined a gnomon as that which, added to an entity (number or shape), makes a new entity similar to the starting entity.

This justifies the Euclid’ picture of the gnomon, as a generator of self-similarity:

(image taken from the wiki page on gnomon)
So maybe the word “atom” is less appropriate than “generator”. In conclusion, according to ancient greeks, a gnomon (be it a triple sun-man-shadow or a pair eye – elementary L-shape) is the generator of the visual perception, via the mechanism of self-similarity.

In their architecture, they tried to make this obvious, readable.  Because it’s scalable (due to the relation with self-similarity), the architectural solution of constructing with gnomons  invaded the world.

Right angles everywhere (I)

Related: The gnomon in the greek theater of vision, I.

Look at almost any building in the contemporary city, it’s constructed from right angles, assembled into rectangles, assembled into boxes. We expect, in fact,  a room to have a rectangular floor, with vertical walls meeting in right angles. Exceptions are either due to architectural fancies or to historical constraints or mistakes.

When a kid draws a house, it looks like a rectangle, with the  triangle of the roof on top.

Is this normal? Where does this obsession of the right angle comes from?

The answer is that behind any right angle is hidden a gnomon. We build like this because we  are Pythagoras children, living by the rules and categories of our cultural ancestors, the ancient greeks.

Let’s see:
(I) In ancient times,  or in  places far from the greeks  (and babylonians), other architectural forms are preferred, like the  roundhouse. Here’s a Scottish broch (image taken from this wiki page)

and here’s a Buddhist stupa (image taken from the wiki page)

Another ancient building form is the step pyramid , like the Great Ziggurat of Ur (image taken from the last wiki page)

or the egyptian pyramids, or any other famous  pyramid in the world (there are plenty of them, in very different cultural frames).

Here is a Sardinian Nuraghe

Conclusion: round, conical, pyramidal is the rule, there are no right angles there!

Until the greeks: here’s the Parthenon

It is made of gnomons, here’s one (from the wiki page)

Next time, about gnomons.

Mass connected processing?

In this  Unlimited Detail technology description  appears the term “mass connected processing”. Looking on the net for this one finds this post,  I cite from it:

“By the looks of what they are saying, the areas and level of real time software performance they are talking about, it is likely to be the same methods that I came up with back around 1997 (when I was also in Brisbane), or not far off of it, as the problem reduces down to single 100% efficient methods. ”

Anybody knows what’s this all about?

 

 

A discussion about the structure of visual space

In August I discovered the blog The structure of visual space group and I was impressed by the post The Maya theory of (visual) perception.  A discussion with Bill Rosar started in the comments, which I consider interesting enough to rip it from there and put it here.

This discussion may also help to better understand some of the material from my blog. Several  links were added in order to facilitate this. Here is the exchange of comments.

Me: “I just discovered this blog and I look forward to read it in detail.

Re: “What I am calling into question is the ontological status of a physical world existing beyond the senses.”

Also in relation to the mind-body dualism, I think there is a logical contradiction in the “Cartesian Theater” argument by Dennett, due to the fact that the fact that Dennett’s theater is a theater in a box, already designed for a dualist homunculus-stage space perception (in contradistinction with the older, original Greek Theater).”

Bill Rosar: “Thank you for your posting, Marius Buliga, and welcome! It is great to have a mathematician join us, for obvious reasons, especially since you are interested in problems in geometry.

Your idea of the eye as a “theatron” is interesting, though I do not believe that the brain is computing anything, for the simple reason that it is not a computer, and doesn’t behave like one, as some neuroscientists are now publicly saying. It is people who perform computations, not brains.

Raymond Tallis, who posted “The Disappearance of Appearance” here two years ago, went to some pains to articulate the fallacious reasoning behind the computational metaphor of mind and brain in his marvelous little book WHY THE MIND IS NOT A COMPUTER.

It has long been a truism in cognitive psychology that we do not see our retinal images, and the “function” or process of vision is probably very different from the creation of images, because there is no image in the brain, nor anything like one. If anything, the pattern of stimulation on the retinae is “digested” by the visual system, broken down rather like food is into nutrients (as an alternative, think of chemical communication among insects).

To my knowledge, Descartes did not invoke the analogy of a theater for vision (or perception in general), so for Dennett to construe his ideas on such an analogy is dubious at the outset and, in this instance, just seems to make for a straw man. For that matter, Dennett does not seem to understand the reasons for dualism very fully, and as nearly as I can determine, never bothered to acquaint himself with the excellent volume edited by John Smythies and John Beloff, THE CASE FOR DUALISM (1989). His ill-informed refutations just strike me as facile and unconvincing (and his computational theory of mind has been roundly rejected by Ray Tallis as being fallacious).

My own invoking of theater here as an analogy is to reality itself, not just perception, and is therefore quite different from the view Dennett imputes to Cartesian dualism, though. I propose that physics studies the stagecraft of a reality that only (fully) exists when perceived–which is closer to Berkeley than Descartes, and is a view consistent with John Wheeler’s “observer-participant” model of the universe.

Theoretical physicist Saul-Paul Sirag advanced a “many realities” alternative to the Everett-Wheeler “many worlds” hypothesis, arguing that other realities are mathematically possible. That is why I have tendered the provocative notion that the reality we know is a sort of construction, one that is maintained by the physical constants–or so it seems. Sirag argued that it is not the only possible reality for that reason, and that the constants are comparable to the “chains” that hold the cave dwellers captive to the shadow play on the wall.

I propose instead that the senses are part of the reality-making “mechanism,” and that vision has more the character of a resolving mechanism than a picture-making one (not quite like the Bohm-Pribram holographic reality/brain analogy, though). That gets rid of the homunculus problem, because it turns the perception process inside out: The person and homunculus are one and the same, and visual space is just where it appears to be, viz. in front of us, not a picture made by the visual system in the brain. The forerunner of this view was James Culbertson. The flaw is that it requires a rejection or modification of the causal theory of perception, as we have discussed here. But causality is a metaphysical principle, not a physical one, and perhaps in this context at least requires some close scrutiny, just as Culbertson gave it.”

Me: “…”…for Dennett to construe his ideas on such an analogy is dubious at the outset and, in this instance, just seems to make for a straw man.” This is my impression also, but what can we learn from this about vision?

As a mathematician, maybe, I am quite comfortable with vagueness. What I get from the greek theater/theater in a box argument is that the homunculus is as artificial as the scenic space, or the outer, physical space. These two notions come in pairs: either one has both, or none. The positive conclusion of the argument is that we have to go higher: there is a relation, akin to a map-territory relation, which has on one side the homunculus and on the other side the space.

Let me elaborate a bit on the map-territory relation. What is a map of a territory? It is the outcome of a collection of procedures agreed by the cartographer and the map reader. The cartographer wanders through the territory and constructs a map by some procedure, say by measuring angles and distances using some apparatus. The cartographer follows a convention of representation of the results of his experiments on a piece of paper, let us call this convention “euclidean geometry” (but it might be “quantum mechanics” as well, or “relativity theory”…). The map reader knows that such convention exists and moreover, at least concerning basic facts, he knows how to read the map by applying the convention (for example, the reader of the map of a city, say, knows that straight lines are shortest on the maps as well as across the city). We may say that the map-territory relation (correspondence between points from the territory – pixels from the map) IS the collection of agreed procedures of representation of experiments of the cartographer on the map. The relation between the particular map and the particular territory is just an outcome of this map-territory relation.

Looking at this level, instead of speaking about the perception of the exterior by the homunculus, it is maybe more reasonable to speak, like in “The structure of visual spaces” by J.J. Koenderink, A.J. van Doorn, Journal of mathematical imaging and vision, Volume: 31, Issue: 2-3 (2008), pp. 171-187, about the structure of the visual space as being the result of a controlled hallucination, based on prior experiences which led to coherent results.

Bill Rosar: “Thank you, Marius! What can we learn from Dennett’s faulty analysis of vision, you ask? The “moral of the story” IMO is that any model based on computation presupposes that we know how people perform computations–or how the human minds does–which is something we presently unknown, because we don’t really know what the “mind” really is–it’s just a name. All a computer does is automate a procedure we humans perform. To assume that Nature makes computers strikes me as a classic example of anthropormorphism, and Ray Tallis would agree. How then to get beyond that fallacy? Or, in the case of vision, to echo John Wheeler’s style of formulating foundational problems in physics, “How do you get vision without vision?”–that is, how to understand vision without presupposing it? That’s quite a feat!

A few months ago when Bob French and I were last debating some of these points I suggested that we turn to the evolution of the eye and see what that tells us. Conveniently the evolution of the eyes has been one of Richard Dawkins’ favorite examples to refute the idea of “intelligent design”.

In light of all the questions the account Dawkins raises but leaves unanswered, intelligent design seems to make more sense (I offer no opinion on that myself). So it is a question of what the simplist eyes do and how the organisms possessing them use them. There is a nice little video on YouTube that highlights all that Dawkins does not explain in his simplist account of the evolution of the eye.

As for the map-territory analogy you suggest, it is comparable to the idea of “cortical maps” but shares the same conceptual pitfall as that of the perspective projection analogy I gave above, because as I noted, unlike being able to compare the flat perspective projection (map) with the 3-D *visual space* of which it is (supposedly) a projection, we cannot do that with visual space in relation to putative physical space, which lies beyond our senses. It seems to me that we are to some extent each trapped solipsistically within our own perceptual world.

Koenderink’s idea just seems like nonsense to me, because we don’t even really know what hallucinations are any more than how a hallucinatory space is created relative to our “normal” waking visual space (BTW we invited Koenderink to join the blog a few years ago, but he never replied). The *concept* of a hallucination is only useful when one has some non-hallucinatory experience to which to compare it–thus the same problem as the projection analogy above.

Trouble is we seem to be *inside* the system we are trying to understand, and therefore cannot assume an Archimedean point outside it from which to better grasp it (one of the fundamental realizations Einstein had in developing the theory of relativity, i.e., relativity is all *within* the system = universe).

As for visual space being non-Euclidean or not, I called into question many years ago the interpretation of the data upon which all theories of the geometry of visual space are based, because the “alley experiments” never took into account changes of projection on the retinae as a function of eye movement, i.e., the angles of objects projected on the retina are constantly changing as the eyes move. This has never been modeled mathematically, but it should be. Just look at the point where a wall meets the ceiling an run your eyes along its length, back and forth. You will notice that the angle of the line changes as you move your eyes along it.

Yes, the space and homunculus are an inseparable pair IMO–just look at Wheeler’s symbolic representation of the observer-participant universe (the eye looking at the U).”

Bill Rosar: “I should hasten to emend my remarks above by stating that when we speak of “eyes” and “brains” such objects are only known to us by perception. So like any physical object, we cannot presuppose their existence as such separate from our perception of them–except by an act of a kind of faith (belief), much as we believe that the sun will rise every morning. Therefore talking about their “function” etc. is still all resting upon perceptions, without which we would have no knowledge of anything, ergo, something like Aristotle’s dictum “There is nothing in the mind that was not first in the senses.” Are there eyes and brains that exist independently of perceptions of them?”

Me: “Dear Bill, thank you for the interesting comments! I have several of my own (please feel free to edit the post if it is too long, boring or otherwise repellent for the readers of this blog):

1. It looks to me we agree more than my faulty style of exposition shows: one cannot base an explanation of how the space is “re-constructed” in the brain on the structure of the physical space, point. It may be that what we call structure of physical space is formed by features selected as significant by our brain, in the same way as a wind pipe extracts a fundamental note from random noise (thank you Neal Stephenson).

2. We both agree (as well as Koenderink, see his “Brain a geometry engine”) that, as you write, “the senses are part of the reality-making “mechanism,” and that vision has more the character of a resolving mechanism than a picture-making one”.

3. Concerning “computing”, is just a word. In the sense that “computing” is defined as something which could be done by Turing machines, or expressed in lambda calculus, etc, I believe too that the brain is not computing in this sense. With efforts and a lot of dissipation, it seems that the brain is able to compute in this sense, but naturally it does not. (It would be an interesting project to experimentally “measure” this dissipation, starting for example from a paper by Mark Changizi “Harnessing vision for computation”, here is the link to a pdf.

4. But if we enlarge the meaning of the word “computing” then it may as well turn out that the brain does compute. The interesting question for a mathematician is: find a definition of “computation in enlarged sense” which fits with what the brain does in relation to vision. This is a project dear to me, I don’t want to bother you with this (unless you are interested), which might have eventual real world applications. I started it with the paper “Computing with space, a tangle formalism for chora and difference” and I reached the goal of connecting this (as a matter of proof of principle, not because I believe that the brain really computes in the classical sense of the word) with lambda calculus in the paper “Local and global moves on locally planar trivalent graphs, lambda calculus and lambda-Scale“.
(By the way, I cannot solve the problem of where to submit a paper like “Computing with space…”)

5. Concerning “hallucination”, as previously, is just a word. What I think is likely to be true is that, even if the brain does not have direct access to the physical space, it may learn a language of primitives of this space, by some bayesian or other, unknown, procedure, which is akin to say that we may explain why we see (suppose, for the sake of the discussion) an euclidean 3d space not by using as hypothesis the fact that the physical space has this structure, but because our brains learn a family of primitives of such a structure and then lay in front of our eyes a “hallucination” which is constructed by the consistent use of those primitives.”

Bill Rosar: “Thanks for these stimulating thoughts and ideas, Marius. Not to worry about the length of your blog postings. Mine are often (too) long, too. My remarks will be in two parts. This is part I.

When John Smythies and I started this blog (which was really intended to be a “think tank” rather than a blog), we agreed that, following the lead of Einstein, it may be necessary to re-examine fundamental concepts of space and geometry (not to mention time), thus John’s very first posting about Jean Nicod’s work in this regard, and a number of mine which followed.

One of these fundamental concepts that calls for closer scrutiny is space itself, or, to be more precise, the nature of *spatial extension,* both of which are abstractions, especially in mathematics (in this regard see Graham Nerlich’s excellent monograph, “The Shape of Space”).

We need to better understand the basis of those two abstractions–space and extension–IMO if we are to make progress on the nature of visual space, or the other sensory modalities that occupy perceptual space as a whole (auditory, tactile, olfactory, gustatory). Abstractions reflect both what they omit and what they assume, and it is the assumptions that we especially need to examine here. While clearly visual space is extended, what about smell? Are smells extended in space?

What we find is that there is a *hierarchy* in perceptual space, one that in man is dominated by visual sensation–what has been called the “dominant visual matrix” by psychologists studying perception. Even sounds are referred to visual loci (“localized”), and I think that can be said of smells, too. But in of themselves it is not clear that even auditory sensations are extended in the same way that visual sensations are, because it is as if when a sound is gone, that part of the “soundscape” is also gone, but that which remains is visual space. In visual space an object may disappear, but the locus it occupied does not also disappear. For example, though we can point to the *visual* source of a sound we hear, we do not point to a sound–even the phrase sounds strange, and ordinary language reveals much about the nature of the perceptual world–or what the man of the street calls the “physical world.””

Bill Rosar: “Part II.

If that is so, why should we assume that physical space has all the properties of visual space and is perhaps not more like smell? Physics is making one big assumption!

I will always remember what Caltech mathematician Richard M. Wilson told me when I consulted him many years ago on ideas I had about how the geometry of visual space reflects changing perspective projections on the retinas. He said, “Keep it simple!” By that he meant being parsimonious and not jumping into fancy mathematical formulations without necessity. I am suggesting that we need to keep the mathematical apparatus here to a minimum, lest its elegance obscure the deeper truth we are seeking–just as Einstein cautioned.

So when we talk about the brain, I think we need to be mindful of what Ray Tallis says about it in his posting “The Disappearance of Appearance,” and just *how* we know about the brain, because we cannot talk about the extended world of physical space and exclude the brain itself from that as a (presumably) physically extended biophysical object. It is not that there is the physical world and there is the brain apart from it.

This ultimately becomes question-begging, because in talking about the brain, we are presupposing physical space, rather than explaining how we have arrived at the notion of physical space and extension. Certainly physical science would deny that physical space is created by the brain. Yet David Bohm would say that physics is largely based on an optical lens-like conception of the physical world, but that physical reality may be more like a hologram (now once again a popular analogy in cosmology because of Leonard Susskind’s theory).

Of course when Karl Pribram then talks about the brain being a mechanism that resolves the holonomic reality (“implicate order”) into a hologram or holographic image (“explicate order”), he forgets that the brain itself would presumably be part of the same holonomic implicate order, and would therefore be resolving itself. By what special power can it perform that trick?

So the very “picture” we have of the brain itself is no different from any other physical entity, as Ray Tallis has been at pains to show.

For now, I’m going to rest with just these rejoinders, and return to your other points later.”

Computing with space: done!

The project of giving a meaning to “computing” part of “Computing with space” is done, via the \lambda-Scale calculus and its graphic lambda calculus (still in preview mode).

_______________________

UPDATE (09.01.2013): There is now a web tutorial about graphic lambda calculus on this blog.  At some point a continuation of “Computing with space …” will follow, with explicit use of this calculus, as well as applications which were mentioned only briefly, like why the diagram explaining the “emergence” of the Reidemeister III move gives a discretized notion of scalar curvature for a metric space with dilations.

_______________________

Explanations.  In the “Computing with space…” paper I claimed that:

1. – there is a “computing” part hidden behind the idea of emergent algebras

2. – which is analogous  with the hypothetical computation taking place in the front-end visual system.

The 1. part is done, essentially. The graphic version of \lambda-Scale is in fact very powerful, because it contains as sectors:

– lambda calculus

– (discrete, abstract) differential calculus

– the formalism of tangle diagrams.

These “sectors” appear as subsets S of graphs in GRAPH (see the preview paper for definitions), for which the condition $G \in  S$ is global, together with  respective selections of  local or global graphic moves (from those available on GRAPH) which transform elements of S into elements of S.

For example, for lambda calculus the relevant set is \lambda-GRAPH and the moves are (ASSOC) and  the graphic \beta move (actually, in this way we obtain a formalism a bit nicer than lambda calculus; in order to obtain exactly lambda calculus we have to add the stupid global FAN-OUT and global pruning moves).

For differential calculus we need to restrict to graphs like those in \lambda-GRAPH, but also admitting dilation gates. We may directly go to \lambda-Scale, which contains lambda calculus (made weaker by adding the (ext) rules, corresponding to \eta-conversion) and differential calculus (via emergent algebras). The moves are (ASSOC), graphic \beta move, (R1), (R2), (ext1), (ext2) and, if we want a dumber version,  some global FAN-OUT and pruning moves.

For tangle diagrams see the post Four symbols and wait for the final version of the graphic calculus paper.

SO  now, I declare part 1. CLOSED. It amounts to patiently writing all details, which is an interesting activity by itself.

Part 2. is open, albeit now I have much more hope to give a graph model for the front-end of the visual system, which is not relying on assumptions about the geometric structure of the space, linear algebra, tasks and other niceties of the existing models.

UPDATE  02.07.2012. I put on arxiv the graphical formalism paper, it should appear on 03.07.2012. I left outside of the paper a big chunk of very intriguing facts about various possible crossing definitions, for another paper.

Leap motion, another example of computing with space

After the post on Unlimited Detail, here is another example of something which may be seen as (facilitating the) computing with space: the Leap, by Leap Motion.

I see a difference and a common point, when I look at both.

Difference:   The Leap, by  is a finished, or almost, product, while Unlimited detail, by Euclideon, is still in development.

Common point: In both cases it is stressed that the respective products are outcomes of mathematical breakthroughs!

A geometric viewpoint on computation?

Let me try to explain what I am trying to do in this work related to “computing with space“. The goal is to understand the process of emergence, in its various precise mathematical forms, like:

– how the dynamics of a big number of particles becomes the dynamics of a continuous system? Apart the physics BS of neglecting infinities, I know of very few mathematically correct approaches. From my mixed background of calculus of variations and continuous media mechanics, I can mention an example of such an approach  in the work of Andrea Braides    on the \Gamma-convergence of the energy functional of a discrete system to the energy functional of a continuous system and atomistic models of solids.

– how to endow a metric space (like a fractal, or sub-riemannian space) with a theory of differential calculus? Translated: how to invent “smoothness” in spaces where there is none, apparently? Because smoothness is certainly emergent. This is part of the field of non-smooth calculus.

– how to explain the profound resemblance between geometrical results of Gromov on groups with polynomial growth and combinatorial results of Breuillard, Gree, Tao on approximate groups? In both cases a nilpotent structure emerges from considering larger and larger scales. The word “explain” means here: identify a general machine at work in both results.

– how to explain the way our brain deals with visual input?  This is a clear case of emergence because the input is the excitation of some receptors of the retina and the output is almost completely not understood, except that we all know that we see objects which are moving and complex geometrical relations among them. A fly sees as well, read From insect vision to robot vision by N. Franceschini, J.M. Pichon, C. Blanes. Related to this paper, I cite from the abstract (boldfaced by me):

  We designed, simulated, and built a complete terrestrial creature which moves about and avoids obstacles solely by evaluating the relative motion between itself and the environment. The compound eye uses an array of elementary motion detectors (EMDS) as smart, passive ranging sensors. Like its physiological counterpart, the visuomotor system is based on analogue, continuous-time processing and does not make use of conventional computers. It uses hardly any memory to adjust the robot’s heading in real time via a local and intermittent visuomotor feedback loop.

More generally, there seems to be a “computation” involved in vision, massively parallel and taking very few steps (up to six), but it is not understood how this is a computation in the mathematical, or computer science sense. Conversely, the visual performances of any device based on computer science computation up to now, are dwarfed by any fly.

I identified a “machine of emergence” which is in work in some of the examples given above. Mathematically, this machine should have something to do with emergent algebras, but what about the computation part?

Probably geometers reason like flies: by definition, a geometrical statement is invariant up to the choice of maps. A sphere is not, geometrically speaking, a particular atlas of maps on the sphere. For a geometer, reproducing whatever it does by using ad-hoc enumeration by  natural numbers, combinatorics  and Turing machines is nonsense, because profoundly not geometrical.

On the other hand, the powerful use and control of abstraction is appealing to the geometer. This justifies the effort to import abstraction techniques from computer science and to replace the non-geometrical stuff by … whatever is more of a geometrical character.

For the moment, such efforts are mostly a source of frustration, a familiar feeling for any mathematician.

But at some point, in these times of profound changes in, mathematics as well as in the society, from all these collective efforts will emerge something beautiful, clear and streamlined.

Geometry of imaginary spaces, by Koenderink

This post is about the article “Geometry of imaginary spaces“,   Journal of  Physiology – Paris, 2011, in press, by Jan Koenderink.

Let me first quote from the abstract (boldfaced  by me):

“Imaginary space” is a three-dimensional visual awareness that feels different from what you experience when you open your eyes in broad daylight. Imaginary spaces are experienced when you look “into” (as distinct from “at”) a picture for instance.

Empirical research suggests that imaginary spaces have a tight, coherent structure, that is very different from that of three-dimensional Euclidean space.

[he proposes the structure of a bundle E^{2} \times A^{1} \rightarrow E^{2}, with basis the euclidean plane, “the visual field” and fiber the 1-dimensional affine line, “the depth domain”,]

I focus on the topic of how, and where, the construction of such geometrical structures, that figure prominently in one’s awareness, is implemented in the brain. My overall conclusion—with notable exceptions—is that present day science has no clue.

What is remarkable in this paper? Many many things, here are just three quotes:

–  (p. 3) “in the mainstream account”, he writes, “… one starts from samples of … the retinal “image”. Then follows a sequence of image operations […] Finally there is a magic step: the set of derived images turns into a “representation of the scene in front of you”. “Magic” because image transformations convert structures into structures. Algorithms cannot convert mere structure into quality and meaning, except by magic. […] Input structure is not intrinsically meaningful, meaning needs to be imposed (magically) by some arbitrary format.”

– (p. 4) “Alternatives to the mainstream account have to […] replace inverse optics with “controlled hallucination” [related to this, see the post “The structure of visual space“]

– (p. 5) “In the mainstream account one often refers to the optical structure as “data”, or “information”. This is thoroughly misleading because to be understood in the Shannon (1948) sense of utterly meaningless information. As the brain structures transform the optical structure into a variety of structured neural activities, mainstream often uses semantic terms to describe them. This confuses facts with evidence. In the case of an “edge detector” (Canny, 1986) the very name suggests that the edge exists before being detected. This is nonsensical, the so-called edge detector is really nothing but a “first order directional derivative operator” (Koenderink and van Doorn, 1992). The latter term is to be preferred because it describes the transformation of structure into structure, whereas the former suggests some spooky operation” [related to this, see the tag archive “Map is the territory“]

Related to my  spaces with dilations, let me finally quote from the “Final remarks”:

The psychogenetic process constrains its articulations through probing the visual front end. This part of the brain is readily available for formal descriptions that are close to the neural hardware. The implementation of the group of isotropic similarities, a geometrical object that can  easily be probed through psychophysical means, remains fully in the dark though.

Scaled lambda epsilon

My first attempt to introduce a scaled version of lambda epsilon turned out to be wrong, but now I think I have found a way. It is a bit trickier than I thought. Let me explain.

In lambda epsilon calculus we have three operations (which are not independent), namely the lambda abstraction, the application and the emergent algebra (one parameter family of) operation(s), called dilations. If we want to obtain a scaled version then we have to “conjugate” with dilations. Looking at terms as being syntactic trees, this amounts to:

– start with a term A and a scale \varepsilon \in \Gamma,

– transform a tree T such that FV(T) \cap FV(A) = \emptyset,  into a tree A_{\varepsilon}[T], by conjugating with A \circ_{\varepsilon} \cdot.

This can be done by recursively defining the transform T \mapsto A_{\varepsilon}[T]. Graphically, we would like to transform the elementary syntactic trees of the three operations into this:


The problem is that, while (c) is just the familiar scaled dilation, the scaled \lambda from (a) does not make sense, because A \circ_{\varepsilon} u is not a variable. Also, the scaled application (b) is somehow misterious.

The solution is to exploit the fact that it makes sense to make substitutions of the form B[ A \circ_{\varepsilon} u : = C] because of the invertibility of dilations. Indeed, A \circ_{\varepsilon} u = C is equivalent with u = A \circ_{\varepsilon^{-1}} C, therefore we may define B[ A \circ_{\varepsilon} u : = C] to mean B[u : = A \circ_{\varepsilon^{-1}} C].

If we look to the rule (ext2) here, the discussion about substitution becomes:

Therefore the correct scaled lambda, instead of (a)  from the first figure, should be this:

The term (syntactic tree) from the LHS should be seen as a notation for the term from the RHS.

And you know what? The scaled application, (b) from the first figure, becomes less misterious, because we can prove the following.

1.  Any u \in X \setminus FV(A) defined a relative variable u^{\varepsilon}_{A} := A \circ_{\varepsilon} u (remark that relative variables are terms!).The set of relative variables is denoted by X_{\varepsilon}(A).

2. The function B \mapsto A_{\varepsilon}[B] is defined for any term B \in T such that FV(A) \cap FV(B) = \emptyset. The definition is this:

–  A_{\varepsilon}[A] = A,

–  A_{\varepsilon}[u] = u for any u \in X \setminus FV(A)

A_{\varepsilon}[ B \mu C] = A \circ_{\varepsilon^{-1}} ((A \circ_{\varepsilon} A_{\varepsilon}[B]) \mu (A \circ_{\varepsilon} A_{\varepsilon}[C]))  for  any B, C \in T such that FV(A) \cap (FV(B) \cup FV(C))= \emptyset

–  A_{\varepsilon}[ u \lambda B] is given by:

 

 

3. B is a scaled term, notation B \in T_{\varepsilon} (A), if there is a term B' \in T such that FV(A) \cap FV(B') = \emptyset and such that B = A_{\varepsilon}[B'].

4. Finally, the operations on scaled terms are these:

– for any \mu \in \Gamma and B, C \in T_{\varepsilon}(A) the scaled application (of coefficient \mu) is

B \mu^{\varepsilon}_{A} C = A \circ_{\varepsilon^{-1}} ((A \circ_{\varepsilon} B) \mu (A \circ_{\varepsilon} C))

– for any scaled variable  u^{\varepsilon}_{A} \in X_{\varepsilon}(A)  and any scaled term B \in T_{\varepsilon}(A) the scaled abstraction is

5.    With this, we can prove that (u^{\varepsilon}_{A} \lambda^{\varepsilon}_{A} B) 1^{\varepsilon}_{A} C = (u \lambda B) 1 C = B [ u: = C], which is remarkable, I think!

The neuron unit

I am updating the paper on \lambda \epsilon almost daily.  It is the first time when I am doing such a thing, it is maybe interesting to see what comes out of this way of writing.

The last addition is something I was thinking about for a long time, something which is probably well known in some circles, but maybe not. It is about eliminating variable (names) from such a calculus. This has been done in several ways, here is another (or the same as a previous one?).

The idea is simple. let T be any term and x \in Var(T) a variable. Look at the syntactic tree of T, then glue to all leaves decorated by x the leaves of a tree with nodes consisting of FAN-OUT  gates.

Further on I shall identify syntactic  trees with terms. I shall add to such trees a new family
(of trees), constructed from the elementary tree depicted at (a) in the next figure. At (b) we see an example of such a tree. We consider also the trivial tree (c).

We have to think about such trees as devices for multiplying the occurences of a variable. I call them FAN-OUT trees. All these trees, with the exception of the trivial one (c), are planar binary trees. We shall add the following rule of associativity:

(ASSOC) any two FAN-OUT trees with the same number of leaves are identified.

This rule will be applied under the following form: we are free to pass from a FAN-OUT tree to an equivalent one which has the same number of leaves. The name “associativity” comes from the fact that a particular instance of this rule (which deserves the name  “elementary associativity move”) is this:

With the help of these FAN-OUT trees we shall replace the multiple occurences of a variable by such trees. Let us see what become the rules of \lambda \varepsilon calculus by using this notation.

\alpha conversion is no longer needed, because variables have no longer names. Instead, we are free to graft usual trees to FAN-OUT trees. This way, instead of terms we shall use “neurons”.

Definition:   The forgetful form (b) of a syntactic tree (a) (of a term) is the tree with the name variables deleted.

A neuron is the result of gluing the root of a forgetful form of a syntactic tree to the root of a FAN-OUT tree, like in the following figure.

The axon of the neuron is the FAN-OUT tree. The dendrites of the neuron are the undecorated edges of the forgetful form of the syntactic tree. A dendrite is bound if it is a left  edge pointing to a node decorated by \lambda.   For any bound dendrite, the set of dependent dendrides are those of the sub-tree starting from the right edge of the \lambda node (where the bound dendrite is connected via a left edge), which are moreover not bound. Otherwise a dendrite is free.  The soma of the neuron is the forgetful form of the syntactic tree.

Substitution. We are free to connect leaves of axons of neurons with dendrites of other neurons.
In order to explain the substitution we have to add the following rule:

(subst) The leaves of an axon cannot be glued to more than one bounded dendrite of another neuron. If a leaf of the axon of the neuron A is connected to a bound dendrite of the neuron B, then it has to be the leftmost leaf of the axon of A. Moreover, in this case all other leaves of A which are connected with B have to be connected only via dendrites which are dependent on the mentioned bound dendrite of B, possibly via adding leaves to the axon of A by using (assoc).

Substitution is therefore assimilated to connecting neurons.

New (beta*). The rule (beta*) takes the following graphical form. In the next figure appear only the leaf of the neuron A connecting to the \lambda node (the other leaves of the axon of A not drawn) and only some of the dendrites depending on the bound one relative to the \lambda node which is figured.

The neuron A may have other dendrites, not figured. According to the definition of the neuron, A together with the \lambda node and adiacent edges form a bigger neuron. Also figured is another leaf of the axon of the neuron B, which may point to another neuron. Finally, in the RHS, the bound dendrite looses all dependents.

Important remark. Notice that there may be multiple outcomes from (subst) and (beta*)! Indeed, this is due to the fact that connections could be made in several ways, by using all or only of part of the dependent dendrites. Because of that, it seems that this version of the calculus is richer than the previous one, but I am not at all sure if this is the case.

Another thing to be remarked is that the “=” sign in this version of these  rules is reflexive and transitive, but not symmetric. Previously the “$ latex =$” sign was supposed to be symmetric.

(R1)  That is easy, we use the emergent algebra gates:

(R2)    Easy as well:

The rules (ext1), (ext2) take obvious forms.

Therefore, if we think about computation as reduction to a normal form (if any), in this graphical notation with “neurons”, computation amounts to re-wiring of neurons or changing the rewiring inside the soma of some neurons.

Variables dissapeared, with the price of introducing FAN-OUT trees.

As concerns the  remark  previously made, we could obtain a calculus which is clearly equivalent with \lambda \varepsilon by modifying the definition of the neuron, in this way.
In order to clearly specify which are the dependent dendrites, we could glue to any bound dendrite a FAN-OUT tree, such that the leaves of this tree connect again with a set of dependent dendrites of the same neuron. In this way, substitution and (beta*) will amount of erasing such a FAN-OUT tree and then perform the moves, as previously explained, but using this time all the dependent dendrites which were connected to the bound dendrite by the erased tree.

The gnomon in the greek theater of vision, I

In the post Theatron as an eye I proposed the Greek Theater, or Theatron (as opposed to the “theater in a box”, or Cartesian Theater, see further) as a good model for   vision.

Any model of vision should avoid the homunculus fallacy. What looks less understood is that any good model of vision should avoid the scenic space fallacy. The Cartesian Theater argument against the existence of the homunculus is not, by construction, an argument against the scenic space. Or, in the Cartesian Theater, homunculus and scenic space come to existence in a pair. As a conclusion, it seems that there could not be a model of vision which avoids the homunculus but is not avoiding the scenic space. This observation is confirmed by facts: there is no good, rigorous  model of vision up to date, because all proposed models rely on the a priori existence of a scenic space. There is, on the contrary, a great quantity of experimental data and theoretical partial models which show just how complex the problem of vision is. But, essentially, from a mathematician viewpoint, it is not known how to even formulate the problem of vision.

In the influent paper “The brain a geometry engine”  J. Koenderink proposes that (at least a part of) the visual mechanism is doing a kind of massively parallel computation, by using an embodiment of the geometry of jet spaces (the euclidean infinitesimal geometry of a smooth manifold)  of the scenic space. Jean Petitot continues along this idea, by proposing a neurogeometry of vision based essentially on the sub-riemannian geometry of those jet spaces. This an active mathematical area of research, see for example “Antropomorphic image reconstruction via hypoelliptic diffusion“, by Ugo Boscain et al.

Sub-riemannian geometry is one of my favorite mathematical subjects, because it  is just a  particular model of a metric space with dilations.  Such spaces are somehow fundamental for the problem of vision, I think. Why? because there is behind them a purely relational formalism, called “emergent algebra“, which allow to understand “understanding space” in a purely relational way. Thus I hope emergent algebras could be used in order to formulate the problem of vision as the problem of computing with space, which in turn could be used for getting a good model of vision.

To my surprise, some time ago I have found that this  very complex subject has a respectable age, starting with Pythagora  and Plato!  This is how I arrived to write this blog, as an effort to disseminate what I progressively understand.

This brings me back to the theater and, finally, to gnomon. I cite from previous wiki link:

Hero defined a gnomon as that which, added to an entity (number or shape), makes a new entity similar to the starting entity.

In the greek theater, a gnomon sits in the center of the orchestra (which is the circular place where things happen in the greek thater, later replaced by the scene in the theater in a box). Why?

Three problems and a disclaimer

In this post I want to summarize the list of problems I am currently thinking about. This is not a list of regular mathematical problems, see the disclaimer on style written at the end of the post.

Here is the list:

1. what is “computing with space“? There is something happening in the brain (of a human or of a fly) which is akin to a computation, but is not a logical computation: vision. I call this “computing with space”. In the head there are a bunch of neurons chirping one to another, that’s all. There is no euclidean geometry, there are no a priori coordinates (or other extensive properties), there are no problems to solve for them neurons, there is  no homunculus and no outer space, only a dynamical network of gates (neurons and their connections). I think that a part of an answer is the idea of emergent algebras (albeit there should be something more than this).  Mathematically, a closely related problem is this: Alice is exploring a unknown space and then sends to Bob enough information so that Bob could “simulate” the space in the lab. See this, or this, or this.

Application: give the smallest hint of a purely relational  model of vision  without using any a priori knowledge of the (euclidean or other) geometry of outer space or any  pre-defined charting of the visual system (don’t give names to neurons, don’t give them “tasks”, they are not engineers).

2. non-commutative Baker-Campbell-Hausdorff formula. From the solution of the Hilbert’s fifth problem we know that any locally compact topological group without small subgroups can be endowed with the structure of a “infinitesimally commutative” normed group with dilations. This is true because  one parameter sub-groups  and Gleason metrics are used to solve the problem.  The BCH formula solves then another problem: from the infinitesimal structure of a (Lie) group (that is the vector space structure of the tangent space at the identity and the maniflod structure of the Lie group) and from supplementary infinitesimal data (that is the Lie bracket), construct the group operation.

The problem of the non-commutative BCH is the following: suppose you are in a normed group with dilations. Then construct the group operation from the infinitesimal data (the conical group structure of the tangent space at identity and the dilation structure) and supplementary data (the halfbracket).

The classical BCH formula corresponds to the choice of the dilation structure coming from the manifold structure of the Lie group.

In the case of a Carnot group (or a conical group), the non-commutative BCH formula should be trivial (i.e. x y = x \cdot y, the equivalent of xy = x+y in the case of a commutative Lie group, where by convention we neglect all “exp” and “log” in formulae).

3. give a notion of curvature which is meaningful for sub-riemannian spaces. I propose the pair curvdimension- curvature of a metric profile. There is a connection with problem 1: there is a link between the curvature of the metric profile and the “emergent Reidemeister 3 move” explained in section 6 of the computing with space paper. Indeed, at page 36 there is this figure. Yes, R^{x}_{\epsilon \mu \lambda} (u,v) w is a curvature!

Disclaimer on style. I am not a problem solver, in the sense that I don’t usually like to find the solution of an already formulated problem. Instead, what I do like to do is to understand some phenomenon and prove something about it in the simplest way possible.  When thinking about a subject, I like to polish the partial understanding I have by renouncing to use any “impure” tools, that is any (mathematical) fact which is strange to the subject. I know that this is not the usual way of doing the job, but sometimes less is more.

Spacebook: a facebook for space

I follow the work of Mark Changizi on vision. Previously I mentioned one of his early papers on this subject, “Harnessing vision for computation” .

One of the applications of computing with space  could be to SHARE THE SPATIAL EXPERIENCE ON THE WEB.

Background. When I was writing the paper on the problem of computing with space, I stumbled upon this article by Mark in Psychology Today

The Problem With the Web and E-Books Is That There’s No Space for Them

The title says a lot. I was intrigued by the following passage

“My personal library serves as extension of my brain. I may have read all my books, but I don’t remember most of the information. What I remember is where in my library my knowledge sits, and I can look it up when I need it. But I can only look it up because my books are geographically arranged in a fixed spatial organization, with visual landmarks. I need to take the integral of an arctangent? Then I need my Table of Integrals book, and that’s in the left bookshelf, upper middle, adjacent to the large, colorful Intro Calculus book.”

So I posted the following comment:  Is your library my library?

“Good point, but you have converted a lot of time into understanding, exploring and using the space of your library. To me the brain-spatial interface of your library is largely incomprehensible. I have to spend time in order to reconstruct it in my head.

Then, your excellent suggestion may give somebody the idea to do a “facebook” for our personal libraries. How to share spatial competences, that is a question!”

In the section 2.7 (“Spacebook”) of the paper on computing with space I mention this as an intriguing application of this type of computing (the name itself was suggested by Mark Changizi after I sent him a first version of the paper).

What more?Again from browsing Mark Changizi site, I learned that in fact this problem of non-spatiality (say) of e-books has measurable effects. Indeed, see this article by Maia Szalavitz

Do E-Books Make It Harder to Remember What You Just Read?

Nice! But in order to do a spacebook we need first to understand the primitives of space (as represented in the human brain) and then how to “port” them by using the web.

Theatron as an eye

I want to understand what “computing with space” might be. By making  a parallel with the usual computation, there are three ingredients which need to be identified: what are the computing with space equivalents of

1. the universal computing gate (in usual computing this is the transistor)

2. the universal machine (in usual computing this is the Turing machine)

3. what is the universal machine doing by using its arrangement of universal computing gates (in usual computing this is the algorithm).

I think that (3) is (an abstraction of) the activity of map making, or space exploration. The result of this activity is coded by a dilation structure, but I have no idea HOW such a result is achieved. Once obtained though, a mathematical model of the space is the consequence of  a priori assumptions (that we can repeat in principle indefinitely the map making operations) which lead to the emergent algebraic and differential structure of the space.

The universal gate (1), I think, is the dilation gate, or the map-territory relation.

Today I want to pave the way to the discovery of the universal machine (2). This is related to my previous posts The Cartesian Theater: philosophy of mind versus aerography and Towards aerography, or how space is shaped to comply with the perceptions of the homunculus.

My take is that the Greek Theater, or Theatron (as opposed to the “theater in a box”, or Cartesian Theater) is a good model for an universal machine.

For today, I just want to point to the similarities between the theatron and the eye.

The following picture represents the main parts of the theatron (the ancient greek meaning of “theatron” is “place of seeing). In black are written the names of the theatron parts and in red you see the names of the corresponding parts of the eye, according to the proposed similarity.

Let me proceed with the meaning of these words:

– Analemmata means the pedestal of a sundial (related with analemma and analemmatic sundial; basically a theatron is an analemmatic sundial, with the chorus as the gnomon). I suggest to parallel this with the choroid of the eye.

– Diazomata (diazoma means “belt”), proposed to be similar with the retina.

Prohedria (front seating) is a privilege to sit in the first few rows at the bottom of the viewing area. Similar with the fovea (small pit), responsible for sharp central vision.

Skene (tent), the stage building, meant to HIDE the workings  of the actors which are not part of the show, as well as the masks and other materials. When a character dies, it happens behind the skene. Eventually, the skene killed the chorus and  became the stage. The eye equivalent  of this is the iris.

Parodos (para – besides, counter, and ode – song) entrance of the chorus. Eye equivalent is the crystalline lens.

– Orchestra, the ancient greek stage, is the place where the chorus acts, the center of the greek theater. Here we pass to abstraction: the eye correspondent is the visual field.

Combinatorics versus geometric…

… is like using roman numerals versus using a positional numeral system, like the hindu-arabic numerals we all know very well. And there is evidence that our brain is NOT using combinatorial techniques, but geometric, see further.

What is this post about? Well, it is about the problem of using knowledge concerning topological groups in order to study discrete approximate groups, as Tao proposes in his new course, it is about discrete finitely generated groups with polynomial growth which, as Gromov taught us, when seen from far away they become nilpotent Lie groups, and so on. Only that there is a long way towards these subjects, so please bear me a little bit more.

This is part of a larger project to try to understand approximate groups, as well as normed groups with dilations, in a more geometric way. One point of interest is understanding the solution to the Hilbert’s fifth problem from a more general perspective, and this passes by NOT using combinatorial techniques from the start, even if they are one of the most beautiful mathematical gems which is the solution given by Gleason-Montgomery-Zippin to the problem.

What is combinatorial about this solution? Well, it reduces (in a brilliant way) the problem to counting, by using objects which are readily at hand in any topological group, namely the one-parameter subgroups. There is nothing wrong about this, only that, from this point of view, Gromov’s theorem on groups with polynomial growth appears as magical. Where is this nilpotent structure coming from?

As written in a previous post, Hilbert’s fifth problem without one-parameter subgroups, Gromov’ theorem is saying a profound geometrical thing about a finitely generated group with polynomial growth: that seen from far away this group is self-similar, that is a conical group, or a contractible group w.r.t. any of its dilations. That is all! the rest is just a Siebert’ result. This structure is deeply hidden in the proof and one of my purposes is to understand where it is coming from. A way of NOT understanding this is to use huge chunks of mathematical candy in order to make this structure appear by magic.

I cannot claim that I understand this, that I have a complete solution, but instead, for this post, I looked for an analogy and I think I have found one.

It is the one from the first lines of the post.

Indeed, what is wrong, by analogy, with the roman numeral system? Nothing, actually, we have generators, like I, V, X, L, C, D, M, and relations, like IIII = IV, and so on (yes, they are not generators and relations exactly like in a group sense). The problems appear when we want to do complex operations, like addition of large numbers. Then we have to be really clever and use very intelligent and profound combinatorial arguments in order to efficiently manage all the cancellations and whatnot coming from the massive use of relations. Relations are at very small scale, we have to bridge the way towards large scales, therefore we have to approximate the errors by counting in different ways and to be very clever about these ways.

Another solution for this, historically preferred, was to use a positional number system, which is more geometric, because it exploits a large scale property of natural numbers, which is that their collection is (approximately) self-similar. Indeed, take (as another kind of generators, again not in a group sense), a small set, like B={0, 1, 2, …, 8, 9} and count in base 10, which goes roughly like this: take a (big, preferably) natural number X and do the following

– initialize i = 1,

– find the smallest natural power a_{i} of 10 such that 10^{-a_{i}} X has a norm smaller than 10, then pick the element k_{i} of B which minimizes the distance to 10^{-a_{i}} X,

– substract (from the right or the left, it does not matter here because addition of natural numbers is commutative) 10^{a_{i}} k_{i} from X, and rename the result by X,

-repeat until X \in B and finish the algorithm by taking the last digit as X.

In the end (remember, I said “roughly”) represent X as a string which codes the string of pairs (a_{i}, k_{i}).

The advantage of this representation of natural numbers is that we can do, with controlled precision, the addition of big numbers, approximately. Indeed, take two very big numbers X, Y and take another number, like 10. Then for any natural n define

(X+Y) approx(n) = 10^{n} ( [X]_{n} + [Y]_{n})

where [X]_{n} is the number which is represented as the truncation of the string which represents X up to the first n letters, counting from left to right.
If n is small compared with X, Y, then (X+Y) approx(n) is close to the true X+Y, but the computational effort for calculating (X+Y) approx(n) is much smaller than the one for calculating X+Y.

Once we have this large scale self-similarity, then we may exploit it by using the more geometrical positional numeral system instead of the roman numeral system, that is my analogy. Notice that in this (almost correct) algorithm 10^{a} X is not understood as X+X+....+X 10^{a} times.

Let me now explain why the positional numeral system is more geometric, by giving a neuroscience argument, besides what I wrote in this previous post: “How not to get bored, by reading Gromov and Tao” (mind the comma!).

I reproduce from the wiki page on “Subitizing and counting

Subitizing, coined in 1949 by E.L. Kaufman et al.[1] refers to the rapid, accurate, and confident judgments of number performed for small numbers of items. The term is derived from the Latin adjective subitus (meaning “sudden”) and captures a feeling of immediately knowing how many items lie within the visual scene, when the number of items present falls within the subitizing range.[1] Number judgments for larger set-sizes were referred to either as counting or estimating, depending on the number of elements present within the display, and the time given to observers in which to respond (i.e., estimation occurs if insufficient time is available for observers to accurately count all the items present).

The accuracy, speed, and confidence with which observers make judgments of the number of items are critically dependent on the number of elements to be enumerated. Judgments made for displays composed of around one to four items are rapid,[2] accurate[3] and confident.[4] However, as the number of items to be enumerated increases beyond this amount, judgments are made with decreasing accuracy and confidence.[1] In addition, response times rise in a dramatic fashion, with an extra 250–350 ms added for each additional item within the display beyond about four.[5]

This is a brain competence which is spatial (geometrical) in nature, as evidenced by Simultanagnosia:

Clinical evidence supporting the view that subitizing and counting may involve functionally and anatomically distinct brain areas comes from patients with simultanagnosia, one of the key components of Balint’s syndrome.[14] Patients with this disorder suffer from an inability to perceive visual scenes properly, being unable to localize objects in space, either by looking at the objects, pointing to them, or by verbally reporting their position.[14] Despite these dramatic symptoms, such patients are able to correctly recognize individual objects.[15] Crucially, people with simultanagnosia are unable to enumerate objects outside the subitizing range, either failing to count certain objects, or alternatively counting the same object several times.[16]

From the wiki description of simultanagnosia:

Simultanagnosia is a rare neurological disorder characterized by the inability of an individual to perceive more than a single object at a time. It is one of three major components of Bálint’s syndrome, an uncommon and incompletely understood variety of severe neuropsychological impairments involving space representation (visuospatial processing). The term “simultanagnosia” was first coined in 1924 by Wolpert to describe a condition where the affected individual could see individual details of a complex scene but failed to grasp the overall meaning of the image.[1]

I rest my case.

Unlimited detail: news

There are news regarding the “Unlimited detail” technology developed by Euclideon.

To be clear, I don’t think it’s a scam. It may be related to what I am describing in the paper Maps of metric spaces, which has the abstract:

This is a pedagogical introduction covering maps of metric spaces, Gromov-Hausdorff distance and its “physical” meaning, and dilation structures as a convenient simplification of an exhaustive database of maps of a metric space into another. See arXiv:1103.6007 for the context.

This is pure speculation, but it looks to me that all has to do with manipulations of maps in the screen pixels space, along the lines of using scale stable and viewpoint stable zoom sequences (definitions 4.1-4.4) and (the groupoid of) transformations between these.

But how exactly? I would really much like to know!

Some time ago, after seeing the demos (check the link to the wiki page of unlimited detail), I tried to learn more about the mathematical details, but without success (which is understandable).

Now Bruce Dell released new demos and an interview!

Don’t be fooled by the fractal looking of the territory! Probably it has more to do with the fact that in order to use Unlimited Detail, one first needs to have a territory to render, so, in my opinion, the guys generated it by using some fractal tricks.

 

UPDATE 06.09.2012: After a bit more than a year, now Euclideon morphed into Euclideon:Geoverse. Looks less and less as a scam, what do you think, Markus Persson?

Still looking forward to learn how exactly Unlimited detail works though.

Towards aerography, or how space is shaped to comply with the perceptions of the homunculus

In the previous post

The Cartesian Theater: philosophy of mind versus aerography

I explained why the Cartesian Theater is not well describing the appearance of the homunculus.

A “Cartesian theater”, Dennett proposes, is any theory about what happens in one’s mind which can be reduced to the model of a “tiny theater in the brain where a homunculus … performs the task of observing all the sensory data projected on a screen at a particular instant, making the decisions and sending out commands.”

This leads to infinite regression, therefore any such theory is flawed. One has to avoid the appearance of the homunculus in one’s theory, as a consequence.

The homunculus itself may appear from apparently innocuous assumptions, such as the introduction of any limen (or threshold), like supposing that (from Consciousness Explained (1991), p. 107)

“…there is a crucial finish line or boundary somewhere in the brain, marking a place where the order of arrival equals the order of “presentation” in experience because what happens there is what you are conscious of.”

By consequence such assumptions are flawed. There is no limen, boundary inside the brain (strangely, any assumption which supposes a boundary which separates the individual from the environment is not disturbing anybody excepting Varela, Maturana, or the second order cybernetics).

In the previous post I argued, based on my understanding of the excellent paper of Kenneth R Olwig

“All that is landscape is melted into air: the `aerography’ of ethereal space”, Environment and Planning D: Society and Space 2011, volume 29, pages 519 – 532,

that the “Cartesian theater” model is misleading because it neglects to notice that what happens on stage is as artificial as the homunculus spectator, while, in the same time, the theater itself (a theater in a box) is designed for perception.

Therefore, while everybody (?) accepts that there is no homunculus in the brain, in the same time nobody seems to be bothered that always the perception data are modeled as if they come from the stage of the Cartesian theater.

For example, few would disagree that we see a 3-dimensional, euclidean world. But this is obviously not what we see and the proof is that we can be easily tricked by stereoscopy. These are the visual data (together with other, more subtle, auditory, posture and whatnot) which the brain uses to reconstruct the world as seen by a homunculus, created by our illusory image that there is a boundary between us (me, you) and the environment.

You would say: nobody in the right mind denies that the world is 3d, at least our familiar everyday world, not quantum or black holes or other inventions of physicists. I don’t deny it, just notice, like in this previous post, that the space is perceived as it is based on prior knowledge, that is because prior “controlled hallucinations” led consistently to coherent interpretations.

The idea is that in fact there are two things to avoid: one is the homunculus and the other one is the scenic space.

The “scenic space” is itself a model of the real space (does this exists?) and it leads itself to infinite regression. We “learn space” by relating to it and modeling it in our brains. I suppose that all (inside and outside of the brain) complies with the same physical laws and that the rational explanation for the success of the “3d scenic space” (which is consistent with our educated perception, but also with physical phenomena in our world, at least at human scale and range) should come from this understanding that brain processes are as physical as a falling apple and as mathematical as perspective is.

Topographica, the neural map simulator

The following speaks for itself:

 Topographica neural map simulator 

“Topographica is a software package for computational modeling of neural maps, developed by the Institute for Adaptive and Neural Computation at the University of Edinburgh and the Neural Networks Research Group at the University of Texas at Austin. The project is funded by the NIMH Human Brain Project under grant 1R01-MH66991. The goal is to help researchers understand brain function at the level of the topographic maps that make up sensory and motor systems.”

From the Introduction to the user manual:

“The cerebral cortex of mammals primarily consists of a set of brain areas organized as topographic maps (Kaas et al. 1997Vanessen et al. 2001). These maps contain systematic two-dimensional representations of features relevant to sensory, motor, and/or associative processing, such as retinal position, sound frequency, line orientation, or sensory or motor motion direction (Blasdel 1992Merzenich et al. 1975Weliky et al. 1996). Understanding the development and function of topographic maps is crucial for understanding brain function, and will require integrating large-scale experimental imaging results with single-unit studies of individual neurons and their connections.”

One of the Tutorials is about the Kohonen model of self-organizing maps, mentioned in the post  Maps in the brain: fact and explanations.

Entering “chora”, the infinitesimal place

There is a whole discussion around the key phrases “The map is not the territory” and “The map is the territory”. From the wiki entry on the map-territory relation, we learn that Korzybski‘s dictum “the map is not the territory” means that:

A) A map may have a structure similar or dissimilar to the structure of the territory,

B) A map is not the territory.

Bateson, in “Form, Substance and Difference” has a different take on this: he starts by explaining the pattern-substance dichotomy

Let us go back to the original statement for which Korzybski is most famous—the statement that the map is not the territory. This statement came out of a very wide range of philosophic thinking, going back to Greece, and wriggling through the history of European thought over the last 2000 years. In this history, there has been a sort of rough dichotomy and often deep controversy. There has been violent enmity and bloodshed. It all starts, I suppose, with the Pythagoreans versus their predecessors, and the argument took the shape of “Do you ask what it’s made of—earth, fire, water, etc.?” Or do you ask, “What is its pattern?” Pythagoras stood for inquiry into pattern rather than inquiry into substance.1 That controversy has gone through the ages, and the Pythagorean half of it has, until recently, been on the whole the submerged half.

Then he states his point of view:

We say the map is different from the territory. But what is the territory? […] What is on the paper map is a representation of what was in the retinal representation of the man who made the map–and as you push the question back, what you find is an infinite regress, an infinite series of maps. The territory never gets in at all.

Always the process of representation will filter it out so that the mental world is only maps of maps of maps, ad infinitum.

At this point Bateson puts a very interesting footnote:

Or we may spell the matter out and say that at every step, as a difference is transformed and propagated along its pathways, the embodiment of the difference before the step is a “territory” of which the embodiment after the step is a “map.” The map-territory relation obtains at every step.

Inspired by Bateson, I want to explore from the mathematical side the point of view that there is no difference between the map and the territory, but instead the transformation of one into another can be understood by using tangle diagrams.

Let us imagine that the exploration of the territory provides us with an atlas, a collection of maps, mathematically understood as a family of two operations (an “emergent algebra”). We want to organize this spatial information in a graphical form which complies with Bateson’s footnote: map and territory have only local meaning in the graphical representation, being only the left-hand-side (and r-h-s respectively) of the “making map” relation.

Look at the following figure:

In the figure from the left, the “v” which decorates an arc, represents a point in the “territory”, that is the l-h-s of the relation, the “u” represents a “pixel in the map”, that is the r-h-s of a relation. The relation itself is represented by a crossing decorated by an epsilon, the “scale” of the map.

The opposite crossing, see figure from the right, is the inverse relation.

Imagine now a complex diagram, with lots of crossings, decorated by various
scale parameters, and segments decorated with points from a space X which
is seen both as territory (to explore) and map (of it).

In such a diagram the convention map-territory can be only local, around each crossing.

There is though a diagram which could unambiguously serve as a symbol for
“the place (near) the point x, at scale epsilon” :

In this diagram, all crossings which are not decorated have “epsilon” as a decoration, but this decoration can be unambiguously placed near the decoration “x” of the closed arc. Such a diagram will bear the name “infinitesimal place (or chora) x at scale epsilon”.

The structure of visual space

Mark Changizi has an interesting post “The Visual Nerd in You Undestands Curved Space” where he explains that spherical geometry is relevant for the visual perception.

At some point he writes a paragraph which triggered my post:

Your visual field conforms to an elliptical geometry!

(The perception I am referring to is your perception of the projection, not your perception of the objective properties. That is, you will also perceive the ceiling to objectively, or distally, be a rectangle, each angle having 90 degrees. Your perception of the objective properties of the ceiling is Euclidean.)

Is it true that our visual perception senses the Euclidean space?

Look at this very interesting project

The structure of optical space under free viewing conditions

and especially at this paper:

The structure of visual spaces by J.J. Koenderink, A.J. van Doorn, Journal of mathematical imaging and vision, Volume: 31, Issue: 2-3 (2008), pp. 171-187

In particular, one of the very nice things this group is doing is to experimentally verify the perception of true facts in projective geometry (like this Pappus theorem).

From the abstract of the paper: (boldfaced by me)

The “visual space” of an optical observer situated at a single, fixed viewpoint is necessarily very ambiguous. Although the structure of the “visual field” (the lateral dimensions, i.e., the “image”) is well defined, the “depth” dimension has to be inferred from the image on the basis of “monocular depth cues” such as occlusion, shading, etc. Such cues are in no way “given”, but are guesses on the basis of prior knowledge about the generic structure of the world and the laws of optics. Thus such a guess is like a hallucination that is used to tentatively interpret image structures as depth cues. The guesses are successful if they lead to a coherent interpretation. Such “controlled hallucination” (in psychological terminology) is similar to the “analysis by synthesis” of computer vision.

So, the space is perceived to be euclidean based on prior knowledge, that is because prior controlled hallucinations led consistently to coherent interpretations.

Maps in the brain: fact and explanations

From wikipedia

Retinotopy describes the spatial organization of the neuronal responses to visual stimuli. In many locations within the brain, adjacent neurons have receptive fields that include slightly different, but overlapping portions of the visual field. The position of the center of these receptive fields forms an orderly sampling mosaic that covers a portion of the visual field. Because of this orderly arrangement, which emerges from the spatial specificity of connections between neurons in different parts of the visual system, cells in each structure can be seen as forming a map of the visual field (also called a retinotopic map, or a visuotopic map).

See also tonotopy for sounds and the auditory system.

The existence of retinotopic maps is a fact, the problem is to explain how they appear and how they function without falling into the homunculus fallacy, see my previous post.

One of the explanations of the appearance of these maps is given by Teuvo Kohonen.

Browse this paper (for precise statements) The Self-Organizing map , or get a blurry impression from this wiki page. The last paragraph from section B. Brain Maps reads:

It thus seems as if the internal representations of information in the brain are generally organized spatially.

Here are some quotes from the same section, which should rise the attention of a mathematician to the sky:

Especially in higher animals, the various cortices in the cell mass seem to contain many kinds of “map” […] The field of vision is mapped “quasiconformally” onto the primary visual cortex. […] in the visual areas, there are line orientation and color maps. […] in the auditory cortex there are the so-called tonotopic maps, which represent pitches of tones in terms of the cortical distance […] at the higher levels the maps are usually unordered, or at most the order is a kind of ultrametric topological order that is not easy interpretable.

Typical for self-organizing maps is that they use (see wiki page) “a neighborhood function to preserve the topological properties of the input space”.

From the connectionist viewpoint, this neighbourhood function is implemented by lateral connections between neurons.

For more details see for example Maps in the Brain: What Can We Learn from Them? by Dmitri B. Chklovskii and Alexei A. Koulakov. Annual Review of Neuroscience 27: 369-392 (2004).

Also browse Sperry versus Hebb: Topographic mapping in Isl2/EphA3 mutant mice by Dmitri Tsigankov and Alexei A. Koulakov .

Two comments:

1. The use of a neighbourhood function is much more than just preserving topological information. I tentatively propose that such neighbourhood functions appear out of the need of organizing spatial information, like explained in the pedagogical paper from the post Maps of metric spaces.

2. Just to reason on discretizations (like hexagonal or other) of the plane is plain wrong, but this is a problem encountered in many many places elsewhere. It is wrong because it introduces the (euclidean) space on the back door (well, this and using happily an L^2 space).

The Cartesian Theater: philosophy of mind versus aerography

Looks to me there is something wrong with the Cartesian Theater term.

Short presentation of the Cartesian Theater, according to wikipedia (see previous link):

The Cartesian theater is a derisive term coined by philosopher Daniel Dennett to pointedly refer to a defining aspect of what he calls Cartesian materialism, which he considers to be the often unacknowledged remnants of Cartesian dualism in modern materialistic theories of the mind.

Descartes originally claimed that consciousness requires an immaterial soul, which interacts with the body via the pineal gland of the brain. Dennett says that, when the dualism is removed, what remains of Descartes’ original model amounts to imagining a tiny theater in the brain where a homunculus (small person), now physical, performs the task of observing all the sensory data projected on a screen at a particular instant, making the decisions and sending out commands.

Needless to say, any theory of mind which can be reduced to the Cartesian Theater is wrong because it leads to the homunculus fallacy: the homunculus has a smaller homunculus inside which is observing the sensory data, which has a smaller homunculus inside which …

This homunculus problem is very important in vision. More about this in a later post.

According to Dennett, the problem with the Cartesian theater point of view is that it introduces an artificial boundary (from Consciousness Explained (1991), p. 107)

“…there is a crucial finish line or boundary somewhere in the brain, marking a place where the order of arrival equals the order of “presentation” in experience because what happens there is what you are conscious of.”

As far as I understand, this boundary creates a duality: on one side is the homunculus, on the other side is the stage where the sensory data are presented. In particular this boundary acts as a distinction, like in the calculus of indications of Spencer-Brown’ Laws of Form.

This distinction creates the homunculus, hence the homunculus fallacy. Neat!

Why I think there is something wrong with this line of thought? Because of the “theater” term. Let me explain.

The following is based on the article of Kenneth R Olwig

“All that is landscape is melted into air: the `aerography’ of ethereal space”, Environment and Planning D: Society and Space 2011, volume 29, pages 519 – 532.

but keep in mind that what is written further represents my interpretation of some parts of the article, according to my understanding, and not the author point of view.

There has been a revolution in theater, started by

“…the early-17th-century court masques (a predecessor of opera) produced by the author Ben Jonson (the leading author of the day after Shakespeare) together with the pioneering scenographer and architect Inigo Jones.
The first of these masques, the 1605 Masque of Blackness (henceforth Blackness ), has a preface by Jonson containing an early use of landscape to mean scenery and a very early identification of landscape with nature (Olwig, 2002, page 80), and Jones’s scenography is thought to represent the first theatrical use of linear perspective in Britain (Kernodle, 1944, page 212; Orgel, 1975).” (p. 521)Ben Johnson,

So? Look!

From the time of the ancient Greeks, theater had largely taken place outside in plazas and market places, where people could circle around, or, as with the ancient Greco-Roman theater or Shakespeare’s Globe, in an open roofed arena. Jones’s masques, by contrast, were largely performed inside a fully enclosed rectangular space, giving him control over both the linear-focused geometrical perspectival organization of the performance space and the aerial perspective engendered by the lighting (Gurr, 1992; Orrell, 1985).” (p. 522, my emphasis)

“Jonson’s landscape image is both enframed by, and expressive of, the force of the lines of perspective that shoot forth from “the eye” – notably the eye of the head of state who was positioned centrally for the best perspectival gaze.” (p. 523, my emphasis)

“Whereas theater from the time of the ancient Greeks to Shakespeare’s Globe was performed in settings where the actor’s shadow could be cast by the light of the sun, Jones’s theater created an interiorized landscape in which the use of light and the structuring of space created an illusion of three dimensional space that shot from the black hole of the individual’s pupil penetrating through to a point ending ultimately in ethereal cosmic infinity. It was this space that, as has been seen, and to use Eddington’s words, has the effect of “something like a turning inside out of our familiar picture of the world” (Eddington, 1935, page 40). It was this form of theater that went on to become the traditional `theater in a box’ viewed as a separate imagined world through a proscenium arch.” (p. 526, my emphasis)

I am coming to the last part of my argument: Dennett’ Cartesian Theater is a “theater in a box”. In this type of theater there is a boundary,

“… scenic space separated by a limen (or threshold) from the space of the spectators – today’s `traditional’ performance space [on liminality see Turner (1974)]” (p. 522)

a distinction, as in Dennett argument. We may also identify the homunculus side of the distinction with the head of state.

But this is not all.

Compared with the ancient Greeks theater, the “theater in a box” takes into account the role of the spectator as the one which perceives what is played on stage.

Secondly, the scenic space is not “what happens there”, as Dennett writes, but a construction already, a controlled space, a map of the territory and not the territory itself.

Conclusion: in my view (contradict me please!) the existence of the distinction (limen) in the “Cartesian theater”, which creates the homunculus problem, is superficial. More important is the fact that “Cartesian theater”, as “theater in a box”, is already a representation of perception, having on one side of the limen a homunculus and on the other side a scenic space which is not the “real space” (as for example the collection of electric sparks sent by the sensory organs to the brain) but instead is as artificial as the homunculus, being a space created and controlled by the scenographer.

Litmus test: repeat the reasoning of Dennett after replacing the “theater in a box” preconception of the “theater” by the older theater from the time of ancient Greeks. Can you do it?

On the beautiful idea of “aerography”, later.

Koenderink and Changizi

Jan Koenderink is a leading researcher in vision. He proposed the concept of
“scale-space representation” in relation to the understanding of how the front-end visual system works.

His paper “The brain a geometry engine” starts with:

According to Kant, spacetime is a form of the mind. If so, the brain must be a geometry engine. This idea is taken seriously, and consequently the implementation of space and time in terms of machines is considered. This enables one to conceive of spacetime as really ldquoembodied.rdquo

Later he writes:

There may be a point in holding that many of the better-known brain processes are most easily understood in terms of differential geometrical calculations running on massively parallel processor arrays whose nodes can be understood quite directly in terms of multilinear operators (vectors, tensors, etc).
In this view brain processes in fact are space.

This is a very interesting idea! As far as I understand, Koenderink is saying that somehow brain processes involved in vision and (external) space are similar!

In my opinion this is something to explore. However, my take is that this superb idea is clouded by his reliance on linear algebra and differential calculus of the exterior euclidean space (see “vectors, tensors, etc” as well as his derivation of the gaussian filter from invariance with respect to the same euclidean structure). If said brain processes are space and if those brain processes are a kind of computation (in a sense to be explained later) then space should appear as the result of a computation in the front-end visual system. No euclidean a priori!

Are those brain processes a kind of computation? The answer depends on what computation means. Anyway, nobody doubts that logical boolean computations are orthodox computations.

See then the following paper by Mark Changizi “Harnessing vision for computation” or check this Wired post

Scientists Build Visual Circuits to Harness your Brain’s GPU”

The abstract of the paper is:

Might it be possible to harness the visual system to carry out artificial computations, somewhat akin to how DNA has been harnessed to carry out computation? I provide the beginnings of a research programme attempting to do this. In particular, new techniques are described for building `visual circuits’ (or `visual software’) using wire, NOT, OR, and AND gates in a visual modality such that our visual system acts as `visual hardware’ computing the circuit, and generating a resultant perception which is the output

My conclusion: this is experimental proof that at least some brain processes related to vision can do something which simulates logical computation.

Computing with space

This is the first in a series of postings concerning computing with space. I shall try to give a gentle introduction to – and later discussion around – the ideas from the paper

Computing with space: a tangle formalism for chora and difference

We shall talk about:

– mathematics of metric spaces with dilations

Bateson viewpoint that the map is the territory, as opposed to Korzybski dictum “the map is not the territory”.

– Plato’ Timaeus 48e-53c where he introduces the concept of “chora”, which means “space” or “place”

– research in the neuroscience of vision, like Jan Koenderink paper “Brain a geometry engine”

and many other.

Older papers of mine on this subject: arXiv:1009.5028 “What is a space? Computations in emergent algebras and the front end visual system” and the arXiv:1007.2362 “Introduction to metric spaces with dilations”.