# Example: Gromov-Hausdorff distances and the Heisenberg group, PART 1

This post continues the previous one “Quantum physics and the Gromov-Hausdorff distance“.

Let me take an example. We are in the following Heisenberg group (this is really a physicist Heisenberg group): the semidirect product $G = \mathbb{R}^{2} \times S^{1}$. Elements of the group have the form

$X = (x, e^{2\pi iz})$ with $x \in \mathbb{R}^{2}$ and $z \in \mathbb{R} / \mathbb{Z}$

(think that $z \in [0,1]$ with the identification $o = 1$).
The group operation is given by:

$X \cdot Y = (x,e^{2\pi i z}) \cdot (y, e^{2 \pi i u}) = (x+y, e^{2 \pi i (u+z+ \frac{1}{2} \omega(x,y))})$

where $\omega: \mathbb{R}^{2} \times \mathbb{R}^{2} \rightarrow \mathbb{R}$ is the “symplectic form” (or area form)

$\omega(x,y) = x_{1} y_{2} - x_{2} y_{1}$

Remark 1. The pair $(\mathbb{R}^{2}, \omega)$ is a symplectic (linear) space. As we well know, the Hamiltonian mechanics lives in such spaces, therefore we may think about $(\mathbb{R}^{2}, \omega)$ as being a “classical”, or “large scale” (see further the justification, in PART 2) phase space of a mechanical system with one degree of freedom.

The group $G$ is generated by the subset $D = \mathbb{R}^{2} \times \left\{ 1 \right\}$, more precisely any element of $G$ can be expressed as a product of four elements of $D$ (in a non-unique way).

What is the geometry of $G$, seen as generated by $D$? In order to understand this, we may put an euclidean norm on $D$ (identified with $\mathbb{R}^{2}$):

$\| (x, 1) \| = \| x\|$, where $\|x\|^{2} = x_{1}^{2} + x_{2}^{2}$ for example.

Then we define “horizontal strings” and their “length”: a string $w = X_{1} ... X_{n}$ of elements of $G$ is horizontal if for any two successive elements of the string, say $X_{i}, X_{i+1}$ we have

$X_{i}^{-1} \cdot X_{i+1} \in D$, where $X^{-1}$ denotes the inverse of $X$ with respect to the group operation. Also, we have to ask that $X_{1} \in D$.

The length of the horizontal string $w = X_{1} ... X_{n}$ is defined as:

$l(w) = \|X_{1}\| + \| X_{1}^{-1} \cdot X_{2}\| + .... + \|X_{n-1}^{-1} \cdot X_{n}\|$. The source of the string $w$ is the neutral element $s(w) = E = (0,1)$ and the target of the string is $t(w) = X_{1}\cdot ... \cdot X_{n}$.

OK, then let us define the “group norm” of an element of $G$, which is an extension of the norm defined on $D$. A formula for this would be:

$\| X\| = \, inf \left\{ l(w) \mbox{ : } t(w) = X \right\}$.

Small technicality: it is not clear to me if this definition is really good as it is, but we may improve it by the following procedure coming from the definition of the Hausdorff measure. Let us introduce the “finesse” of a horizontal string, given by

$fin(w) = \max \left\{ \|X_{1}\| , \| X_{1}^{-1} \cdot X_{2}\| , ... , \|X_{n-1}^{-1} \cdot X_{n}\| \right\}$

and then define, for any $\varepsilon > 0$, the quantity:

$\| X\|_{\varepsilon} = \, inf \left\{ l(w) \mbox{ : } t(w) = X \mbox{ and } fin(w) < \varepsilon \right\}$.

The correct definition of the group norm is then

$\| X\| = \, sup \left\{\| X\|_{\varepsilon} \mbox{ : } \varepsilon > 0 \right\}$.

With words, that means: for a given “scale” $\varepsilon > 0$, take discrete paths from $E$ to $X$, made by “small” (norm smaller than $\varepsilon$) horizontal increments, and then take the infimum of the length of such curves. You get $\| X\|_{\varepsilon}$. Go with $\varepsilon$ to $o$ and get the norm $\| X\|_{\varepsilon}$.

Up to some normalization, the bigger is the norm of an element of $G$, the bigger is the infimal length of a horizontal curve which expresses it, therefore the group norm gives a quantitative estimate concerning how the group element is generated.

In disguise, this norm is nothing but a sub-riemannian distance!

# Quantum physics and the Gromov-Hausdorff distance: more than discrete or continuous

In face of the question “is reality digital or analog?”, my first reaction was “how can one be so unimaginative?”. Actually, I know that there is at least another possibility, thanks to some mathematical results, like Pansu’ Rademacher theorem for Carnot groups and to the Gromov-Hausdorff distance.
I saw this question announced as the subject of a FQXI essay contest and I decided to give a try to explain a bit what I have in mind. It was interesting to do this. I think it was also a bit disconcerting, because it feels a lot like social interacting during the participation in a MMORPG.

More than discrete or continous: a bird’s view

The idea is that any experiment is like making a map of a part of the reality (say at the quantum scale, for example) into another (the results, as we read them in the lab). Estimating distances are proposed as a good analogy of a measurement. Or, it may be possible that there are a priori limits of the precision (accuracy, etc) of the map (any map obtained by a exploiting a physical process) due to the fact that at different scales the space is different, like metric spaces are.

Up to some normalization, the Planck length may be just (proportional to) the Gromov-Hausdorff distance between the (unit ball in ) the Heisenberg group (as a Carnot group) and the euclidean phase space of the same system.

Reality (whatever that means, say the “real” phase space of a system) may be discrete, continuous, both or none, but at different scales, up to a (small) error, the Heisenberg group or the euclidean space, respectively, may be close to it (to the small part of reality explored). Then, as a consequence of Pansu’ Rademacher theorem, it follows that one cannot make a perfect (read “bi-lipschitz”) map of a part of reality into the other, but instead, there is a lower limit of the accuracy of any such map, quantitatively proportional with a Gromov-Hausdorff distance.

In order to explain this in more detail, I took the view that anyway the differential calculus (which is the mathematical base of physics) is an abstraction of the activity of map-making, or chorography.

The “bird’s view” is a play with a quote from Plato, I let you discover it, at the end of the paper.

# Drawings

This week-end post is light, dedicated to painting. However, for me painting is a serious matter, not at all a hobby. When I was very young, I struggled to decide what to do when I shall be a grown-up: physics or math? Painting came to me out of the blue (as most of good things in my life) when I was 10. Later I entertained the idea to become a painter instead of a mathematician, but I have been told that I shall live a life of poverty, most likely. I was impressed by this argument and became a researcher, so now I am very rich indeed.

Some day I shall be a painter. As doing this requires 24 hours in a day, is for later.

Here are some unworthy experiments with Photoshop, done during my stay in Lausanne. The point is that one can produce anything starting from a random picture taken from the net and deforming (patches of) it by applying available algorithms in the mentioned program. In a way, there is not one line drawn, but only a whirl of local transformations of the original photo.

For example, one of the first drawing with photoshop was this (a part of it is put as the front image of the blog): it is a snail shell

but it means a lot of words, especially regarding to the status of space.

This drawing is a deformed picture of a dancer (so much deformed that at large scale the picture looks like it is no longer a topological transformation of an image of a simply connected dancer)

This one is originally a photo of two cats:

Finally, here you may recognize a photo from Venice, with boats floating, beyond of their stability points , in the sky.

# Combinatorics versus geometric…

… is like using roman numerals versus using a positional numeral system, like the hindu-arabic numerals we all know very well. And there is evidence that our brain is NOT using combinatorial techniques, but geometric, see further.

What is this post about? Well, it is about the problem of using knowledge concerning topological groups in order to study discrete approximate groups, as Tao proposes in his new course, it is about discrete finitely generated groups with polynomial growth which, as Gromov taught us, when seen from far away they become nilpotent Lie groups, and so on. Only that there is a long way towards these subjects, so please bear me a little bit more.

This is part of a larger project to try to understand approximate groups, as well as normed groups with dilations, in a more geometric way. One point of interest is understanding the solution to the Hilbert’s fifth problem from a more general perspective, and this passes by NOT using combinatorial techniques from the start, even if they are one of the most beautiful mathematical gems which is the solution given by Gleason-Montgomery-Zippin to the problem.

What is combinatorial about this solution? Well, it reduces (in a brilliant way) the problem to counting, by using objects which are readily at hand in any topological group, namely the one-parameter subgroups. There is nothing wrong about this, only that, from this point of view, Gromov’s theorem on groups with polynomial growth appears as magical. Where is this nilpotent structure coming from?

As written in a previous post, Hilbert’s fifth problem without one-parameter subgroups, Gromov’ theorem is saying a profound geometrical thing about a finitely generated group with polynomial growth: that seen from far away this group is self-similar, that is a conical group, or a contractible group w.r.t. any of its dilations. That is all! the rest is just a Siebert’ result. This structure is deeply hidden in the proof and one of my purposes is to understand where it is coming from. A way of NOT understanding this is to use huge chunks of mathematical candy in order to make this structure appear by magic.

I cannot claim that I understand this, that I have a complete solution, but instead, for this post, I looked for an analogy and I think I have found one.

It is the one from the first lines of the post.

Indeed, what is wrong, by analogy, with the roman numeral system? Nothing, actually, we have generators, like I, V, X, L, C, D, M, and relations, like IIII = IV, and so on (yes, they are not generators and relations exactly like in a group sense). The problems appear when we want to do complex operations, like addition of large numbers. Then we have to be really clever and use very intelligent and profound combinatorial arguments in order to efficiently manage all the cancellations and whatnot coming from the massive use of relations. Relations are at very small scale, we have to bridge the way towards large scales, therefore we have to approximate the errors by counting in different ways and to be very clever about these ways.

Another solution for this, historically preferred, was to use a positional number system, which is more geometric, because it exploits a large scale property of natural numbers, which is that their collection is (approximately) self-similar. Indeed, take (as another kind of generators, again not in a group sense), a small set, like B={0, 1, 2, …, 8, 9} and count in base 10, which goes roughly like this: take a (big, preferably) natural number $X$ and do the following

– initialize $i = 1$,

– find the smallest natural power $a_{i}$ of 10 such that $10^{-a_{i}} X$ has a norm smaller than 10, then pick the element $k_{i}$ of $B$ which minimizes the distance to $10^{-a_{i}} X$,

– substract (from the right or the left, it does not matter here because addition of natural numbers is commutative) $10^{a_{i}} k_{i}$ from $X$, and rename the result by $X$,

-repeat until $X \in B$ and finish the algorithm by taking the last digit as $X$.

In the end (remember, I said “roughly”) represent $X$ as a string which codes the string of pairs $(a_{i}, k_{i})$.

The advantage of this representation of natural numbers is that we can do, with controlled precision, the addition of big numbers, approximately. Indeed, take two very big numbers $X, Y$ and take another number, like $10$. Then for any natural $n$ define

$(X+Y)$ approx(n) $= 10^{n} ( [X]_{n} + [Y]_{n})$

where $[X]_{n}$ is the number which is represented as the truncation of the string which represents $X$ up to the first $n$ letters, counting from left to right.
If $n$ is small compared with $X, Y$, then $(X+Y)$ approx(n) is close to the true $X+Y$, but the computational effort for calculating $(X+Y)$ approx(n) is much smaller than the one for calculating $X+Y$.

Once we have this large scale self-similarity, then we may exploit it by using the more geometrical positional numeral system instead of the roman numeral system, that is my analogy. Notice that in this (almost correct) algorithm $10^{a} X$ is not understood as $X+X+....+X$ $10^{a}$ times.

Let me now explain why the positional numeral system is more geometric, by giving a neuroscience argument, besides what I wrote in this previous post: “How not to get bored, by reading Gromov and Tao” (mind the comma!).

I reproduce from the wiki page on “Subitizing and counting

Subitizing, coined in 1949 by E.L. Kaufman et al.[1] refers to the rapid, accurate, and confident judgments of number performed for small numbers of items. The term is derived from the Latin adjective subitus (meaning “sudden”) and captures a feeling of immediately knowing how many items lie within the visual scene, when the number of items present falls within the subitizing range.[1] Number judgments for larger set-sizes were referred to either as counting or estimating, depending on the number of elements present within the display, and the time given to observers in which to respond (i.e., estimation occurs if insufficient time is available for observers to accurately count all the items present).

The accuracy, speed, and confidence with which observers make judgments of the number of items are critically dependent on the number of elements to be enumerated. Judgments made for displays composed of around one to four items are rapid,[2] accurate[3] and confident.[4] However, as the number of items to be enumerated increases beyond this amount, judgments are made with decreasing accuracy and confidence.[1] In addition, response times rise in a dramatic fashion, with an extra 250–350 ms added for each additional item within the display beyond about four.[5]

This is a brain competence which is spatial (geometrical) in nature, as evidenced by Simultanagnosia:

Clinical evidence supporting the view that subitizing and counting may involve functionally and anatomically distinct brain areas comes from patients with simultanagnosia, one of the key components of Balint’s syndrome.[14] Patients with this disorder suffer from an inability to perceive visual scenes properly, being unable to localize objects in space, either by looking at the objects, pointing to them, or by verbally reporting their position.[14] Despite these dramatic symptoms, such patients are able to correctly recognize individual objects.[15] Crucially, people with simultanagnosia are unable to enumerate objects outside the subitizing range, either failing to count certain objects, or alternatively counting the same object several times.[16]

From the wiki description of simultanagnosia:

Simultanagnosia is a rare neurological disorder characterized by the inability of an individual to perceive more than a single object at a time. It is one of three major components of Bálint’s syndrome, an uncommon and incompletely understood variety of severe neuropsychological impairments involving space representation (visuospatial processing). The term “simultanagnosia” was first coined in 1924 by Wolpert to describe a condition where the affected individual could see individual details of a complex scene but failed to grasp the overall meaning of the image.[1]

I rest my case.

# Baker-Campbell-Hausdorff polynomials and Menelaus theorem

This is a continuation of the previous post on the noncommutative BCH formula. For the “Menelaus theorem” part see this post.

Everything is related to “noncommutative techniques” for approximate groups, which hopefully will apply sometimes in the future to real combinatorial problems, like the Tao’ project presented here, and also to the problem of understanding curvature (in non-riemannian settings), see a hint here, and finally to the problem of higher order differential calculus in sub-riemannian geometry, see this for a comment on this blog.

Remark: as everything this days can be retrieved on the net, if you find in this blog something worthy to include in a published paper, then don’t be shy and mention this. I believe strongly in fair practices relating to this new age of scientific collaboration opened by the www, even if in the past too often ideas which I communicated freely were taken in published papers without attribution. Hey, I am happy to help! but unfortunately I have an ego too (not only an ergobrain, as any living creature).

For the moment we stay in a Lie group , with the convention to take the exponential equal to identity, i.e. to consider that the group operation can be written in terms of Lie brackets according to the BCH formula:

$x y = x + y + \frac{1}{2} [x,y] + \frac{1}{12}[x,[x,y]] - \frac{1}{12}[y,[y,x]]+...$

For any $\varepsilon \in (0,1]$ we define

$x \cdot_{\varepsilon} y = \varepsilon^{-1} ((\varepsilon x) (\varepsilon y))$

and we remark that $x \cdot_{\varepsilon} y \rightarrow x+y$ uniformly with respect to $x,y$ in a compact neighbourhood of the neutral element $e=0$. The BCH formula for the operation labeled with $\varepsilon$ is the following

$x \cdot_{\varepsilon} y = x + y + \frac{\varepsilon}{2} [x,y] + \frac{\varepsilon^{2}}{12}[x,[x,y]] - \frac{\varepsilon^{2}}{12}[y,[y,x]]+...$

$BCH^{0}_{\varepsilon} (x,y) = x \cdot_{\varepsilon} y$

and $BCH^{0}_{0}(x,y) = \lim_{\varepsilon \rightarrow 0} BCH^{0}_{\varepsilon}(x,y) = x + y$.

Define the “linearized dilation$\delta^{x}_{\varepsilon} y = x + \varepsilon (-x+y)$ (written like this on purpose, without using the commutativity of the “+” operation; due to limitations of my knowledge to use latex in this environment, I am shying away to put a bar over this dilation, to emphasize that it is different from the “group dilation”, equal to $x (\varepsilon(x^{-1}y))$).

Consider the family of $\beta > 0$ such that there is an uniform limit w.r.t. $x,y$ in compact set of the expression

$\delta_{\varepsilon^{-\beta}}^{BCH^{0}_{\varepsilon}(x,y)} BCH^{0}_{0}(x,y)$

and remark that this family has a maximum $\beta = 1$. Call this maximum $\alpha_{0}$ and define

$BCH^{1}_{\varepsilon}(x,y) = \delta_{\varepsilon^{-\alpha_{1}}}^{BCH^{0}_{\varepsilon}(x,y)} BCH^{0}_{0}(x,y)$

and $BCH^{1}_{0}(x,y) = \lim_{\varepsilon \rightarrow 0} BCH^{1}_{\varepsilon}(x,y)$.

Let us compute $BCH^{1}_{0}(x,y)$:

$BCH^{1}_{0}(x,y) = x + y + \frac{1}{2}[x,y]$

and also remark that

$BCH^{1}_{\varepsilon}(x,y) = x+y + \varepsilon^{-1} ( -(x+y) + (x \cdot_{\varepsilon} y))$.

We recognize in the right hand side an expression which is a relative of what I have called in the previous post an “approximate bracket”, relations (2) and (3). A better name for it is a halfbracket.

We may continue indefinitely this recipe. Namely for any natural number $i\geq 1$ we first define the maximal number $\alpha_{i}$ among all $\beta > 0$ with the property that the (uniform) limit exists

$\lim_{\varepsilon \rightarrow 0} \delta_{\varepsilon^{-\beta}}^{BCH^{i}_{\varepsilon}(x,y)} BCH^{i}_{0}(x,y)$

Generically we shall find $\alpha_{i} = 1$. We define then

$BCH^{i+1}_{\varepsilon}(x,y) = \delta_{\varepsilon^{-\alpha_{i}}}^{BCH^{i}_{\varepsilon}(x,y)} BCH^{i}_{0}(x,y)$

and $BCH^{i+1}_{0}(x,y) = \lim_{\varepsilon \rightarrow 0} BCH^{i+1}_{\varepsilon}(x,y)$.

It is time to use Menelaus theorem. Take a natural number $N > 0$. We may write (pretending we don’t know that all $\alpha_{i} = 1$, for $i = 0, ... N$):

$x \cdot_{\varepsilon} y = BCH^{0}_{\varepsilon}(x,y) = \delta^{BCH^{0}_{0}(x,y)}_{\varepsilon^{\alpha_{0}}} \delta^{BCH^{1}_{0}(x,y)}_{\varepsilon^{\alpha_{1}}} ... \delta^{BCH^{N}_{0}(x,y)}_{\varepsilon^{\alpha_{N}}} BCH^{N+1}_{\varepsilon}(x,y)$

Let us denote $\alpha_{0} + ... + \alpha_{N} = \gamma_{N}$ and introduce the BCH polynomial $PBCH^{N}(x,y)(\mu)$ (the variable of the polynomial is $\mu$), defined by: $PBCH^{N}(x,y)(\mu)$ is the unique element of the group with the property that for any other element $z$ (close enough to the neutral element) we have

$\delta^{BCH^{0}_{0}(x,y)}_{\mu^{\alpha_{0}}} \delta^{BCH^{1}_{0}(x,y)}_{\mu^{\alpha_{1}}} ... \delta^{BCH^{N}_{0}(x,y)}_{\mu^{\alpha_{N}}} z = \delta^{PBCH^{N}(x,y)(\mu)}_{\mu^{\gamma_{N}}} z$

Such an element exists and it is unique due to (Artin’ version of the) Menelaus theorem.

Remark that $PBCH^{N}(x,y)(\mu)$ is not a true polynomial in $\mu$, but it is a rational function of $\mu$ which is a polynomial up to terms of order $\mu^{\gamma_{N}}$. A straightforward computation shows that the BCH polynomial (up to terms of the mentioned order) is a truncation of the BCH formula up to terms containing $N-1$ brackets, when we take $\mu =1$.

It looks contorted, but written this way it works verbatim for normed groups with dilations! There are several things which are different in detail. These are:

1. the coefficients $\alpha_{i}$ are not equal to $1$, in general. Moreover, I can prove that the $\alpha_{i}$ exist (as a maximum of numbers $\beta$ such that …) for a sub-riemannian Lie group, that is for a Lie group endowed with a left-invariant dilation structure, by using the classical BCH formula, but I don’t think that one can prove the existence of these numbers for a general group with dilations! Remark that the numbers $\alpha_{i}$ are defined in a similar way as Hausdorff dimension is!

2. one has to define noncommutative polynomials, i.e. polynomials in the frame of Carnot groups (at least). This can be done, it has been sketched in a previous paper of mine, Tangent bundles to sub-riemannian groups, section 6.

UPDATE: (30.10.2011) See the post of Tao

Associativity of the Baker-Campbell-Hausdorff formula

where a (trained) eye may see the appearance of several ingredients, in the particular commutative case, of the mechanism of definition of the BCH formula.

The associativity is rephrased, in a well known way,  in proposition 2 as a commutativity of say left and  right actions. From there signs of commutativity (unconsciously assumed) appear:  the obvious first are the “radial  homogeneity  identities”, but already at this stage a lot of familiar  machinery is put in place and the following is more and more heavy of  the same. I can only wonder:  is this  all necessary? My guess is: not. Because for starters, as explained here and in previous posts, Lie algebras are of a commutative blend, like the BCH formula. And (local, well known from the beginning) groups are not.

# Principles: randomness/structure or emergent from a common cause?

I think both.

Finally Terence Tao presented a sketch of his project relating Hilbert’ fifth problem and approximate groups. For me the most interesting part is his Theorem 12 and its relation with Gromov’ theorem on finitely generated groups with polynomial growth.

A bit dissapointing (for me) is that he seems to choose to rely on “commutative” techniques and, not surprising, he is bound to get results valid only in riemannian geometry (or spaces of Alexander type) AND also to regard the apparition of nilpotent structures as qualitatively different from smooth structures.

For clarity reasons, I copy-paste his two “broad principles”

The above three families of results exemplify two broad principles (part of what I like to call “the dichotomy between structure and randomness“):

• (Rigidity) If a group-like object exhibits a weak amount of regularity, then it (or a large portion thereof) often automatically exhibits a strong amount of regularity as well;
• (Structure) This strong regularity manifests itself either as Lie type structure (in continuous settings) or nilpotent type structure (in discrete settings). (In some cases, “nilpotent” should be replaced by sister properties such as “abelian“, “solvable“, or “polycyclic“.)

Let me contrast with my

Principle of common cause: an uniformly continuous algebraic structure has a smooth structure because both structures can be constructed from an underlying emergent algebra (introduced here).

(from this previous post).

While his “dichotomy between structure and randomness“ principle is a very profound and mysterious one, the second part (Structure) is only an illusion created by psychological choices (I guess). Indeed, both (Lie) smooth structure and nilpotence are just “noncommutative (local) linearity”, as explained previously. Both smooth structure and “conical algebraic” structure (nilpotent in particular) stem from the same underlying dilation structure. What is most mysterious, in my opinion, be it in the “exact” or “approximate” group structure, is HOW a (variant of) dilation structure appears from non-randomness (presumably as a manifestation of Tao randomness/structure principle), i.e. HOW just a little bit of approximate self-similarity bootstraps to a dilation structure, or emergent algebra, which then leads to various flavors of “smoothness” and “nilpotence”.

# Penrose’ combinatorial space time as chora

Roger Penrose, among other extraordinary things he did, proposed an approach to combinatorial space-time by way of spin-networks. Here is a link to his amazing paper

Roger Penrose, Angular momentum: an approach to combinatorial space-time, in Quantum Theory and Beyond, ed. T. Bastin, Cambridge University Press, Cambridge, 1971.