To comment or not to comment, that is the question?

Some comments  to Gowers post “Why I’ve also joined the good guys” make me write a third reaction note. I want to understand why there is so much discussion around the idea of  the utility of comments to articles “published” (i.e. selected from arxiv or other free OA repositories) in epijournals.

UPDATE: For epijournals see Episciences.org and also the blog post  Episciences: de quoi s’agit-il?.

UPDATE 2: Read “Comments in epijournals: we may learn from Wikipedia” for a constructive proposal concerning comments (and peer-reviews as well).

I take as examples the comments by Izabella Laba  and  Mike Taylor.  Here they are:

Izabella Laba, link to comment:

I would not submit a paper to a journal that would force me to have a mandatory comment page on every article. I have written several long posts already on this type of issues, so here I’ll only say that this is my well considered opinion based on my decades of experience in mathematics, several years of blogging, and following (and sometimes commenting on) blogs with comment sections of varying quality. No amount of talk about possible fixes etc. will make me change my mind.

Instead, I want to mention a few additional points.

1) A new journal needs to develop a critical mass of authors. While having comment pages for articles may well attract some authors, making them mandatory pages will likely turn off just as many. In particular, the more senior and established authors are less likely to worry about the journal being accepted by promotion committees etc, but also less likely to have the time and inclination to manage and moderate discussion pages.

2) It is tempting to think that every paper would have a lively, engaging and productive comment page. In reality, I expect that this would only happen for a few articles. The majority of papers might get one or two lazy comments. The editors would have to spend time debating whether this or that lazy comment is negative enough or obnoxious enough to be removed, in response to the inevitable requests from the authors; but the point is that no greater good was achieved by having the comment page in the first place.

3) It is also tempting that such comment pages would contain at least a reasonably comprehensive summary of follow-up work (Theorem 1 was extended to a wider class of functions in [A], Conjecture 2 was proved in [B], and the range of exponents in Theorem 3 was proved to be sharp in [C]). But I don’t believe that this will happen. When I write an article, it is my job to explain clearly and informatively how my results relate to existing literature. It is *not* my job to also post explanations of that on multiple comment pages for cited articles, I certainly would not have the time to do that, and I’m not convinced that we could always could on the existence of interested and willing third parties.

A better solution would be to allow pingbacks (say, from the arXiv), so that the article’s journal page shows also the list of articles citing it. Alternatively, authors and editors might be allowed to add post-publication notes of this type (separate from the main article).

4) Related to this, but from a broader perspective: what is it that journals are supposed to accomplish, aside from providing a validation stamp? The old function of disseminating information has already been taken over by the internet. I believe that the most important thing that journals should be doing now is consolidating information, improving the quality of it, raising the signal to noise ratio.

I can see how this goal would be served by having a small number of discussion pages where the commenters are knowledgeable and engaged. In effect, these pages would serve as de facto expository papers in a different format. I do not think that having a large number of comment pages with one or two comments on them would have the same effect. It would not consolidate information – instead, it would diffuse it further.

On a related note, since I mentioned expository papers – it would be excellent to have a section for those. Right now, the journal market for expository papers is very thin: basically, it’s either the Monthly (limited range of topics) or the AMS Bulletin (very small number of papers, each one some sort of a “big deal”). But there is no venue, for instance, for the type of expository papers that researchers often write when they try to understand something themselves. (Except maybe for conference proceedings, but this is not a perfect solution, for many reasons.)

I will likely have more thoughts on it – if so, I’ll post a longer version of this on my own blog.

Mike Taylor, link to comment:

“I would not submit a paper to a journal that would force me to have a mandatory comment page on every article … No amount of talk about possible fixes etc. will make me change my mind.”

I am sorry to hear that. Without in the slighting expecting or intended to change you’re mind, I’ll say this: I can easily imagine that within a few more years, I will be refusing to submit to journals that do not have a comment page on my article. From my perspective, the principle purpose of publishing an article is to catalyse discussion and further work. I am loath to waste my work on venues that discourage this.

“It is tempting to think that every paper would have a lively, engaging and productive comment page. In reality, I expect that this would only happen for a few articles. The majority of papers might get one or two lazy comments.”

The solution to this is probably for us to write more interesting papers.

I totally agree with Mike Taylor and I am tempted to add that authors not willing to accept comments to their articles will deserve a future Darwin award for publication policies.  But surely is their right to lower the chances for their research to  produce descendants.

Say you are a film maker. What do you want?

  • a) to not allow your film to be seen because some of the critics may not appreciate it
  • b) to disseminate your film as much as possible and to learn from the critics and public about eventual weak points and good points of it

If the movie world would be alike to the actual academic world then most of the film makers would choose a), because it does not matter if the film is good or bad, only matters how many films you made and, among them, how many were supported by governmental grants.

A second argument for allowing comments to be made is Wikipedia.  It is clear to (almost) anybody that Wikipedia would not be what it is if it were only based on the 500-1000 regular editors (see the wiki page on Aaron Swartz and Wikipedia). Why is then impossible to imagine that we can make comments to the article a very useful feature of epijournals? Simply by importing some of the well proven rules from wikipedia concerning contributors!

On the reasons of such reactions which disregard the reality, another time. I shall just point to the fact that is still difficult to accept models of thinking based not on pyramidal bureaucratic organizational structures but on massive networked collaboration.   Pre-internet, the pyramidal organization was the most efficient. Post internet it makes no sense because the cost of organizing (Coase cost) went to almost nil.

But thought reflexes are still alive, because we are only humans.

Discussion about how an UD algorithm might work

I offer this post for discussions around UD type algorithms. I shall update this post, each time indicating the original comment with the suggested updates.

[The rule concerning comments on this blog is that the first time you comment, I have to aprove it. I keep the privilege of not accepting or deleting comments which are not constructive]

For other posts here on the subject of UD see the dedicated tag unlimited detail.

I propose you to start from this comment by JX, then we may work on it to make it clear (even for a mathematician). Thank you JX for this comment!

I arranged a bit the comment, [what is written between brackets is my comment]. I numbered each paragraph, for easiness.

Now I worked and thought enough to reveal all the details, lol. [see this comment by JX]
I may dissapoint you: there’s no much mathematics in what I did. JUST SOME VERY ROUGH BRUTE-FORCE TRICKS.

1) In short: I render cubemaps but not of pixels – it is cubemaps of 3d points visible from some center.

2) When camera is in that cubemap’s center – all points projected and no holes visible. When camera moves, the world realistically changes in perspective but holes count increases. I combine few snapshots at time to decrease holes count, I also use simple hole filling algorithm. My hole filling algorithm sometimes gives same artifacts as in non-cropped UD videos (bottom and right sides) .

[source JX #2]   ( link to the artifacts image )this artifacts can be received after appliying hole-filling algorithm from left to right and then from top to the bottom, this why they appear only on right and bottom sides. Another case is viewport clipping of groups of points arranged into grid: link from my old experiment with such groups.

This confirms that UD has holes too and his claim “exactly one point for each pixel” isn’t true.
3) I used words “special”, “way”, “algorithm” etc just to fog the truth a bit. And there is some problems (with disk space) which doesn’t really bother UD as I understand. [that’s why they moved to geospatial industry] So probably my idea is very far from UD’s secret. Yes, it allows to render huge point clouds but it is stupid and I’m sure now it was done before. Maybe there is possibility to take some ideas from my engine and improve them, so here is the explanation:
4) Yes, I too started this project with this idea: “indexing is the key”. You say to the database: “camera position is XYZ, give me the points”. And there’s files in database with separated points, database just picks up few files and gives them to you. It just can’t be slow. It only may be very heavy-weight (impossible to store such many “panoramas”) .

5) I found that instead of keeping _screen pixels_ (like for panoramas) for each millimeter of camera position it is possible to keep actual _point coordinates_ (like single laser scanner frame) and project them again and again while camera moves and fill holes with other points and camera step between those files may be far bigger than millimeters (like for stereo-pairs to see volumetric image you only need two distant “snapshots”).

6) By “points linked with each other” I meant bunch of points linked to the some central points. (by points I mean _visible_ points from central point).

7) What is central point? Consider this as laser scanner frame. Scanner is static and catches points around itself. Point density near scanner is high and vice versa.

8) So again: my engine just switches gradually between virtual “scanner” snapshots of points relative to some center. During real-time presentation, for each frame a few snapshots are  projected, more points projected from the nearest, less from far snapshots.

9) Total point count isn’t very big, so real-time isn’t impossible. Some holes appear, simple algorithm fills them using only color and z-buffer data.
10) I receive frames(or snapshots) by projecting all the points using perspective matrix, I use fov 90, 256×256 or 512×512 point buffer (like z-buffer but it stores relative (to the scanner) point position XYZ).

11) I do this six times to receive cubemap. Maximum points in the frame is 512x512x6. I can easily do color interpolation for the overlapped points. I don’t pick color of the point from one place. This makes data interleaved and repeated.

12) Next functions allow me to compress point coordinates in snapshots to the 16bit values. Why it works – because we don’t need big precision for distant points, they often don’t change screen position while being moved by small steps.

int32_t expand(int16_t x, float y)
{
int8_t sign = 1;
if (x<0) { sign = -1; x = -x; }
return (x+x*(x*y))*sign;
}

int16_t shrink(int32_t z, float y)
{
int8_t sign = 1;
if (z<0) { sign = -1; z = -z; }
return ((sqrtf(4*y*z+1)-1)/(2*y))*sign;
}

13) I also compress colors to 16bit. I also compress normals to one 24bit value. I also add shader number (8bit) to the point. So one point in snapshot consists of:  16bit*3 position + 24bit normal + 16bit color + 8bit shader.

14) There must be some ways to compress it more (store colors in texture (lossy jpeg), make some points to share shader and normals). Uncompressed snapshot full of points (this may be indoor snapshot) 512x512x6 = 18Mb , 256x256x6 = 4,5Mb

Of course, after lzma compression (engine reads directly from ulzma output, which is fast) it can be up to 10 times smaller, but sometimes only 2-3 times. AND THIS IS A PROBLEM. I’m afraid, UD has smarter way to index it’s data.

For 320×240 screen resolution 512×512 is enough, 256×256 too, but there will be more holes and quality will suffer.

To summarize engine’s workflow:

15) Snapshot building stage. Render all scene points (any speed-up may be used here: octrees or, what I currently use: dynamic point skipping according to last point distance to the camera) to snapshots and compress them. Step between snapshots increases data weight AND rendering time AND quality. There’s no much sense to make step like 1 point. Or even 100 points. After this, scene is no longer needed or I should say scene won’t be used for realtime rendering.

16) Rendering stage. Load nearest snapshots to the camera and project points from them (more points for closer snapshots, less for distant. 1 main snapshot + ~6-8 additional used at time. (I am still not sure about this scheme and changing it often). Backfa..point culling applied. Shaders applied. Fill holes. Constantly update snapshots array according to the camera position.

17) If I restrict camera positions, it is possible to “compress” huge point cloud level into relatively small database. But in other cases my database will be many times greater than original point cloud scene. [ See comments   JX#2  , JX#3 , chorasimilarity#4 , chorasimilarity#5 . Here is an eye-candy image of an experiment by JX, see JX#2:]

eye_candy_by_JX

Next development steps may be:

18) dynamic camera step during snapshot building (It may be better to do more steps when more points closer to camera (simple to count during projection) and less steps when camera in air above the island, for example),

19) better snapshot compression (jpeg, maybe delta-coding for points), octree involvement during snapshot building.

20) But as I realized disk memory problems, my interest is falling.

Any questions?

UD question

I try to formulate the question about how Unlimited Detail works like this:

Let D be a database of 3D points, containing information about  M points. Let also S be the image on the screen, say with N pixels. Problem:

  • reorganize the database D to obtain another database D’ with at most O(M) bits, such that
  • starting from D’ and a finite (say 100 bytes) word there exists an algorithm which finds the image on the screen in O(N log(M)) time.

Is this reasonable?

For example, take N=1. The finite word means the position and orientation of the screen in the 3D world of the database. If the M points would admit a representation as a number (euclidean invariant hash function?) of order M^a (i.e. polynomial in the number of points), then it would be reasonable to expect  D’ to have dimension of order O(log(M)), so in this case simply by traversing D’ we get the time O(log(M)) = O(N log(M)). Even if we cannot make D’ to be O(log(M)) large, maybe the algorithm still takes O(log(M)) steps simply because M is approximately the volume, so the diameter in 3D space is roughly between M^(1/3) and M,  or due to scaling of the perspective the algorithm may still hop through D’ in geometric, not arithmetic steps.

The second remark is that there is no restriction for the time which is necessary for transforming D into D’.

AZ open access

Instead of Diamond OA (as mentioned in Tim Gowers very interesting “Why I’ve also joined the good guys“) I suggest that a better and inspiring name for this yet mysterious idea if epijournals would be

AZ open access

or open access from A to Z. There is another, better justification for this name, see the end of the post!

Diamond and Gold names just betray that many people don’t get the idea that in the age of the net is not good to base one business model on the SCARCITY OF GOODS. Gold and diamonds are valuable because they are scarce. Information, on the contrary, is abundant and it thrives from being shared. Google got it, for example, they are not doing bad, right? Therefore, why base the business publishing model on the idea of making the information scarce, in order to have value? You already have value, because value itself is just a kind of carrier of information.

The name AZ OA is a tribute. It means:

Aaron SwartZ Open Access.

Good news from the good guys

The very recent post of Gowers “Why I’ve also joined the good guys” is good news! It is about a platform for “epijournals”, or in common (broken, in my case) English means a system of peer-reviewing arxiv articles.

UPDATE: For epijournals see Episciences.org and also the blog post  Episciences: de quoi s’agit-il?.

If you have seen previous posts here on this subject, then you imagine I am very excited about this! I  posted immediately a comment, is awaiting moderation just appeared, so here is it for the posterity:

Congratulations, let’s hope that it will work (however I don’t understand the secrecy behind the idea). For some time I try to push an idea which emerged from several discussions, described here  Peer-review turned on its head has market value(also see Peer-review is Cinderella’s lost shoe )  with very valuable contributions from readers, showing that the model may be viable, as a sort of relative of the pico-publication idea.

Secrecy (if there is one or I am just uninformed) is not a good idea, because no matter how smart is someone, there is always a smarter idea waiting to germinate in another one’s head. It is obvious that:

  • a public discussion about this new model will improve it beyond the imagination of the initiators, or it will show its weakness (if any), just like in the case of a public discussion about an encryption protocol, say. If you want the idea to stand, then discuss it publicly,
  • the model has to provide an incentive for the researchers to do peer-reviews. There are two aspects about this: 1)  the researchers are doing peer-reviews for free anyway, for the old-time journals, so maybe the publishers themselves could consider the idea to organize the peer-review process, 2) anything is possible once you persuade enough people that it’s a good idea.
  • any association between expired reflexes (like vanity publication, or counting the first few bits of articles, like ISI, for the sake of HR departments) will harm the project. In this respect see the excellent post MOOCs teach OA a lesson   by Eric Van de Velde, where it is discussed why the idea of Massively Open Online Courses (MOOCs) had much more success in such a short time then the OA movement.

Enough for now, I am looking forward to hear more about epijournals.

UPDATE: There is no technical reason to ignore  some of the eprints which are already on arxiv. By this I mean the following question: are epijournals considering only peer-reviewing new arxiv eprints, or there is any interest of peer-reviewing existing eprints?

UPDATE 2: This comment by Benoît Régent-Kloeckner    clarifies who is the team behind epijournals. I reproduce the comment here:

I can clarify a bit the “epi-team” composition. Jean-Pierre Demailly tried to launch a similar project some years ago, but it had much less institutional support and did not work out. More recently, Ariane Rolland heard about this tentative and, having contact at CCSD, made them meet with Jean-Pierre. That’s the real beginning of the episciences project, which I joined a bit later. The names you should add are the people involved in the CCSD: Christine Berthaud, head of CCSD, Laurent Capelli who is coding the software right now, and Agnès Magron who is working on the communication with Ariane.

Gnomonic cubes: a simultaneous view of the extended graphic beta move

Recall that the extended  beta move is equivalent with the pair  of  moves :

beta_beta_star

where the first move is the graphic beta move and the second move is the dual of the beta move, where duality is (still loosely) defined by the following diagram:

correspondence_1

In this post I want to show you that it is possible to view simultaneously these two moves. For  that I need to introduce the gnomonic cube. (Gnomons appeared several times in this blog, in expected or unexpected places, consult the dedicated tag “gnomon“).

From the wiki page about the gnomon,     we see that

A three dimensional gnomon is commonly used in CAD and computer graphics as an aid to positioning objects in the virtual world. By convention, the X axis direction is colored red, the Y axis green and the Z axis blue.

3DGraphicsGnomon

(image taken from the mentioned wiki page, then converted to jpg)

A gnomonic cube is then just a cube with colored faces. I shall color the faces of the gnomonic cube with symbols of the gates from graphic lambda calculus! Here is the construction:

gnomonic_cube_2

So, to each gate is associated a color, for drawing conveniences. In the upper part of the picture is described how the faces of the cube are decorated. (Notice the double appearance of the \Upsilon gate, the one used as a FAN-OUT.)  In the lower part of the picture are given 4 different views of the gnomonic cube. Each face of the cube is associated with a color. Each color is associated with a gate.

Here comes the simultaneous view of the pair of moves which form, together, the extended beta move.

gnomonic_cube_3

In this picture is described a kind of a 3D move, namely the pair of gnomonic cubes connected with the blue lines can be replaced by the pair of red lines, and conversely.

If you project to the UP face of the dotted big cube then you get the graphic beta move. The UP view is the viewpoint from lambda calculus (metaphorically speaking).

If you project to the RIGHT face then you get the dual of the graphic beta move. The RIGHT view is  the viewpoint from emergent algebras (…).

Instead of 4 gates (or 5 if we count \varepsilon^{-1} as different than \varepsilon), there is only one: the gnomonic cube. Nice!

Animals in lambda calculus II. The monoid structure and patterns

This is a continuation of “Animals in lambda calculus“. You may need to consult the web tutorial on graphic lambda calculus.

Animals are particular graphs in GRAPH. They were introduced in the mentioned post. I wrote there that animals have the meaning of a kind of transformations over terms in lambda calculus. I shall come back to this at the end of this post and a future post will develop this subject.

The anatomy of an animal is described in the next figure.

animal_4

The animal has therefore a body, which is a graph in GRAPH with only one output, decorated in the figure with “OUT”.  The allowed gates in the body are: the \lambda gate, the \curlywedge gate, the termination and the \Upsilon gates.

To the body is grafted a \Upsilon tree, with the root decorated in the figure with “IN”. The said tree is called “the spine” of the animal and the grafting points are called “insertions” and are figured by green small disks in the figure.

An animal may have a trivial spine (no \Upsilon gate, only a wire). The most trivial animal is the wire itself, with the body containing just a wire and with no insertion points. Let’s call this animal “the identity”.

Animals may compose one with another, simply by grafting the “OUT” of one animal to the “IN” of the other. The body, the spine and the insertions of the composed animal are described in the next figure.

animal_5

The moral is that the spine and the insertions  of the composed animal are inherited from the first animal in the composition, excepting the case when the first animal has trivial spine (not depicted in the figure).

The pleasant thing is that the set of animals is a monoid with composition and with the identity animal as neutral element. That is: the composition of animals is associative and the identity is the neutral element.

In the first post about animals it is explained that the graphic beta move transforms an animal into an animal.  Indeed, the animal

animal_1

is transformed into the animal

animal_3_p

The graphic beta move goes in both directions.  For example the identity animal is transformed by the graphic beta move into the following animal

animal_3_pp

With this monoid structure in mind, we may ask if there is any category structure behind, such that the animals are (decorations of) arrows. Otherwise said, is there any interesting category which admits a functor from it to the monoid of animals, seen as a category with one object and animals as arrows?

An interesting category is the one of “patterns”, the subject will be developed in a future post. But here I shall explain, in a not totally rigorous way,  what a pattern is.

The category of patterns has the untyped lambda terms as objects (with some care about the use of alpha equivalence) and patterns as arrows. A category may be defined only in terms of its arrows, therefore let me say what a pattern should be.

A bit loosely speaking, a pattern is a pair (A[u:=B], B), where A,B are untyped lambda calculus terms and u is a variable. I call it a pattern because really it is one. Namely, start from the following data: a term A, a variable u (which appears in A … ) and another term B. They generate a new term A[u:=B] which has all the occurences of the term B in A[u:=B] which were caused by the substitution u:=B, marked somehow.

If you think a bit in graphic lambda terms, that is almost an animal with the “IN” decorated by B and the “OUT” decorated by A[u:=B].

Composition of animals corresponds to composition of patterns, namely (C[v:= A[u:=B]], A[u:=B]) (A[u:=B], B) = (D[u:=B], B), where D is a term which is defined (with care) such that D[u:=B] = C[v:= A[u:=B]].

Clearly, the definition of patterns ant their composition is not at all rigorous, but it will be made so in a future post.