All posts by chorasimilarity

can do anything

A second opinion on “Slay peer review” article

“It is no good just finding particular instances where peer review has failed because I can point you to specific instances where peer review has been very successful,” she said.
She feared that abandoning peer review would make scientific literature no more reliable than the blogosphere, consisting of an unnavigable mass of articles, most of which were “wrong or misleading”.This is a quote from one of the most interesting articles I read these days: “Slay peer review ‘sacred cow’, says former BMJ chief” by Paul Jump.
I commented previously about replacing peer-review with validation by reproducibility
but now I want to concentrate on this quote, which, according to the author of the article,  has been made by “Georgina Mace, professor of biodiversity and ecosystems at University College London”.This is the pro argument in favor of the actual peer review system. Opposed to it, and main subject of the article, is

“Richard Smith, who edited the BMJ between 1991 and 2004, told the Royal Society’s Future of Scholarly Scientific Communication conference on 20 April that there was no evidence that pre-publication peer review improved papers or detected errors or fraud.”

I am very much convinced by this, but let’s think coldly.

Pro peer review is that a majority of peer reviewed articles is formed by correct articles, while a majority of  “the blogosphere [is] consisting of an unnavigable mass of articles, most of which were “wrong or misleading””.

Contrary to peer review is that

“Richard Smith, who edited the BMJ between 1991 and 2004, told the Royal Society’s Future of Scholarly Scientific Communication conference on 20 April that there was no evidence that pre-publication peer review improved papers or detected errors or fraud.
Referring to John Ioannidis’ famous 2005 paper “Why most published research findings are false”, Dr Smith said “most of what is published in journals is just plain wrong or nonsense”. He added that an experiment carried out during his time at the BMJ had seen eight errors introduced into a 600-word paper that was sent out to 300 reviewers.
“No one found more than five [errors]; the median was two and 20 per cent didn’t spot any,” he said. “If peer review was a drug it would never get on the market because we have lots of evidence of its adverse effects and don’t have evidence of its benefit.””and moreover:

“peer review was too slow, expensive and burdensome on reviewers’ time. It was also biased against innovative papers and was open to abuse by the unscrupulous. He said science would be better off if it abandoned pre-publication peer review entirely and left it to online readers to determine “what matters and what doesn’t”.”

Which I interpret as confidence in the blogosphere-like medium.

Where is the truth? In the middle, as usual.

Here is my opinion, please form yours.

Potentially the new medium comes with new, relatively better means to do research. An important part of the research is communication, and it is clear that the old system is already obsolete. It is kept artificially alive by authority and business interests.

However, it is also true that a majority of productions which are accessible via the new medium are of a very bad quality and unreliable.

To make another comparison, in the continuation of the one about the fall of academic painters and the rise of impressionists
a majority of the work of academic painters was good but not brilliant (reliable but not innovative enough), a majority of non academic painters produce crappy cute paintings which average people LOVE. Even now the average people opinion is that anybody can doodle and use random colours to make these new paintings (which are 100 years old btw).That is life.

You can’t accuse a non affiliated painter that he shows his work in the same venue where you find all the cats, kids, wrinkled old people and cute places.

Science side, we live in a sea of crap which is loved by the average people, which the blood of the so called attention economy.

It is difficult for a researcher, who is a particular species of a creator, to find other people willing to spend time not only to share original ideas (which are not liked because strange, by default), but also to invest a work into understanding, into creating.

That is why I believe that:
– there have to be social incentive for these people (and that attention economy thinking is not helping this, being instead a vector of propagation for big budget PR and lolcats and life wisdom quotes)
– and that the creators of new scientific content have to provide as much as possible means for self-validation of their work.


A comparison of two models of computation: chemlambda and Turing machines

The purpose is to understand clearly what is this story about. The most simple stuff, OK? in order to feel it in familiar situations.

Chemlambda is a collection of rules about rewritings done on pieces of files in a certain format. Without an algorithm which tells which rewrite to use, where and when,  chemlambda does nothing.

In the sophisticated version of the Distributed GLC proposal, this algorithmic part uses the Actor Model idea. Too complicated!, Let’s go simpler!

The simplest algorithm for using the collection of rewrites from chemlambda is the following:
  1. take a file (in the format called “mol”, see later)
  2. look for all patterns in the file which can be used for rewrites
  3. if there are different patterns which overlap, then pick a side (by using an ordering or graph rewrites, like the precedence rules in arithmetic)
  4. apply all the rewrites at once
  5. repeat (either until there is no rewrite possible, or a given number of times, or forever)
 To spice things just a bit, consider the next simple algorithm, which is like the one described, only that we add at step 2:
  •  for every identified pattern flip a coin to decide to keep it or ignore it further in the algorithm
The reason  is that  randomness is the simplest way to say: who knows if I can do this rewrite when I want, or maybe I have in my computer only a part of the file, or maybe I have to know that a friend has a half of the pattern and I have the other, so I have to talk with him first, then agree to make together the rewrite. Who knows? Flip a coin then.

Now, proven facts.

Chemlambda with the stupid deterministic algorithm is Turing universal. Which means that implicitly this is a model of computation. Everything is prescribed from the top to the bottom. Is on the par with a Turing machine, or with a RAND model.

Chemlambda with the random stupid model seems to be also Turing universal, but I don’t have yet a proof for this. There is a reason for the fact that it is as powerful as the stupid deterministic model, but I won’t go there.

So the right image to have is that chemlambda with the  described algorithm can do anything any computer can.

The first question is, how? For example how compares chemlambda with a Turing machine? If it is at this basic level then it means it is incomprehensible, because we humans can’t make sense of a scroll of bytecode, unless we are highly trained in this very specialised task.

All computers do the same thing: they crunch machine code. No matter which high language you use to write a program, it is then compiled and eventually there is a machine code which is executed, and that is the level we speak.

It does not matter which language you use, eventually all is machine code. There is a huge architectural tower and we are on the top of it, but in the basement all looks the same. The tower is here for us to be easy to use the superb machine. But it is not needed otherwise, it is only for our comfort.

This is very puzzling when we look at chemlambda because it is claimed that chemlambda has something to do with lambda calculus, or lambda calculus is the prototype of a functional programming language. So it appears that chemlamdba should be associated with higher meaning and clever thinking, and abstraction of the abstraction of the abstraction.

No, from the point of view of the programmer.

Yes, from the point of view of the machine.

In order to compare chemlambda with a TM we have to put it in the same terms. So you can easily put a TM in terms of a rewrite system, such that it works with the same stupid deterministic algorithm.

It is not yet put there, but the conclusion is obvious: chemlambda can do lambda calculus with one rewrite, while an Universal Turing Machine needs about 20 rewrites to do what TM do.

Wait, what about distributivity, propagation, the fanin, all the other rewrites?
They are common, they just form a mechanism for signal transduction and duplication!
Chemlambda is much simpler than TM.

So you can use directly chemlambda, at this metal level, to perform lambda calculus. Is explained here

And I highly recommend  to try to play with it by following the instructions.

You need a linux system, or any system where you have sh and awk.


2. unzip it and go to the directory “dynamic”
3. open a shell and write:  bash
4. you will get a prompt and a list of files with the extension .mol , write the name of one of them, in the form file.mol
5. you get file.html. Open it with a browser with js enabled. For reasons I don’t understand, it works much better in safari, chromium, chrome than in firefox.

When you look at the result of the computation you see an animation, which is the equivalent of seeing a TM head running here and there on a tape. It does not make much sense at first, but you can convince that it works and get a feeling about how it does it.

Once you get this feeling I will be very glad to discuss more!

Recall that all this is related  to the most stupid algorithm. But I believe it helps a lot to understand how to build on it.

Yes, “slay peer review” and replace it by reproducibility

Via Graham Steel the awesome article Slay peer review ‘sacred cow’, says former BMJ chief.

“Richard Smith, who edited the BMJ between 1991 and 2004, told the Royal Society’s Future of Scholarly Scientific Communication conference on 20 April that there was no evidence that pre-publication peer review improved papers or detected errors or fraud. […]

“He said science would be better off if it abandoned pre-publication peer review entirely and left it to online readers to determine “what matters and what doesn’t”.

“That is the real peer review: not all these silly processes that go on before and immediately after publication,” he said.”

That’s just a part of the article, go read the counter arguments by Georgina Mace.

Make your opinion about this.

Here is mine.

In the post Reproducibility vs peer review I write

“The only purpose of peer review is to signal that at least one, two, three or four members of the professional community (peers) declare that they believe that the said work is valid. Validation by reproducibility is much more than this peer review practice. […]

Compared to peer review, which is only a social claim that somebody from the guild checked it, validation through reproducibility is much more, even if it does not provide means to absolute truths.”

There are several points to mention:

  • the role of journals is irrelevant to anybody else than publishers and their fellow academic bureaucrats who work together to maintain this crooked system, for their own $ advantage.
  • indeed, an article should give by itself the means to validate its content
  • which means that the form of the article has to change from the paper version to a document which contains data, programs, everything which may help to validate the content written with words
  • and the validation process (aka post review) has to be put on the par with the activity of writing articles, Even if an article comes with all means to validate it (like the process described in  Reproducibility vs peer review ), the validation supposes work and by itself it is an activity akin to the one which is reported in the article. More than this, the validation may or may not function according to what the author of the work supposes, but in any case it leads to new scientific content.

In theory sounds great, but in practice it may be very difficult to provide a work with the means of validation (of course up to the external resources used in the work, like for example other works).

My answer is that: concretely it is possible to do this and I offer as example my article Molecular computers, which is published on and it comes with a repository which contains all the means to confirm or refute what is written in the article.

The real problem is social. In such a system the bored researcher has to spend more than 10 min top to read an article he or she intends to use.

Then it is much easy, socially, to use the actual, unscientific system of replacing validation by authority arguments.

As well, the monkey system — you scratch my back and I’ll scratch yours — which is behind most of the peer reviews (only think about the extreme specialisation of research which makes that almost surely a reviewer competes or collaborates with the author), well, that monkey system will no longer function.

This is even a bigger problem than the one that publishing and academic bean counting will soon be obsolete.

So my forecast is that we shall keep a mix of authority based (read “peer review”) and reproducibility (by validation), for some time.

The authority, though, will take another blow.

Which is in favour of research. It is also economically sound, if you think that probably today a majority of funding for research go to researchers whose work pass peer reviews, but not validation.


All successful computation experiments with lambda calculus in one list

What you see in this links: I take a lambda term, transform it into a artificial molecule, then let it reduce randomly, with no evaluation strategy. That is what I call the most stupid algorithm. Amazing is that is works.
You don’t have to believe me, because you can check it independently, by using the programs available in the github repo.

Here is the list of demos where lambda terms are used.

Now, details of the process:
– I take a lambda term and I draw the syntactic tree
– this tree has as leaves the variables, bound and free. These are eliminated by two tricks, one for the bound variables, the other for the free ones. The bound variables are eliminated by replacing them with new arrows in the graph, going from one leg of a lambda abstraction node, to the leaf where the variable appear. If there are more places where the same bound variable appears, then insert some fanout nodes (FO). For the free variable do the same, by adding for each free variable a tree of FO nodes. If the bound variable does not appear anywhere else then add a termination (T) node.
– in this way the graph which is obtained is no longer a tree. It is a trivalent graph mostly, with some free ends. It is an oriented graph. The free ends which correspond to a “in” arrow are there for each free variable. There is only one end which correspond to an “out” arrow, coming from the root of the syntactic tree.
– I give a unique name to each arrow in the graph
– then I write the “mol file” which represents the graph, as a list of nodes and the names of arrows connected to the nodes (thus an application node A which has the left leg connected to the arrow “a”, the right leg connected to the arrow “b” and the out leg connected to “c”, is described by one line “A a b c” for example.

OK, now I have the mol file, I run the scripts on it and then I look at the output.

What is the output?

The scripts take the mol file and transform it into a collection of associative arrays (that’s why I’m using awk) which describe the graph.

Then they apply the algorithm which I call “stupid” because really is stupidly simplistic: do a predetermined number of cycles, where in each cycle do the following
– identify the places (called patterns) where a chemlambda rewrite is possible (these are pairs of lines in the mol file, so pairs of nodes in the graph)
– then, as you identify a pattern, flip a coin, if the coin gives “0” then block the pattern and propose a change in the graph
– when you finish all this, update the graph
– some rewrites involve the introduction of some 2-valent nodes, called “Arrow”. Eliminate them in a inner cycle called “COMB cycle”, i.e. comb the arrows

As you see, there is absolutely no care about the correctness of the intermediary graphs. Do they represent lambda terms? Generically no!
Are there any variable which are passed, or evaluations of terms which are done in some clever order (eager, lazy, etc)? Not at all, there are no other variables than the names of the arrows of the graph, or these ones have the property that they are names which appear twice in the mol file (once in a port “in”, 2nd in a port “out”). When the pattern is replaced these names disappear and the names of the arrows from the new pattern are generated on the fly, for example by a counter of arrows.

The scripts do the computation and they stop. There is a choice made over the way of seeing the computation and the results.
One obvious choice would be to see the computation as a sequence of mol files, corresponding to the sequence of graphs. Then one could use another script to transform each mol file into a graph (via, say, a json file) and use some graph visualiser to see the graph. This was the choice in the first scripts made.
Another choice is to make an animation of the computation, by using d3.js. Nodes which are eliminated are first freed of links and then they vanish, while new nodes appear, are linked with their ports, then linked with the rest of the graph.

This is what you see in the demos. The scripts produce a html file, which has inside a js script which uses d3.js. So the output of the scripts is the visualisation of the computation.

Recall hat the algorithm of computation is random, therefore it is highly likely that different runs of the algorithm give different animations. In the demos you see one such animation, but you can take the scripts from the repo and make your own.

What is amazing is that they give the right results!

It is perhaps bizzare to look at the computation and to not make any sense of it. What happens? Where is the evaluation of this term? Who calls whom?

Nothing of this happens. The algorithm just does what I explained. And since there are no calls, no evaluations, no variables passed from here to there, that means that you won’t see them.

That is because the computation does not work by the IT paradigm of signals sent by wires, through gates, but it works by what chemists call signal transduction. This is a pervasive phenomenon: a molecule enters in chemical reactions with others and there is a cascade of other chemical reactions which propagate and produce the result.

About what you see in the visualisations.
Because the graph is oriented, and because the trivalent nodes have the legs differentiated (i.e. for example there might be a leg, a leg and a out leg, which for symmetry is described as a middle.out leg) I want to turn it into an unoriented graph.
This is done by replacing each trivalent node by 4 nodes, and each free end or termination node by 2 nodes each.
For trivalent nodes there will be one main node and 3 other nodes which represents the legs. These are called “ports”. There will be a color-coded notation, and the choice made is to represent the nodes A (application) and FO by the main node colored green, the L (lambda) and FI (fanin, exists in chemlamda only) by red (actually in the demos this is a purple)
and so on. The port nodes are coloured by yellow for the “in” ports and by blue for the “out” ports. The “left”, right”, “middle” types are encoded by the radius of the ports.


Suppose this. What then?

 I want to understand how a single molecule interacts with others, chemically. You have to agree that this is a worthy goal.
What I say is this. By using a collection of made up molecules and made up chemical reactions, I proved that by the stupid deterministic algorithm I can do anything and by experiment that it seems that if I design well the initial molecule then I can do anything I proposed myself doing, with the stupid random algorithm (a molecule which encounters randomly enzymes which rewrite it by chemical reactions). For me, the molecule is not the process, it is just a bunch of atoms and bonds. But I proved I can do anything with it, without any lab supervision.Which is relevant because any real cell does that. It has no supervision, nor goals, nor understanding, is nothing else than a collection of chemicals which interact randomly.

My hypothesis is the following. There is a transformation from the made up chemlambda molecules, which TRANSFORMS:
– node into real molecule
– port into real molecule
– bond into real molecule

and some other real molecules called here “enzymes”, one per any type of graph rewrite

such that

– graph rewrite G which replaces this configuration LT of two nodes and 1 bond  into that  RT configuration TRANSFORMS into  the chemical reaction between enzyme G and the transformation of LT into real chemicals, which gives the transformation of RT into real chemicals and the enzyme G (perhaps carrying away some other reaction products, to have conservation of # atoms)

The argument for that hypothesis is that the rewrites are so simple, compared with real chemistry of biomolecules, that there have to exist such reactions.

This is explained in the Molecular computers

Suppose that the hypothesis is confirmed. Either by identifying the TRANSFORM from scratch (i.e. by using chemistry knowledge to identify classes of reactions and chemicals which can model chemlambda), or by finding the enzymes G and the real molecules corresponding to node, port and bond in some fundamental biochemical processes (that would be even more wonderful).

Suppose this. What then?

Then, say I have the goal to design a molecule which does something inside a cell, when injected in the body. It does it by itself, in the cell medium. What it does can always (or in principle) be expressed as an algorithm, as a computation.

I use chemlambda and TRANSFORM to design the molecule and check that once I have it, it does the job. It is of course a problem to build it in reality, but for this I have the printer of Craig Venter, the digital biological converter .

So I print it and that’s all. When injected in the body, once arrived in the cell, it does the job.

Other possibilities would open in the case some formalism like chemlambda (i.e. using individual molecules and rewrites, along with trivial random algorithms) is identified in real biochemistry. This would help enormously the understanding of biochemical processes, because instead of working empirically, like now when we work at the level of functions of molecules (knowing well that the same molecule does different things in different contexts and that molecule-function association is very fragile in biology), we might work inversely, from using functions as black boxes to being able to build functions. Even to go outside functions and understand chemistry as computation directly, not only as a random medium for encoding our theoretical notions of computation.

See more about this at the chemlambda index


Busy beaver’s random church

The demo linked here would surely look impossible.

One computer, WHILE IT WORKS, is multiplied into 3 computers which  continue to work, each of them. This feat is done without any higher control and there are never conflicts which appear. All computers work asynchronously (mimicked here by randomness). Moreover, eventually they arrive to read and write on the same tape.

There are no calls, nothing you have seen before.Everything is local, no matter how you slice it into separated parts.


OMG the busy beaver goes into abstract thinking

So called higher capacities, like abstraction, are highly hyped.  Look:
What happens when you apply the Church number 3 to a busy beaver? You get 3 busy beavers on the same tape.

Details will be added into the article Turing machines, chemlambda style.

If you want tot experiment, then click on “fork me on github” and copy the gh-pages branch of the repo. Then look in the dynamic folder for the script In a terminal type “bash”, then type “church3bb.mol”. You shall get the file church3bb.html which you can see by using a js enabled browser.
The sh script calls an awk script, which produces the html file. The awk script is check_1_mov2_rand_metabo_bb.awk. Open it with a text editor and you shall see at the beginning all kinds of parameters which you can change (before calling the sh script), so that you may alter the duration, the speed, change between deterministic and random algorithms.
Finally, you also need a mol file to play. For this demo has been used the mol file church3bb.mol. You can also open it with a text editor and play with it.

UPDATE: Will tweak it more the next days, but the idea which I want to communicate is that TM can be seen as chemistry, like in chemlambda, and it can interact very well with the rest of the formalism. So you have these two pillars of computation on the same footing, together, despite the impression that they are somehow very different, one like hardware and the other like software.