Tag Archives: validation

The price of publishing with GitHub, Figshare, G+, etc

Three years ago I posted The price of publishing with arXiv. If you look at my arXiv articles then you’ll notice that I barely posted on arXiv.org since then. Instead I went into territory which is even less recognized as serious by a big part of academia. I used:

The effects of this choice are put in front of my homepage, so go there to read them. (Besides, it is a good exercise to remember how to click on links and use them, that lost art from the age when internet was free.)

In this post I want to explain what is the price I paid for these choices and what I think now about them.

First, it is a very stressful way of living. I am not joking, as you know stress comes from realizing that there are many choices and one has to choose. Random reward from the social media is addictive. The discovery that there is a way to get out from the situation which keeps us locked into the legacy publishing system (validation). The realization that the problem is not technical but social. A much more cynical view of the undercurrents of the social life of researchers.

The feeling that I can really change the world with my research. The worries that some possible changes might be very dangerous.

The debt I owe concerning the scarcity of my explanations. The effort to show only the aspects I think are relevant, putting aside those who are not. (Btw, if you look at my About page then you’ll read “This blog contains ideas from the future”. It is true because I already pruned the 99% of the paths leading nowhere interesting.)

The desire to go much deeper, the desire to explain once again what and why, to people who seem either lacking long term attention capability or having shallow pet theories.

Is like fishing for Moby Dick.

Eisen’ “parasitic green OA” is the apt name for Harnad’ flawed definition of green OA, but all that is old timers disputes, the future is here and different than both green and gold OA

See this post and the replies on G+ at https://plus.google.com/+MariusBuliga/posts/efzia2KxVzo.

My short description of the situation: the future is here, and it is not gold OA (nor the flawed green OA definition which ignores arXiv). So, visually:


It has never occurred to me that putting an article in a visible place (like arXiv.org) is parasitic green OA+Michael B. Eisen  calls it parasitic because he supposes that this has to come along with the real publication. But what if not?

[Added: Eisen writes in the body of the post that he uses the definition given by Harnad to green OA, which ignores the reality. It is very conveniently for gold OA to have a definition of green OA which does not apply to the oldest (1991) and fully functional example of a research communication experiment which is OA and green: the arXiv.org.]
Then, compared to that, gold OA appears as a progress.

I think gold OA, in the best of cases, is a waste of money for nothing.

A more future oriented reply has +Mike Taylor
who sees two possible futures, green (without the assumption from Eisen post) and gold.

I think that the future comes faster. It is already here.

Relax. Try validation instead peer review. Is more scientific.

Definition. Peer-reviewed article: published by the man who saw the man who claims to have read it, but does not back the claim with his name.

The reviewers are not supermen. They use the information from the traditional article. The only thing they are supposed to do is that they read it. This is what they use to give their approval stamp.

Validation means that the article provides enough means so that the readers can reproduce the research by themselves. This is almost impossible with  an article in the format inherited from the time when it was printed on paper. But when the article is replaced by a program which runs in the browser, which uses databases, simulations, whatever means which facilitate the validation, then the reader can, if he so wishes, make a scientific motivated opinion about this.

Practically the future has come already and we see it on Github. Today. Non-exclusively. Tomorrow? Who knows?

Going back to the green-gold OA dispute, and Elsevier recent change of sharing and hosting articles (which of course should have been the real subject of discussions, instead of waxing poetic about OA, only a straw man).

This is not even interesting. The discussion about OA revolves around who has the copyright and who pays (for nothing).

I would be curious to see discussions about DRM, who cares who has the copyright?

But then I realised that, as I wrote at the beginning of the post, the future is here.

Here to invent it. Open for everybody.

I took the image from this post by +Ivan Pierre and modified the text.


Don’t forget to read the replies from the G+ post.


Real or artificial chemistries? Questions about experiments with rings replications

The replication mechanism for circular bacterial chromosomes is known. There are two replication forks which propagate in two directions, until they meet again somewhere and the replication is finished.


[source, found starting from the wiki page on circular bacterial chromosomes]

In the artificial chemistry chemlambda something similar can be done. This leads to some interesting questions. But first, here is a short animation which describes the chemlambda simulation.


The animation has two parts, where the same simulation is shown. In the first part some nodes are fixed, in order to ease the observation of the propagation of the replication, which is like the propagation of the replication forks. In the second part there is no node fixed, which makes easy to notice that eventually we get two ring molecules from one.


If the visual evidence convinced you, then it is time to go to the explanations and questions.

But notice that:

  • The replication of circular DNA molecules is done with real chemistry
  • The replication of the circular molecule from the animation is done with an artificial chemistry model.

The natural question to ask is: are these chemistries the same?

The answer may be more subtle than a simple yes or no. As more visual food for thinking, take a look at a picture from the Nature Nanotechnology Letter “Self-replication of DNA rings” http://www.nature.com/nnano/journal/vaop/ncurrent/full/nnano.2015.87.html by Junghoon Kim, Junwye Lee, Shogo Hamada, Satoshi Murata & Sung Ha Park


[this is Figure 1 from the article]

This is a real ring molecule, made of patterns which, themselves are made of DNA. The visual similarity with the start of the chemlambda simulation is striking.

But this DNA ring is not a DNA ring as in the first figure. It is made by humans, with real chemistry.

Therefore the boldfaced question can be rephrased as:

Are there real chemical assemblies which react as of they are nodes and bonds of the artificial chemistry?

Like actors in a play, there could be a real chemical assembly which plays the role of a red atom in the artificial chemistry, another real chemical assembly which plays the role of a green atom, another for a small atom (called “port”) and another for a bond between these artificial chemistry atoms.

From one side, this is not surprising, for example a DNA molecule is figured as a string of letters A, C, T, G, but each letter is a real molecule. Take A (adenine)



Likewise, each atom from the artificial chemistry (like A (application), L (lambda abstraction), etc) could be embodied in real chemistry by a real molecule. (I am not suggesting that the DNA bases are to be put into correspondence with artificial chemistry atoms.)

Similarly, there are real molecule which could play the role of bonds. As an ilustration (only), I’ll take the D-deoxyribose, which is involved into the backbone structure of a DNA molecule.



So it turns out that it is not so easy to answer to the question, although for a chemist may be much easier than for a mathematician.


 0. (few words about validation)  If you have the alternative to validate what you read, then it is preferable to authority statements or hearsay from editors. Most often they use anonymous peer-reviews which are unavailable to the readers.

Validation means that the reader can make an opinion about a piece of research by using the means which the author provides. Of course that if the means are not enough for the reader, then it is the author who takes the blame.

The artificial chemistry  animation has been done by screen recording of the result of a computation. As the algorithm is random, you can produce another result by following the instructions from
I used the mol file model1.mol and quiner.sh. The mol file contains the initial molecule, in the mol format. The script quiner.sh calls the program quiner.awk, which produces a html and javascript file (called model1.html), which you can see in a browser.

I  added text to such a result and made a demo some time ago


(when? look at the history in the github repo, for example: https://github.com/chorasimilarity/chemlambda-gui/commits/gh-pages/dynamic/model1.html)

1. Chemlambda is a model of computation based on artificial chemistry which is claimed to be very closed to the way Nature computes (chemically), especially when it comes to the chemical basis of computation in living organisms.
This is a claim which can be validated through examples. This is one of the examples. There are many other more; the chemlambda collection shows some of them in a way which is hopefully easy to read (and validate).
A stronger claim, made in the article Molecular computers (link further), is that chemlambda is real chemistry in disguise, meaning that there exist real chemicals which react according to the chemlambda rewrites and also according to the reduction algorithm (which does random rewrites, as if produced by random encounters of the parts of the molecule with invisible enzymes, one enzyme type per rewrite type).
This claim can be validated by chemists.

2. In the animation the duplication of the ring molecule is achieved, mostly, by the propagation of chemlambda DIST rewrites. A rewrite from the family DIST typically doubles the number of nodes involved (from 2 to 4 nodes), which vaguely suggest that DIST rewrites may be achieved by a DNA replication mechanism.
(List of rewrites here:
http://chorasimilarity.github.io/chemlambda-gui/dynamic/moves.html )

So, from my point of view, the question I have is: are there DNA assemblies for the nodes, ports and bonds of chemlambda molecules?

3. In the first part of the animation you see the ring molecule with some (free in FRIN and free out FROUT) nodes fixed. Actually you may attach more graphs at the free nodes (yellow and magenta 1-valent nodes in the animation).

You can clearly see the propagation of the DIST rewrites. In the process, if you look closer, there are two nodes which disappear. Indeed, the initial ring has 9 nodes, the two copies have 7 nodes each. That is because the site where the replication is initiated (made of two nodes) is not replicated itself. You can see the traces of it as a pair of two bonds which connect, each, a free in with a free out node.

In the second part of the animation, the same run is repeated, this time without fixing the FRIN and FROUT nodes before the animation starts. Now you can see the two copies, each with 7 nodes, and the remnants of the initiation site, as a two pairs of FRIN-FROUT nodes.


Artificial life, standard computation tests and validation

In previous posts from the chemlambda collection  I wrote about the simulation of various behaviours of living organisms by the artificial chemistry called chemlambda.
There are more to show in this direction, but there is already an accumulation of them:
As the story is told backwards, from present to the past, there will be more about reproduction later.
Now, that is one side of the story: these artificial microbes or molecules manifest life characteristics, stemming from an universal, dumb simple algorithm, which does random rewrites as if the molecule encounters invisible rewriting enzymes.


So, the mystery is not in the algorithm. The algorithm is only a sketchy physics.

But originally this formalism has been invented for computation.


It does pass very well standard computation steps, as well as nonstandard ones (from the point of view of biologists, who perhaps don’t hold enough sophisticated views as to differentiate between boring boolean logic gates and recursive but not primitive recursive functions like the Ackermann function).

In the following animation you see a few seconds screenshot of the computation of a factorial function.


Recall that the factorial is something a bit more sophisticated than a AND boolean gate, but it is primitively recursive, so is less sophisticated than the Ackermann function.


During these few seconds there are about 5-10 rewrites. The whole process has several hundreds of them.

How are they done? Who decides the order? How are criteria satisfied or checked, what is incremented, when does the computation stop?

That is why it is wonderful:

  • the rewrites are random
  • nobody decides the order, there’s no plan
  • there are no criteria to check (like equality of values), there are no values to be incremented or otherwise to be passed
  • the computation stops when there are no possible further rewrites (thus, according to the point of view used with the artificial life forms, the computation stops when the organism dies)

Then, how it works? Everything necessary is in the graphs and the rewrite patterns they show.

It is like in Nature, really. In Nature there is no programmer, no function, no input and output, no higher level. All these are in the eyes of the human observer, who then creates a story which has some predictive power if it is a scientific one.

All I am writing can be validated by anybody wishing to do it. Do not believe me, it is not at all necessary to appeal to authority here.

So I conclude:  this is a system of a simplified world which manifest life like behaviours and universal computation power, in the presence of randomness. In this world there is no plan, there is no control and there are no externally imposed goals.

Very unlike the Industrial Revolution thinking!

This thread of posts can be seen at once in the chemlambda collection

If you want to validate this wordy post yourself then go to the github repository and read the instructions

The live computation of the factorial is here

Which means that if you want to produce another random computation of the factorial then you have to use the file
lisfact_2_mod.mol and to follow the instructions.


It is time to cast doubt on any peer reviewed but not validated research article

Any peer reviewed article which does not come at least with the reviews has only a social validation. With reviews which contain only value judgements, grammar corrections and impossible to validate assertions, there is not much more trust added.

As to the recourse to experts… what are we, a guild of wizards? It is true because somebody says some anonymous experts have  been consulted and they say it’s right or wrong?

Would you take a pill based on the opinion of an anonymous expert that it cures your disease?

Would you fly in a plane whose flight characteristics have been validated by the hear-say of unaccountable anonymous experts?

What is more than laughable is that it seems that mathematics is the field with the most wizards, full of experts who are willingly exchanging private value opinions, but who are reluctant to make them in public.

Case by case, building on concrete examples, in an incremental manner, it is possible to write articles which can be validated by using the means they provide (and any other available), by anyone willing to do it.

It is time to renounce at this wizardry called peer review and to pass to a more rigorous approach.

Hard, but possible. Of course that the wizards will complain. After all they are in material conflict of interests, because they are both goalkeepers and arbiters, both in academic and editorial committees.

But again, why should we be happy with “it’s worthy of publication or not because I say so, but do not mention my name” when there is validation possible?

The wizardry costs money, directed to compliant students, produces no progress elsewhere than in the management metrics, kills or stalls research fields where the advance is made harder than it should because of the mediocrity of these high, but oh so shy in public experts who are where they are because in their young time the world was more welcoming with researchers.



A second opinion on “Slay peer review” article

“It is no good just finding particular instances where peer review has failed because I can point you to specific instances where peer review has been very successful,” she said.
She feared that abandoning peer review would make scientific literature no more reliable than the blogosphere, consisting of an unnavigable mass of articles, most of which were “wrong or misleading”.
This is a quote from one of the most interesting articles I read these days: “Slay peer review ‘sacred cow’, says former BMJ chief” by Paul Jump.
I commented previously about replacing peer-review with validation by reproducibility
but now I want to concentrate on this quote, which, according to the author of the article,  has been made by “Georgina Mace, professor of biodiversity and ecosystems at University College London”.This is the pro argument in favour of the actual peer review system. Opposed to it, and main subject of the article, is”Richard Smith, who edited the BMJ between 1991 and 2004, told the Royal Society’s Future of Scholarly Scientific Communication conference on 20 April that there was no evidence that pre-publication peer review improved papers or detected errors or fraud.”

I am very much convinced by this, but let’s think coldly.

Pro peer review is that a majority of peer reviewed articles is formed by correct articles, while a majority of  “the blogosphere [is] consisting of an unnavigable mass of articles, most of which were “wrong or misleading””.

Contrary to peer review is that, according to “Richard Smith, who edited the BMJ between 1991 and 2004” :

“there was no evidence that pre-publication peer review improved papers or detected errors or fraud.”
“Referring to John Ioannidis’ famous 2005 paper “Why most published research findings are false”, Dr Smith said “most of what is published in journals is just plain wrong or nonsense”. […]
“If peer review was a drug it would never get on the market because we have lots of evidence of its adverse effects and don’t have evidence of its benefit.””
and moreover:

“peer review was too slow, expensive and burdensome on reviewers’ time. It was also biased against innovative papers and was open to abuse by the unscrupulous. He said science would be better off if it abandoned pre-publication peer review entirely and left it to online readers to determine “what matters and what doesn’t”.”

Which I interpret as confidence in the blogosphere-like medium.

Where is the truth? In the middle, as usual.

Here is my opinion, please form yours.

The new medium comes with new, relatively better means to do research. An important part of the research involves communication, and it is clear that the old system is already obsolete. It is kept artificially alive by authority and business interests.

However, it is also true that a majority of productions which are accessible via the new medium are of a very bad quality and unreliable.

To make another comparison, in the continuation of the one about the fall of academic painters and the rise of impressionists
a majority of the work of academic painters was good but not brilliant (reliable but not innovative enough), a majority of non academic painters produce crappy cute paintings which average people LOVE to see and comment about.
You can’t accuse a non affiliated painter that he shows his work in the same venue where you find all the cats, kids, wrinkled old people and cute places.

Science side, we live in a sea of crappy content which is loved by the average people.

The so  called attention economy consists mainly in shuffling this content from a place to another. This is because liking and sharing content is a different activity than creating content. Some new thinking is needed here as well, in order to pass over the old idea of scarce resources which are made available by sharing them.

It is difficult for a researcher, who is a particular species of a creator, to find other people willing to spend time not only to share original ideas (which are not liked because strange, by default), but also to invest  work into understanding it, into validating it, which is akin an act of creation.

That is why I believe that:
– there have to be social incentives for these researchers  (and that attention economy thinking is not helping this, being instead a vector of propagation for big budget PR and lolcats and life wisdom quotes)
– and that the creators of new scientific content have to provide as many as possible means for self-validation of their work.