Transparency is superior to trust

I am fascinated by this quote. I think it’s the most beautiful quote, in it’s terseness, I’ve seen since a long time. Wish I invented it!

It is not, though, the motto of Wikileaks, it’s taken from the section on Reproducibility of this Open Science manifesto.

To me, this quote means that validation is superior to peer review.

It is also significant that the quote says nothing about the publishing aspects of Open Science. That is because, I believe, we should split publishing from the discussion about Open Science.

Publishing, scientific publishing I mean, is simply irrelevant at this point. The strong part of Open Science, the new, original idea it brings forth is validation.

Sci-Hub acted as the great leveler, as concerns scientific publication. No interested reader cares, at this point, if an article is hostage behind a paywall or if the author of the article paid money for nothing to a Gold OA publisher.

Scientific publishing is finished. You have to be realistic about this thing.

But science communication is a far greater subject of interest. And validation is one major contribution to a superior scientific method.

Chemlambda for the people (with context)

UPDATE:Here are  the slides associated. The  context of this is very weird and it still continues as I write this update (Aug 6 2017).

UPDATE 2: With even more details, on Medium: How I became a face model for TED.

In Jan 2017 I was contacted by a curator of the TEDGlobal 2017 and asked if I would like to give a talk. I was very surprised because I had split feelings:

(positive) this could boost even more the interest in my proposal of molecular computers based on chemical reactions which mimic interaction nets rewrites[Knitting-Crown]-Keep-Calm-And-Use-Rna-For-Interaction-Nets

(negative) this is a subject which is fundamental science with possible worrying consequences,

(positive) but maybe I could talk about my personal experience with Open Science?

(negative) definitely not the touchy-feely kind, there’s nothing to be happy about yet, the fight for OS is tough and it continues,

so I agreed.

In May 2017 I was announced that indeed I’ll talk at the TED event. Yay, because since January I realised I could use this talk for several good purposes. Something fishy though, my feelings are complex, you see? there was this question in the back of my head: OK so this is the best public talks organization. We are in 2017. They need 4 months to start organizing? Hm, OK.

And from that point on I felt into a bureaucracy nightmare.

They baptized me BULIGIA for some time, even if my google mail is They sent me official invitations and other stuff… To me? nah, to BULIGIA. I sent them something like 5-6 polite mails until they finally figured out what’s wrong.

I was joking with my friends: maybe they think I’m Italian. Marius Buligia! Somebody said that “buligia” sounds like the name of a disease. So I’m going to spread the buligia disease in the fancy circles of TED. Funny! So be it.

Then they told me that I’ll have 6 min. … right, what can I do with less than 900 words? I can talk about Open Science. I quickly wrote a draft, give it the name ringo.txt  😉 and sent them.

This draft is now available as Open Science is RWX science. A quote from the original:

“I use everything available to turn this project into Open Science. You name it: old form articles, html and javascript articles, research blog, Github repository, Figshare data repository, Google collection, this 🙂 TED talk”

No, they wanted me to talk about chemlambda. Moreover (and this was one expression they kept repeating, Borg like) GitHub, Figshare, micromanagement, etc are “insider words”. Don’t use insider words.

Well, this is strange because one thing which made me very interested into this talk was the audience. You see, apparently the best thing about a main TED event is the audience in the room. I could see that they are in one of these categories: founders of the main web or computers services, rich from less well known but clever businesses  involving computers, representing investment funds, NY or Silicon Valley intellectuals, others in proportion of 10 percents. (And the speakers.)

So maybe a regular editor thinks that it reaches a bigger audience if the talk is made for people using 1500 words, drooling over their keyboard while they look at the talk with empty eyes.

Does not apply here, right?

Moreover the bigger audience is not reached, because presently there are huge audiences for anything interesting. There are so many people on the Net today that you don’t have to go to the barely human level to attract them. No public talks show has a 2 billion audience. If they have several million people interested in a talk, that’s great. Or there are always several million intelligent people interested in a public talk on science, computing, biology, whatever, which are bored to death by the stupidity of the generic shows proposed by the regular media.

OK, so what can I do with less than 900 words to explain chemlambda? Images, of course 🙂

I proposed them a choice between a second version of the script (more chemlambda, less Open Science) and a more bold one based on the Internet of Smells story.

They picked the first choice and after talking with the science editor (who’s a nice guy), after encountering some more “insider words” remarks, I said what the heck and just sent them a script based on the Internet of Smells.

They liked it a lot! It is basically what became “Chemlambda for the people”.

Great! This took them almost 2 months. (The year is 2017.)

I was very flexible concerning their suggestions not because I was forced to, but because I wanted to take all this as a challenge and also to learn from this interaction.

I prepared the slides, sent them and waited for the first rehearsal. At this point they had my script, which they liked, and my slides, html with high res movies and animations included.

They proposed to use for the rehearsal something professional (they said), won’t give the name.

Then I had a first rehearsal with the TED team, where they have not used my slides. They asked me to use screencapture from my laptop, I couldn’t see what they see. I could barely hear them (and see them) but I imagined that they know what they do.

For those quick to point out that maybe the connection was bad because I was in Romania, think again. Romania, and especially Bucharest, is one of those little places in the world with one of the top speed web connections. Moreover, many people whom I talk to by video know that we can have a decent and alive exchange even if I’m based in Romania. So leave your stupid racism at the door please and continue.

I was very worried inside, something is wrong. I was screaming to them and speaking very rarely, because the conditions I experienced were like wind howling during a storm.

The movies and animations I prepared and they had, but not used them? I didn’t know then, but learned later that the screencast turned into random renderings at the frequency of 1/s.

Anyway, after this professional rehearsal they were not happy to learn that I’ll be away for 2 weeks. People with small kids know that you have to make reservations months before the vacation.

Next day I received the edited  side of the transmission (so I could see what they saw, hear what they heard).

Finally, just before leaving for the vacation I had a (normal, not professional) video talk with the main curator and the scientific editor where they said that there’s not enough time to prepare the talk more and they decided to not let me talk. But I still participate at the event, in the public.

That hurt! For those who don’t know me, I’m a professional mathematician. I gave hundreds of public talks, most of them in English or French, I gave university courses, so I definitely never had a problem with public speaking. My pride is hurt!

Nevermind, is their show I said. I’ll think about coming and announce them. Them I mailed them and refused to come, just to be in the public.

During the vacation I started to have doubts about all this. Wait a moment, that was not a rehearsal, that was a set up, or it looked like that. Or maybe is my pride, again? Hm…

I was still receiving their general announcements and I saw they made public the list of speakers.

I did not wanted to link to their site, for privacy reasons, but I could still use a search engine. Surprise! I appear as a speaker on CNN. What’s this?

I went to the CNN site and I was not there. However, Google caches the page seen by the crawler. I got it and … I was there. For some time, then the page was edited.

I saved the cached page. Photos, like this one

Screenshot from 2017-08-06 14:37:06

are still available see this link. (archived version)

OK, so they made a mistake, right? They just sent to CNN the list of speakers (after they took me off), but they forgot to take me off from that list.

Back home I opened the cached html of the CNN page and look closer. Do you notice something weird here?

Screenshot from 2017-08-05 15:20:05

Yes, I appear twice, haha. At the position 33 and then 44.

OK, mistakes are done all the time, even by the most professional teams. Forget about it. Let’s stop with the thinking about wtf was all this.

Yesterday, Aug 5, more than 2 weeks after they announced the speakers, I visited the Tedglobal2017 site.

Somebody familiar was looking at me. Yeah, that guy, second row in the middle!

Screenshot from 2017-08-05 15:04:05


Huh? Naah, my figure again. Is it? Let’s look closer, use a small window like a phone or tablet. Gives this. (for the original image this link and archived version)

Screenshot from 2017-08-05 20:38:40

Yep, that’s me!

Now enjoy the script (slightly modified) and don’t forget to look at the slides.


We can program a computer to do anything. What if we had the same power over the molecules of our bodies? Let’s imagine how this could change our lives.

For example… this version of the scenario [3].

Adam and Eve meet at a party. She likes him. Her sniffer ring can sense Adam’s biomolecules floating in the air between them. One of them triggers a warning. Eve forwards the warning to Adam’s phone.

Back home, Adam files a bug report with his internet slash health provider. The bug report contains his biological ID and the DNA code received by the warning message.

The bug report is opened.

The ID and DNA code are converted to a digital chemistry. Technical staff manipulate this chemistry, as hackers about to debug a program in Neuromancer style.

“still he’d see the matrix in his sleep, bright lattices of logic unfolding across that colorless void”
William Gibson, Neuromancer

Things like making lists, just, fold up inside themselves. Come out the other way around. Crazy things.”
Pseudo — William Gibson

They find a digital molecule which solves Adam’s problem. A medicine. They convert the solution back to a DNA code which they send to Adam’s router.

The router can turn DNA code back into real biomolecules. Why? It’s a Venter 9000 digital-to-biological converter. Version one looks like this [1].

Is a bit larger than a router, for the moment. But, in few years, the 9000 version will be in everybody’s home.

The router emits these biomolecules into Adam’s bedroom. They enter the body and so the bug report is solved, the medicine is delivered and Adam is in perfect health again.

Can we really do this?

I think so, there are 3 steps to make.

Step 1. Build a digital chemistry which we can program. In a digital chemistry data and programs are all graph like structures, digital molecules which “fold up inside themselves and come out the other way around” only they do it randomly, like in real chemistry.

We would create and manipulate digital molecules as if we write programs made from a very few elementary bricks. Then we could simulate their behaviour on a computer, to be sure they work right.

Step 2. Use Nature to simulate this digital chemistry. There’s no computer as powerful as Nature, let’s use it. Find a digital-to-biological dictionary from the elementary bricks of the digital chemistry to real biomolecular bricks.

Step 3. Build digital-to-biological converters and biological-to-digital sensors. Craig Venter gave us the first generic DBC converter. Sensors as performant as Eve’s sniffer ring, as a part of the Internet of Things, are possible.

OK, so the program is simple. Let’s do it right away!

Well, I’m not a chemist, I’m a mathematician and I built a digital chemistry which does work like real chemistry. It is indeed inspired from stuff related to Lisp and Haskell (but goes in wild directions). Is called chemlambda [6], is an Open Science project and I hope it can be used in reality.

Molecules in chemlambda are graphs made by colored nodes and links between them. The chemical reactions are done by enzymes rewiring small patterns in these graphs.

Chemlambda is Turing universal, meaning that you can translate any computer program into one of these molecules and execute it via random digital chemical reactions.

In my simulations I used things like the Ackermann function or the factorial, but think: any program! You could do anything with the Nature’s computer.

More general, going far outside the small world of computer programs interesting for the neighbourhood programmer, you could design molecules from first principles.

Instead of shooting in the dark by doing many experiments with real world molecules, kind of like a barbarian who finds new uses for the tiny things discovered in a clock workshop, instead of this you could design what you need, then turn it into reality.

Colonize Mars? Deposit all Netflix shows in lichen spores?

Just applications.

Some frightening, of course.

But: understand life at molecular level? What a worthy goal. This may (or may not) help.

If the step 2 is realized, here’s the bottleneck.

I am very willing to try the step 2 of the program. I think this can be done by a combination of clever searches in available chemical databases and collaborative work.

After all, chemlambda it’s an Open Science project. Means that it may scale, with chance.


[1] Digital-to-biological converter for on-demand production of biologics, Kent S Boles, Krishna Kannan, John Gill, Martina Felderman, Heather Gouvis, Bolyn Hubby, Kurt I Kamrud, J Craig Venter and Daniel G Gibson
see also Motherboard article

[2] The chemlambda repository README is the entry point to the project.

[3] Internet of Smells,

The Library of Alexandra

“Hint: Sci-Hub was created to open papers that are not available online at all. You cannot find these papers in Google or in open access” [tweet by @Sci_Hub]

“Public Resource will make extracts of the Library of Alexandra available shortly, will present the issues to publishers and governments.” [tweet by Carl Malamud]



More experiments with Open Science

I still don’t know which format is better for Open Science. I’m long past the article format for obvious reasons. Validation is a good word and concept because you don’t have to rely absolutely on opinions of others and that’s how the world works. This is not all the story though.

I am very fortunate to be a mathematician, not a biologist or biochemist. Still I long for the good format for Open Science, even if, as a mathematician, I don’t have the problems biologists or chemists have, namely loads and loads of experimental data and empirical approaches. I do have a world of my own to experiment with, where I do have loads of data and empirical constructs. My mind, my brain are real and I could understand myself by using tools of chemists and biologists to explore the outcomes of my research. Funny right? I can look at myself from the outside.

That is why  I chose to not jump directly to make Hydrogen, but instead to treat the chemlambda  world, again, as a guinea pig for Open Science.

There are 427 well written molecules in the chemlambda library of molecules on Github. There are 385 posts in the chemlambda collection on Google+, most of them with animations from simulations of those molecules. It is a world, how big is it?

It is easy to make first a one page direct access to the chemlambda collection. It is funnier to build a phylogenetic tree of the molecules, based on their genes. That’s what I am doing now, based on a work in progress.

Each molecule can be decomposed in “genes” say, by a sequencer program. Then one can use a distance between these genes to estimate first how they cluster and later to make a phylogenetic tree.

Here is the first heatmap (using the edit distance between single occurrences of genes in molecules) of the 427 molecules.


Is a screenshot, proving that my custom programs work 🙂 (one understands more by writing some scripts than by taking tools ready made from others, at least at this stage of research).

By using the edit distance I can map the explored chemlambda molecules. In the following image the 427 molecules from the library are represented as nodes and for each pair of molecules at an edit distance at most 20 there is a link. The nodes are in a central gravitational field, each node has the same charge and the links between nodes act as springs.


This is a screenshot of the result, showing clusters and trees, connecting them. Not very sophisticated, but enough to give a sense of the explored territory. In the curated collection, such a map would be useful to navigate through the molecules, as well as for giving ideas about which parts are not as well explored. I have not yet made clear which parts of the map cover lambda terms, which cover quines, etc.

Moreover, I see structure! The 427 molecules are made of copies of  605 different linear “genes” (i.e. sticks with colored ends)  and 38 ring shaped ones.  (Is easy to prove that lambda terms have no rings, when turned into molecules.) There are some interesting curved features visible in the edit distance of the sticks.


They don’t look random enough.

Is clear that a phylogenetic tree is in reach, then what else than connecting the G+ collection posts with the molecules used, arranged along the tree…?

Can I discover which molecules are coming from lambda terms?

Can I discover how my mind worked when building these molecules?

Which are the neglected sides, the blind places?

I hope to be able to tell by the numbers.

Which brings me to the main subject of this post: which is a good format for an Open Science piece of research?

Right now I am in between two variants, which may turn out to not be as different as they seem. An OS research vehicle could be:

  • like a viable living organism, literary
  • or like a viable world, literary.

Only the future will tell which is which. Maybe both!

Chemlambda will be curated

Chemlambda appeared out of frustration that nobody understands and see what I do, so I had to write it. The same with the chemlambda collection, I’ll curate it and put it in one easy to figure out place. It’s doable. The only fear I have about this is to be sucked again in this highly hallucinatory universe, which is almost real now and will be really real soon.

So for the moment here’s a page which allows you to go directly to any of the chemlambda collection post.

Maybe this will improve my karma so I’ll be prepared to do pure hydrogen. The initial trials look very promising and despite the apparent simplicity (what? make required mathematics, space, physics and this simple atom, all invoked from abstract nonsense like in a super geometric Lisp) the hydrogen project is more difficult because there is no precedent.

So wish me luck 🙂

Update the Panton Principles please

There is a big contradiction between the text of The Panton Principles and the List of the Recommended Conformant Licenses. It appears that it is intentional, I’ll explain in a moment why I write this.

This contradiction is very bad for the Open Science movement. That is why, please, update your principles.

Here is the evidence.

1. The second of the Panton Principles is:

“2. Many widely recognized licenses are not intended for, and are not appropriate for, data or collections of data. A variety of waivers and licenses that are designed for and appropriate for the treatment of data are described [here]( Creative Commons licenses (apart from CCZero), GFDL, GPL, BSD, etc are NOT appropriate for data and their use is STRONGLY discouraged.

*Use a recognized waiver or license that is appropriate for data.* ”

As you can see, the authors clearly state that “Creative Commons licenses (apart from CCZero) … are NOT appropriate for data and their use is STRONGLY discouraged.”

2. However, if you look at the List of Recommended Licenses, surprise:

Creative Commons Attribution Share-Alike 4.0 (CC-BY-SA-4.0) is recommended.

3. The CC-BY-SA-4.0 is important because it has a very clear anti-DRM part:

“You may not offer or impose any additional or different terms or conditions on, or apply any Effective Technological Measures to, the Licensed Material if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed Material.” [source CC 4.0 licence: in Section 2/Scope/a. Licence grant/5]

4. The anti-DRM is not a “must” in the Open Definition 2.1. Indeed, the Open Definition clearly uses “must” in some places and “may” in another places.  See

“2.2.6 Technical Restriction Prohibition

The license may require that distributions of the work remain free of any technical measures that would restrict the exercise of otherwise allowed rights. ”

5. I asked why is this here. Rufus Pollock, one of the authors of The Panton Principles and of the Open Definition 2.1, answered:

“Hi that’s quite simple: that’s about allowing licenses which have anti-DRM clauses. This is one of the few restrictions that an open license can have.”

My reply:

“Thanks Rufus Pollock but to me this looks like allowing as well any DRM clauses. Why don’t include a statement as clear as the one I quoted?”


“Marius: erm how do you read it that way? “The license may prohibit distribution of the work in a manner where technical measures impose restrictions on the exercise of otherwise allowed rights.”

That’s pretty clear: it allows licenses to prohibit DRM stuff – not to allow it. “[Open] Licenses may prohibit …. technical measures …”


“Marius: so are you saying your unhappy because the Definition fails to require that all “open licenses” explicitly prohibit DRM? That would seem a bit of a strong thing to require – its one thing to allow people to do that but its another to require it in every license. Remember the Definition is not a license but a set of principles (a standard if you like) that open works (data, content etc) and open licenses for data and content must conform to.”

I gather from this exchange that indeed the anti-DRM is not one of the main concerns!

6. So, until now, what do we have? Principles and definitions which aim to regulate what Open Data means which avoid to take an anti-DRM stance. In the same time they strongly discourage the use of an anti-DRM license like CC-BY-4.0. However, on a page which is not as visible they recommend, among others, CC-BY-4.0.

There is one thing to say: “you may use anti-DRM licenses for Open Data”. It means almost nothing, it’s up to you, not important for them. They write that all CC licenses excepting CCZero are bad! Notice that CC0 does not have anything anti-DRM.

Conclusion. This ambiguity has to be settled by the authors. Or not, is up to them. For me this is a strong signal that we witness one more attempt to tweak a well intended  movement for cloudy purposes.

The Open Definition 2.1. ends with:

Richard Stallman was the first to push the ideals of software freedom which we continue.

Don’t say, really? Maybe is the moment for a less ambiguous Free Science.

computing with space | open notebook


computing with space | open notebook


The Decentralised Internet is Here


An experimental 3d voxel rendering algorithm

Retraction Watch

Tracking retractions as a window into the scientific process

Gödel's Lost Letter and P=NP

a personal view of the theory of computation

%d bloggers like this: