Tag Archives: cost of knowledge

Which side you take: ASAPbio or Alphabet?

Times are changing fast and old time thinking dies hard and ugly. We have winners and losers.

Winners side: the #ASAPbio hashtag signals that biologists are ready to adopt the arXiv model of research communication.

“On Feb. 29, Carol Greider of Johns Hopkins University became the third Nobel Prize laureate biologist in a month to do something long considered taboo among biomedical researchers: She posted a report of her recent discoveries to a publicly accessible website, bioRxiv, before submitting it to a scholarly journal to review for “official’’ publication.” [source: NYTimes]

OK, many people are still confused about “preprints”, mainly because the fake open movement Gold OA enshrined this distinction preprint-postprint into the brains of honest researchers looking for Internet Age ways of communication. The goal was, without doubt, to throw a shade over the well known arXiv model (which used the name “eprints” before Gold OA was a thing). And to preserve for a while longer the obsolete print business (as witnessed by their immediate association with the legacy publishers against Sci-Hub). But now, with bioRxiv, there seems to be a historical shift.

Biologists are already leaders into imagining ways to share and validate big volumes of data. They are certainly aware that sharing all data, aka Open Science, is a necessary part of the scientific method.

The scientific publishing industry, a largely useless business in the Internet Age, is the only loser in this story.

Losers side: Alphabet, the mother company of Google, sells Boston Dynamics. To be clear, BD is one of those research groups who does real impact, hard research. According to Bloomberg:

“Executives at Google parent Alphabet Inc., absorbed with making sure all the various companies under its corporate umbrella have plans to generate real revenue, concluded that Boston Dynamics isn’t likely to produce a marketable product in the next few years and have put the unit up for sale, according to two people familiar with the company’s plans.”

I am sure that there are all sort of reasons for this move. Short term reasons.

Why is this important, for researchers: because it shows that it is time to seriously acknowledge that, technically:

  • we need more data and access to research data, not more filters and bottlenecks in the way of research communication
  • and research data on the Internet is a particular kind of big data. Not even very big compared with the really big data collected, archived and used today by big commercial companies. It is technically possible to manage it by those who are the most interested, the researchers. This is not a new idea. Besides arXiv, and now bioRxiv,  see for example Bjorn Brembs  calls for a modern scientifc infrastructure.
  • Big commercial companies are not reliable for that. At any point they might dump us. Social media venues for research news and discussions are great, but the infrastructure is not to be trusted when it comes to the management of research data.

 

 

Sci-Hub is not tiny, nor special interest

“Last year, the tiny special-interest academic-paper search-engine Sci-Hub was trundling along in the shadows, unnoticed by almost everyone.” [source: SW-POW!, Barbra Streisand, Elsevier, and Sci-Hub]

According to the info available in the article Meet the Robin Hood of science, by Simon Oxenham:

[Sci-Hub] “works in two stages, firstly by attempting to download a copy from the LibGen database of pirated content, which opened its doors to academic papers in 2012 and now contains over 48 million scientific papers.”

“The ingenious part of the system is that if LibGen does not already have a copy of the paper, Sci-hub bypasses the journal paywall in real time by using access keys donated by academics lucky enough to study at institutions with an adequate range of subscriptions. This allows Sci-Hub to route the user straight to the paper through publishers such as JSTOR, Springer, Sage, and Elsevier. After delivering the paper to the user within seconds, Sci-Hub donates a copy of the paper to LibGen for good measure, where it will be stored forever, accessible by everyone and anyone. ”

“As the number of papers in the LibGen database expands, the frequency with which Sci-Hub has to dip into publishers’ repositories falls and consequently the risk of Sci-Hub triggering its alarm bells becomes ever smaller. Elbakyan explains, “We have already downloaded most paywalled articles to the library … we have almost everything!” This may well be no exaggeration.”

Is that tiny? I don’t think so. I have near me the comparisons I made in
ArXiv is 3 times bigger than all megajournals taken together and, if we would trust the publicly available numbers, then:

  • Sci-Hub is tiny
  • arXiv.org is minuscule with about 1/40 of what (is declared as) available in Sci-Hub
  • all the gold OA journals have no more than 1/100 of the “tiny” baseline, therefore they are, taken together, infinitesimal

Do i feel a dash of envy? subtle spin in favor of gold OA? maybe because Alexandra Elbakyan is from Kazakhstan? More likely is only an unfortunate formulation, but the thing is that if this info is true, then it’s huge.

UPDATE: putting aside all legal aspects, where I’m not competent to have an opinion, so putting aside these, it appears that the 48 million collection of paywalled articles is the result of the collective behaviour of individuals who “donated” (or whatever the correct word should be used) them.

My opinion is that this collective behaviour shows a massive vote against the system. Is not even intended to be a vote, people (i.e. individual researchers) just help one another. Compare this behaviour with the one of academic managers and with the one of all kinds of institutions which a) manage public funds and negociate prices with publishers, b) use metrics which are based on commercial publishers for distributing public funds as grants and promotions.

On one side there is the reality of individual researchers, who create and want to read what others like them created (from public funds basically) and on the other side there is this system in academia which rewards the compliance with this obsolete medium of dissemination of knowledge (presently turned upside down and replaced with a  system which puts paywalls around the research articles, it’s amazing).

Of course, I am not discussing here if Sci-hub is legal, or if commercial publishers are doing anything wrong from a legal point of view.

All this seems to me very close to the disconnection between politicians and regular people. These academic managers are like politicians now, the system ignores that it is possible to gauge the real opinion of people, almost in real time, and instead pretends that everything is OK, on paper.

 

____________________

How I hit a wall when I used the open access and open source practices when applying for a job

UPDATE 11.10.2015. What happened since the beginning of the “contest”? Nothing. My guess is that they are going to follow the exact literary sense of their announcement. It is a classic sign of cronyism. They write 3 times that they are going to judge according to the file submitted (the activity of the candidate as it looks from the file), but they don’t give other criteria than the ones from an old law. In my case I satisfy these criteria, of course, but later on they write about “candidates considered eligible”, which literary means candidates that an anonymous board considers they are eligible and not simply eligible according to the mentioned criteria.

Conclusion: this is not news, is dog bites man.

I may be wrong. But in the case I’m right then the main subject (namely what happens in a real situation with open access practices in case of a job opening) looks like a frivolous, alien complaint.

The split between:
– a healthy, imaginative, looking to the future community of individuals and
– a kafkian old world of bureaucratic cronies
is growing bigger here in my country.

__________

UPDATE 14.10.2015: Suppositions confirmed. The results have been announced today, only verbally, the rest is shrouded in mystery. Absolutely no surprise. Indeed, faced with the reality of local management, my comments about open access and open source practices are like talking about a TV show to cavemen.

Not news.

There is a statement I want to make, for those who read this and have only access to info about Romanians from the media, which is, sadly, almost entirely negative.

It would be misleading to judge the local mathematicians (or other creative people, say) from these sources. There is nothing wrong with many Romanian people. On the contrary, these practices which show textbook signs of corruption are typical for the managers of state institutions from this country. They are to be blamed. What you see in the media is the effect of the usual handshake between bad leadership and poverty.

Which sadly manifest everywhere in the state institutions of Romania, in ways far beyond the ridicule.

So next time when you shall interact with one such manager, don’t forget who they are and what they are really doing.

I am not going to pursue a crusade against corruption in Romania, because I have better things to do. Maybe I’m wrong and what is missing is more people doing exactly this. But the effects of corrupt practices is that the state institution becomes weaker and weaker. So, by psycho historic reasons 🙂 there is no need for a fight with dying institutions.

Let’s look to the future, let’s do interesting stuff!

________________________

This is real: there are job openings at the Institute of Mathematics of the Romanian academy, announced by the pdf file

http://www.imar.ro/~cjoita/IMAR/Concurs-anunt-2015.pdf

The announce is in Romanian but you may notice that they refer to a law from 2003, which asks for a CV, research memoire, list of publications and ten documents, from kindergarden to PhD. On paper.

That is only the ridicule of bureaucracy, but the real problems were somewhere else.

There is no mention of criteria of selection, members of the committee, but in the announcement is written 3 times that every candidate’s work will be considered only as it appears from looking at the file submitted.

They also ask that the scientific, say, part of the submission to be sent by email to two addresses which you can grasp from the announcement.

So I did all the work and I hit a wall when I submitted by email.

I sent them the following links:

– my homepage which has all the info needed (including links to all relevant work)
http://imar.ro/~mbuliga/

– link to my arxiv articles
http://arxiv.org/a/buliga_m_1
because all my published articles and all my cited articles, published or not) are available at arXiv

– link to the chemlambda repository for the programming, demos, etc part
https://github.com/chorasimilarity/chemlambda-gui/blob/gh-pages/dynamic/README.md

I was satisfied because I finished this, when I got a message from DanTimotin@imar.ro telling me that I have to send them, as attachment, the pdf files of at least 5 relevant articles.

In the paper file I put 20+ of these articles (selected from 60+), but they wanted also the pdf files.

I don’t have the pdfs of many legacy published articles because they are useless for open access, you can’t distribute them publicly.
Moreover I keep the relevant work I do as open as possible.

Finally, how could I send the content of the github repository? Or the demos?

So I replied by protesting about the artificial difference he makes between a link and the content available at that link and I sent a selection of 20 articles with links to their arXiv version.

He replied by a message where he announced that if I want my submission to be considered then I have to send 5 pdfs attached.

I visited physically Dan Timotin to talk and to understand why a link is different from the content available to that link.

He told me that these are the rules.

He told that he is going to send the pdfs to the members of the committees and it might happen that they don’t have access to the net when they look for the work of the candidate.

He told me that they can’t be sure that the arXiv version is the same as the published version.

He has nothing to say about the programming/demo/animations part.

He told that nobody will read the paper file.

I asked if he is OK if I make public this weird practice and he agreed to that.

Going back to my office, I arrived to find 9 pdfs of the published articles. In many other cases my institute does not have a subscription to journals where my articles appeared, so I don’t think that is fair to be asked to buy back my work, only because of the whims of one person.

Therefore I sent to Dan Timotin a last message where I attached these 9 pdfs, I explained that I can’t access the others, but I firmly demand that all the links sent previously to be sent to the (mysterious, anonymous, net deprived, and lacking public criteria) committee, otherwise I would consider this an abuse.

I wrote that I regret this useless discussion provoked by the lack of transparency and by the hiding behind an old law, which should not stop a committee of mathematicians to judge the work of a candidate as it is, and not as it appears by an abuse of filtering.

After a couple of hours he replied that he will send the files and the links to the members of the committee.

I have to believe his word.

That is what happens, in practice, with open access and open science, at least in some places.

What could be done?

Should I wait for the last bureaucrat to stop supporting passively the publishing industry, by actively opposing open access practices?

Should I wait for all politicians to pass fake PhDs under the supervision of a very complacent local Academia?

Should I feel ashamed of being abused?

ArXiv is 3 times bigger than all megajournals taken together

 How big are the “megajournals” compared to arXiv?
I use data from the article

[1] Have the “mega-journals” reached the limits to growth? by Bo-Christer Björk ​https://dx.doi.org/10.7717/peerj.981 , table 3

and the arXiv monthly submission rates

[2] http://arxiv.org/stats/monthly_submissions

To have a clear comparison I shall look at the window 2010-2014.

Before showing the numbers, there are some things to add.

1.  I saw the article [1] via the post by +Mike Taylor

[3] Have we reached Peak Megajournal? http://svpow.com/2015/05/29/have-we-reached-peak-megajournal/

I invite you to read it, it is interesting as usual.

2. Usually, the activity of counting articles is that dumb thing which is used by managers to hide behind, in order to not be accountable for their decisions.
Counting  articles is a very lossy compression technique, which associates to an article a very small number of bits.
I indulged into this activity because of the discussions from the G+ post

[4] https://plus.google.com/+MariusBuliga/posts/efzia2KxVzo

and its clone

[4′] Eisen’ “parasitic green OA” is the apt name for Harnad’ flawed definition of green OA, but all that is old timers disputes, the future is here and different than both green and gold OA https://chorasimilarity.wordpress.com/2015/05/28/eisen-parasitic-green-oa-is-the-apt-name-for-harnad-flawed-definition-of-green-oa-but-all-that-is-old-timers-disputes-the-future-is-here-and-different-than-both-green-and-gold-oa/

These discussions made me realize that the arXiv model is carefully edited out from reality by the creators and core supporters of green OA and gold OA.

[see more about in the G+ variant of the post https://plus.google.com/+MariusBuliga/posts/RY8wSk3wA3c ]
Now, let’s see those numbers. Just how big is that arXiv thing compared to “megajournals”?

From [1]  the total number of articles per year for “megajournals” is

2010:  6,913
2011:  14,521
2012:   25,923
2013:  37,525
2014:  37,794
2015:  33,872

(for 2015 the number represents  “the articles published in the first quarter of the year multiplied by four” [1])

ArXiv: (based on counting the monthly submissions listed in [2])

2010: 70,131
2011: 76,578
2012: 84,603
2013: 92,641
2014:  97,517
2015:  100,628  (by the same procedure as in [1])

This shows that arXiv is 3 times bigger than all the megajournals at once, despite that:
– it is not a publisher
– does not ask for APC
– it covers fields far less attractive and prolific than the megajournals.

And that is because:
– arxiv answers to a real demand from researchers, to communicate fast and reliable their work to their fellows, in a way which respects their authorship
– also a reaction of support for what most of them think is “green OA”, namely to put their work there where is away from the publishers locks.

_____________________________________

Eisen’ “parasitic green OA” is the apt name for Harnad’ flawed definition of green OA, but all that is old timers disputes, the future is here and different than both green and gold OA

See this post and the replies on G+ at https://plus.google.com/+MariusBuliga/posts/efzia2KxVzo.

My short description of the situation: the future is here, and it is not gold OA (nor the flawed green OA definition which ignores arXiv). So, visually:

imageedit_34_6157098125

It has never occurred to me that putting an article in a visible place (like arXiv.org) is parasitic green OA+Michael B. Eisen  calls it parasitic because he supposes that this has to come along with the real publication. But what if not?

[Added: Eisen writes in the body of the post that he uses the definition given by Harnad to green OA, which ignores the reality. It is very conveniently for gold OA to have a definition of green OA which does not apply to the oldest (1991) and fully functional example of a research communication experiment which is OA and green: the arXiv.org.]
Then, compared to that, gold OA appears as a progress.
http://www.michaeleisen.org/blog/?p=1710

I think gold OA, in the best of cases, is a waste of money for nothing.

A more future oriented reply has +Mike Taylor
http://svpow.com/2015/05/26/green-and-gold-the-possible-futures-of-open-access/
who sees two possible futures, green (without the assumption from Eisen post) and gold.

I think that the future comes faster. It is already here.

Relax. Try validation instead peer review. Is more scientific.

Definition. Peer-reviewed article: published by the man who saw the man who claims to have read it, but does not back the claim with his name.

The reviewers are not supermen. They use the information from the traditional article. The only thing they are supposed to do is that they read it. This is what they use to give their approval stamp.

Validation means that the article provides enough means so that the readers can reproduce the research by themselves. This is almost impossible with  an article in the format inherited from the time when it was printed on paper. But when the article is replaced by a program which runs in the browser, which uses databases, simulations, whatever means which facilitate the validation, then the reader can, if he so wishes, make a scientific motivated opinion about this.

Practically the future has come already and we see it on Github. Today. Non-exclusively. Tomorrow? Who knows?

Going back to the green-gold OA dispute, and Elsevier recent change of sharing and hosting articles (which of course should have been the real subject of discussions, instead of waxing poetic about OA, only a straw man).

This is not even interesting. The discussion about OA revolves around who has the copyright and who pays (for nothing).

I would be curious to see discussions about DRM, who cares who has the copyright?

But then I realised that, as I wrote at the beginning of the post, the future is here.

Here to invent it. Open for everybody.

I took the image from this post by +Ivan Pierre and modified the text.
https://plus.google.com/+IvanPierreKilroySoft/posts/BiPbePuHxiH

_____________

Don’t forget to read the replies from the G+ post.

____________________________________________________

Screen recording of the reading experience of an article which runs in the browser

The title probably needs parsing:

SCREEN RECORDING {

READING {

PROGRAM EXECUTION {

RESEARCH ARTICLE }}}

An article which runs in the browser is a program (ex. html and javascript)  which is executed by the browser. The reader has access to the article as a program, to the data and other programs which have been used for producing the article, to all other articles which are cited.

The reader becomes the reviewer. The reader can validate, if he wishes, any piece of research which is communicated in the article.

The reader can see or interact with the research communicated. By having access to the data and programs which have been used, the reader can produce other instances of the same research (i.e virtual experiments).

In the case of the article presented as an example, embedded in the article are animations of the Ackermann function computation and the other concerning the building of a molecular structure. These are produced by using an algorithm which has randomness in the composition, therefore the reader may produce OTHER instances of these examples, which may or may not be consistent with the text from the article. The reader may change parameters or produce completely new virtual experiments, or he may use the programs as part of the toolbox for another piece of research.

The experience of the reader is therefore:

  • unique, because of the complete freedom to browse, validate, produce, experiment
  • not limited to reading
  • active, not passive
  • leading to trust, in the sense that the reader does not have to rely on hearsay from anonymous reviewers

In the following video there is a screen recording of these possibilities, done for the article

M. Buliga, Molecular computers, 2015, http://chorasimilarity.github.io/chemlambda-gui/dynamic/molecular.html

This is the future of research communication.

____________________________________________________________

It is time to cast doubt on any peer reviewed but not validated research article

Any peer reviewed article which does not come at least with the reviews has only a social validation. With reviews which contain only value judgements, grammar corrections and impossible to validate assertions, there is not much more trust added.

As to the recourse to experts… what are we, a guild of wizards? It is true because somebody says some anonymous experts have  been consulted and they say it’s right or wrong?

Would you take a pill based on the opinion of an anonymous expert that it cures your disease?

Would you fly in a plane whose flight characteristics have been validated by the hear-say of unaccountable anonymous experts?

What is more than laughable is that it seems that mathematics is the field with the most wizards, full of experts who are willingly exchanging private value opinions, but who are reluctant to make them in public.

Case by case, building on concrete examples, in an incremental manner, it is possible to write articles which can be validated by using the means they provide (and any other available), by anyone willing to do it.

It is time to renounce at this wizardry called peer review and to pass to a more rigorous approach.

Hard, but possible. Of course that the wizards will complain. After all they are in material conflict of interests, because they are both goalkeepers and arbiters, both in academic and editorial committees.

But again, why should we be happy with “it’s worthy of publication or not because I say so, but do not mention my name” when there is validation possible?

The wizardry costs money, directed to compliant students, produces no progress elsewhere than in the management metrics, kills or stalls research fields where the advance is made harder than it should because of the mediocrity of these high, but oh so shy in public experts who are where they are because in their young time the world was more welcoming with researchers.

Enough!

_____________________________________________________________