Tag Archives: arXiv

The price of publishing with GitHub, Figshare, G+, etc

Three years ago I posted The price of publishing with arXiv. If you look at my arXiv articles then you’ll notice that I barely posted on arXiv.org since then. Instead I went into territory which is even less recognized as serious by a big part of academia. I used:

The effects of this choice are put in front of my homepage, so go there to read them. (Besides, it is a good exercise to remember how to click on links and use them, that lost art from the age when internet was free.)

In this post I want to explain what is the price I paid for these choices and what I think now about them.

First, it is a very stressful way of living. I am not joking, as you know stress comes from realizing that there are many choices and one has to choose. Random reward from the social media is addictive. The discovery that there is a way to get out from the situation which keeps us locked into the legacy publishing system (validation). The realization that the problem is not technical but social. A much more cynical view of the undercurrents of the social life of researchers.

The feeling that I can really change the world with my research. The worries that some possible changes might be very dangerous.

The debt I owe concerning the scarcity of my explanations. The effort to show only the aspects I think are relevant, putting aside those who are not. (Btw, if you look at my About page then you’ll read “This blog contains ideas from the future”. It is true because I already pruned the 99% of the paths leading nowhere interesting.)

The desire to go much deeper, the desire to explain once again what and why, to people who seem either lacking long term attention capability or having shallow pet theories.

Is like fishing for Moby Dick.

Which side you take: ASAPbio or Alphabet?

Times are changing fast and old time thinking dies hard and ugly. We have winners and losers.

Winners side: the #ASAPbio hashtag signals that biologists are ready to adopt the arXiv model of research communication.

“On Feb. 29, Carol Greider of Johns Hopkins University became the third Nobel Prize laureate biologist in a month to do something long considered taboo among biomedical researchers: She posted a report of her recent discoveries to a publicly accessible website, bioRxiv, before submitting it to a scholarly journal to review for “official’’ publication.” [source: NYTimes]

OK, many people are still confused about “preprints”, mainly because the fake open movement Gold OA enshrined this distinction preprint-postprint into the brains of honest researchers looking for Internet Age ways of communication. The goal was, without doubt, to throw a shade over the well known arXiv model (which used the name “eprints” before Gold OA was a thing). And to preserve for a while longer the obsolete print business (as witnessed by their immediate association with the legacy publishers against Sci-Hub). But now, with bioRxiv, there seems to be a historical shift.

Biologists are already leaders into imagining ways to share and validate big volumes of data. They are certainly aware that sharing all data, aka Open Science, is a necessary part of the scientific method.

The scientific publishing industry, a largely useless business in the Internet Age, is the only loser in this story.

Losers side: Alphabet, the mother company of Google, sells Boston Dynamics. To be clear, BD is one of those research groups who does real impact, hard research. According to Bloomberg:

“Executives at Google parent Alphabet Inc., absorbed with making sure all the various companies under its corporate umbrella have plans to generate real revenue, concluded that Boston Dynamics isn’t likely to produce a marketable product in the next few years and have put the unit up for sale, according to two people familiar with the company’s plans.”

I am sure that there are all sort of reasons for this move. Short term reasons.

Why is this important, for researchers: because it shows that it is time to seriously acknowledge that, technically:

  • we need more data and access to research data, not more filters and bottlenecks in the way of research communication
  • and research data on the Internet is a particular kind of big data. Not even very big compared with the really big data collected, archived and used today by big commercial companies. It is technically possible to manage it by those who are the most interested, the researchers. This is not a new idea. Besides arXiv, and now bioRxiv,  see for example Bjorn Brembs  calls for a modern scientifc infrastructure.
  • Big commercial companies are not reliable for that. At any point they might dump us. Social media venues for research news and discussions are great, but the infrastructure is not to be trusted when it comes to the management of research data.



ArXiv is 3 times bigger than all megajournals taken together

 How big are the “megajournals” compared to arXiv?
I use data from the article

[1] Have the “mega-journals” reached the limits to growth? by Bo-Christer Björk ​https://dx.doi.org/10.7717/peerj.981 , table 3

and the arXiv monthly submission rates

[2] http://arxiv.org/stats/monthly_submissions

To have a clear comparison I shall look at the window 2010-2014.

Before showing the numbers, there are some things to add.

1.  I saw the article [1] via the post by +Mike Taylor

[3] Have we reached Peak Megajournal? http://svpow.com/2015/05/29/have-we-reached-peak-megajournal/

I invite you to read it, it is interesting as usual.

2. Usually, the activity of counting articles is that dumb thing which is used by managers to hide behind, in order to not be accountable for their decisions.
Counting  articles is a very lossy compression technique, which associates to an article a very small number of bits.
I indulged into this activity because of the discussions from the G+ post

[4] https://plus.google.com/+MariusBuliga/posts/efzia2KxVzo

and its clone

[4′] Eisen’ “parasitic green OA” is the apt name for Harnad’ flawed definition of green OA, but all that is old timers disputes, the future is here and different than both green and gold OA https://chorasimilarity.wordpress.com/2015/05/28/eisen-parasitic-green-oa-is-the-apt-name-for-harnad-flawed-definition-of-green-oa-but-all-that-is-old-timers-disputes-the-future-is-here-and-different-than-both-green-and-gold-oa/

These discussions made me realize that the arXiv model is carefully edited out from reality by the creators and core supporters of green OA and gold OA.

[see more about in the G+ variant of the post https://plus.google.com/+MariusBuliga/posts/RY8wSk3wA3c ]
Now, let’s see those numbers. Just how big is that arXiv thing compared to “megajournals”?

From [1]  the total number of articles per year for “megajournals” is

2010:  6,913
2011:  14,521
2012:   25,923
2013:  37,525
2014:  37,794
2015:  33,872

(for 2015 the number represents  “the articles published in the first quarter of the year multiplied by four” [1])

ArXiv: (based on counting the monthly submissions listed in [2])

2010: 70,131
2011: 76,578
2012: 84,603
2013: 92,641
2014:  97,517
2015:  100,628  (by the same procedure as in [1])

This shows that arXiv is 3 times bigger than all the megajournals at once, despite that:
– it is not a publisher
– does not ask for APC
– it covers fields far less attractive and prolific than the megajournals.

And that is because:
– arxiv answers to a real demand from researchers, to communicate fast and reliable their work to their fellows, in a way which respects their authorship
– also a reaction of support for what most of them think is “green OA”, namely to put their work there where is away from the publishers locks.


Github laudatio: negative Coase cost

This is a record of a mind changing experience I had with Github, one which will manifest in the months to come. (Well, it’s time to move on, to move further, new experiences await, I like to do this…)

Here is not more than what is in this ephemeral google+ post, but is enough to get the idea.

And it’s controversial, although obvious.

“I  just got hooked by github.io . Has everything, is a dream came true. Publishing? arXiv? pfff…. I know, everybody knows this already, let me enjoy the thought, for the moment. Then it will be some action.

Continuing with github and publishing, this is a worthy subject (although I believe that practically github already dwarfed legacy publishing, academia and arXiv). Here is an excerpt from a post from 2011
“- Publishing is central to Academia, but its publishing system is outclassed by what Open Source software developers have in GitHub

– GitHub’s success is not just about openness, but also a prestige economy that rewards valuable content producers with credit and attention

-Open Science efforts like arXiv and PLoS ONE should follow GitHub’s lead and embrace the social web”

I am aware about the many efforts about publishing via github, I only wonder if that’s not like putting a horse in front of a rocket.

On the other side, there is so much to do, now that I feel I’ve seen rock solid proof that academia, publishing and all that jazz is walking dead, with the last drops of arterial blood splatting around from the headless body. “


Negative Coase cost?



The price of publishing with arXiv

This is a very personal post. It is emotionally triggered by looking at this old question  Downsides of using the arXiv and by reading the recent The coming Calculus MOOC Revolution and the end of math research.

What I think? That a more realistic reason for a possible end (read: shrinking) of math research comes from  thinking  that there are any downsides of using the arXiv. That there are any downsides of using an open peer review system. It comes from those who are moderately in favour of open research until they participate into a committee or until it comes to protecting their own little church from strange ideas.

And from others, an army of good but not especially creative researchers, a high mediocracy (high because selected, however) who will probably sink research for a time, because on the long term a lot of mediocre research results add to noise. But on the short term, this is a very good business: write many mediocre, correct articles, hide them behind a paywall and influence the research policy to favour the number (and not the content) of those.

What I think  is that will happen exactly like it happened with the academic painters, a while ago.

You know that I’m right.

Now, because the net is not subtle enough, in order to show you that indeed, these people are right from a social point of view, to say that there is a price for not behaving as they expect, indulge me to explain what was the price which I paid for using the arXiv as the principal means of publication.

The advantage: I had a lot of fun. I wrote articles which contain more than one idea, or which use more than one field of research. I wrote articles on subjects which genuinely interest me, or articles which contain more questions than answers. I wrote articles which were not especially designed to solve problems, but to open ones. I changed fields, once about 3-4 years.

The price: I was told that I don’t have enough published articles. I lost a lot of cites, either because the citation was incorrectly done, or because the databases (like ISI) don’t count well those (not that I care, really). Because I change fields (for those who know me, it’s clear that I don’t do this randomly, but because there are connections between fields) I seem to come from nowhere and go nowhere. Socially, and professionally, is very bad for the career to do what I did. Most of the articles I sent for publication (to legacy publishers) have spent incredible amounts of time there and most of the refusals were of the type “seems OK but maybe another journal” or “is OK but our journal …”. I am incredibly (i.e. the null hypothesis statistically incredible) unlucky to publish in legacy journals.

But, let me stress this, I survived. And I still have lots of ideas, better than before, and I’m using dissemination tools (like this blog) and I am still having a lot of fun.

So, it’s your choice: recall why you have started to do research, what dreams you had. I don’t believe you that you dreamed, as a kid, to write a lot of ISI papers about a lot of arcane problems of others, in order to attract grant financing from bureaucrats who count what is your social influence.