(information from the Data CampL Jupyter And R Markdown)
In his talk, J.J Allaire, confirms that the efforts in R
itself for reproducible research, the efforts of Emacs
to combine text code and input, the Pandoc
, Markdown
and knitr
projects, and computational notebooks have been evolving in parallel and influencing each other for a lot of years. He confirms that all of these factors have eventually led to the creation and development of notebooks for R
.
Firstly, computational notebooks have quite a history: since the late 80s, when Mathematica’s front end was released, there have been a lot of advancements. In 2001, Fernando Pérez started developing IPython, but only in 2011 the team released the 0.12 version of IPython was realized. The SageMath project began in 2004. After that, there have been many notebooks. The most notable ones for the data science community are the Beaker (2013), Jupyter (2014) and Apache Zeppelin (2015).
Then, there are also the markup languages and text editors that have influenced the creation of RStudio’s notebook application, namely, Emacs, Markdown, and Pandoc. Org-mode was released in 2003. It’s an editing and organizing mode for notes, planning and authoring in the free software text editor Emacs. Six years later, Emacs org-R was there to provide support for R users. Markdown, on the other hand, was released in 2004 as a markup language that allows you to format your plain text in such a way that it can be converted to HTML or other formats. Fast forward another couple of years, and Pandoc was released. It’s a writing tool and as a basis for publishing workflows.
Lastly, the efforts of the R community to make sure that research can be reproducible and transparent have also contributed to the rise of a notebook for R. 2002, Sweave was introduced in 2002 to allow the embedding of R code within LaTeX documents to generate PDF files. These pdf files combined the narrative and analysis, graphics, code, and the results of computations. Ten years later, knitr was developed to solve long-standing problems in Sweave and to combine features that were present in other add-on packages into one single package. It’s a transparent engine for dynamic report generation in R. Knitr allows any input languages and any output markup languages.
Also in 2012, R Markdown was created as a variant of Markdown that can embed R code chunks and that can be used with knitr to create reproducible web-based reports. The big advantage was and still is that it isn’t necessary anymore to use LaTex, which has a learning curve to learn and use. The syntax of R Markdown is very similar to the regular Markdown syntax but does have some tweaks to it, as you can include, for example, LaTex equations.
The OSF gives you free accounts for collaboration around files and other research artifacts.
Create an account and log in at: http://osf.io/
Each account can have up to 5 GB of files without any problem, and it remains private until you make it public.
You can add other authorized users as well.
Once you are ready to make the project public, you can do so in two ways:
You can also cut DOIs for resources so that you can put them in papers.
The OSF is archival (they have a sustainability fund that guarantees their data will remain around for something like 30 years) and many publications will accept them as a place to make data public if you use a registration as above.
OSF Meetings - free poster and presentation sharing
OSF for institutions - branded for institutions, with logins/passwords tied into your institutional login.
The Dryad Digital Repository (aka Dryad) is a curated resource that makes the data underlying scientific publications discoverable, freely reusable, and citable. Dryad provides a general-purpose home for a wide diversity of datatypes.
Dryad’s vision is to promote a world where research data is openly available, integrated with the scholarly literature, and routinely re-used to create knowledge. They provide the infrastructure for, and promote the re-use of, data underlying the scholarly literature.
Dryad is governed by a nonprofit membership organization. Membership is open to any stakeholder organization, including but not limited to journals, scientific societies, publishers, research institutions, libraries, and funding organizations.
Publishers are encouraged to facilitate data archiving by coordinating the submission of manuscripts with submission of data to Dryad. Learn more about submission integration.
Dryad originated from an initiative among a group of leading journals and scientific societies in evolutionary biology and ecology to adopt a joint data archiving policy (JDAP) for their publications, and the recognition that easy-to-use, sustainable, community-governed data infrastructure was needed to support such a policy.
Zenodo is derived from Zenodotus, the first librarian of the Ancient Library of Alexandria and father of the first recorded use of metadata, a landmark in library history.
The OpenAIRE project, in the vanguard of the open access and open data movements in Europe was commissioned by the EC to support their nascent Open Data policy by providing a catch-all repository for EC funded research. CERN, an OpenAIRE partner and pioneer in open source, open access and open data, provided this capability and Zenodo was launched in May 2013.