Jupyter notebook is an interactive web application that allows you to type and edit lines of code and see the output. The software requires Python installation, but currently supports interaction with over 40 languages.
Literate programming is the basic idea behind dynamic documents and was proposed by Donald Knuth in 1984. Originally, it was for mixing the source code and documentation of software development together. Today, we will create dynamic documents in which program or analysis code is run to produce output (e.g. tables, plots, models, etc) and then are explained through narrative writing.
The 3 steps of Literate Programming:
So that leaves us, the writers, with 2 steps which includes writing:
Note #1: R Notebooks and Jupyter notebooks are very similar! They are two sides of the same coin. We suggest that you adopt which ever one makes more sense to you and is in a layout that has a lower barrier for you to learn.
Markdown is a system for writing simple, readable text that is easily converted to HTML. Markdown essentially is two things:
Main goal of Markdown:Make the syntax of the raw (pre-HTML) document as readable possible.
Would you rather read this code in HTML?
<body>
<section>
<h1>Fresh Berry Salad Recipe</h1>
<ul>
<li>Blueberries</li>
<li>Strawberries</li>
<li>Blackberries</li>
<li>Raspberries</li>
</ul>
</section>
</body>
Or this code in Markdown?
# Fresh Berry Salad Recipe
* Blueberries
* Strawberries
* Blackberries
* Raspberries
If you are human, the Markdown code is definitely easier to read! Let us take a moment to soak in how much easier our lives are/will be because Markdown exists! Thank you John Gruber and Aaron Swartz (RIP) for creating Markdown in 2004!
In this case, “notebook” or “notebook documents” denote documents that contain both code and rich text elements, such as figures, links, equations, etc. Because of the mix of code and text elements, these documents are the ideal place to bring together an analysis description and its results as well as they can be executed perform the data analysis in real time. These documents are produced by the Jupyter Notebook App.
As a fun note, “Jupyter” is a loose acronym meaning Julia, Python, and R. These programming languages were the first target languages of the Jupyter application, but nowadays, the notebook technology also supports many other languages.
The main components of the whole Jupyter environment are, on one hand, the notebooks themselves and the application. On the other hand, you also have a notebook kernel (that is the language interpreter that will be executing the code in the background) and a notebook dashboard.
And there you have it: the Jupyter Notebook - there are also several examples of Jupyter notebooks that you can see/browse.
There are the official and detailed installation notes, and you can also have a quick look at step-by-step guide and some references here.
Generally, you’ll need to install Python (which is a prerequisite). The general recommendation is that you use the Anaconda distribution to install both Python and the notebook application.
After installation, the only thing necessary is to actually start the notebook. This can be done at command line using the following command:
jupyter notebook
After running the command, you will see a bunch of information on the command line window, and at the same time, a new page will open on your browser that will look like the following:
There are three main tabs Files
, Running
and Clusters
. You’ll be mostly using the first two (when not in the actual notebook):
Files
: is the listing of your current working directory. When you first launch the notebook, the directory is the same where you launched the app.Running
: is a list of all active notebooks, i.e. notebooks that have been running commands through one of the available kernels.Clusters
: this is a listing of all clusters that are available for a back-end execution (will be empty, unless you have connected the Jupyter Notebook app to a cluster)Creating a Notebook is as straightforward as clicking on the New
button on the top right, and selecting the kernel (i.e. the engine that will be interpreting our commands).
Note Jupyter really shines for Python and Julia notebooks. R users usually go the RMarkdown, which is much more optimized for R (as opposed to Jupyter). Eventually however, it all comes down to personal preference (or lab inheritance…)
All notebooks look like this in the beginning:
You’ll notice the following points:
In []
section in the middle. This is called a cell
and is essentially an interface where you can put your code, text, markdown, etc. Simply put, every cell that is an input is marked as In
followed by an increasing number that corresponds to the relative order that the particular cell was executed.After writing some code/text in the currently active cell, the main keyboard command to remember is how to execute the code.
Shift-Enter
: Executes the code and creates a new cell underneath.Ctrl-Enter
: Executes the code without creating a new cell.Type the following code into a cell and then hit Shift-Enter
.
2 + 3
You will see the following screen (or similar):
First of all, you may have briefly seen an asterisk after In
, i.e. In [ * ]
. The asterisk means that the kernel is currently trying to run the code, so you should be waiting for the output. After successful execution, the *
will change to the next number of the Cell (1
in our instance), and the output of the command will be visible below (5
in our case). Finally, as we executed the code with Shift-Enter
, a brand new cell has been created for us.
At this point, we can rename our Notebook, by clicking on the Untitled
entry, and let’s rename it to Jupyter-is-fun
Well done! You’ve just created you first Jupyter Notebook!
A Jupyter Notebook can support multiple Kernels at the same time. Let’s try and run a cell using Python 3. Change the cell type to Python 3
(or Python 2
if you have a different version installed), and type:
print("Hello World!")
(if you’re in Python 2, type print "Hello World!"
)
You’ll notice that the output now is Hello World!
, while at the same time, the numbering of the cell has reset to 1
- this is because each kernel recognizes it’s own cells with the order of execution.
Finally, let’s do a quick loop:
elements = ['oxygen', 'nitrogen', 'argon']
for char in elements:
print(char)
Here, we have created a list of 3 elements, and we assigned the list to a variable aptly named elements
. Next, we created a loop
structure, where each of the elements in the elements
list is assigned to the variable char
in turn, and the commands in the loop (i.e. print(char)
) is executed for that element. You should see a final output similar to this screen:
One of the most powerful things in Jupyter is the fact that you can write both text and code in the same notebook - much like a real Lab notebook where you have your text notes and your equations/figures/etc.
Let’s try and put some text in our notebook. To do that, we need to tell Jupyter that the cell should be interpreted as text (Markdown-formatted) and not as code. Click on the empty cell (it should have a green outline), and then go to Cell
-> Cell Type
-> Markdown
. You will notice that the In [ ]
indicator just disappeared, as there will be no need to execute something (and therefore no output will be produced).
Let’s copy the following text into the cell:
# Writing Notebooks
We can write lots of formatted text here, using the [Markdown syntax](https://en.wikipedia.org/wiki/Markdown). It is an easy way to write pretty text easily and efficiently.
## Formatting
It does support several common formatting styles:
- It can do **bold**
- It can do _italics_
- It can also do sub lists
* with items one
* two
* and three
It also allows to write [LaTex](https://www.latex-project.org) equations, like that:
$$c = \sqrt{a^2 + b^2}$$
Pretty neat, right?
If you press Shift-Enter
after putting this text, it should look like that:
You’ll also notice that, by default, Jupyter has changed the type of the new cell to R
, so you won’t have to change types constantly, but only when needed.
For more instructions, the Help
menu has a good tour and detailed information. Notebooks can be downloaded locally by going to the File
menu, then selecting Download
and choosing a file type to download, and it supports both pdf
and html
as file type choices.
You can also share the entire file that you have just created (there should be a file named Jupyter-is-fun.ipynb
in your working directory). You can even grab the one we created right now from here.
That’s it!
According to the people that created and support Jupyter, it has now ~3 million users worldwide, and over 500k Notebooks on GitHub - so huge success!
The same team is now working on JupyterLab (currently in alpha), essentially the next generation of the Jupyter Notebook application. You can read more here and also have a look at their recent talk at SciPy2016. So, stay tuned!