Boosting your data science workflow with vim+tmux

Photo by SpaceX on Unsplash

As most of my peers, I started my carreer in data science working with the Jupyter eco-system. Jupyter is a great environement, easy to setup, which offers useful out-of-the-shelve features.

At some points, I nevertheless felt that I needed to move beyond. Some limitations, inherent to notebooks, started to kill my productivity. Just to name a few:

  • Notebook’s versioning is problematic. I was constantly fearing to send notebooks outputs containing client data online. In addition, it is pretty inconvenient to have git keeping track of the outputs modifications while the code is actually unchanged.
  • Even though progress have been made recently, editing capabilities, like advanced search-and-replace, are still very limited in the Jupyter environement.
  • When the project gets larger, it is common to have several notebooks open simultaneously. Navigating from notebook to notebook is a pain, especially if they lurk in an ocean of internet tabs. The browser is for googling, not for developing for code’s sake!

I was therefore actively looking for alternatives and luckily, I met a bunch of cool dudes that taught me the old-school way of code development. It relies on the vim + tmux combo, which combines a powerful terminal-embedded editor with a multiplexer. Together they provide advanced editing capabilities but also interactivity, as required for data exploration . In addition, this workflow can be fully operated with the keyboard , resulting in a substantial amount of time saved since you no longer need to constantly switch between the keyboard and the mouse.

You may wonder at some points why I didn’t consider using an IDE like pycharm. Well, there are two main reasons for that. Firstly, IDEs are not really portable and, as a consultant, I tend to work on many different environements. Secondly, and more importantly, it looks so much cooler to work on a dark screen where you can execute code and move from pane to pane at (almost) the speed of light.

This post first aims at guiding you through the setup of a basic, but functionnal, data science environement based on vim + tmux. I will also showcase how such a setup can boost your productivity in your projects.

Disclaimer: You will need basic familiarities with vim to follow this post. If you are a complete novice, maybe first take a look at this article and then come back.

Tmux

Tmux is a command line tool that enables multiple windows and panes within a single terminal window. Technically, it is called a multiplexer. Installation is as simple as sudo apt-get install tmux . You can then create your first session with:

tmux new -s mycoolproject

Within a session, you can control windows and panes using the prefix command ctrl+b and some specific keys. For instance, ctrl+b “ horizontal split of the windows. You can then navigate between panes using ctrl+b arrows .

A big advantage of tmux is that it allows to run multiple sessions in parallel. This is very convenient for quickly switching between different projects without risking to mixed-up your scripts . You can detach from you current session using ctrl+b d , list the existing sesssions with tmux ls and attach to a different session with tmux a -t <session name> .

Quick overview of tmux capabilities.

It is fairly easy to customize tmux, you simply need to edit the config file .tmux.conf located in your home directory. For instance, many people like to rebind the prefix command to ctrl+a .

I only aimed at providing a brief overview of tmux here, but if you want to learn more, there are plenty of great tutorials out there. It is also worth to take a look at this cheat sheet .

:heart: vim :heart:

I like Vim. Really. Vim is one of those things like black coffee, Sunday morning jogging or Godard movies that can feel a bit harsh at first, but that becomes more and more enjoyable through time and practice.

Some say that the learning curve is steep. It’s true. But it is also extremely rewarding when you start to master new shortcuts or macros that considerably improve your productivity.

Vim is a highly customizable text editor directly embedded in the terminal. Vim is present by default on all unix-like systems. No installation needed. The basic configuration has limited capabilities but you can quickly add features (like syntax highlighting, auto-completion, etc..) by tuning or adding plugins to the .vimrc , the configuration file located in your home directory that is loaded when starting the editor.

I’ve made a simple .vimrc available in this repo . It will help you to replicate the steps described below. However, I strongly recommand to set-up your own config file in order to better feel the vim’s spirit.

We will use three plugins in the course of this tutorial:

  • vimux , which enables vim to interact with tmux
  • vim-pyShell , a wrapper around vimux specifically designed to ease the use of ipython
  • vim-cellmode , a matlab-like code block execution for ipython

The easiest way to install plugins is through a plugin manager. I personnally use vim-plug , but there are plenty of other good options.

Vim-plug is easy to install. It only requires a single bash command:

curl -fLo ~/.vim/autoload/plug.vim  https://raw.github.com/junegunn/vim-plug/master/plug.vim

You then just need to specify the desired plugins in your .vimrc between call plug#begin() and call plug#end() as illustrated in the snapshot below.

First lines of .vimrc

To install the plugins, execute the command :PlugInstall with your .vimrc open. Then restart vim to source the config file.

Code execution

Once our plugins are up and running, we can start to send instructions from vim to the terminal.

Within a tmux session, open a python script with vim. In normal mode, you can fire an ipython terminal by calling the dedicated function from the newly installed pluggins with the command :call StartPyShell() . By default, this will create a pane at the bottom of the screen and starts an ipython session.

Code can be executed either by:

  • sending instructions line by line. To do this, move your cursor to the desired line and run the command :call RunPyShellSendLine() .
  • sending code blocks delimited with ##{/##} . In this case, go to the block and call RunTmuxPythonCell(0) .

Sending commands directly from vim to the shell with vimux

This is already pretty cool, but it actually requires quite some typing. Can we do better?

Boosting your productivity with the relevant mappings

Automating repetitive tasks. This is the secret for shortening developpment time and hence boost your productivity. And the good news is that vim is really good at that.

The main idea consists in creating mappings for the most common tasks. Let’s take a closer look at how to actually implement mappings. Again, this is done in the .vimrc . In the snippet below, lines 2 and 3 map the shortcuts ,ss and ,sk to ipython start and stop commands, respectively, while the second block defines the mappings for code execution.

It is well known that most of the time in data science is devoted to data preparation. This step heavily relies on dataframe manipulations. Hence, defining mappings associated to basic operations like:

,sdh
,sdi
,spp
,sph
,so
,sl

will save you a lot of time . In addition, you are not polluting your script with numerous prints and outputs since the inspection is performed through passing the variable/object under cursor to a backend function. No additional typing needed.

Let’s see those mappings in action!

Few mappings were sufficient to really boost my productivity!

Concluding thoughts

Combining the advanced editing capabilities of vim with few well designed mappings has really enhanced my productivity. This workflow helps me to meet the tight deadlines inherent to my job. It is true that it requires a substantial initial investment, but I am convinced that the pay-back is much higher, in terms of time saved but also in terms of working comfort.

What keeps amazing me with vim is the endless customization possibilities. So be creative, start to hack the .vimrc and implement your own mappings!

Thomas Carette was the first person to introduce me to vim and tmux. He was also kind enougth to proof-read this post, his feedback was extremly valuable. Thank you Thomas!

我来评几句
登录后评论

已发表评论数()

相关站点

热门文章