The Complete Data Science Setup (macOS)
One of the funnest (and most frustrating) parts of data science is the vast array of tools available to us. It can be overwhelming where to start. Every now and then I like to completely wipe my computer clean, and then reinstall everything from scratch. This helps clean up my computer, and make sure everything is running smoothly.
This is a living document that captures my most up to date set up. My set up is inspired by the University of British Columbia Data Science Program which provides helpful setup guides for three operating systems (macOS, Windows, and Ubuntu).
My guide currently covers the following areas.
I choose to use the python distribution Miniconda from Anaconda. I use miniconda as opposed to Anaconda because it is a stripped down version of Anaconda comes with a lot of software that I do not play to use such as Spyder and Orange.
After I have miniconda installed I then work on setting up my python environment. I like to leave the root environment as is, and create a new environment call
ds_base (data science base). I then load in my favourite data science libraries. Note that I never use
pip install in the
ds_base. If there is a package that I can not install through conda I will clone
ds_base and then install the desired package. I do this to avoid breaking the installation of
ds_base. Conda recommends:
- Care should be taken to avoid running
pipin the root environment.
Steps to setup python:
- Follow the official instructions to install miniconda.
- Create a new conda environment. You can either add packages as you wish, or get started by basing your data science environment off of my own. Run one of the following commands in your shell.
# option 1: start from scratch conda create -name ds_base # option 2: start with my suggested packages # download my environment_ds_base.yml file from github curl -o environment_ds_base.yml https://gist.githubusercontent.com/SamEdwardes/ae9fd4582d5fe213c5e2c43b68a78e12/raw/7d8c163a8d0da96602133d739d92c67337d9223a/environment_ds_base.yml # create environment from yml file conda env create -f environment_ds_base.yml
Below is a complete list of packages in my environment
Homebrew is an open source package manager for macOS and linux. When ever possible I try and install software using Homebrew as it helps keep everything organized. Some of my favourite software that I download from homebrew include:
tree: Allows you to view directories and files in a tree like structure
gh: The GitHub.com CLI. Great for quickly checkout PRs, or creating issues from the command line.
autojump: Allows you to quickly jump between directories in the command line (GitHub README).
npm: I never use these directly, but lots of other tools seem to rely on them.
Steps to install Homebrew:
# download and install homebrew /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)" # install some of my favourite packages brew install tree brew install github/gh/gh brew install autojump brew install node
Terminal and ZSH
Instead of the default Mac OS terminal app I use iTerm2. It provides a lot of different options for themes, uses tabs, and allows you to have a split layout on one tab.
I also use ZSH instead of bash. ZSH is now the default shell in macOS, but if you are operating on an older system it may be bash. Here is a good article from stackabuse.com comparing the two.
Lastly I customize ZSH with another tool called Oh My ZSH. The tool allows you to extend the usefulness of ZSH by adding additional features, plugins, and themes.
Steps to setup iTerm2, ZSH, and Oh My ZSH
Install iTerm2 from https://www.iterm2.com/
I will assume that you are already using ZSH, but if you are not update to the latest version of macOS and ZSH should be the default. Here is a guide from How-To-Geek to help switch between shells.
Install Oh My ZSH using:
sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
~/.zshrcfile to add some customizations. Every time you open a new command line window/tab, or refresh your current command line by calling
zshthis file will be run. I add a combination of functions and alias for commonly performed tasks. You can copy and paste the below into your own
.zshrcfile if you wish, or just add your own.
- I also add the following lines to the file to enable my desired theme and plugins:
plugins=(git autojump) ZSH_THEME="blinks"
Work in progress…
Visual Studio Code, or VS Code for short is one of the most popular editors at the moment. I enjoy listing to the Talk Python to Me podcast and it seems like 99% of guests use VS code now. VS Code is great because it is very light weight, but has a ton of powerful extensions.
Steps to setup VS Code
- Download from https://code.visualstudio.com/
- Install extensions. I currently use the following extensions. You can download them by searching for each one in the extensions sidebar.