Deploying Python projects can be a pain - especially with Python 3.5. Anaconda is the emerging replacement for pip/virtualenv deploys, with its scope expanding past Python packages to binaries like redis. Tragically, its documentation is still… Maturing. With conda deploys being such a huge feature for me, I had to write about it!
In this post I’ll talk about deploying on Amazon EC2 AMIs (running Amazon Linux), but in theory should apply to all Unix platforms where Anaconda runs.
Conda environments are isolated boxes in which you can install and execute your software on many platforms. Creating a conda env is simple:
$ conda create -n $ENV_NAME python=3.5 numpy toolz etc
The packages at the end are the conda packages added to the environment to start - don’t worry about getting these right the first time. You won’t be able to specify pip packages here, but will be able to add them after with
pip directly. After creating the env, you’ll need to activate it:
$ . activate $ENV_NAME
This changes where Anaconda symbolic links point (for python, pip, etc), giving you a fresh, isolated environment. Now that you have the env active, you can run
pip install and
conda install commands - these will install the packages in the env and record their installed version as a dependency.
Recreating your environment on another machine is easy - you just need to export it to a YAML file and send it over. You can export the active conda env using:
$ conda env export > environment.yml
This environment YAML file is a full description of the conda env (including python version!), which will allow you to easily deploy on a fresh instance somewhere on EC2 or other IaaS providers. In order to create an environment somewhere else from a YAML file, just use:
$ conda env create -f environment.yml -n $ENV_NAME
You can add the
--force switch on there if you don’t care about tromping on some flowers.
“This is all well and fine - but how does this help me deploy on EC2?”
I’m glad you asked! Continuum also makes a portable version of conda for quickly deploying called miniconda. You can simply download miniconda on your instance, run it, and add it to the path, and you have everything you need to install and run Anaconda projects.
The biggest value here is that you can spin up an instance on EC2 or schedule a cron task on Jenkins and have it easily bootstrap in isolation. All you’ve gotta do is download and run miniconda, create your environment, and resolve the dependencies. The steps to doing this are few:
$ git clone https://my.git.project.com/whevs.git; cd whevs $ curl https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -o miniconda.sh $ sh miniconda.sh -b -p $HOME/miniconda $ export PATH="$HOME/miniconda/bin:$PATH" $ conda env create -f environment.yml -n $PROJECT_NAME $ . activate $PROJECT_NAME $ python main.py # and boom goes the dynamite
Also - for posterity - the format for conda environment.yml files are:
name: ekg-predict dependencies: - pip=7.1.2=py35_0 - python=3.5.1=0 - toolz=0.7.4=py35_0 - wheel=0.26.0=py35_1 - zlib=1.2.8=0 - pip: - click==6.2
You certainly don’t need to write them by hand, but sometimes it’s helpful.
I hope this has helped! Know of any other great conda patterns, or other great Python project management tools? Let me know!
The genesis of the software industry to stream processing is well underway. Open source systems like Kafka handle huge throughputs with surprisingly few resources, and aid heavily in decomposing...
By Stuart Axelbrooke, who does data science and text analytics. You should follow him on Twitter