Use Python for Data Mining

Here is my experience in a python production environment and configure illustrations to share with you.

I mainly followed these blogs and you can refer to them with some minor adjustments.

  1. Getting started with python for data scientists
  2. INSTALL PYTHON, NUMPY, SCIPY, AND MATPLOTLIB ON MAC OS X

Modules

And here is what i do:

1
2
3
4
5
6
7
8
9
10
11
12
13
sudo pip install nose
sudo pip install jinja2
sudo pip install tornado
sudo pip install pyzmq
sudo pip install jinja2
sudo pip install ipython
sudo pip install pyreadline
sudo pip install pygments
sudo pip install numpy
sudo pip install scipy
sudo pip install cython
sudo pip install ipython
sudo pip install pandas

Error with Pandas

But I come across a strange problem that I can’t install pandas on my OSX 10.9. I google the errors and tried many solutions on stackoverflow.

Then on githhub, Tom Augspurger told me that virtualenv and conda may work.

Fixed

Here I tried virtualenv and it does works .

1
2
3
4
pip install virtualenv
virtualenv ENV
source bin/activate
pip install pandas

And python 3.3 with pythonbrew also works, it’s also a alternative.
pandas

1
2
pythonbrew switch Python-3.3.1
pip install pandas

And have fun!