Training/Exploration:
- Mr Job on company's dedicated cluster
- Local instance of postgresql to rapidly rebuild training sets
- python mostly for combining the data and transformations that are a pain in mrJob/Postgres.
- R for most model building unless I want to try something off the beaten path in which case I'll use sci-py + pycuda
Production:
- Hive
- MSSQL
- Python which calls R models and functions through Rpy2
- Tableau
Use mostly vim for text editor and git for version control on a 2012 macbook pro