When using R I like using [RStudio](https://www.rstudio.com). I think it's the best IDE for R, and makes iterating on code very easy and quick. Within RStudio there is a package manager that can help you install the packages I have listed here:
The following two packages are installed a little differently.
First install [Bioconductor](http://bioconductor.org/)
Then you may install [GEOQuery](http://genomicsclass.github.io/book/pages/GEOquery.html)
Those two packages are used in [get_gse_data.r](./mkdataset/get_gse_data.r), which can get any GSE, given the GSE ID and gene symbol column name.
## Python Dependencies
I used a virtual environment by [virtualenv](https://pypi.python.org/pypi/virtualenv). If you want to use it as well, I recommend [this installation tutorial](http://docs.python-guide.org/en/latest/dev/virtualenvs/).
> **NOTE**: Using a virtual environment doesn't allows you to use [matplotlib](http://matplotlib.org/faq/virtualenv_faq.html) directly. You need to map it to your system's copy of **matplotlib** because the graphics libraries to create window frames is closely tied to the operating system.
The prominent packages are:
To install all the dependencies quickly and easily you should use [`pip`](https://pypi.python.org/pypi/pip/)
pip install -r requirements.txt
## How to Run any Script
Just navigate to the folder containing the script, and run it directly.
If you're using RStudio, then all you need to do is **source** the script. There is a button for that in the top right corner of the editor window. Else from the command line:
R CMD BATCH <name_of_script>
Just run it directly from the command line. Assuming that you environment is prepared, and you have all the dependencies, all you have to do is