Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
Just put everything together for the appendum. The README should expl…
…ain mostly everything that it needs. All the data and scripts have been modified to use relative paths, rather than absolute. Using best configuration of the GAN. Created a requirement.txt for python dependency installing.
- Loading branch information
Showing
20 changed files
with
972 additions
and
175 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
Data Wrangling & GAN | ||
================================= | ||
|
||
## R Dependencies | ||
|
||
When using R I like using [RStudio](https://www.rstudio.com). I think it's the best IDE for R, and makes iterating on code very easy and quick. Within RStudio there is a package manager that can help you install the packages I have listed here: | ||
|
||
* dyplr | ||
* ggplot2 | ||
|
||
The following two packages are installed a little differently. | ||
First install [Bioconductor](http://bioconductor.org/) | ||
Then you may install [GEOQuery](http://genomicsclass.github.io/book/pages/GEOquery.html) | ||
|
||
Those two packages are used in [get_gse_data.r](./mkdataset/get_gse_data.r), which can get any GSE, given the GSE ID and gene symbol column name. | ||
|
||
|
||
## Python Dependencies | ||
|
||
I used a virtual environment by [virtualenv](https://pypi.python.org/pypi/virtualenv). If you want to use it as well, I recommend [this installation tutorial](http://docs.python-guide.org/en/latest/dev/virtualenvs/). | ||
|
||
> **NOTE**: Using a virtual environment doesn't allows you to use [matplotlib](http://matplotlib.org/faq/virtualenv_faq.html) directly. You need to map it to your system's copy of **matplotlib** because the graphics libraries to create window frames is closely tied to the operating system. | ||
The prominent packages are: | ||
|
||
* numpy | ||
* Pandas | ||
* Scikit-Learn | ||
* TensorFlow | ||
* Keras | ||
* keras_adversarial | ||
|
||
To install all the dependencies quickly and easily you should use [`pip`](https://pypi.python.org/pypi/pip/) | ||
|
||
```bash | ||
pip install -r requirements.txt | ||
``` | ||
|
||
## How to Run any Script | ||
|
||
Just navigate to the folder containing the script, and run it directly. | ||
|
||
### R | ||
|
||
If you're using RStudio, then all you need to do is **source** the script. There is a button for that in the top right corner of the editor window. Else from the command line: | ||
|
||
```r | ||
R CMD BATCH <name_of_script> | ||
``` | ||
|
||
### Python | ||
|
||
Just run it directly from the command line. Assuming that you environment is prepared, and you have all the dependencies, all you have to do is | ||
|
||
``` | ||
python gan.py | ||
``` |
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.