Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
rim17004 authored Apr 12, 2024
1 parent 1e78133 commit e297b92
Showing 1 changed file with 17 additions and 0 deletions.
17 changes: 17 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1 +1,18 @@
This repository contains two branches: MicroVI and MicroVI-retraining. MicroVI contains the base code for implementing MicroVI 5-fold cross-validation 100 times. MicroVI-retraining contains the code for implementing MicroVI with data augmentation, i.e., after each run of the model is trained, a portion of the learned latent space is sampled and used to finetune the trained model.

The run.py file within each project is the main file to run. A sample shell script is provided in myjob.sh. The data for each project is provided within the NEW_DATASETS folder, which is subdivided into POMP (aka M-DIET, our dataset used for linear regression) and DOMA (aka M-AGE, our dataset used for classification/logistic regression).

To choose which setting to run (i.e., linear regression or classification), modify the first ten lines of the main function within run.py:
## Linear Regression - uncomment below:
# dataset = 'pomp'

## Logistic Regression (classification) - uncomment below:
dataset = 'doma'

pct_supervised = 100 # Choose pct supervision from: 0, 5, 10, 25, 50, or 100
alpha_list = [1.0] # Choose weightage of supervision in loss function from: 0.1, 0.25, 0.5, 1.0
covariate_ablation = True # Choose whether to include or exclude covariates: if True, exclude covariates; if False, include covariates

That is, set the desired dataset, percent supervision, alpha value, and whether to include covariates.

run.py will create a nested set of Master_Results and dataset folders according to the settings above. This is where the results csv of the 100 CV runs will be stored.

0 comments on commit e297b92

Please sign in to comment.