-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
17 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,18 @@ | ||
This repository contains two branches: MicroVI and MicroVI-retraining. MicroVI contains the base code for implementing MicroVI 5-fold cross-validation 100 times. MicroVI-retraining contains the code for implementing MicroVI with data augmentation, i.e., after each run of the model is trained, a portion of the learned latent space is sampled and used to finetune the trained model. | ||
|
||
The run.py file within each project is the main file to run. A sample shell script is provided in myjob.sh. The data for each project is provided within the NEW_DATASETS folder, which is subdivided into POMP (aka M-DIET, our dataset used for linear regression) and DOMA (aka M-AGE, our dataset used for classification/logistic regression). | ||
|
||
To choose which setting to run (i.e., linear regression or classification), modify the first ten lines of the main function within run.py: | ||
## Linear Regression - uncomment below: | ||
# dataset = 'pomp' | ||
|
||
## Logistic Regression (classification) - uncomment below: | ||
dataset = 'doma' | ||
|
||
pct_supervised = 100 # Choose pct supervision from: 0, 5, 10, 25, 50, or 100 | ||
alpha_list = [1.0] # Choose weightage of supervision in loss function from: 0.1, 0.25, 0.5, 1.0 | ||
covariate_ablation = True # Choose whether to include or exclude covariates: if True, exclude covariates; if False, include covariates | ||
|
||
That is, set the desired dataset, percent supervision, alpha value, and whether to include covariates. | ||
|
||
run.py will create a nested set of Master_Results and dataset folders according to the settings above. This is where the results csv of the 100 CV runs will be stored. |