Skip to content

HealthInfoLab/MicroVI

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 

This repository contains two branches: MicroVI and MicroVI-retraining. MicroVI contains the base code for implementing MicroVI 5-fold cross-validation 100 times. MicroVI-retraining contains the code for implementing MicroVI with data augmentation, i.e., after each run of the model is trained, a portion of the learned latent space is sampled and used to finetune the trained model.

The run.py file within each project is the main file to run. A sample shell script is provided in myjob.sh. The data for each project is provided within the NEW_DATASETS folder, which is subdivided into a subfolder containing microbiome data with mice on different diets (aka m_diet, our dataset used for linear regression) and another subfolder containing microbiome data measured when mice were in different ages (aka m_age, our dataset used for classification/logistic regression).

To choose which setting to run (i.e., linear regression or classification), modify the first ten lines of the main function within run.py:

## Linear Regression - uncomment below:
# dataset = 'm_diet'

## Logistic Regression (classification) - uncomment below:
dataset = 'm_age'

pct_supervised = 100 # Choose pct supervision from: 0, 5, 10, 25, 50, or 100
alpha_list = [1.0] # Choose weightage of supervision in loss function from: 0.1, 0.25, 0.5, 1.0
covariate_ablation = True # Choose whether to include or exclude covariates: if True, exclude covariates; if False, include covariates

That is, set the desired dataset, percent supervision, alpha value, and whether to include covariates.

run.py will create a nested set of Master_Results and dataset folders according to the settings above. This is where the results csv of the 100 CV runs will be stored.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published