-
Notifications
You must be signed in to change notification settings - Fork 0
Could you share your HPC jobs? #14
Comments
Luis replied to me in Whatsapp: /gpfs/scratchfs1/hfp14002/lrm22005 This is the path that I gave you access. I need to create the environment again in HPC, because I delete my entire anaconda to try to figure if the error was related with it. But I didn't try again. /home/lrm22005/ML_Notebooks/Arrhytmia_GP |
Dear @lrm22005 , HPC does not allow me to access your personal home folder. I think you should use Git to synchronize your HPC jobs. Could you please create an HPC branch in this repo and synchronize all your HPC jobs? Thanks!
|
Dear @lrm22005 , I am able to run the code on my Linux server. I will check how to run it on my Google Colab.
|
Dear @lrm22005 , After fixing the tqdm error, I encountered a new error after finish one active learning cycle?
Error message: ---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py in line 85
[82](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=81) print("Final Test Metrics:", results['test_metrics'])
[84](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=83) if __name__ == "__main__":
---> [85](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=84) main()
/mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py in line 50, in main()
[47](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=46) # Active Learning Iterations
[48](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=47) for iteration in tqdm(range(active_learning_iterations), desc='Active Learning', unit='iteration', leave=True):
[49](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=48) # Perform uncertainty sampling to select new samples from the validation set
---> [50](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=49) uncertain_sample_indices = stochastic_uncertainty_sampling(model, likelihood, val_loader, n_samples=batch_size, n_batches=5)
[52](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=51) # Update the training loader with uncertain samples
[53](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/ss_main.py?line=52) train_loader = update_train_loader_with_uncertain_samples(train_loader, uncertain_sample_indices, batch_size)
File /mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/active_learning/ss_active_learning.py:23, in stochastic_uncertainty_sampling(gp_model, gp_likelihood, val_loader, n_samples, n_batches, n_components)
[21](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/active_learning/ss_active_learning.py?line=20) gp_likelihood.eval()
[22](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/active_learning/ss_active_learning.py?line=21) uncertain_sample_indices = []
---> [23](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/active_learning/ss_active_learning.py?line=22) sampled_batches = random.sample(list(val_loader), n_batches) # Randomly sample n_batches from val_loader
[25](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/active_learning/ss_active_learning.py?line=24) with torch.no_grad():
[26](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/active_learning/ss_active_learning.py?line=25) for batch in sampled_batches:
[27](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/active_learning/ss_active_learning.py?line=26) # reduced_data = apply_tsne(batch['data'].reshape(batch['data'].size(0), -1), n_components=n_components)
[28](file:///mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/BML_project/active_learning/ss_active_learning.py?line=27) # reduced_data_tensor = torch.Tensor(reduced_data).to(device)
File [~/anaconda3/envs/CS330_torch/lib/python3.11/random.py:456](https://vscode-remote+ssh-002dremote-002b137-002e99-002e3-002e33.vscode-resource.vscode-cdn.net/mnt/r/ENGR_Chon/Dong/Python/B_ML_Project-Luis/B_ML_Project/~/anaconda3/envs/CS330_torch/lib/python3.11/random.py:456), in Random.sample(self, population, k, counts)
...
--> [456](file:///home/doh16101/anaconda3/envs/CS330_torch/lib/python3.11/random.py?line=455) raise ValueError("Sample larger than population or is negative")
[457](file:///home/doh16101/anaconda3/envs/CS330_torch/lib/python3.11/random.py?line=456) result = [None] * k
[458](file:///home/doh16101/anaconda3/envs/CS330_torch/lib/python3.11/random.py?line=457) setsize = 21 # size of a small set minus size of an empty list
ValueError: Sample larger than population or is negative
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?1c4f7608-fccd-40a5-8448-d75902e8e173) or open in a [text editor](command:workbench.action.openLargeOutput?1c4f7608-fccd-40a5-8448-d75902e8e173). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)... |
I got this simple error outside the active learning. I will fix it later. |
Dear Luis,
Could you please share the HPC jobs you submitted to run the Bayesian Variational Inference but failed to execute? If you did not use Git to track it, could you please just give me read permission to see those code and the data in the scratch folder?
The code version I am talking about was this: 9388a46#diff-35df281ed874530b6cf6b4e0d3c4c3a4431dfcada9ec97875c73f57af5a4598e
Thanks!
The text was updated successfully, but these errors were encountered: