Hello, a great job. But I have some confusion about the train and test data split, in do_kfold.py and do_stratified_kfold.py use sklearn KFold to split the data and only have train and test set with a random way, in paper "It is crucial to consider these non-binders as outliers during model development and evaluation to ensure model accuracy and robustness." for the S645 dataset. Does handling these outliers mean directly deleting the data with a ddG==8 in S645? In the last, Could you please provide a more detailed document on training or inference? Thanks.
Hello, a great job. But I have some confusion about the train and test data split, in do_kfold.py and do_stratified_kfold.py use sklearn KFold to split the data and only have train and test set with a random way, in paper "It is crucial to consider these non-binders as outliers during model development and evaluation to ensure model accuracy and robustness." for the S645 dataset. Does handling these outliers mean directly deleting the data with a ddG==8 in S645? In the last, Could you please provide a more detailed document on training or inference? Thanks.