- The recommended Python version is
3.10.12. - Please ensure the packages are installed before running the code.
This can be done by installing with the requirements.txt file provided.
pip install -r requirements.txt - Python version and package versions do not have to be exactly the same, but we highly recommend using the same versions in case results slightly differ.
- The results on kaggle and in the report is generated using HKU CS GPU Farm for Teaching. Please refer to the report section 3.1 for more details.
- There is a
gp_utils.pyfile that contains functions for functions that share between the two track notebooks. Please make sure to put it in the same directory as the notebooks. - Please put the data files in a
data/folder in the same directory as the notebooks.- The notebooks will search for:
data/ehr_preprocessed_seq_by_cat_embedding.pkldata/notes.csvdata/train.csvdata/valid.csvdata/test.csv- Since image files are not used for training for the track 2 submission, there is no need to include them to replicate the results shown on kaggle.
- The notebooks are:
Group23_Track1.ipynb: for Track 1Group23_Track2.ipynb: for Track 2- In case any modification happens, the configurations for the notebook should be
USE_IMAGE = FalseandUSE_TEXT = Truefor the result in the kaggle leaderboard.
This should be the default configuration as well.
- In case any modification happens, the configurations for the notebook should be
- The results of the notebooks will be saved to
output/folder in the same directory as the notebooks, with namestrack{1|2}_{timestamp}.csv.
vinkami/stat3612-project
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|