This project aims to evaluate a portfolio of machine learning algorithms for liver diseases classification. The experimental study consists of 3 experimental scenarios:
- Multiclass classification of the largest dataset
- Binary classification of the largest dataset
- Binary classification of all 3 datasets
sudo snap install astral-uv --classicThe following datasets are being used:
To evaluate the experiments, first set up the environment:
source setupenvThen the liver command will be present and you can run liver -h to see what it does:
$ liver -h
usage: liver [-h] --experiment {1,2,3} [--debug]
[--learners-group {logistic-regression,random-forest,tree,gradient-boosting,neural-network,svm} [{logistic-regression,random-forest,tree,gradient-boosting,neural-network,svm} ...]]
[--config {default,global,experiment1,experiment2,experiment3}] [--plot-only]
options:
-h, --help show this help message and exit
--experiment {1,2,3}
--debug Enable debug logging
--learners-group {logistic-regression,random-forest,tree,gradient-boosting,neural-network,svm} [{logistic-regression,random-forest,tree,gradient-boosting,neural-network,svm} ...]
Run specific family(ies) of learners.
--config {default,global,experiment1,experiment2,experiment3}
Configuration to use for the experiment
--plot-only Plot only on already existing results.To check the generated results/report, run:
wslview results/experiment<experiment>-<config>.csv
wslview reports/experiment<experiment>-<config>.htmlConfiguration files are located in the configs folder.
The structure is as follows:
{
"logistic-regression": {
"penalty": [
"l2"
],
"C": [
1.0
],
"class_weight": [
null
]
},
"random-forest": {...},
"svm": {...},
"gradient-boosting": {...},
"tree": {...},
"neural-network": {...}
}where the parameters for each learner are added.
- Disabled GUI options should usually be represented by omitting the parameter from the JSON, not by passing
null. PassingnullbecomesNonein Python and is only valid for parameters whose API explicitly acceptsNone, such asclass_weight,max_depth, orrandom_state. Orange.evaluation.testing.sample()uses a different splitting implementation/row-selection logic than Orange GUI’s Data Sampler widget, so the samen=0.8,stratified=True, andrandom_state=42do not guarantee the same train/test rows.
- Make
defaultJSON files for Orange and for Python, i.e.default-owanddefault-py. - Make full default JSON with all parameters described for each learner.