Experimental machine learning study for liver disease classification

This project aims to evaluate a portfolio of machine learning algorithms for liver diseases classification. The experimental study consists of 3 experimental scenarios:

Multiclass classification of the largest dataset
Binary classification of the largest dataset
Binary classification of all 3 datasets

Prerequisites

sudo snap install astral-uv --classic

Datasets

The following datasets are being used:

How to run?

To evaluate the experiments, first set up the environment:

source setupenv

Then the liver command will be present and you can run liver -h to see what it does:

$ liver -h
usage: liver [-h] --experiment {1,2,3} [--debug]
             [--learners-group {logistic-regression,random-forest,tree,gradient-boosting,neural-network,svm} [{logistic-regression,random-forest,tree,gradient-boosting,neural-network,svm} ...]]
             [--config {default,global,experiment1,experiment2,experiment3}] [--plot-only]

options:
  -h, --help            show this help message and exit
  --experiment {1,2,3}
  --debug               Enable debug logging
  --learners-group {logistic-regression,random-forest,tree,gradient-boosting,neural-network,svm} [{logistic-regression,random-forest,tree,gradient-boosting,neural-network,svm} ...]
                        Run specific family(ies) of learners.
  --config {default,global,experiment1,experiment2,experiment3}
                        Configuration to use for the experiment
  --plot-only           Plot only on already existing results.

To check the generated results/report, run:

wslview results/experiment<experiment>-<config>.csv
wslview reports/experiment<experiment>-<config>.html

Configuration files

Configuration files are located in the configs folder. The structure is as follows:

{
    "logistic-regression": {
        "penalty": [
            "l2"
        ],
        "C": [
            1.0
        ],
        "class_weight": [
            null
        ]
    },
    "random-forest": {...},
    "svm": {...},
    "gradient-boosting": {...},
    "tree": {...},
    "neural-network": {...}
}

where the parameters for each learner are added.

Known issues and limitations

Disabled GUI options should usually be represented by omitting the parameter from the JSON, not by passing null. Passing null becomes None in Python and is only valid for parameters whose API explicitly accepts None, such as class_weight, max_depth, or random_state.
Orange.evaluation.testing.sample() uses a different splitting implementation/row-selection logic than Orange GUI’s Data Sampler widget, so the same n=0.8, stratified=True, and random_state=42 do not guarantee the same train/test rows.

TODO

Make default JSON files for Orange and for Python, i.e. default-ow and default-py.
Make full default JSON with all parameters described for each learner.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
.github/workflows		.github/workflows
configs		configs
datasets		datasets
docs		docs
experiment1		experiment1
experiment2		experiment2
experiment3		experiment3
results		results
src/liver		src/liver
templates		templates
.gitignore		.gitignore
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
setupenv		setupenv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Experimental machine learning study for liver disease classification

Prerequisites

Datasets

How to run?

Configuration files

Known issues and limitations

TODO

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Experimental machine learning study for liver disease classification

Prerequisites

Datasets

How to run?

Configuration files

Known issues and limitations

TODO

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages