Skip to content

jhonatangs/slsdt

Repository files navigation

SLSDT

Stochastic Local Search Decision Tree

This repository is for my first scientific initiation project.

About

Decision tree is a predictive modelling aproach used in machine learning, data mining and statistics. In the decision tree each internal node represents a test on a feature and each terminal (or leaf) node represents a class label. Oblique Decision Tree is a variation of traditional decision trees, which allows multivariate tests in its internal nodes in the form of a combination of the features.

Our research, SLSDT, is a method for induction oblique decision trees using a stochastic local search method called Late Acceptance Hill-Climbing (LAHC) to try to find the best combination of features in each internal node.

This project also provides a utility to read csv files and convert to the format accepted by the SLSDT method. Moreover provides also a utility to load datasets included in the package.

How to use

  1. Install
pip3 install slsdt
  1. read_csv
from slsdt.reader_csv import read_csv

X, y = read_csv("iris.csv", "class")
  1. load_dataset
from slsdt.datasets import load_dataset

X, y = load_dataset("iris")
  1. slsdt
from slsdt.slsdt import SLSDT


# split train and test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

clf = SLSDT()

clf.fit(X_train, y_train)

results = clf.predict(X_test)

print(f"Accuracy: {sum(results == y_test) / len(y_test)}")

Iris example oblique split

from sklearn import datasets
from slsdt.slsdt import SLSDT

iris = datasets.load_iris()

X = iris.data[:, :2] # we only take the sepal width and sepal length features.
y = iris.target

mark = y != 2

# we only take the 0 (Iris-setosa) and 1 (Iris-versicolor) class labels
X = X[mark]
y = y[mark]

clf = SLSDT()

clf.fit(X, y)

clf.print_tree()

result = clf.predict(X)

print(result)
print(result == y)

Plot iris oblique split

alt text

Plot with Matplotlib using the results obtained above.

How to contribute

  • Leave the ⭐ if you liked the project
  • Fork this project
  • Cloner your fork: git clone your-fork-url && cd slsdt
  • Create a branch with your features: git checkout -b my-features
  • Commit your changes: git commit -m 'feat: My new features'
  • Send the your branch: git push origin my-features

License

This project is licensed under the EPL 2.0 License - see the LICENSE file for details.

About

A Python package for inducing Oblique Decision Trees using Stochastic Local Search, specifically the Late Acceptance Hill-Climbing (LAHC) algorithm.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages