Skip to content

ValueError when EBM has excluded features #29

Description

@aswer-svg

Hi,
I get the following error when an EBM has excluded features:

import pandas as pd
import numpy as np
from interpret.glassbox import ExplainableBoostingClassifier

np.random.seed(42)
n_samples = 1000

age = np.random.randint(16, 80, n_samples)
income = np.random.randint(20000, 120000, n_samples)
region = np.random.choice(['US', 'TR', 'UK'], n_samples)

logit = (
    -3.0 +
    (age * 0.05) + 
    (income * 0.00002) + 
    np.where(region == 'US', 0.8, np.where(region == 'TR', -0.5, 0.0)) +
    np.where((age < 30) & (region == 'US'), 1.2, 0)
)

probabilities = 1 / (1 + np.exp(-logit))

y = np.random.binomial(n=1, p=probabilities)

df = pd.DataFrame({'age': age, 'income': income, 'region': region})

ebm = ExplainableBoostingClassifier(
    interactions=[(0, 2)],
    exclude=[0],
    random_state=42
)

ebm.fit(df, y)
import onnx
import ebm2onnx
onnx_model = ebm2onnx.to_onnx(
    model=ebm,
    dtype=ebm2onnx.get_dtype_from_pandas(df)
)
onnx.save_model(onnx_model, 'ebm_model.onnx')
ValueError: could not convert string to float: 'TR'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions