midlearn.MIDRegressor

class midlearn.MIDRegressor(params_main: int | None = None, params_inter: int | None = None, penalty: float = 0, link: str | None = None, kernel_type: int | list[int] = 1, encoding_frames: dict = {}, model_terms: str | list[str] | None = None, singular_ok: bool = False, mode: int = 1, method: int | str | None = None, centering_penalty: float = 1000000.0, na_action: str | None = 'na.omit', verbosity: int = 1, split: Literal['quantile', 'uniform'] = 'quantile', digits: int | None = None, lump: Literal['none', 'rank', 'order', 'auto'] = 'none', others: str = 'others', sep: str = '>', max_nelements: int | None = 1000000000.0, nil: float = 1e-07, tol: float = 1e-07, **kwargs)[source]

Stand-alone Maximum Interpretation Decomposition regressor.

__init__(params_main: int | None = None, params_inter: int | None = None, penalty: float = 0, link: str | None = None, kernel_type: int | list[int] = 1, encoding_frames: dict = {}, model_terms: str | list[str] | None = None, singular_ok: bool = False, mode: int = 1, method: int | str | None = None, centering_penalty: float = 1000000.0, na_action: str | None = 'na.omit', verbosity: int = 1, split: Literal['quantile', 'uniform'] = 'quantile', digits: int | None = None, lump: Literal['none', 'rank', 'order', 'auto'] = 'none', others: str = 'others', sep: str = '>', max_nelements: int | None = 1000000000.0, nil: float = 1e-07, tol: float = 1e-07, **kwargs)[source]

Create a MID model.

Parameters:

params_main (int or None, optional) – An integer specifying the maximum number of sample points for main effects. This corresponds to the ‘k[1]’ argument in R’s midr::interpret().
params_inter (int or None, optional) – An integer specifying the maximum number of sample points for interactions. This corresponds to the ‘k[2]’ argument in R’s midr::interpret().
penalty (float, optional) – The regularization penalty for pseudo-smoothing, corresponding to the ‘lambda’ argument in R’s midr::interpret(). Defaults to 0.
link (str or None, optional) – A character string specifying the link function, e.g., “logit”, “probit”, “identity”, “log”, “sqrt”, “inverse”. Corresponds to the ‘link’ argument in R.
kernel_type (int or list[int], optional) – The type of encoding. Effects of quantitative variables are modeled as piecewise linear functions if 1 (default), and as step functions if 0. If a list, kernel_type[0] is for main effects and kernel_type[1] is for interactions. Corresponds to the ‘type’ argument in R.
encoding_frames (dict, optional) – A dictionary of encoding frames to apply to specific variables. Advanced feature corresponding to the ‘frames’ argument in R.
model_terms (str (in R's formula syntax), list[str] or None, optional) – A list of term labels (e.g., [“x1”, “x2”, “x1:x2”]) specifying the set of component functions to be modeled. Corresponds to the ‘terms’ argument in R.
singular_ok (bool, optional) – If False (default), a singular fit is an error. Corresponds to the ‘singular.ok’ argument in R.
mode (int, optional) – An integer specifying the method of calculation. If 1 (default), centralization constraints are treated as penalties. If 2, constraints are used to reduce the number of free parameters. Corresponds to the ‘mode’ argument in R.
method (int, str or None, optional) – An integer or a string specifying the method for solving the least squares problem. Possible values include: - 0 or “qr”: column-pivoted QR - 1 or “unpivoted.qr”: unpivoted QR - 2 or “llt”: LLT Cholesky - 3 or “ldlt”: LDLT Cholesky - 4 or “svd”: singular value decomposition - 5 or “eigen”: eigenvalue-eigenvector decomposition If None (default), the R default is used. Corresponds to the ‘method’ argument in R.
centering_penalty (float, optional) – The penalty factor for centering constraints (used only when mode=1). Corresponds to the ‘kappa’ argument in R. Defaults to 1e+06.
na_action (str or None, optional) – A string specifying the method of NA handling. Corresponds to the ‘na.action’ argument in R. Defaults to ‘na.omit’.
verbosity (int, optional) – The level of verbosity. 0: fatal, 1: warning (default), 2: info, 3: debug. Corresponds to the ‘verbosity’ argument in R.
split ({'quantile', 'uniform'}, default 'quantile') – The splitting strategy for numeric variables. ‘quantile’ creates bins/knots based on data density; ‘uniform’ creates equally spaced bins/knots. Corresponds to the ‘split’ argument in R.
digits (int or None, optional) – The rounding digits for encoding numeric variables (used when kernel_type=1). Corresponds to the ‘digits’ argument in R. Defaults to None.
lump ({'none', 'rank', 'order', 'auto'}, default 'none') – The lumping strategy for high-cardinality factors. ‘rank’ keeps the top k levels; ‘order’ merges adjacent levels preserving order; ‘auto’ selects based on variable type. Corresponds to the ‘lump’ argument in R.
others (str, optional) – The label for the catch-all level used when lump is active (e.g., ‘rank’). Corresponds to the ‘others’ argument in R. Defaults to ‘others’.
sep (str, optional) – The separator string used when merging ordered factor levels (e.g., “Level1>Level3”). Corresponds to the ‘sep’ argument in R. Defaults to ‘>’.
max_nelements (int or None, optional) – The maximum number of elements of the design matrix. Corresponds to the ‘max.nelements’ argument in R (midr >= 0.5.3). Defaults to 1e+09.
nil (float, optional) – A threshold for the intercept and coefficients to be treated as zero. Corresponds to the ‘nil’ argument in R. Defaults to 1e-07.
tol (float, optional) – A tolerance for the singular value decomposition. Corresponds to the ‘tol’ argument in R. Defaults to 1e-07.
**kwargs (dict) – Additional keyword arguments to be passed directly to the underlying midr::interpret() function. This can include arguments not explicitly listed here, such as interactions: bool to auto-include all second-order interactions.

Methods

`__init__`([params_main, params_inter, ...])	Create a MID model.
`breakdown`(**kwargs)	Create MIDBreakdown object from the fitted estimator.
`conditional`(variable, **kwargs)	Create MIDConditional object from the fitted estimator.
`effect`(term, x[, y])	Evaluate a single MID component function for new data.
`fit`(X, y[, sample_weight])	Fit the MID model to the response y on predictors X.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`importance`(**kwargs)	Create MIDImportance object from the fitted estimator.
`interactions`(term)	Extract a pd.DataFrame representing the interaction of the specified 'term'.
`main_effects`(term)	Extract a pd.DataFrame representing the main effect of the specified 'term'.
`plot`(term[, style, theme, intercept, ...])	Visualize the estimated main or interaction effect of a fitted MID model with plotnine.
`predict`(X)	Predict target values for new data X using the fitted MID model.
`predict_terms`(X)	Predict the contribution of each term for new data X.
`r_predict`(X[, output_type, terms])	A low-level method to call the R predict.mid function.
`score`(X, y[, sample_weight])	Return coefficient of determination on test data.
`set_fit_request`(*[, sample_weight])	Configure whether metadata should be requested to be passed to the `fit` method.
`set_params`(**params)	Set the parameters of this estimator.
`set_score_request`(*[, sample_weight])	Configure whether metadata should be requested to be passed to the `score` method.
`terms`(**kwargs)	Extract term labels from the fitted model.

Attributes

`fitted_matrix`	A pandas DataFrame showing the breakdown of the fitted values into the effects of the component functions.
`fitted_values`	A NumPy array of the fitted values.
`intercept`	The intercept of the fitted model.
`ratio`	The ratio of the sum of squared error between the target model predictions and the fitted values, to the sum of squared deviations of the target model predictions.
`residuals`	A NumPy array of the working residuals.
`weights`	Sample weights used to fit the model.