midlearn.MIDRegressor

class midlearn.MIDRegressor(params_main: int | None = None, params_inter: int | None = None, penalty: float = 0, link: str | None = None, kernel_type: int | list[int] = 1, encoding_frames: dict = {}, model_terms: str | list[str] | None = None, singular_ok: bool = False, mode: int = 1, method: int | None = None, centering_penalty: float = 1000000.0, na_action: str | None = 'na.omit', verbosity: int = 1, encoding_digits: int | None = 3, use_catchall: bool = False, catchall: str = '(others)', max_nelements: int | None = 1000000000.0, nil: float = 1e-07, tol: float = 1e-07, **kwargs)[source]

Stand-alone Maximum Interpretation Decomposition regressor.

__init__(params_main: int | None = None, params_inter: int | None = None, penalty: float = 0, link: str | None = None, kernel_type: int | list[int] = 1, encoding_frames: dict = {}, model_terms: str | list[str] | None = None, singular_ok: bool = False, mode: int = 1, method: int | None = None, centering_penalty: float = 1000000.0, na_action: str | None = 'na.omit', verbosity: int = 1, encoding_digits: int | None = 3, use_catchall: bool = False, catchall: str = '(others)', max_nelements: int | None = 1000000000.0, nil: float = 1e-07, tol: float = 1e-07, **kwargs)[source]

Create a MID model.

Parameters:
  • params_main (int or None, optional) – An integer specifying the maximum number of sample points for main effects. This corresponds to the ‘k[1]’ argument in R’s midr::interpret().

  • params_inter (int or None, optional) – An integer specifying the maximum number of sample points for interactions. This corresponds to the ‘k[2]’ argument in R’s midr::interpret().

  • penalty (float, optional) – The regularization penalty for pseudo-smoothing, corresponding to the ‘lambda’ argument in R’s midr::interpret(). Defaults to 0.

  • link (str or None, optional) – A character string specifying the link function, e.g., “logit”, “probit”, “identity”, “log”, “sqrt”, “inverse”. Corresponds to the ‘link’ argument in R.

  • kernel_type (int or list[int], optional) – The type of encoding. Effects of quantitative variables are modeled as piecewise linear functions if 1 (default), and as step functions if 0. If a list, kernel_type[0] is for main effects and kernel_type[1] is for interactions. Corresponds to the ‘type’ argument in R.

  • encoding_frames (dict, optional) – A dictionary of encoding frames to apply to specific variables. Advanced feature corresponding to the ‘frames’ argument in R.

  • model_terms (str (in R's formula syntax), list[str] or None, optional) – A list of term labels (e.g., [“x1”, “x2”, “x1:x2”]) specifying the set of component functions to be modeled. Corresponds to the ‘terms’ argument in R.

  • singular_ok (bool, optional) – If False (default), a singular fit is an error. Corresponds to the ‘singular.ok’ argument in R.

  • mode (int, optional) – An integer specifying the method of calculation. If 1 (default), centralization constraints are treated as penalties. If 2, constraints are used to reduce the number of free parameters. Corresponds to the ‘mode’ argument in R.

  • method (int or None, optional) – An integer specifying the method for solving the least squares problem. Non-negative values are passed to RcppEigen::fastLmPure(), negative to stats::lm.fit(). None uses R default. Corresponds to the ‘method’ argument in R.

  • centering_penalty (float, optional) – The penalty factor for centering constraints (used only when mode=1). Corresponds to the ‘kappa’ argument in R. Defaults to 1e+06.

  • na_action (str or None, optional) – A string specifying the method of NA handling. Corresponds to the ‘na.action’ argument in R. Defaults to ‘na.omit’.

  • verbosity (int, optional) – The level of verbosity. 0: fatal, 1: warning (default), 2: info, 3: debug. Corresponds to the ‘verbosity’ argument in R.

  • encoding_digits (int or None, optional) – The rounding digits for encoding numeric variables (used when kernel_type=1). Corresponds to the ‘encoding.digits’ argument in R. Defaults to 3.

  • use_catchall (bool, optional) – If True, less frequent levels of qualitative variables are replaced by the ‘catchall’ level. Corresponds to ‘use.catchall’ in R. Defaults to False.

  • catchall (str, optional) – The catchall level string to use when use_catchall=True. Corresponds to the ‘catchall’ argument in R. Defaults to ‘(others)’.

  • max_nelements (int or None, optional) – The maximum number of elements of the design matrix. Corresponds to the ‘max.nelements’ argument in R (midr >= 0.5.3). Defaults to 1e+09.

  • nil (float, optional) – A threshold for the intercept and coefficients to be treated as zero. Corresponds to the ‘nil’ argument in R. Defaults to 1e-07.

  • tol (float, optional) – A tolerance for the singular value decomposition. Corresponds to the ‘tol’ argument in R. Defaults to 1e-07.

  • **kwargs (dict) – Additional keyword arguments to be passed directly to the underlying midr::interpret() function. This can include arguments not explicitly listed here, such as interactions: bool to auto-include all second-order interactions.

Methods

__init__([params_main, params_inter, ...])

Create a MID model.

breakdown(**kwargs)

Create MIDBreakdown object from the fitted estimator.

conditional(variable, **kwargs)

Create MIDConditional object from the fitted estimator.

effect(term, x[, y])

Evaluate a single MID component function for new data.

fit(X, y[, sample_weight])

Fit the MID model to the response y on predictors X.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

importance(**kwargs)

Create MIDImportance object from the fitted estimator.

interactions(term)

Extract a pd.DataFrame representing the interaction of the specified 'term'.

main_effects(term)

Extract a pd.DataFrame representing the main effect of the specified 'term'.

plot(term[, style, theme, intercept, ...])

Visualize the estimated main or interaction effect of a fitted MID model with plotnine.

predict(X)

Predict target values for new data X using the fitted MID model.

predict_terms(X)

Predict the contribution of each term for new data X.

r_predict(X[, output_type, terms])

A low-level method to call the R predict.mid function.

score(X, y[, sample_weight])

Return coefficient of determination on test data.

set_fit_request(*[, sample_weight])

Configure whether metadata should be requested to be passed to the fit method.

set_params(**params)

Set the parameters of this estimator.

set_score_request(*[, sample_weight])

Configure whether metadata should be requested to be passed to the score method.

terms(**kwargs)

Extract term labels from the fitted model.

Attributes

fitted_matrix

A pandas DataFrame showing the breakdown of the fitted values into the effects of the component functions.

fitted_values

A NumPy array of the fitted values.

intercept

The intercept of the fitted model.

ratio

The ratio of the sum of squared error between the target model predictions and the fitted values, to the sum of squared deviations of the target model predictions.

residuals

A NumPy array of the working residuals.

weights

Sample weights used to fit the model.