causal_testing.estimation.abstract_regression_estimator

This module contains the RegressionEstimator, which is an abstract class for concrete regression estimators.

Module Contents

Classes

RegressionEstimator

A Linear Regression Estimator is a parametric estimator which restricts the variables in the data to a linear

Attributes

logger

causal_testing.estimation.abstract_regression_estimator.logger
class causal_testing.estimation.abstract_regression_estimator.RegressionEstimator(base_test_case: causal_testing.testing.base_test_case.BaseTestCase, treatment_value: float, control_value: float, adjustment_set: set, df: pandas.DataFrame = None, effect_modifiers: dict[causal_testing.specification.variable.Variable, Any] = None, formula: str = None, alpha: float = 0.05, query: str = '')

Bases: causal_testing.estimation.abstract_estimator.Estimator

A Linear Regression Estimator is a parametric estimator which restricts the variables in the data to a linear combination of parameters and functions of the variables (note these functions need not be linear).

abstract property regressor

The regressor to use, e.g. ols or logit. This should be a property accessible with self.regressor. Define as regressor = …` outside of __init__, not as self.regressor = …, otherwise you’ll get an “cannot instantiate with abstract method” error.

setup_covariates()

Parse the formula and set up the covariates from the design matrix so we can use them in the statsmodels array API. This allows us to only parse the formula once, rather than using the formula API, which parses it every time the regression model is fit, which can be a lot if using causal test adequacy.

add_modelling_assumptions()

Add modelling assumptions to the estimator. This is a list of strings which list the modelling assumptions that must hold if the resulting causal inference is to be considered valid.

fit_model(data=None) statsmodels.regression.linear_model.RegressionResultsWrapper

Run logistic regression of the treatment and adjustment set against the outcome and return the model.

Returns:

The model after fitting to data.

_predict(data=None, adjustment_config: dict = None) pandas.DataFrame

Estimate the outcomes under control and treatment.

Parameters:

data – The data to use, defaults to self.df. Controllable for boostrap sampling.

Param:

adjustment_config: The values of the adjustment variables to use.

Returns:

The estimated outcome under control and treatment, with confidence intervals in the form of a dataframe with columns “predicted”, “se”, “ci_lower”, and “ci_upper”.