causal_testing.estimation.linear_regression_estimator

This module contains the LinearRegressionEstimator for estimating continuous outcomes.

Module Contents

Classes

LinearRegressionEstimator

A Linear Regression Estimator is a parametric estimator which restricts the variables in the data to a linear

Attributes

logger

causal_testing.estimation.linear_regression_estimator.logger
class causal_testing.estimation.linear_regression_estimator.LinearRegressionEstimator(base_test_case: causal_testing.testing.base_test_case.BaseTestCase, treatment_value: float, control_value: float, adjustment_set: set, df: pandas.DataFrame = None, effect_modifiers: dict[causal_testing.specification.variable.Variable, Any] = None, formula: str = None, alpha: float = 0.05, query: str = '')

Bases: causal_testing.estimation.abstract_regression_estimator.RegressionEstimator

A Linear Regression Estimator is a parametric estimator which restricts the variables in the data to a linear combination of parameters and functions of the variables (note these functions need not be linear).

regressor
gp_formula(ngen: int = 100, pop_size: int = 20, num_offspring: int = 10, max_order: int = 0, extra_operators: list = None, sympy_conversions: dict = None, seeds: list = None, seed: int = 0)

Use Genetic Programming (GP) to infer the regression equation from the data.

Parameters:
  • ngen – The maximum number of GP generations to run for.

  • pop_size – The GP population size.

  • num_offspring – The number of offspring per generation.

  • max_order – The maximum polynomial order to use, e.g. max_order=2 will give polynomials of the form ax^2 + bx + c.

  • extra_operators – Additional operators for the GP (defaults are +, *, log(x), and 1/x). Operations should be of the form (fun, numArgs), e.g. (add, 2).

  • sympy_conversions – Dictionary of conversions of extra_operators for sympy, e.g. "mul": lambda \*args_: "Mul({},{})".format(\*args_).

  • seeds – Seed individuals for the population (e.g. if you think that the relationship between X and Y is probably logarithmic, you can put that in).

  • seed – Random seed for the GP.

estimate_coefficient() causal_testing.estimation.effect_estimate.EffectEstimate

Estimate the unit average treatment effect of the treatment on the outcome. That is, the change in outcome caused by a unit change in treatment.

Returns:

The unit average treatment effect and the 95% Wald confidence intervals.

estimate_ate() causal_testing.estimation.effect_estimate.EffectEstimate

Estimate the average treatment effect of the treatment on the outcome. That is, the change in outcome caused by changing the treatment variable from the control value to the treatment value.

Returns:

The average treatment effect and the 95% Wald confidence intervals.

estimate_risk_ratio(adjustment_config: dict = None) causal_testing.estimation.effect_estimate.EffectEstimate

Estimate the risk_ratio effect of the treatment on the outcome. That is, the change in outcome caused by changing the treatment variable from the control value to the treatment value.

Returns:

The average treatment effect and the 95% Wald confidence intervals.

estimate_ate_calculated(adjustment_config: dict = None) causal_testing.estimation.effect_estimate.EffectEstimate

Estimate the ATE of the treatment on the outcome. That is, the change in outcome caused by changing the treatment variable from the control value to the treatment value. Here, we actually calculate the expected outcomes under control and treatment and divide one by the other. This allows for custom terms to be put in such as squares, inverses, products, etc.

Param:

adjustment_config: The configuration of the adjustment set as a dict mapping variable names to their values. N.B. Every variable in the adjustment set MUST have a value in order to estimate the outcome under control and treatment.

Returns:

The average treatment effect and the 95% Wald confidence intervals.

_get_confidence_intervals(model, treatment)