:py:mod:`causal_testing.estimation.genetic_programming_regression_fitter` ========================================================================= .. py:module:: causal_testing.estimation.genetic_programming_regression_fitter .. autoapi-nested-parse:: This module contains a genetic programming implementation to infer the functional form between the adjustment set and the outcome. Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: causal_testing.estimation.genetic_programming_regression_fitter.GP Functions ~~~~~~~~~ .. autoapisummary:: causal_testing.estimation.genetic_programming_regression_fitter.reciprocal causal_testing.estimation.genetic_programming_regression_fitter.mut_insert causal_testing.estimation.genetic_programming_regression_fitter.create_power_function .. py:function:: reciprocal(x: float) -> float Return the reciprocal of the input. :param x: Float to reciprocate. :return: 1/x .. py:function:: mut_insert(expression: deap.gp.PrimitiveTree, pset: deap.gp.PrimitiveSet) NOTE: This is a temporary workaround. This method is copied verbatim from gp.mutInsert. It seems they forgot to import isclass from inspect, so their method throws an error, saying that "isclass is not defined". A couple of lines are not covered by tests, but since this is 1. a temporary workaround until they release a new version of DEAP, and 2. not our code, I don't think that matters. Inserts a new branch at a random position in *expression*. The subtree at the chosen position is used as child node of the created subtree, in that way, it is really an insertion rather than a replacement. Note that the original subtree will become one of the children of the new primitive inserted, but not perforce the first (its position is randomly selected if the new primitive has more than one child). :param expression: The normal or typed tree to be mutated. :param pset: The pset object defining the variables and constants. :return: A tuple of one tree. .. py:function:: create_power_function(order: int) Creates a power operator and its corresponding sympy conversion. :param order: The order of the power, e.g. `order=2` will give x^2. :return: A pair consisting of the power function and the sympy conversion .. py:class:: GP(df: pandas.DataFrame, features: list, outcome: str, max_order: int = 0, extra_operators: list = None, sympy_conversions: dict = None, seed=0) Object to perform genetic programming. .. py:method:: split(individual: deap.gp.PrimitiveTree) -> list Split an expression into its components, e.g. 2x + 4y - xy -> [2x, 4y, xy]. :param individual: The expression to be split. :return: A list of the equations components that are linearly combined into the full equation. .. py:method:: _convert_prim(prim: deap.gp.Primitive, args: list) -> str Convert primitives to sympy format. :param prim: A GP primitive, e.g. add :param args: The list of arguments :return: A sympy compatible string representing the function, e.g. add(x, y) -> Add(x, y). .. py:method:: _stringify_for_sympy(expression: deap.gp.PrimitiveTree) -> str Return the expression in a sympy compatible string. :param expression: The expression to be simplified. :return: A sympy compatible string representing the equation. .. py:method:: simplify(expression: deap.gp.PrimitiveTree) -> sympy.core.Expr Simplify an expression by appling mathematical equivalences. :param expression: The expression to simplify. :return: The simplified expression as a sympy Expr object. .. py:method:: repair(expression: deap.gp.PrimitiveTree) -> deap.gp.PrimitiveTree Use linear regression to infer the coefficients of the linear components of the expression. Named "repair" since a "repair operator" is quite common in GP. :param expression: The expression to process. :return: The expression with constant coefficients, or the original expression if that fails. .. py:method:: fitness(expression: deap.gp.PrimitiveTree) -> float Evaluate the fitness of an candidate expression according to the error between the estimated and observed values. Low values are better. :param expression: The candidate expression to evaluate. :return: The fitness of the individual. .. py:method:: make_offspring(population: list, num_offspring: int) -> list Create the next generation of individuals. :param population: The current population. :param num_offspring: The number of new individuals to generate. :return: A list of num_offspring new individuals generated through crossover and mutation. .. py:method:: run_gp(ngen: int, pop_size: int = 20, num_offspring: int = 10, seeds: list = None, repair: bool = True) -> deap.gp.PrimitiveTree Execute Genetic Programming to find the best expression using a mu+lambda algorithm. :param ngen: The maximum number of generations. :param pop_size: The population size. :param num_offspring: The number of new individuals per generation. :param seeds: Seed individuals for the initial population. :param repair: Whether to run the linear regression repair operator (defaults to True). :return: The best candididate expression. .. py:method:: mutate(expression: deap.gp.PrimitiveTree) -> deap.gp.PrimitiveTree mutate individuals to replicate the small changes in DNA that occur in natural reproduction. A node will randomly be inserted, removed, or replaced. :param expression: The expression to mutate. :return: The mutated expression.