Internals#
This reference provides detailed documentation for modules and classes that are important to developers who want to include formulae in their library.
matrices#
These objects are not intended to be used by end users. But developers working with formulae will need some familiarity with them, especially if you want to take advantage of features like obtaining a design matrix from an existing design but evaluated with new data.
- class formulae.matrices.ResponseMatrix(term)[source]#
Representation of the respose matrix of a model.
- Parameters:
term (Response) – The term that represents the response in the model.
- design_matrix#
A 2-dimensional numpy array containing the values of the response.
- Type:
np.array
- name#
The name of the response term.
- Type:
str
- class formulae.matrices.CommonEffectsMatrix(terms)[source]#
Representation of the design matrix for the common effects of a model.
- Parameters:
terms (list) – A list of
Termobjects.
- design_matrix#
A 2-dimensional numpy array containing the values of the design matrix.
- Type:
np.array
- evaluated#
Indicates if the terms have been evaluated at least once. The terms must have been evaluated before calling
self.evaluate_new_data()because we must know the kind of each term to correctly handle the new data passed and the terms here.- Type:
bool
- terms#
A dictionary that holds all the terms passed at instantiation. The keys are given by the term names.
- Type:
dict
- __getitem__(term)[source]#
Get the sub-matrix that corresponds to a given term.
- Parameters:
term (str) – The name of the term.
- Returns:
matrix – A 2-dimensional numpy array that represents the sub-matrix corresponding to the term passed.
- Return type:
np.array
- evaluate(data, env)[source]#
Obtain design matrix for common effects.
Evaluates
self.modelinside the data mask provided bydataand updatesself.design_matrix. This method also sets the values ofself.dataandself.env.It also populates the dictionary
self.slices…- Parameters:
data (pandas.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- evaluate_new_data(data)[source]#
Evaluates common terms with new data and return a new instance of
CommonEffectsMatrix.This method is intended to be used to obtain design matrices for new data and obtain out of sample predictions. Stateful transformations are properly handled if present in any of the terms, which means parameters involved in the transformation are not overwritten with the new data.
- Parameters:
data (pandas.DataFrame) – The data frame where variables are taken from
- Returns:
new_instance – A new instance of
CommonEffectsMatrixwhose design matrix is obtained with the values in the new data set.- Return type:
- class formulae.matrices.GroupEffectsMatrix(terms)[source]#
Representation of the design matrix for the group specific effects of a model.
The sub-matrix that corresponds to a specific group effect can be accessed by
self[term_name], for exampleself["1|g"].- Parameters:
terms (list) – A list of
GroupSpecificTermobjects.
- design_matrix#
The design matrix in CSR format.
- Type:
scipy.sparse.csr_matrix
- evaluated#
Indicates if the terms have been evaluated at least once. The terms must have been evaluated before calling
self.evaluate_new_data()because we must know the kind of each term to correctly handle the new data passed and the terms here.- Type:
bool
- terms#
A dictionary that holds all the group specific terms. The keys are given by the term names.
- Type:
dict
- __getitem__(term)[source]#
Get the sub-matrix that corresponds to a given term.
- Parameters:
term (str) – The name of a group specific term.
- Returns:
matrix – The sub-matrix corresponding to the term passed.
- Return type:
scipy.sparse.csr_matrix
- evaluate(data, env)[source]#
Evaluate group specific terms.
This evaluates
self.termsinside the data mask provided bydataand the environmentenv. It updatesself.design_matrixwith the result from the evaluation of each term.This method also sets the values of
self.dataandself.env. It also populates the dictionaryself.terms_infowith information related to each term ,such as the kind, the columns and rows they occupy in the design matrix and the names of the columns.- Parameters:
data (pandas.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- evaluate_new_data(data)[source]#
Evaluates group specific terms with new data and return a new instance of
GroupEffectsMatrix.This method is intended to be used to obtain design matrices for new data and obtain out of sample predictions. Stateful transformations are properly handled if present in any of the group specific terms, which means parameters involved in the transformation are not overwritten with the new data.
- Parameters:
data (pandas.DataFrame) – The data frame where variables are taken from
- Returns:
new_instance – A new instance of
GroupEffectsMatrixwhose design matrix is obtained with the values in the new data set.- Return type:
terms#
These are internal components of the model that are not expected to be used by end users. Developers won’t (normally) need to access these objects either. But reading this documentation may help you understand how formulae works, with both its advantages and disadvantages.
- class formulae.terms.Variable(name, level=None, is_response=False)[source]#
Representation of a variable in a model Term.
This class and
Callare the atomic components of a model term.- Parameters:
name (str) – The identifier of the variable.
level (str) – The level to use as reference. Allows to use the notation
variable["level"]to indicate which event should be model as success in binary response models. Can only be used with response terms. Defaults toNone.is_response (bool) – Indicates whether this variable represents a response. Defaults to
False.
- eval_categoric(x, spans_intercept)[source]#
Finishes evaluation of a categoric variable.
Converts the intermediate values in
xinto a numpy array of shape(n, p), wherenis the number of observations andpthe number of dummy variables used in the numeric representation of the categorical variable.- Parameters:
x (np.ndarray or pd.Series) – The intermediate values of the variable.
spans_intercept (bool) – Indicates if the encoding of categorical variables spans the intercept or not. Omitted when the variable is numeric.
- eval_new_data(data_mask)[source]#
Evaluates the variable with new data.
This method evaluates the variable within a new data mask. If this object is categorical, original encoding is remembered (and checked) when carrying out the new evaluation.
- Parameters:
data_mask (pd.DataFrame) – The data frame where variables are taken from
- Returns:
result – The rules for the shape of this array are the rules for
self.eval_numeric()andself.eval_categoric(). The first applies for numeric variables, the second for categoric ones.- Return type:
np.array
- eval_new_data_categoric(x)[source]#
Evaluates the variable with new data when variable is categoric.
This method also checks the levels observed in the new data frame are included within the set of the levels of the original data set. If not, an error is raised.
- x: np.ndarray or pd.Series
The intermediate values of the variable.
- Returns:
result – Numeric numpy array
(n, p), wherenis the number of observations andpthe number of dummy variables used in the numeric representation of the categorical variable.- Return type:
np.array
- eval_numeric(x)[source]#
Finishes evaluation of a numeric variable.
Converts the intermediate values in
xinto a 1d numpy array.- Parameters:
x (np.ndarray or pd.Series) – The intermediate values of the variable.
- property labels#
Obtain labels of the columns in the design matrix associated with this Variable
- set_data(spans_intercept=None)[source]#
Obtains and stores the final data object related to this variable.
- Parameters:
spans_intercept (bool) – Indicates if the encoding of categorical variables spans the intercept or not. Omitted when the variable is numeric.
- set_type(data_mask)[source]#
Detemines the type of the variable.
Looks for the name of the variable in
data_maskand sets the.kindproperty to"numeric"or"categoric"depending on the type of the variable. It also stores the result of the intermediate evaluation inself._intermediate_data.- Parameters:
data_mask (pd.DataFrame) – The data frame where variables are taken from
- property var_names#
Returns the name of the variable as a set.
This is used to determine which variables of the data set being used are actually used in the model. This allows us to subset the original data set and only raise errors regarding missing values when the missingness happens in variables used in the model.
- class formulae.terms.Call(call, is_response=False)[source]#
Representation of a call in a model Term.
This class and
Variableare the atomic components of a model term.This object supports stateful transformations defined in
formulae.transforms. A transformation of this type defines its parameters the first time it is called, and then can be used to recompute the transformation with memorized parameter values. This behavior is useful when implementing a predict method and using transformations such ascenter(x)orscale(x).center(x)memorizes the value of the mean, andscale(x)memorizes both the mean and the standard deviation.- Parameters:
call (formulae.terms.call_resolver.LazyCall) – The call expression returned by the parser.
is_response (bool) – Indicates whether this call represents a response. Defaults to
False.
- accept(visitor)[source]#
Accept method called by a visitor.
Visitors are those available in call_utils.py, and are used to work with call terms.
- eval_categoric(x, spans_intercept)[source]#
Finishes evaluation of categoric call.
First, it checks whether the intermediate evaluation returned is ordered. If not, it creates a category where the levels are the observed in the variable. They are sorted according to
sorted()rules.Then, it determines the reference level as well as all the other levels. If the variable is a response, the value returned is a dummy with 1s for the reference level and 0s elsewhere. If it is not a response variable, it determines the matrix of dummies according to the levels and the encoding passed.
- Parameters:
x (np.ndarray or pd.Series) – The intermediate values of the variable.
spans_intercept (bool) – Indicates if the encoding of categorical variables spans the intercept or not. Omitted when the variable is numeric.
- eval_new_data(data_mask)[source]#
Evaluates the function call with new data.
This method evaluates the function call within a new data mask. If the transformation applied is a stateful transformation, it uses the proper object that remembers all parameters or settings that may have been set in a first pass.
- Parameters:
data_mask (pd.DataFrame) – The data frame where variables are taken from
- Returns:
result – The rules for the shape of this array are the rules for
self.eval_numeric()andself.eval_categoric(). The first applies for numeric calls, the second for categoric ones.- Return type:
np.array
- eval_new_data_categoric(x)[source]#
Evaluates the call with new data when the result of the call is categoric.
This method also checks the levels observed in the new data frame are included within the set of the levels of the result of the original call If not, an error is raised.
- x: np.ndarray or pd.Series
The intermediate values of the variable.
- Returns:
result – Numeric numpy array
(n, p), wherenis the number of observations andpthe number of dummy variables used in the numeric representation of the categorical variable.- Return type:
np.array
- eval_numeric(x)[source]#
Finishes evaluation of a numeric call.
Converts the intermediate values of the call into a numpy array of shape
(n, 1), wherenis the number of observations. This method is used both inself.set_dataand inself.eval_new_data.- Parameters:
x (np.ndarray or pd.Series) – The intermediate values resulting from the call.
- Returns:
result – A dictionary with keys
"value"and"kind". The first contains the result of the evaluation, and the latter is equal to"numeric".- Return type:
dict
- property labels#
Obtain labels of the columns in the design matrix associated with this Call
- set_data(spans_intercept=False)[source]#
Finishes the evaluation of the call according to its type.
It does not support multi-level categoric responses yet. If
self.is_responseisTrueand the variable is of a categoric type, this method returns a 1d array of 0-1 instead of a matrix. # XTODO: Fix previous point In practice, it just completes the evaluation that started withself.set_type().- Parameters:
spans_intercept (bool) – Indicates if the encoding of categorical variables spans the intercept or not. Omitted when the variable is numeric.
- set_type(data_mask, env)[source]#
Evaluates function and determines the type of the result of the call.
Evaluates the function call and sets the
.kindproperty to"numeric"or"categoric"depending on the type of the result. It also stores the intermediate result of the evaluation in._intermediate_datato prevent us from computing the same thing more than once.- Parameters:
data_mask (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- property var_names#
Returns the names of the variables involved in the call, not including the callee.
This is used to determine which variables of the data set being used are actually used in the model. This allows us to subset the original data set and only raise errors regarding missing values when the missingness happens in variables used in the model.
Uses a visitor of class
CallVarsExtractorthat walks through the components of the call and returns a list with the name of the variables in the call.- Returns:
result – A list of strings with the names of the names of the variables in the call, not including the name of the callee.
- Return type:
list
- class formulae.terms.Term(*components)[source]#
Representation of a model term.
Terms are made of one or more components. Components are instances of
VariableorCall. Terms with only one component are known as main effects and terms with more than one component are known as interaction effects. The order of the interaction is given by the number of components in the term.- data#
The values associated with the term as they go into the design matrix.
- Type:
np.ndarray
- kind#
Indicates the type of the term. Can be one of
"numeric","categoric", or"interaction".- Type:
str
- name#
The name of the term as it was originally written in the model formula.
- Type:
str
- __add__(other)[source]#
Addition operator. Analogous to set union.
"x + x"is equal to just"x""x + y"is equal to a model with bothxandy."x + (y + z)"addsxto model already containingyandz.
- __matmul__(other)[source]#
Simple interaction operator.
This operator is actually invoked as
:but internally passed as@because there is no:operator in Python."x : x"equals to"x""x : y"is the interaction between"x"and"y"x:(y:z)"equals to"x:y:z"(x:y):u"equals to"x:y:u""(x:y):(u + v)"equals to"x:y:u + x:y:v"
- __mul__(other)[source]#
Full interaction operator.
This operator includes both the interaction as well as the main effects involved in the interaction. It is a shortcut for
x + y + x:y."x * x"equals to"x""x * y"equals to``”x + y + x:y”``"x:y * u"equals to"x:y + u + x:y:u""x:y * u:v"equals to"x:y + u:v + x:y:u:v""x:y * (u + v)"equals to"x:y + u + v + x:y:u + x:y:v"
- __or__(other)[source]#
Group-specific operator. Creates a group-specific term.
Intercepts are implicitly added.
"x|g"equals to"(1|g) + (x|g)"
Distributive over right hand side
"(x|g + h)"equals to"(1|g) + (1|h) + (x|g) + (x|h)"
- __pow__(other)[source]#
Power operator.
It leaves the term as it is. For a power in the math sense do
I(x ** n)or{x ** n}.
- __sub__(other)[source]#
Subtraction operator. Analogous to set difference.
"x - x"returns empty model."x - y"returns the term"x"."x - (y + z)"returns the term"x".
- __truediv__(other)[source]#
Division interaction operator.
"x / x"equals to just"x""x / y"equals to"x + x:y""x / z:y"equals to"x + x:z:y""x / (z + y)"equals to"x + x:z + x:y""x:y / u:v"equals to"x:y + x:y:u:v""x:y / (u + v)"equals to"x:y + x:y:u + x:y:v"
- eval_new_data(data)[source]#
Evaluates the term with new data.
Calls
.eval_new_data()method on each component in the term and combines the results appropiately.- Parameters:
data (pd.DataFrame) – The data frame where variables are taken from
- Returns:
result – The values resulting from evaluating this term using the new data.
- Return type:
np.array
- get_component(name)[source]#
Returns a component by name.
- Parameters:
name (str) – The name of the component to return.
- Returns:
component – The component with name
name.- Return type:
:class:`Variable or :class:`Call
- property labels#
Obtain labels of the columns in the design matrix associated with this Term
- property levels#
Obtain levels of the columns in the design matrix associated with this Term
It is like .labels, without the name of the terms
- set_data(spans_intercept)[source]#
Obtains and stores the final data object related to this term.
Calls
.set_data()method on each component in the term. Then, it uses the.dataattribute on each of them to buildself.dataandself.metadata.- Parameters:
encoding (dict or bool) – Indicates if it uses full or reduced encoding when the type of the variable is categoric.
- set_type(data, env)[source]#
Set type of the components in the term.
Calls
.set_type()method on each component in the term. For those components of classVariable`it only passes the data mask. For :class:`Call objects it also passes the evaluation environment.- Parameters:
data (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- property spans_intercept#
Does this term spans the intercept?
True if all the components span the intercept
- property var_names#
Returns the name of the variables in the term as a set.
Loops through each component and updates the set with the
.var_namesof each component.- Returns:
var_names – The names of the variables involved in the term.
- Return type:
set
- class formulae.terms.GroupSpecificTerm(expr, factor)[source]#
Representation of a group specific term.
Group specific terms are of the form
(expr | factor). The expressionexpris evaluated as a model formula with only common effects and produces a model matrix following the rules for common terms.factoris inspired on factors in R, but here it is evaluated as an orderedpandas.CategoricalDtypeobject.The pipe operator
|works as in R package lme4. As its authors say: “One way to think about the vertical bar operator is as a special kind of interaction between the model matrix and the grouping factor. This interaction ensures that the columns of the model matrix have different effects for each level of the grouping factor”- Parameters:
- data#
The values associated with the term as they go into the design matrix.
- Type:
np.ndarray
- metadata#
Metadata associated with the term. If
"numeric"or"categoric"it holds additional information in the component.dataattribute. If"interaction", the keys are the name of the components and the values are dictionaries holding the metadata.- Type:
dict
- kind#
Indicates the type of the term. Can be one of
"numeric","categoric", or"interaction".- Type:
str
- eval_new_data(data)[source]#
Evaluates the term with new data.
Converts the variable in
factorto the type remembered from the first evaluation and produces the design matrix for this grouping, calls.eval_new_data()onself.exprto obtain the design matrix for theexprside, then computes the design matrix corresponding to the group specific effect.- Parameters:
data (pd.DataFrame) – The data frame where variables are taken from.
- Returns:
Zi
- Return type:
scipy.sparse.csr_matrix
- property name#
Obtain string representation of the name of the term.
- Returns:
name – The name of the term, such as
1|gorvar|g.- Return type:
str
- property var_names#
Returns the name of the variables in the term as a set.
Obtains both the variables in the
expras well as the variables infactor.- Returns:
var_names – The names of the variables involved in the term.
- Return type:
set
- class formulae.terms.Intercept[source]#
Internal representation of a model intercept.
- __add__(other)[source]#
Addition operator.
Generally this operator is used to explicitly add an intercept to a model. There may be cases where the result is not a
Model, or does not contain an intercept."1 + 0"and"1 + (-1)"return an empty model."1 + 1"returns a single intercept."1 + x"and"1 + (x|g)"returns a model with both the term and the intercept."1 + (x + y)"adds an intercept to the model given byxandy.
- __or__(other)[source]#
Group-specific interaction-like operator. Creates a group-specific intercept.
This operation is usually surrounded by parenthesis. It is not actually required. They are always used because
|has lower precedence than any of the other operators except~.This operator is distributed over the right-hand side, which means
(1|g + h)is equivalent to(1|g) + (1|h).
- __sub__(other)[source]#
Subtraction operator.
This operator removes an intercept from a model if the given model has an intercept.
"1 - 1"returns an empty model."1 - 0"and"1 - (-1)"return an intercept."1 - (x + y)"returns the model given byxandyunchanged."1 - (1 + x + y)"returns the model given byxandy, removing the intercept.
- eval_new_data(data)[source]#
Returns data for a new intercept.
The length of the new intercept is given by the number of rows in
data.
- set_data(encoding)[source]#
Creates data for the intercept.
It sets
self.dataequal to a numpy array of ones of length(self.len, 1).
- property var_names#
Returns empty set, no variables are used in the intercept.
- class formulae.terms.NegatedIntercept[source]#
Internal representation of the opposite of a model intercept.
This object is created whenever we use
"0"or"-1"in a model formula. It is not expected to appear in a final model. It’s here to help us make operations using theInterceptand deciding when to keep it and when to drop it.- __add__(other)[source]#
Addition operator.
Generally this operator is used to explicitly remove an from a model.
"0 + 1"returns an empty model."0 + 0"returns a negated intercept"0 + x"returns a model that includes the negated intercept."0 + (x + y)"adds an the negated intercept to the model given byxandy.
No matter the final result contains the negated intercept, for example if we do something like
"y ~ 0 + x + y + 0", theModelthat is obtained removes any negated intercepts thay may have been left. They just don’t make sense in a model.
- class formulae.terms.Response(term)[source]#
Representation of a response term.
It is mostly a wrapper around
Term.- Parameters:
term (
Term) – The term we want to take as response in the model. Must contain only one component.
- __add__(other)[source]#
Modelled as operator.
The operator is
~, but since it is not an operator in Python, we internally replace it with+. It means the LHS is taken as the response, and the RHS as the predictor.
- property var_names#
Returns the name of the variables in the response as a set.
- class formulae.terms.Model(*terms, response=None)[source]#
Representation of a model.
- Parameters:
- __add__(other)[source]#
Addition operator. Analogous to set union.
Adds terms to the model and returns the model.
- Returns:
self – The same model object with the added term(s).
- Return type:
- __matmul__(other)[source]#
Simple interaction operator.
"(x + y) : (u + v)"equals to"x:u + x:v + y:u + y:v"."(x + y) : u"equals to"x:u + y:u"."(x + y) : f(u)"equals to"x:f(u) + y:f(u)".
- Returns:
model – A new instance of the model with all the interaction terms computed.
- Return type:
- __mul__(other)[source]#
Full interaction operator.
"(x + y) * (u + v)"equals to"x + y + u + v + x:u + x:v + y:u + y:v"."(x + y) * u"equals to"x + y + u + x:u + y:u".
- Returns:
model – A new instance of the model with all the interaction terms computed.
- Return type:
- __or__(other)[source]#
Group specific term operator.
Only _models_
"0 + x"arrive here."(0 + x | g)"equals to"(x|g)"."(0 + x | g + y)"equals to"(x|g) + (x|y)".
There are several edge cases to handle here. See in-line comments.
- Returns:
model – A new instance of the model with all the terms computed.
- Return type:
- __pow__(other)[source]#
Power of a set made of
TermComputes all interactions up to order
nbetween the terms in the set."(x + y + z) ** 2"equals to"x + y + z + x:y + x:z + y:z".
- Returns:
model – A new instance of the model with all the terms computed.
- Return type:
- __sub__(other)[source]#
Subtraction operator. Analogous to set difference.
"(x + y) - (x + u)"equals to"y + u".."(x + y) - x"equals to"y"."(x + y + (1 | g)) - (1 | g)"equals to"x + y".
- Returns:
self – The same model object with the removed term(s).
- Return type:
- __truediv__(other)[source]#
Division interaction operator.
"(x + y) / z"equals to"x + y + x:y:z"."(x + y) / (u + v)"equals to"x + y + x:y:u + x:y:v".
- Returns:
model – A new instance of the model with all the terms computed.
- Return type:
- add_response(term)[source]#
Add response term to model description.
This method is called when something like
"y ~ x + z"appears in a model formula.This method is called via special methods such as
Response.__add__().- Returns:
self – The same model object but now with a response term.
- Return type:
- add_term(term)[source]#
Add term to model description.
The term added can be of class
InterceptTerm, orGroupSpecificTerm. It appends the new term object to the list of common terms or group specific terms as appropriate.This method is called via special methods such as
__add__().- Returns:
self – The same model object but now containing the new term.
- Return type:
- property common_components#
Components in common terms in the model.
- Returns:
components – A list containing all components from common terms in the model.
- Return type:
list
- eval(data, env)[source]#
Evaluates terms in the model.
- Parameters:
data (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- set_types(data, env)[source]#
Set the type of the terms in the model.
Calls
.set_type()method on term in the model.- Parameters:
data (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- property terms#
Terms in the model.
- Returns:
terms – A list containing both common and group specific terms.
- Return type:
list
- property var_names#
Get the name of the variables in the model.
- Returns:
var_names – The names of all variables in the model.
- Return type:
set
call_resolver#
- class formulae.terms.call_resolver.LazyValue(value, lexeme)[source]#
Lazy representation of a value in Python.
This object holds a value (a string or a number). It returns its value only when it is evaluated via
.eval().- Parameters:
value (str or numeric) – The value it holds.
lexeme (str) – The string that generated the value it holds
- class formulae.terms.call_resolver.LazyVariable(name)[source]#
Lazy variable name.
The variable represented in this object does not hold any value until it is explicitly evaluated within a data mask and an evaluation environment.
- Parameters:
name (str) – The name of the variable it represents.
- eval(data_mask, env)[source]#
Evaluates variable.
First it looks for the variable in
data_mask. If not found there, it looks inenv. Then it just returns the value the variable represents in either the data mask or the evaluation environment.- Parameters:
data_mask (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- Returns:
The value represented by this name in either the data mask or the environment.
- Return type:
result
- class formulae.terms.call_resolver.LazyOperator(op, *args)[source]#
Unary and Binary lazy operators.
Functions calls like
a + bare converted into a LazyOperator that is resolved when you explicitly evaluates it.- Parameters:
op (builtin_function_or_method) – An operator in the
operatorbuilt-in module. It can be one ofadd,pos,sub,neg,pow,mul, andtruediv.args – One or two lazy instances.
- eval(data_mask, env)[source]#
Evaluates the operation.
Evaluates the arguments involved in the operation, calls the Python operator, and returns the result.
- Parameters:
data_mask (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- Returns:
The value obtained from the operator call.
- Return type:
result
- class formulae.terms.call_resolver.LazyCall(callee, args, kwargs)[source]#
Lazy representation of a function call.
This class represents a function that can be a stateful transform (a function with memory) whose arguments can also be stateful transforms.
To evaluate these functions we don’t create a string representing Python code and let
eval()run it. We take care of all the steps of the evaluation to make sure all the possibly nested stateful transformations are handled correctly.- Parameters:
callee (str) – The name of the function
args (list) – A list of lazy objects that are evaluated when calling the function this object represents.
kwargs (dict) – A dictionary of named arguments that are evaluated when calling the function this object represents.
- eval(data_mask, env)[source]#
Evaluate the call.
This method first evaluates all its arguments, which are themselves lazy objects, and then proceeds to evaluate the call it represents.
- Parameters:
data_mask (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- Returns:
The result of the call evaluation.
- Return type:
result