Internals#
This reference provides detailed documentation for modules and classes that are important to developers who want to include formulae in their library.
matrices
#
These objects are not intended to be used by end users. But developers working with formulae will need some familiarity with them, especially if you want to take advantage of features like obtaining a design matrix from an existing design but evaluated with new data.
- class formulae.matrices.ResponseMatrix(term)[source]#
Representation of the respose matrix of a model.
- Parameters
term (Response) – The term that represents the response in the model.
- design_matrix#
A 2-dimensional numpy array containing the values of the response.
- Type
np.array
- name#
The name of the response term.
- Type
string
- class formulae.matrices.CommonEffectsMatrix(terms)[source]#
Representation of the design matrix for the common effects of a model.
- Parameters
terms (list) – A list of
Term
objects.
- design_matrix#
A 2-dimensional numpy array containing the values of the design matrix.
- Type
np.array
- evaluated#
Indicates if the terms have been evaluated at least once. The terms must have been evaluated before calling
self.evaluate_new_data()
because we must know the kind of each term to correctly handle the new data passed and the terms here.- Type
bool
- terms#
A dictionary that holds all the terms passed at instantiation. The keys are given by the term names.
- Type
dict
- __getitem__(term)[source]#
Get the sub-matrix that corresponds to a given term.
- Parameters
term (string) – The name of the term.
- Returns
matrix – A 2-dimensional numpy array that represents the sub-matrix corresponding to the term passed.
- Return type
np.array
- evaluate(data, env)[source]#
Obtain design matrix for common effects.
Evaluates
self.model
inside the data mask provided bydata
and updatesself.design_matrix
. This method also sets the values ofself.data
andself.env
.It also populates the dictionary
self.slices
…- Parameters
data (pandas.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- evaluate_new_data(data)[source]#
Evaluates common terms with new data and return a new instance of
CommonEffectsMatrix
.This method is intended to be used to obtain design matrices for new data and obtain out of sample predictions. Stateful transformations are properly handled if present in any of the terms, which means parameters involved in the transformation are not overwritten with the new data.
- Parameters
data (pandas.DataFrame) – The data frame where variables are taken from
- Returns
new_instance – A new instance of
CommonEffectsMatrix
whose design matrix is obtained with the values in the new data set.- Return type
- class formulae.matrices.GroupEffectsMatrix(terms)[source]#
Representation of the design matrix for the group specific effects of a model.
The sub-matrix that corresponds to a specific group effect can be accessed by
self[term_name]
, for exampleself["1|g"]
.- Parameters
terms (list) – A list of
GroupSpecificTerm
objects.
- design_matrix#
A 2 dimensional numpy array with the values of the design matrix.
- Type
np.array
- evaluated#
Indicates if the terms have been evaluated at least once. The terms must have been evaluated before calling
self.evaluate_new_data()
because we must know the kind of each term to correctly handle the new data passed and the terms here.- Type
bool
- terms#
A dictionary that holds all the group specific terms. The keys are given by the term names.
- Type
dict
- __getitem__(term)[source]#
Get the sub-matrix that corresponds to a given term.
- Parameters
term (string) – The name of a group specific term.
- Returns
matrix – A 2-dimensional numpy array that represents the sub-matrix corresponding to the term passed.
- Return type
np.array
- evaluate(data, env)[source]#
Evaluate group specific terms.
This evaluates
self.terms
inside the data mask provided bydata
and the environmentenv
. It updatesself.design_matrix
with the result from the evaluation of each term.This method also sets the values of
self.data
andself.env
. It also populates the dictionaryself.terms_info
with information related to each term,such as the kind, the columns and rows they occupy in the design matrix and the names of the columns.- Parameters
data (pandas.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- evaluate_new_data(data)[source]#
Evaluates group specific terms with new data and return a new instance of
GroupEffectsMatrix
.This method is intended to be used to obtain design matrices for new data and obtain out of sample predictions. Stateful transformations are properly handled if present in any of the group specific terms, which means parameters involved in the transformation are not overwritten with the new data.
- Parameters
data (pandas.DataFrame) – The data frame where variables are taken from
- Returns
new_instance – A new instance of
GroupEffectsMatrix
whose design matrix is obtained with the values in the new data set.- Return type
terms
#
These are internal components of the model that are not expected to be used by end users. Developers won’t (normally) need to access these objects either. But reading this documentation may help you understand how formulae works, with both its advantages and disadvantages.
- class formulae.terms.Variable(name, level=None, is_response=False)[source]#
Representation of a variable in a model Term.
This class and
Call
are the atomic components of a model term.- Parameters
name (string) – The identifier of the variable.
level (string) – The level to use as reference. Allows to use the notation
variable["level"]
to indicate which event should be model as success in binary response models. Can only be used with response terms. Defaults toNone
.is_response (bool) – Indicates whether this variable represents a response. Defaults to
False
.
- eval_categoric(x, spans_intercept)[source]#
Finishes evaluation of a categoric variable.
Converts the intermediate values in
x
into a numpy array of shape(n, p)
, wheren
is the number of observations andp
the number of dummy variables used in the numeric representation of the categorical variable.- Parameters
x (np.ndarray or pd.Series) – The intermediate values of the variable.
spans_intercept (bool) – Indicates if the encoding of categorical variables spans the intercept or not. Omitted when the variable is numeric.
- eval_new_data(data_mask)[source]#
Evaluates the variable with new data.
This method evaluates the variable within a new data mask. If this object is categorical, original encoding is remembered (and checked) when carrying out the new evaluation.
- Parameters
data_mask (pd.DataFrame) – The data frame where variables are taken from
- Returns
result – The rules for the shape of this array are the rules for
self.eval_numeric()
andself.eval_categoric()
. The first applies for numeric variables, the second for categoric ones.- Return type
np.array
- eval_new_data_categoric(x)[source]#
Evaluates the variable with new data when variable is categoric.
This method also checks the levels observed in the new data frame are included within the set of the levels of the original data set. If not, an error is raised.
- x: np.ndarray or pd.Series
The intermediate values of the variable.
- Returns
result – Numeric numpy array
(n, p)
, wheren
is the number of observations andp
the number of dummy variables used in the numeric representation of the categorical variable.- Return type
np.array
- eval_numeric(x)[source]#
Finishes evaluation of a numeric variable.
Converts the intermediate values in
x
into a 1d numpy array.- Parameters
x (np.ndarray or pd.Series) – The intermediate values of the variable.
- property labels#
Obtain labels of the columns in the design matrix associated with this Variable
- set_data(spans_intercept=None)[source]#
Obtains and stores the final data object related to this variable.
- Parameters
spans_intercept (bool) – Indicates if the encoding of categorical variables spans the intercept or not. Omitted when the variable is numeric.
- set_type(data_mask)[source]#
Detemines the type of the variable.
Looks for the name of the variable in
data_mask
and sets the.kind
property to"numeric"
or"categoric"
depending on the type of the variable. It also stores the result of the intermediate evaluation inself._intermediate_data
.- Parameters
data_mask (pd.DataFrame) – The data frame where variables are taken from
- property var_names#
Returns the name of the variable as a set.
This is used to determine which variables of the data set being used are actually used in the model. This allows us to subset the original data set and only raise errors regarding missing values when the missingness happens in variables used in the model.
- class formulae.terms.Call(call, is_response=False)[source]#
Representation of a call in a model Term.
This class and
Variable
are the atomic components of a model term.This object supports stateful transformations defined in
formulae.transforms
. A transformation of this type defines its parameters the first time it is called, and then can be used to recompute the transformation with memorized parameter values. This behavior is useful when implementing a predict method and using transformations such ascenter(x)
orscale(x)
.center(x)
memorizes the value of the mean, andscale(x)
memorizes both the mean and the standard deviation.- Parameters
call (formulae.terms.call_resolver.LazyCall) – The call expression returned by the parser.
is_response (bool) – Indicates whether this call represents a response. Defaults to
False
.
- accept(visitor)[source]#
Accept method called by a visitor.
Visitors are those available in call_utils.py, and are used to work with call terms.
- eval_categoric(x, spans_intercept)[source]#
Finishes evaluation of categoric call.
First, it checks whether the intermediate evaluation returned is ordered. If not, it creates a category where the levels are the observed in the variable. They are sorted according to
sorted()
rules.Then, it determines the reference level as well as all the other levels. If the variable is a response, the value returned is a dummy with 1s for the reference level and 0s elsewhere. If it is not a response variable, it determines the matrix of dummies according to the levels and the encoding passed.
- Parameters
x (np.ndarray or pd.Series) – The intermediate values of the variable.
spans_intercept (bool) – Indicates if the encoding of categorical variables spans the intercept or not. Omitted when the variable is numeric.
- eval_new_data(data_mask)[source]#
Evaluates the function call with new data.
This method evaluates the function call within a new data mask. If the transformation applied is a stateful transformation, it uses the proper object that remembers all parameters or settings that may have been set in a first pass.
- Parameters
data_mask (pd.DataFrame) – The data frame where variables are taken from
- Returns
result – The rules for the shape of this array are the rules for
self.eval_numeric()
andself.eval_categoric()
. The first applies for numeric calls, the second for categoric ones.- Return type
np.array
- eval_new_data_categoric(x)[source]#
Evaluates the call with new data when the result of the call is categoric.
This method also checks the levels observed in the new data frame are included within the set of the levels of the result of the original call If not, an error is raised.
- x: np.ndarray or pd.Series
The intermediate values of the variable.
- Returns
result – Numeric numpy array
(n, p)
, wheren
is the number of observations andp
the number of dummy variables used in the numeric representation of the categorical variable.- Return type
np.array
- eval_numeric(x)[source]#
Finishes evaluation of a numeric call.
Converts the intermediate values of the call into a numpy array of shape
(n, 1)
, wheren
is the number of observations. This method is used both inself.set_data
and inself.eval_new_data
.- Parameters
x (np.ndarray or pd.Series) – The intermediate values resulting from the call.
- Returns
result – A dictionary with keys
"value"
and"kind"
. The first contains the result of the evaluation, and the latter is equal to"numeric"
.- Return type
dict
- property labels#
Obtain labels of the columns in the design matrix associated with this Call
- set_data(spans_intercept=False)[source]#
Finishes the evaluation of the call according to its type.
It does not support multi-level categoric responses yet. If
self.is_response
isTrue
and the variable is of a categoric type, this method returns a 1d array of 0-1 instead of a matrix. # XTODO: Fix previous point In practice, it just completes the evaluation that started withself.set_type()
.- Parameters
spans_intercept (bool) – Indicates if the encoding of categorical variables spans the intercept or not. Omitted when the variable is numeric.
- set_type(data_mask, env)[source]#
Evaluates function and determines the type of the result of the call.
Evaluates the function call and sets the
.kind
property to"numeric"
or"categoric"
depending on the type of the result. It also stores the intermediate result of the evaluation in._intermediate_data
to prevent us from computing the same thing more than once.- Parameters
data_mask (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- property var_names#
Returns the names of the variables involved in the call, not including the callee.
This is used to determine which variables of the data set being used are actually used in the model. This allows us to subset the original data set and only raise errors regarding missing values when the missingness happens in variables used in the model.
Uses a visitor of class
CallVarsExtractor
that walks through the components of the call and returns a list with the name of the variables in the call.- Returns
result – A list of strings with the names of the names of the variables in the call, not including the name of the callee.
- Return type
list
- class formulae.terms.Term(*components)[source]#
Representation of a model term.
Terms are made of one or more components. Components are instances of
Variable
orCall
. Terms with only one component are known as main effects and terms with more than one component are known as interaction effects. The order of the interaction is given by the number of components in the term.- data#
The values associated with the term as they go into the design matrix.
- Type
np.ndarray
- kind#
Indicates the type of the term. Can be one of
"numeric"
,"categoric"
, or"interaction"
.- Type
string
- name#
The name of the term as it was originally written in the model formula.
- Type
string
- __add__(other)[source]#
Addition operator. Analogous to set union.
"x + x"
is equal to just"x"
"x + y"
is equal to a model with bothx
andy
."x + (y + z)"
addsx
to model already containingy
andz
.
- __matmul__(other)[source]#
Simple interaction operator.
This operator is actually invoked as
:
but internally passed as@
because there is no:
operator in Python."x : x"
equals to"x"
"x : y"
is the interaction between"x"
and"y"
x:(y:z)"
equals to"x:y:z"
(x:y):u"
equals to"x:y:u"
"(x:y):(u + v)"
equals to"x:y:u + x:y:v"
- __mul__(other)[source]#
Full interaction operator.
This operator includes both the interaction as well as the main effects involved in the interaction. It is a shortcut for
x + y + x:y
."x * x"
equals to"x"
"x * y"
equals to``”x + y + x:y”``"x:y * u"
equals to"x:y + u + x:y:u"
"x:y * u:v"
equals to"x:y + u:v + x:y:u:v"
"x:y * (u + v)"
equals to"x:y + u + v + x:y:u + x:y:v"
- __or__(other)[source]#
Group-specific operator. Creates a group-specific term.
Intercepts are implicitly added.
"x|g"
equals to"(1|g) + (x|g)"
Distributive over right hand side
"(x|g + h)"
equals to"(1|g) + (1|h) + (x|g) + (x|h)"
- __pow__(other)[source]#
Power operator.
It leaves the term as it is. For a power in the math sense do
I(x ** n)
or{x ** n}
.
- __sub__(other)[source]#
Subtraction operator. Analogous to set difference.
"x - x"
returns empty model."x - y"
returns the term"x"
."x - (y + z)"
returns the term"x"
.
- __truediv__(other)[source]#
Division interaction operator.
"x / x"
equals to just"x"
"x / y"
equals to"x + x:y"
"x / z:y"
equals to"x + x:z:y"
"x / (z + y)"
equals to"x + x:z + x:y"
"x:y / u:v"
equals to"x:y + x:y:u:v"
"x:y / (u + v)"
equals to"x:y + x:y:u + x:y:v"
- eval_new_data(data)[source]#
Evaluates the term with new data.
Calls
.eval_new_data()
method on each component in the term and combines the results appropiately.- Parameters
data (pd.DataFrame) – The data frame where variables are taken from
- Returns
result – The values resulting from evaluating this term using the new data.
- Return type
np.array
- get_component(name)[source]#
Returns a component by name.
- Parameters
name (string) – The name of the component to return.
- Returns
component – The component with name
name
.- Return type
:class:.Variable` or :class:.Call`
- property labels#
Obtain labels of the columns in the design matrix associated with this Term
- property levels#
Obtain levels of the columns in the design matrix associated with this Term
It is like .labels, without the name of the terms
- set_data(spans_intercept)[source]#
Obtains and stores the final data object related to this term.
Calls
.set_data()
method on each component in the term. Then, it uses the.data
attribute on each of them to buildself.data
andself.metadata
.- Parameters
encoding (dict or bool) – Indicates if it uses full or reduced encoding when the type of the variable is categoric.
- set_type(data, env)[source]#
Set type of the components in the term.
Calls
.set_type()
method on each component in the term. For those components of classVariable`
it only passes the data mask. For :class:.Call` objects it also passes the evaluation environment.- Parameters
data (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- property spans_intercept#
Does this term spans the intercept?
True if all the components span the intercept
- property var_names#
Returns the name of the variables in the term as a set.
Loops through each component and updates the set with the
.var_names
of each component.- Returns
var_names – The names of the variables involved in the term.
- Return type
set
- class formulae.terms.GroupSpecificTerm(expr, factor)[source]#
Representation of a group specific term.
Group specific terms are of the form
(expr | factor)
. The expressionexpr
is evaluated as a model formula with only common effects and produces a model matrix following the rules for common terms.factor
is inspired on factors in R, but here it is evaluated as an ordered pandas.CategoricalDtype object.The operator
|
works as in R package lme4. As its authors say: “One way to think about the vertical bar operator is as a special kind of interaction between the model matrix and the grouping factor. This interaction ensures that the columns of the model matrix have different effects for each level of the grouping factor”- Parameters
- data#
The values associated with the term as they go into the design matrix.
- Type
np.ndarray
- metadata#
Metadata associated with the term. If
"numeric"
or"categoric"
it holds additional information in the component.data
attribute. If"interaction"
, the keys are the name of the components and the values are dictionaries holding the metadata.- Type
dict
- kind#
Indicates the type of the term. Can be one of
"numeric"
,"categoric"
, or"interaction"
.- Type
string
- eval_new_data(data)[source]#
Evaluates the term with new data.
Converts the variable in
factor
to the type remembered from the first evaluation and produces the design matrix for this grouping, calls.eval_new_data()
onself.expr
to obtain the design matrix for theexpr
side, then computes the design matrix corresponding to the group specific effect.- Parameters
data (pd.DataFrame) – The data frame where variables are taken from.
- Returns
Zi
- Return type
np.ndarray
- property name#
Obtain string representation of the name of the term.
- Returns
name – The name of the term, such as
1|g
orvar|g
.- Return type
str
- property var_names#
Returns the name of the variables in the term as a set.
Obtains both the variables in the
expr
as well as the variables infactor
.- Returns
var_names – The names of the variables involved in the term.
- Return type
set
- class formulae.terms.Intercept[source]#
Internal representation of a model intercept.
- __add__(other)[source]#
Addition operator.
Generally this operator is used to explicitly add an intercept to a model. There may be cases where the result is not a
Model
, or does not contain an intercept."1 + 0"
and"1 + (-1)"
return an empty model."1 + 1"
returns a single intercept."1 + x"
and"1 + (x|g)"
returns a model with both the term and the intercept."1 + (x + y)"
adds an intercept to the model given byx
andy
.
- __or__(other)[source]#
Group-specific interaction-like operator. Creates a group-specific intercept.
This operation is usually surrounded by parenthesis. It is not actually required. They are always used because
|
has lower precedence than any of the other operators except~
.This operator is distributed over the right-hand side, which means
(1|g + h)
is equivalent to(1|g) + (1|h)
.
- __sub__(other)[source]#
Subtraction operator.
This operator removes an intercept from a model if the given model has an intercept.
"1 - 1"
returns an empty model."1 - 0"
and"1 - (-1)"
return an intercept."1 - (x + y)"
returns the model given byx
andy
unchanged."1 - (1 + x + y)"
returns the model given byx
andy
, removing the intercept.
- eval_new_data(data)[source]#
Returns data for a new intercept.
The length of the new intercept is given by the number of rows in
data
.
- set_data(encoding)[source]#
Creates data for the intercept.
It sets
self.data
equal to a numpy array of ones of length(self.len, 1)
.
- property var_names#
Returns empty set, no variables are used in the intercept.
- class formulae.terms.NegatedIntercept[source]#
Internal representation of the opposite of a model intercept.
This object is created whenever we use
"0"
or"-1"
in a model formula. It is not expected to appear in a final model. It’s here to help us make operations using theIntercept
and deciding when to keep it and when to drop it.- __add__(other)[source]#
Addition operator.
Generally this operator is used to explicitly remove an from a model.
"0 + 1"
returns an empty model."0 + 0"
returns a negated intercept"0 + x"
returns a model that includes the negated intercept."0 + (x + y)"
adds an the negated intercept to the model given byx
andy
.
No matter the final result contains the negated intercept, for example if we do something like
"y ~ 0 + x + y + 0"
, theModel
that is obtained removes any negated intercepts thay may have been left. They just don’t make sense in a model.
- class formulae.terms.Response(term)[source]#
Representation of a response term.
It is mostly a wrapper around
Term
.- Parameters
term (
Term
) – The term we want to take as response in the model. Must contain only one component.
- __add__(other)[source]#
Modelled as operator.
The operator is
~
, but since it is not an operator in Python, we internally replace it with+
. It means the LHS is taken as the response, and the RHS as the predictor.
- property var_names#
Returns the name of the variables in the response as a set.
- class formulae.terms.Model(*terms, response=None)[source]#
Representation of a model.
- Parameters
- __add__(other)[source]#
Addition operator. Analogous to set union.
Adds terms to the model and returns the model.
- Returns
self – The same model object with the added term(s).
- Return type
- __matmul__(other)[source]#
Simple interaction operator.
"(x + y) : (u + v)"
equals to"x:u + x:v + y:u + y:v"
."(x + y) : u"
equals to"x:u + y:u"
."(x + y) : f(u)"
equals to"x:f(u) + y:f(u)"
.
- Returns
model – A new instance of the model with all the interaction terms computed.
- Return type
- __mul__(other)[source]#
Full interaction operator.
"(x + y) * (u + v)"
equals to"x + y + u + v + x:u + x:v + y:u + y:v"
."(x + y) * u"
equals to"x + y + u + x:u + y:u"
.
- Returns
model – A new instance of the model with all the interaction terms computed.
- Return type
- __or__(other)[source]#
Group specific term operator.
Only _models_
"0 + x"
arrive here."(0 + x | g)"
equals to"(x|g)"
."(0 + x | g + y)"
equals to"(x|g) + (x|y)"
.
There are several edge cases to handle here. See in-line comments.
- Returns
model – A new instance of the model with all the terms computed.
- Return type
- __pow__(other)[source]#
Power of a set made of
Term
Computes all interactions up to order
n
between the terms in the set."(x + y + z) ** 2"
equals to"x + y + z + x:y + x:z + y:z"
.
- Returns
model – A new instance of the model with all the terms computed.
- Return type
- __sub__(other)[source]#
Subtraction operator. Analogous to set difference.
"(x + y) - (x + u)"
equals to"y + u"
.."(x + y) - x"
equals to"y"
."(x + y + (1 | g)) - (1 | g)"
equals to"x + y"
.
- Returns
self – The same model object with the removed term(s).
- Return type
- __truediv__(other)[source]#
Division interaction operator.
"(x + y) / z"
equals to"x + y + x:y:z"
."(x + y) / (u + v)"
equals to"x + y + x:y:u + x:y:v"
.
- Returns
model – A new instance of the model with all the terms computed.
- Return type
- add_response(term)[source]#
Add response term to model description.
This method is called when something like
"y ~ x + z"
appears in a model formula.This method is called via special methods such as
Response.__add__()
.- Returns
self – The same model object but now with a response term.
- Return type
- add_term(term)[source]#
Add term to model description.
The term added can be of class
Intercept
Term
, orGroupSpecificTerm
. It appends the new term object to the list of common terms or group specific terms as appropriate.This method is called via special methods such as
__add__()
.- Returns
self – The same model object but now containing the new term.
- Return type
- property common_components#
Components in common terms in the model.
- Returns
components – A list containing all components from common terms in the model.
- Return type
list
- eval(data, env)[source]#
Evaluates terms in the model.
- Parameters
data (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- set_types(data, env)[source]#
Set the type of the terms in the model.
Calls
.set_type()
method on term in the model.- Parameters
data (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- property terms#
Terms in the model.
- Returns
terms – A list containing both common and group specific terms.
- Return type
list
- property var_names#
Get the name of the variables in the model.
- Returns
var_names – The names of all variables in the model.
- Return type
set
call_resolver
#
- class formulae.terms.call_resolver.LazyValue(value, lexeme)[source]#
Lazy representation of a value in Python.
This object holds a value (a string or a number). It returns its value only when it is evaluated via
.eval()
.- Parameters
value (string or numeric) – The value it holds.
lexeme (string) – The string that generated the value it holds
- class formulae.terms.call_resolver.LazyVariable(name)[source]#
Lazy variable name.
The variable represented in this object does not hold any value until it is explicitly evaluated within a data mask and an evaluation environment.
- Parameters
name (str) – The name of the variable it represents.
- eval(data_mask, env)[source]#
Evaluates variable.
First it looks for the variable in
data_mask
. If not found there, it looks inenv
. Then it just returns the value the variable represents in either the data mask or the evaluation environment.- Parameters
data_mask (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- Returns
The value represented by this name in either the data mask or the environment.
- Return type
result
- class formulae.terms.call_resolver.LazyOperator(op, *args)[source]#
Unary and Binary lazy operators.
Functions calls like
a + b
are converted into a LazyOperator that is resolved when you explicitly evaluates it.- Parameters
op (builtin_function_or_method) – An operator in the
operator
built-in module. It can be one ofadd
,pos
,sub
,neg
,pow
,mul
, andtruediv
.args – One or two lazy instances.
- eval(data_mask, env)[source]#
Evaluates the operation.
Evaluates the arguments involved in the operation, calls the Python operator, and returns the result.
- Parameters
data_mask (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- Returns
The value obtained from the operator call.
- Return type
result
- class formulae.terms.call_resolver.LazyCall(callee, args, kwargs)[source]#
Lazy representation of a function call.
This class represents a function that can be a stateful transform (a function with memory) whose arguments can also be stateful transforms.
To evaluate these functions we don’t create a string representing Python code and let
eval()
run it. We take care of all the steps of the evaluation to make sure all the possibly nested stateful transformations are handled correctly.- Parameters
callee (string) – The name of the function
args (list) – A list of lazy objects that are evaluated when calling the function this object represents.
kwargs (dict) – A dictionary of named arguments that are evaluated when calling the function this object represents.
- eval(data_mask, env)[source]#
Evaluate the call.
This method first evaluates all its arguments, which are themselves lazy objects, and then proceeds to evaluate the call it represents.
- Parameters
data_mask (pd.DataFrame) – The data frame where variables are taken from
env (Environment) – The environment where values and functions are taken from.
- Returns
The result of the call evaluation.
- Return type
result