Regularized Kernel Hilbert Space

NPIV

This module provides implementations of RKHS Instrumental Variable (IV) estimators.

Classes:

_BaseRKHSIV: Base class for RKHS IV methods. RKHSIV: RKHS IV estimator. RKHSIVCV: RKHS IV estimator with cross-validation. RKHSIVL2: RKHS IV estimator with L2 regularization. RKHSIVL2CV: RKHS IV estimator with L2 regularization and cross-validation. ApproxRKHSIV: Approximate RKHS IV estimator using kernel approximations. ApproxRKHSIVCV: Approximate RKHS IV estimator with cross-validation using kernel approximations.

class rkhsiv.ApproxRKHSIV(kernel_approx='nystrom', n_components=10, kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scale='auto')[source]

Bases: _BaseRKHSIV

Approximate RKHS IV estimator using kernel approximations.

This class implements an approximate RKHS IV estimator using kernel approximations.

Parameters
  • kernel_approx (str) – Kernel approximation method (‘nystrom’ or ‘rbfsampler’).

  • n_components (int) – Number of approximation components.

  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scale (str or float) – Scale of the regularization parameter.

  • kernel_params (dict) – Additional parameters for the kernel.

_get_new_approx_instance()[source]

Create a new kernel approximation instance.

Returns

Kernel approximation instance.

Return type

object

fit(Z, T, Y)[source]

Fit the approximate RKHS IV estimator.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

predict(T)[source]

Predict outcomes for new treatments.

Parameters

T (array-like) – New treatments.

Returns

Predicted outcomes.

Return type

array-like

score(Z, T, Y, delta='auto')[source]

Compute the score of the fitted estimator.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

  • delta (str or float) – Critical radius.

Returns

Score.

Return type

float

class rkhsiv.ApproxRKHSIVCV(kernel_approx='nystrom', n_components=10, kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]

Bases: ApproxRKHSIV

Approximate RKHS IV estimator with cross-validation using kernel approximations.

This class implements an approximate RKHS IV estimator with cross-validation using kernel approximations.

Parameters
  • kernel_approx (str) – Kernel approximation method (‘nystrom’ or ‘rbfsampler’).

  • n_components (int) – Number of approximation components.

  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scales (str or array-like) – Scale of the regularization parameter.

  • n_alphas (int) – Number of alpha scales to try.

  • cv (int) – Number of folds for cross-validation.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(Z, T, Y)[source]

Fit the approximate RKHS IV estimator with cross-validation.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

class rkhsiv.RKHSIV(kernel='rbf', gamma=2, degree=3, coef0=1, delta_scale='auto', delta_exp='auto', alpha_scale='auto', kernel_params=None)[source]

Bases: _BaseRKHSIV

RKHS IV estimator.

This class implements an RKHS IV estimator.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scale (str or float) – Scale of the regularization parameter.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(Z, T, Y)[source]

Fit the RKHS IV estimator.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

predict(T_test)[source]

Predict outcomes for new treatments.

Parameters

T_test (array-like) – New treatments.

Returns

Predicted outcomes.

Return type

array-like

score(Z, T, Y, delta='auto')[source]

Compute the score of the fitted estimator.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

  • delta (str or float) – Critical radius.

Returns

Score.

Return type

float

class rkhsiv.RKHSIVCV(kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]

Bases: RKHSIV

RKHS IV estimator with cross-validation.

This class implements an RKHS IV estimator with cross-validation.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scales (str or array-like) – Scale of the regularization parameter.

  • n_alphas (int) – Number of alpha scales to try.

  • cv (int) – Number of folds for cross-validation.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(Z, T, Y)[source]

Fit the RKHS IV estimator with cross-validation.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

class rkhsiv.RKHSIVL2(kernel='rbf', gamma=2, degree=3, coef0=1, delta_scale='auto', delta_exp='auto', kernel_params=None)[source]

Bases: _BaseRKHSIV

RKHS IV estimator with L2 regularization.

This class implements an RKHS IV estimator with L2 regularization.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(Z, T, Y)[source]

Fit the RKHS IV estimator with L2 regularization.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

predict(T_test)[source]

Predict outcomes for new treatments.

Parameters

T_test (array-like) – New treatments.

Returns

Predicted outcomes.

Return type

array-like

class rkhsiv.RKHSIVL2CV(kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]

Bases: RKHSIVL2

RKHS IV estimator with L2 regularization and cross-validation.

This class implements an RKHS IV estimator with L2 regularization and cross-validation.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scales (str or array-like) – Scale of the regularization parameter.

  • n_alphas (int) – Number of alpha scales to try.

  • cv (int) – Number of folds for cross-validation.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(Z, T, Y)[source]

Fit the RKHS IV estimator with L2 regularization and cross-validation.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

class rkhsiv._BaseRKHSIV(*args, **kwargs)[source]

Bases: object

Base class for RKHS IV methods.

This class provides common functionality for RKHS IV estimators.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scale (str or float) – Scale of the regularization parameter.

  • kernel_params (dict) – Additional parameters for the kernel.

_get_alpha(delta, alpha_scale)[source]
_get_alpha_scale()[source]
_get_alpha_scales()[source]
_get_delta(n)[source]

Compute the critical radius.

Parameters

n (int) – Number of samples.

Returns

Critical radius.

Return type

float

_get_kernel(X, Y=None)[source]
rkhsiv._check_auto(param)[source]

Nested NPIV

This module provides implementations of nested NPIV estimators for RKHS function classes.

Classes:

_BaseRKHS2IV: Base class for nested RKHS IV methods. RKHS2IV: Nested RKHS IV estimator. RKHS2IVCV: Nested RKHS IV estimator with cross-validation. RKHS2IVL2: Nested RKHS IV estimator with L2 regularization. RKHS2IVL2CV: Nested RKHS IV estimator with L2 regularization and cross-validation.

class rkhs2iv.RKHS2IV(kernel='rbf', gamma=2, degree=3, coef0=1, delta_scale='auto', delta_exp='auto', kernel_params=None)[source]

Bases: _BaseRKHS2IV

Nested RKHS IV estimator.

This class implements a nested RKHS IV estimator.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fit the nested RKHS IV estimator.

Parameters
  • A (array-like) – Instrumental variables for the first stage.

  • B (array-like) – Treatments for the first stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Treatments for the second stage.

  • Y (array-like) – Outcomes.

  • W (array-like, optional) – Weights. Defaults to None.

  • subsetted (bool, optional) – Whether to use subsets. Defaults to False.

  • subset_ind1 (array-like, optional) – Indices for the first subset. Required if subsetted is True.

  • subset_ind2 (array-like, optional) – Indices for the second subset. Optional.

Returns

Fitted estimator.

Return type

self

predict(B_test, *args)[source]

Predict outcomes for new treatments.

Parameters
  • B_test (array-like) – New treatments for the second stage.

  • *args – Additional arguments, expected to be A_test (new treatments for the first stage).

Returns

Predicted outcomes.

Return type

array-like

class rkhs2iv.RKHS2IVCV(kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]

Bases: RKHS2IV

Nested RKHS IV estimator with cross-validation.

This class implements a nested RKHS IV estimator with cross-validation.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scales (str or array-like) – Scale of the regularization parameter.

  • n_alphas (int) – Number of alpha scales to try.

  • cv (int) – Number of folds for cross-validation.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fit the nested RKHS IV estimator with cross-validation.

Parameters
  • A (array-like) – Instrumental variables for the first stage.

  • B (array-like) – Treatments for the first stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Treatments for the second stage.

  • Y (array-like) – Outcomes.

  • W (array-like, optional) – Weights. Defaults to None.

  • subsetted (bool, optional) – Whether to use subsets. Defaults to False.

  • subset_ind1 (array-like, optional) – Indices for the first subset. Required if subsetted is True.

  • subset_ind2 (array-like, optional) – Indices for the second subset. Optional.

Returns

Fitted estimator.

Return type

self

class rkhs2iv.RKHS2IVL2(kernel='rbf', gamma=2, degree=3, coef0=1, delta_scale='auto', delta_exp='auto', kernel_params=None)[source]

Bases: _BaseRKHS2IV

Nested RKHS IV estimator with L2 regularization.

This class implements a nested RKHS IV estimator with L2 regularization.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fit the nested RKHS IV estimator with L2 regularization.

Parameters
  • A (array-like) – Instrumental variables for the first stage.

  • B (array-like) – Treatments for the first stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Treatments for the second stage.

  • Y (array-like) – Outcomes.

  • W (array-like, optional) – Weights. Defaults to None.

  • subsetted (bool, optional) – Whether to use subsets. Defaults to False.

  • subset_ind1 (array-like, optional) – Indices for the first subset. Required if subsetted is True.

  • subset_ind2 (array-like, optional) – Indices for the second subset. Optional.

Returns

Fitted estimator.

Return type

self

predict(B_test, *args)[source]

Predict outcomes for new treatments.

Parameters
  • B_test (array-like) – New treatments for the second stage.

  • *args – Additional arguments, expected to be A_test (new treatments for the first stage).

Returns

Predicted outcomes.

Return type

array-like

class rkhs2iv.RKHS2IVL2CV(kernel='rbf', gamma=2, degree=3, coef0=1, kernel_params=None, delta_scale='auto', delta_exp='auto', alpha_scales='auto', n_alphas=30, cv=6)[source]

Bases: RKHS2IVL2

Nested RKHS IV estimator with L2 regularization and cross-validation.

This class implements a nested RKHS IV estimator with L2 regularization and cross-validation.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scales (str or array-like) – Scale of the regularization parameter.

  • n_alphas (int) – Number of alpha scales to try.

  • cv (int) – Number of folds for cross-validation.

  • kernel_params (dict) – Additional parameters for the kernel.

fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fit the nested RKHS IV estimator with L2 regularization and cross-validation.

Parameters
  • A (array-like) – Instrumental variables for the first stage.

  • B (array-like) – Treatments for the first stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Treatments for the second stage.

  • Y (array-like) – Outcomes.

  • W (array-like, optional) – Weights. Defaults to None.

  • subsetted (bool, optional) – Whether to use subsets. Defaults to False.

  • subset_ind1 (array-like, optional) – Indices for the first subset. Required if subsetted is True.

  • subset_ind2 (array-like, optional) – Indices for the second subset. Optional.

Returns

Fitted estimator.

Return type

self

class rkhs2iv._BaseRKHS2IV(*args, **kwargs)[source]

Bases: object

Base class for nested RKHS IV methods.

This class provides common functionality for nested RKHS IV estimators.

Parameters
  • kernel (str or callable) – Kernel function or string identifier.

  • gamma (float) – Gamma parameter for the kernel.

  • degree (int) – Degree for polynomial kernels.

  • coef0 (float) – Zero coefficient for polynomial kernels.

  • delta_scale (str or float) – Scale of the critical radius.

  • delta_exp (str or float) – Exponent of the critical radius.

  • alpha_scale (str or float) – Scale of the regularization parameter.

  • kernel_params (dict) – Additional parameters for the kernel.

_get_alpha(delta, alpha_scale)[source]
_get_alpha_scale()[source]
_get_alpha_scales()[source]
_get_delta(n)[source]

Compute the critical radius.

Parameters

n (int) – Number of samples.

Returns

Critical radius.

Return type

float

_get_kernel(X, Y=None)[source]
rkhs2iv._check_auto(param)[source]

Random Forest

NPIV

This module provides implementations of ensemble instrumental variable (IV) estimators using RandomForest models.

Classes:

EnsembleIV: Implements an ensemble learning IV method with adversarial and learner components. EnsembleIVStar: Similar to EnsembleIV but with a different method for updating the test predictions. EnsembleIVL2: An extension of EnsembleIV with L2 regularization and optional cross-validation for regularization parameter selection.

Functions:

_mysign: A helper function that returns 2 if the input is non-negative and -1 otherwise.

class ensemble.EnsembleIV(adversary='auto', learner='auto', max_abs_value=4, n_iter=100)[source]

Bases: object

Implements an ensemble learning IV method with adversarial and learner components.

Parameters
  • adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.

  • learner (str or estimator) – Learner model. If ‘auto’, a default RandomForestClassifier is used.

  • max_abs_value (float) – Maximum absolute value for the predictions.

  • n_iter (int) – Number of iterations for the ensemble.

_check_input(Z, T, Y)[source]
_get_new_adversary()[source]
_get_new_learner()[source]
fit(Z, T, Y)[source]

Fits the ensemble IV model to the provided data.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatment variables.

  • Y (array-like) – Outcome variables.

Returns

Fitted ensemble IV model.

Return type

self

predict(T)[source]

Predicts outcomes for new data using the fitted ensemble IV model.

Parameters

T (array-like) – Treatment variables.

Returns

Predicted outcomes.

Return type

array

class ensemble.EnsembleIVL2(adversary='auto', learner='auto', n_iter=100, delta_scale='auto', delta_exp='auto', CV=False, alpha_scales='auto', n_alphas=30, n_folds=5)[source]

Bases: object

An extension of EnsembleIV with L2 regularization and optional cross-validation to select the best regularization parameter.

Parameters
  • adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.

  • learner (str or estimator) – Learner model. If ‘auto’, a default RandomForestRegressor is used.

  • n_iter (int) – Number of iterations for the ensemble.

  • delta_scale (str or float) – Scale factor for the critical radius delta. Default is ‘auto’.

  • delta_exp (str or float) – Exponent for the critical radius delta. Default is ‘auto’.

  • CV (bool) – Whether to perform cross-validation to select the best alpha value.

  • alpha_scales (str or list) – Scales for alpha in cross-validation. Default is ‘auto’.

  • n_alphas (int) – Number of alpha values to test in cross-validation.

  • n_folds (int) – Number of folds for cross-validation.

_check_input(Z, T, Y)[source]
_cross_validate_alpha(Z, T, Y)[source]

Performs cross-validation to select the best alpha value.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatment variables.

  • Y (array-like) – Outcome variables.

Returns

Best alpha value.

Return type

float

_get_alpha_scales()[source]
_get_delta(n)[source]

Computes the critical radius delta based on the sample size.

Parameters

n (int) – Sample size.

Returns

Critical radius delta.

Return type

float

_get_new_adversary()[source]
_get_new_learner()[source]
fit(Z, T, Y, alpha=1.0, cross_validating=False)[source]

Fits the ensemble IV model with L2 regularization to the provided data.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatment variables.

  • Y (array-like) – Outcome variables.

  • alpha (float) – Regularization parameter.

  • cross_validating (bool) – Whether the function is called during cross-validation.

Returns

Fitted ensemble IV model.

Return type

self

predict(T)[source]

Predicts outcomes for new data using the fitted ensemble IV model with L2 regularization.

Parameters

T (array-like) – Treatment variables.

Returns

Predicted outcomes.

Return type

array

class ensemble.EnsembleIVStar(adversary='auto', learner='auto', max_abs_value=4, n_iter=100)[source]

Bases: object

Similar to EnsembleIV but with a different method for updating the test predictions using a linear combination approach.

Parameters
  • adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.

  • learner (str or estimator) – Learner model. If ‘auto’, a default RandomForestClassifier is used.

  • max_abs_value (float) – Maximum absolute value for the predictions.

  • n_iter (int) – Number of iterations for the ensemble.

_check_input(Z, T, Y)[source]
_get_new_adversary()[source]
_get_new_learner()[source]
_update_test(Z, Y, pred_old, adv)[source]
fit(Z, T, Y)[source]

Fits the ensemble IV model to the provided data.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatment variables.

  • Y (array-like) – Outcome variables.

Returns

Fitted ensemble IV model.

Return type

self

predict(T)[source]

Predicts outcomes for new data using the fitted ensemble IV model.

Parameters

T (array-like) – Treatment variables.

Returns

Predicted outcomes.

Return type

array

ensemble._mysign(x)[source]

Nested NPIV

This module provides implementations of nested nonparametric instrumental variable (NPIV) estimators using ensemble RandomForest models.

Classes:

Ensemble2IV: Implements a nested ensemble learning IV method with two adversaries and two learners. Ensemble2IVL2: An extension of Ensemble2IV with L2 regularization and optional cross-validation for regularization parameter selection.

Functions:

_mysign: A helper function that returns 2 if the input is non-negative and -1 otherwise.

class ensemble2.Ensemble2IV(adversary='auto', learnerg='auto', learnerh='auto', max_abs_value=4, n_iter=100, n_burn_in=10)[source]

Bases: object

Implements a nested ensemble learning IV method with two adversaries and two learners.

Parameters
  • adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.

  • learnerg (str or estimator) – Learner model for g. If ‘auto’, a default RandomForestClassifier is used.

  • learnerh (str or estimator) – Learner model for h. If ‘auto’, a default RandomForestClassifier is used.

  • max_abs_value (float) – Maximum absolute value for the predictions.

  • n_iter (int) – Number of iterations for the ensemble.

  • n_burn_in (int) – Number of burn-in iterations.

_check_input(A, B, C, D, Y, W)[source]
_get_new_adversary()[source]
_get_new_learnerg()[source]
_get_new_learnerh()[source]
fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fits the nested ensemble IV model to the provided data.

Parameters
  • A (array-like) – Instrumental variables for the first stage.

  • B (array-like) – Instrumental variables for the second stage.

  • C (array-like) – Treatment variables for the first stage.

  • D (array-like) – Treatment variables for the second stage.

  • Y (array-like) – Outcome variables.

  • W (array-like, optional) – Weights for the observations.

  • subsetted (bool) – If True, use subsets of data as indicated by subset_ind1 and subset_ind2.

  • subset_ind1 (array-like) – Indices for the first subset.

  • subset_ind2 (array-like) – Indices for the second subset.

Returns

Fitted nested ensemble IV model.

Return type

self

predict(B, *args)[source]

Predicts outcomes for new data using the fitted nested ensemble IV model.

Parameters
  • B (array-like) – Instrumental variables for the second stage.

  • args (tuple) – Optional second argument for instrumental variables of the first stage.

Returns

Predicted outcomes for the second stage. If a second argument is provided, returns a tuple with predictions for both stages.

Return type

array

class ensemble2.Ensemble2IVL2(adversary='auto', learnerg='auto', learnerh='auto', n_iter=100, n_burn_in=10, delta_scale='auto', delta_exp='auto', CV=False, alpha_scales='auto', n_alphas=30, n_folds=5)[source]

Bases: object

An extension of Ensemble2IV with L2 regularization and optional cross-validation to select the best regularization parameter.

Parameters
  • adversary (str or estimator) – Adversary model. If ‘auto’, a default RandomForestRegressor is used.

  • learnerg (str or estimator) – Learner model for g. If ‘auto’, a default RandomForestRegressor is used.

  • learnerh (str or estimator) – Learner model for h. If ‘auto’, a default RandomForestRegressor is used.

  • n_iter (int) – Number of iterations for the ensemble.

  • n_burn_in (int) – Number of burn-in iterations.

  • delta_scale (str or float) – Scale factor for the critical radius delta. Default is ‘auto’.

  • delta_exp (str or float) – Exponent for the critical radius delta. Default is ‘auto’.

  • CV (bool) – Whether to perform cross-validation to select the best alpha value.

  • alpha_scales (str or list) – Scales for alpha in cross-validation. Default is ‘auto’.

  • n_alphas (int) – Number of alpha values to test in cross-validation.

  • n_folds (int) – Number of folds for cross-validation.

_check_input(A, B, C, D, Y, W)[source]
_cross_validate_alpha(A, B, C, D, Y, W)[source]

Performs cross-validation to select the best alpha value.

Parameters
  • A (array-like) – Instrumental variables for the first stage.

  • B (array-like) – Instrumental variables for the second stage.

  • C (array-like) – Treatment variables for the first stage.

  • D (array-like) – Treatment variables for the second stage.

  • Y (array-like) – Outcome variables.

  • W (array-like) – Weights for the observations.

Returns

Best alpha value.

Return type

float

_get_alpha_scales()[source]
_get_delta(n)[source]

Computes the critical radius delta based on the sample size.

Parameters

n (int) – Sample size.

Returns

Critical radius delta.

Return type

float

_get_new_adversary()[source]
_get_new_learnerg()[source]
_get_new_learnerh()[source]
fit(A, B, C, D, Y, W=None, alpha=1.0, cross_validating=False, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fits the nested ensemble IV model with L2 regularization to the provided data.

Parameters
  • A (array-like) – Instrumental variables for the first stage.

  • B (array-like) – Instrumental variables for the second stage.

  • C (array-like) – Treatment variables for the first stage.

  • D (array-like) – Treatment variables for the second stage.

  • Y (array-like) – Outcome variables.

  • W (array-like, optional) – Weights for the observations.

  • alpha (float) – Regularization parameter.

  • cross_validating (bool) – Whether the function is called during cross-validation.

  • subsetted (bool) – If True, use subsets of data as indicated by subset_ind1 and subset_ind2.

  • subset_ind1 (array-like) – Indices for the first subset.

  • subset_ind2 (array-like) – Indices for the second subset.

Returns

Fitted nested ensemble IV model.

Return type

self

predict(B, *args)[source]

Predicts outcomes for new data using the fitted nested ensemble IV model with L2 regularization.

Parameters
  • B (array-like) – Instrumental variables for the second stage.

  • args (tuple) – Optional second argument for instrumental variables of the first stage.

Returns

Predicted outcomes for the second stage. If a second argument is provided, returns a tuple with predictions for both stages.

Return type

array

ensemble2._mysign(x)[source]

Neural Networks

NPIV

This module provides implementations of adversarial generalized method of moments (AGMM) estimators using neural networks.

Classes:

_BaseAGMM: Base class for AGMM models. _BaseSupLossAGMM: Base class for AGMM models with supervised loss. AGMM: Adversarial Generalized Method of Moments estimator. KernelLayerMMDGMM: AGMM with kernel layer using Maximum Mean Discrepancy. CentroidMMDGMM: AGMM with centroid-based Maximum Mean Discrepancy. KernelLossAGMM: AGMM with kernel loss. MMDGMM: AGMM with Maximum Mean Discrepancy.

class agmm.AGMM(learner, adversary)[source]

Bases: _BaseSupLossAGMM

Adversarial Generalized Method of Moments estimator.

Parameters
  • learner – a pytorch neural net module for the learner.

  • adversary – a pytorch neural net module for the adversary.

class agmm.CentroidMMDGMM(learner, adversary_g, kernel, centers, sigma)[source]

Bases: _BaseSupLossAGMM

AGMM with centroid-based Maximum Mean Discrepancy.

Parameters
  • learner – a pytorch neural net module for the learner.

  • adversary_g – a pytorch neural net module for the g function of the adversary.

  • kernel – the kernel function.

  • centers – numpy array containing the initial value of the centers in the Z space.

  • sigma – float corresponding to the precision of the kernel.

class agmm.KernelLayerMMDGMM(learner, adversary_g, g_features, n_centers, kernel, centers=None, sigmas=None, trainable=True)[source]

Bases: _BaseSupLossAGMM

AGMM with kernel layer using Maximum Mean Discrepancy.

Parameters
  • learner – a pytorch neural net module for the learner.

  • adversary_g – a pytorch neural net module for the g function of the adversary.

  • g_features – the number of output features of g.

  • n_centers – the number of centers to use in the kernel layer.

  • kernel – the kernel function.

  • centers – numpy array containing the initial value of the centers in the g(Z) space.

  • sigmas – numpy array containing the initial value of the sigma for each center.

  • trainable – whether to train the centers and the sigmas.

class agmm.KernelLossAGMM(learner, adversary_g, kernel, sigma)[source]

Bases: _BaseAGMM

AGMM with kernel loss.

Parameters
  • learner – a pytorch neural net module for the learner.

  • adversary_g – a pytorch neural net module for the g function of the adversary.

  • kernel – the kernel function.

  • sigma – float corresponding to the precision of the kernel.

fit(Z, T, Y, learner_l2=0.001, adversary_l2=0.0001, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs=100, train_learner_every=1, train_adversary_every=1, ols_weight=0.0, warm_start=False, logger=None, model_dir='.', device=None, verbose=0)[source]
Parameters
  • Z (instruments) –

  • T (treatments) –

  • Y (outcome) –

  • learner_l2 (l2_regularization of parameters of learner and adversary) –

  • adversary_l2 (l2_regularization of parameters of learner and adversary) –

  • learner_lr (learning rate of the Adam optimizer for learner) –

  • adversary_lr (learning rate of the Adam optimizer for adversary) –

  • n_epochs (how many passes over the data) –

  • bs (batch size) –

  • train_learner_every (after how many training iterations of the adversary should we train the learner) –

  • ols_weight (weight on OLS (square loss) objective) –

  • warm_start (whether to reset weights or not) –

  • logger (a function that takes as input (learner, adversary, epoch, writer) and is called after every epoch) – Supposed to be used to log the state of the learning.

  • model_dir (folder where to store the learned models after every epoch) –

class agmm.MMDGMM(learner, adversary_g, n_samples, kernel, sigma)[source]

Bases: _BaseAGMM

AGMM with Maximum Mean Discrepancy.

Parameters
  • learner – a pytorch neural net module for the learner.

  • adversary_g – a pytorch neural net module for the g function of the adversary.

  • n_samples – number of samples.

  • kernel – the kernel function.

  • sigma – float corresponding to the precision of the kernel.

fit(Z, T, Y, learner_l2=0.001, adversary_l2=0.0001, adversary_norm_reg=0.001, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs1=100, bs2=100, bs3=100, train_learner_every=1, train_adversary_every=1, ols_weight=0.0, warm_start=False, logger=None, model_dir='.', device=None, verbose=0)[source]
Parameters
  • Z (instruments) –

  • T (treatments) –

  • Y (outcome) –

  • learner_l2 (l2_regularization of parameters of learner and adversary) –

  • adversary_l2 (l2_regularization of parameters of learner and adversary) –

  • learner_lr (learning rate of the Adam optimizer for learner) –

  • adversary_lr (learning rate of the Adam optimizer for adversary) –

  • n_epochs (how many passes over the data) –

  • bs (batch size) –

  • train_learner_every (after how many training iterations of the adversary should we train the learner) –

  • ols_weight (weight on OLS (square loss) objective) –

  • warm_start (whether to reset weights or not) –

  • logger (a function that takes as input (learner, adversary, epoch, writer) and is called after every epoch) – Supposed to be used to log the state of the learning.

  • model_dir (folder where to store the learned models after every epoch) –

class agmm._BaseAGMM[source]

Bases: object

Base class for AGMM models.

_pretrain()[source]

Prepares the variables required to begin training.

predict()[source]

Predicts outcomes using the fitted AGMM model.

_pretrain(Z, T, Y, learner_l2, adversary_l2, adversary_norm_reg, learner_lr, adversary_lr, n_epochs, bs, train_learner_every, train_adversary_every, warm_start, logger, model_dir, device, verbose, add_sample_inds=False)[source]

Prepares the variables required to begin training.

predict(T, model='avg', burn_in=0, alpha=None)[source]
Parameters
  • T (treatments) –

  • model (one of ('avg', 'final'), whether to use an average of models or the final) –

  • burn_in (discard the first "burn_in" epochs when doing averaging) –

  • alpha (if not None but a float, then it also returns the a/2 and 1-a/2, percentile of) – the predictions across different epochs (proxy for a confidence interval)

class agmm._BaseSupLossAGMM[source]

Bases: _BaseAGMM

Base class for AGMM models with supervised loss.

fit()[source]

Fits the AGMM model with supervised loss to the provided data.

fit(Z, T, Y, learner_l2=0.001, adversary_l2=0.0001, adversary_norm_reg=0.001, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs=100, train_learner_every=1, train_adversary_every=1, ols_weight=0.0, warm_start=False, logger=None, model_dir='.', device=None, verbose=0)[source]
Parameters
  • Z (instruments) –

  • T (treatments) –

  • Y (outcome) –

  • learner_l2 (l2_regularization of parameters of learner and adversary) –

  • adversary_l2 (l2_regularization of parameters of learner and adversary) –

  • adversary_norm_reg (adveresary norm regularization weight) –

  • learner_lr (learning rate of the Adam optimizer for learner) –

  • adversary_lr (learning rate of the Adam optimizer for adversary) –

  • n_epochs (how many passes over the data) –

  • bs (batch size) –

  • train_learner_every (after how many training iterations of the adversary should we train the learner) –

  • ols_weight (weight on OLS (square loss) objective) –

  • warm_start (if False then network parameters are initialized at the beginning, otherwise we start) – from their current weights

  • logger (a function that takes as input (learner, adversary, epoch, writer) and is called after every epoch) – Supposed to be used to log the state of the learning.

  • model_dir (folder where to store the learned models after every epoch) –

agmm._kernel(x, y, basis_func, sigma)[source]
agmm.add_weight_decay(net, l2_value, skip_list=())[source]

Nested NPIV

This module provides implementations of joint estimation for nested nonparametric instrumental variables (NPIV) using neural networks.

Classes:

_BaseAGMM2: Base class for joint estimation of nested NPIV models. _BaseSupLossAGMM2: Base class for joint estimation of nested NPIV models with supervised loss. AGMM2: Adversarial Generalized Method of Moments estimator for nested NPIV. _BaseSupLossAGMM2L2: Base class for joint estimation of nested NPIV models with L2 regularization. AGMM2L2: Adversarial Generalized Method of Moments estimator for nested NPIV with L2 regularization.

class agmm2.AGMM2(learnerh, learnerg, adversary1, adversary2)[source]

Bases: _BaseSupLossAGMM2

Adversarial Generalized Method of Moments estimator for nested NPIV.

Parameters
  • learnerh – a pytorch neural net module for the second stage learner.

  • learnerg – a pytorch neural net module for the first stage learner.

  • adversary1 – a pytorch neural net module for the first stage adversary.

  • adversary2 – a pytorch neural net module for the second stage adversary.

class agmm2.AGMM2L2(learnerh, learnerg, adversary1, adversary2)[source]

Bases: _BaseSupLossAGMM2L2

Adversarial Generalized Method of Moments estimator for nested NPIV with L2 regularization.

Parameters
  • learnerh – a pytorch neural net module for the second stage learner.

  • learnerg – a pytorch neural net module for the first stage learner.

  • adversary1 – a pytorch neural net module for the first stage adversary.

  • adversary2 – a pytorch neural net module for the second stage adversary.

class agmm2._BaseAGMM2[source]

Bases: object

Base class for joint estimation of nested NPIV models.

_pretrain()[source]

Prepares the variables required to begin training.

predict()[source]

Predicts outcomes using the fitted AGMM model.

_pretrain(A, B, C, D, Y, W, learner_l2, adversary_l2, adversary_norm_reg, learner_norm_reg, learner_lr, adversary_lr, n_epochs, bs, train_learner_every, train_adversary_every, warm_start, model_dir, device, verbose, add_sample_inds=False, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Prepares the variables required to begin training.

predict(B, A, model='avg', burn_in=0, alpha=None)[source]
Parameters
  • B (endogenous vars for second and first stage) –

  • A (endogenous vars for second and first stage) –

  • model (one of ('avg', 'final'), whether to use an average of models or the final) –

  • burn_in (discard the first "burn_in" epochs when doing averaging) –

  • alpha (if not None but a float, then it also returns the a/2 and 1-a/2, percentile of) – the predictions across different epochs (proxy for a confidence interval)

class agmm2._BaseSupLossAGMM2[source]

Bases: _BaseAGMM2

Base class for joint estimation of nested NPIV models with supervised loss.

fit()[source]

Fits the AGMM model with supervised loss to the provided data.

fit(A, B, C, D, Y, W=None, learner_l2=0.001, adversary_l2=0.0001, adversary_norm_reg=0.001, learner_norm_reg=0.001, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs=100, train_learner_every=1, train_adversary_every=1, warm_start=False, model_dir='.', device=None, verbose=0, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Parameters
  • A (endogenous vars for first stage) –

  • B (endogenous vars for second stage) –

  • C (instrument vars for second stage) –

  • D (instrument vars for first stage) –

  • Y (outcome) –

  • W (weights for the second stage) –

  • learner_l2 (l2_regularization of parameters of learner and adversary) –

  • adversary_l2 (l2_regularization of parameters of learner and adversary) –

  • adversary_norm_reg (adversary norm regularization weight) –

  • learner_norm_reg (learner norm regularization weight) –

  • learner_lr (learning rate of the Adam optimizer for learner) –

  • adversary_lr (learning rate of the Adam optimizer for adversary) –

  • n_epochs (how many passes over the data) –

  • bs (batch size) –

  • train_learner_every (after how many training iterations of the adversary should we train the learner) –

  • warm_start (if False then network parameters are initialized at the beginning, otherwise we start) – from their current weights

  • model_dir (folder where to store the learned models after every epoch) –

class agmm2._BaseSupLossAGMM2L2[source]

Bases: _BaseAGMM2

Base class for joint estimation of nested NPIV models with L2 regularization.

fit()[source]

Fits the AGMM model with L2 regularization to the provided data.

fit(A, B, C, D, Y, W=None, learner_l2=0.001, adversary_l2=0.0001, adversary_norm_reg=0.001, learner_norm_reg=0.001, learner_lr=0.001, adversary_lr=0.001, n_epochs=100, bs=100, train_learner_every=1, train_adversary_every=1, warm_start=False, model_dir='.', device=None, verbose=0, subsetted=False, subset_ind1=None, subset_ind2=None)[source]
Parameters
  • A (endogenous vars for first stage) –

  • B (endogenous vars for second stage) –

  • C (instrument vars for second stage) –

  • D (instrument vars for first stage) –

  • Y (outcome) –

  • W (weights for the second stage) –

  • learner_l2 (l2_regularization of parameters of learner and adversary) –

  • adversary_l2 (l2_regularization of parameters of learner and adversary) –

  • adversary_norm_reg (adversary norm regularization weight) –

  • learner_norm_reg (learner norm regularization weight) –

  • learner_lr (learning rate of the Adam optimizer for learner) –

  • adversary_lr (learning rate of the Adam optimizer for adversary) –

  • n_epochs (how many passes over the data) –

  • bs (batch size) –

  • train_learner_every (after how many training iterations of the adversary should we train the learner) –

  • warm_start (if False then network parameters are initialized at the beginning, otherwise we start) – from their current weights

  • model_dir (folder where to store the learned models after every epoch) –

agmm2.add_weight_decay(net, l2_value, skip_list=())[source]

Sparse Linear Function Spaces

NPIV

This module provides implementations of sparse linear NPIV estimators.

Classes:

_SparseLinearAdversarialGMM: Base class for sparse linear adversarial GMM. sparse_l1vsl1: Sparse Linear NPIV estimator using $ell_1-ell_1$ optimization. sparse_ridge_l1vsl1: Sparse Ridge NPIV estimator using $ell_1-ell_1$ optimization.

class sparse_l1_l1._SparseLinearAdversarialGMM(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: object

Base class for sparse linear adversarial GMM.

This class implements common functionality for sparse linear models using adversarial GMM.

Parameters
  • lambda_theta (float) – Regularization parameter.

  • B (int) – Budget parameter.

  • eta_theta (str or float) – Learning rate for theta.

  • eta_w (str or float) – Learning rate for w.

  • n_iter (int) – Number of iterations.

  • tol (float) – Tolerance for duality gap.

  • sparsity (int or None) – Sparsity level for the model.

  • fit_intercept (bool) – Whether to fit an intercept.

_check_input(Z, X, Y)[source]
property coef
property intercept
predict(X)[source]
class sparse_l1_l1.sparse_l1vsl1(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: _SparseLinearAdversarialGMM

Sparse Linear NPIV estimator using $ell_1-ell_1$ optimization.

This class solves the high-dimensional sparse linear problem using $ell_1$ relaxations for the minimax optimization problem.

Parameters

_SparseLinearAdversarialGMM. (Same as) –

_check_duality_gap(Z, X, Y)[source]

Check the duality gap to monitor convergence.

The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.

Parameters
  • Z (array-like) – Instrumental variables.

  • X (array-like) – Covariates.

  • Y (array-like) – Outcomes.

Returns

True if the duality gap is less than the tolerance, otherwise False.

Return type

bool

_post_process(Z, X, Y)[source]
fit(Z, X, Y)[source]

Fit the model.

Parameters
  • Z (array-like) – Instrumental variables.

  • X (array-like) – Covariates.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

class sparse_l1_l1.sparse_ridge_l1vsl1(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: _SparseLinearAdversarialGMM

Sparse Ridge NPIV estimator using $ell_1-ell_1$ optimization.

This class solves the high-dimensional sparse ridge problem using $ell_1$ relaxations for the minimax optimization problem.

Parameters

_SparseLinearAdversarialGMM. (Same as) –

_check_duality_gap(Z, X, Y)[source]

Check the duality gap to monitor convergence.

The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.

Parameters
  • Z (array-like) – Instrumental variables.

  • X (array-like) – Covariates.

  • Y (array-like) – Outcomes.

Returns

True if the duality gap is less than the tolerance, otherwise False.

Return type

bool

_post_process(Z, X, Y)[source]
fit(Z, X, Y)[source]

Fit the model.

Parameters
  • Z (array-like) – Instrumental variables.

  • X (array-like) – Covariates.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

Nested NPIV

This module provides implementations of sparse linear NPIV estimators with L1 norm regularization for nested NPIV.

Classes:

_SparseLinear2AdversarialGMM: Base class for sparse linear adversarial GMM for nested NPIV. sparse2_l1vsl1: Sparse Linear NPIV estimator using $ell_1-ell_1$ optimization for nested NPIV. sparse2_ridge_l1vsl1: Sparse Ridge NPIV estimator using $ell_1-ell_1$ optimization for nested NPIV.

class sparse2_l1_l1._SparseLinear2AdversarialGMM(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: object

Base class for sparse linear adversarial GMM for nested NPIV.

This class implements common functionality for sparse linear models using adversarial GMM in a nested NPIV setting.

Parameters
  • mu (float) – Regularization parameter.

  • V1 (int) – Budget parameter for the first stage.

  • V2 (int) – Budget parameter for the second stage.

  • eta_alpha (str or float) – Learning rate for alpha.

  • eta_w1 (str or float) – Learning rate for w1.

  • eta_beta (str or float) – Learning rate for beta.

  • eta_w2 (str or float) – Learning rate for w2.

  • n_iter (int) – Number of iterations.

  • tol (float) – Tolerance for duality gap.

  • sparsity (int or None) – Sparsity level for the model.

  • fit_intercept (bool) – Whether to fit an intercept.

_check_input(A, B, C, D, Y, W)[source]

Check and preprocess input arrays.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like) – Weights.

Returns

Processed A, B, C, D, Y, W.

Return type

tuple

property coef
property intercept
predict(B, *args)[source]

Predict using the fitted model.

Parameters
  • B (array-like) – Covariates for the second stage.

  • args (array-like) – Optional covariates for the first stage.

Returns

Predicted values for the second stage. If args are provided, also returns predicted values for the first stage.

Return type

array

weighted_mean(arr, weights, axis=0)[source]

Compute the weighted mean of an array.

Parameters
  • arr (array-like) – Input array.

  • weights (array-like) – Weights for computing the mean.

  • axis (int, optional) – Axis along which the mean is computed.

Returns

Weighted mean.

Return type

array

class sparse2_l1_l1.sparse2_l1vsl1(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: _SparseLinear2AdversarialGMM

Sparse Linear NPIV estimator using $ell_1-ell_1$ optimization for nested NPIV.

This class solves the high-dimensional sparse linear problem using $ell_1$ relaxations for the minimax optimization problem in a nested NPIV setting.

Parameters

_SparseLinear2AdversarialGMM. (Same as) –

_check_duality_gap(A, B, C, D, Y, W)[source]

Calculate the duality gap to certify convergence of the algorithm.

The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like) – Weights.

Returns

True if the duality gap is below the tolerance level, indicating convergence.

Return type

bool

_post_process(A, B, C, D, Y, W)[source]
fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fit the model.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like, optional) – Weights. Defaults to None.

  • subsetted (bool, optional) – Whether to use subsets. Defaults to False.

  • subset_ind1 (array-like, optional) – Subset indices for the first stage. Required if subsetted is True.

  • subset_ind2 (array-like, optional) – Subset indices for the second stage. Defaults to None.

Returns

Fitted estimator.

Return type

self

class sparse2_l1_l1.sparse2_ridge_l1vsl1(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: _SparseLinear2AdversarialGMM

Sparse Ridge NPIV estimator using $ell_1-ell_1$ optimization for nested NPIV.

This class solves the high-dimensional sparse ridge problem using $ell_1$ relaxations for the minimax optimization problem in a nested NPIV setting.

Parameters

_SparseLinear2AdversarialGMM. (Same as) –

_check_duality_gap(A, B, C, D, Y, W)[source]

Calculate the duality gap to certify convergence of the algorithm.

The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like) – Weights.

Returns

True if the duality gap is below the tolerance level, indicating convergence.

Return type

bool

_post_process(A, B, C, D, Y, W)[source]
fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fit the model.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like, optional) – Weights. Defaults to None.

  • subsetted (bool, optional) – Whether to use subsets. Defaults to False.

  • subset_ind1 (array-like, optional) – Subset indices for the first stage. Required if subsetted is True.

  • subset_ind2 (array-like, optional) – Subset indices for the second stage. Defaults to None.

Returns

Fitted estimator.

Return type

self

Regularized Linear Function Spaces

NPIV

This module provides implementations of sparse linear NPIV estimators with L2 norm regularization.

Classes:

_SparseLinearAdversarialGMM: Base class for sparse linear adversarial GMM. sparse_l2vsl2: Sparse Linear NPIV estimator using $ell_2-ell_2$ optimization. sparse_ridge_l2vsl2: Sparse Ridge NPIV estimator using $ell_2-ell_2$ optimization.

class sparse_l2_l2._SparseLinearAdversarialGMM(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: object

Base class for sparse linear adversarial GMM.

This class implements common functionality for sparse linear models using adversarial GMM.

Parameters
  • lambda_theta (float) – Regularization parameter.

  • B (int) – Budget parameter.

  • eta_theta (str or float) – Learning rate for theta.

  • eta_w (str or float) – Learning rate for w.

  • n_iter (int) – Number of iterations.

  • tol (float) – Tolerance for duality gap.

  • sparsity (int or None) – Sparsity level for the model.

  • fit_intercept (bool) – Whether to fit an intercept.

fit(Z, X, Y)

Fit the model.

predict(X)[source]

Predict using the fitted model.

_check_input(Z, X, Y)[source]
property coef
property intercept
predict(X)[source]

Predict using the fitted model.

Parameters

X (array-like) – Covariates.

Returns

Predicted values.

Return type

array

class sparse_l2_l2.sparse_l2vsl2(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: _SparseLinearAdversarialGMM

Sparse Linear NPIV estimator using $ell_2-ell_2$ optimization.

This class solves the high-dimensional sparse linear problem using $ell_2$ relaxations for the minimax optimization problem.

Parameters

_SparseLinearAdversarialGMM. (Same as) –

_check_duality_gap(Z, X, Y)[source]

Check the duality gap to monitor convergence.

The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.

Parameters
  • Z (array-like) – Instrumental variables.

  • X (array-like) – Covariates.

  • Y (array-like) – Outcomes.

Returns

True if the duality gap is less than the tolerance, otherwise False.

Return type

bool

_post_process(Z, X, Y)[source]
fit(Z, X, Y)[source]

Fit the model.

Parameters
  • Z (array-like) – Instrumental variables.

  • X (array-like) – Covariates.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

class sparse_l2_l2.sparse_ridge_l2vsl2(lambda_theta=0.01, B=100, eta_theta='auto', eta_w='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: _SparseLinearAdversarialGMM

Sparse Ridge NPIV estimator using $ell_2-ell_2$ optimization.

This class solves the high-dimensional sparse ridge problem using $ell_2$ relaxations for the minimax optimization problem.

Parameters

_SparseLinearAdversarialGMM. (Same as) –

_check_duality_gap(Z, X, Y)[source]

Check the duality gap to monitor convergence.

The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.

Parameters
  • Z (array-like) – Instrumental variables.

  • X (array-like) – Covariates.

  • Y (array-like) – Outcomes.

Returns

True if the duality gap is less than the tolerance, otherwise False.

Return type

bool

_post_process(Z, X, Y)[source]
fit(Z, X, Y)[source]

Fit the model.

Parameters
  • Z (array-like) – Instrumental variables.

  • X (array-like) – Covariates.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

Nested NPIV

This module provides implementations of sparse linear NPIV estimators using $ell_2-ell_2$ optimization for nested NPIV.

class sparse2_l2_l2._SparseLinear2AdversarialGMM(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: object

Base class for sparse linear adversarial GMM for nested NPIV.

This class implements common functionality for sparse linear models using adversarial GMM in a nested NPIV setting.

Parameters
  • mu (float) – Regularization parameter.

  • V1 (int) – Budget parameter for the first stage.

  • V2 (int) – Budget parameter for the second stage.

  • eta_alpha (str or float) – Learning rate for alpha.

  • eta_w1 (str or float) – Learning rate for w1.

  • eta_beta (str or float) – Learning rate for beta.

  • eta_w2 (str or float) – Learning rate for w2.

  • n_iter (int) – Number of iterations.

  • tol (float) – Tolerance for duality gap.

  • sparsity (int or None) – Sparsity level for the model.

  • fit_intercept (bool) – Whether to fit an intercept.

_check_input(A, B, C, D, Y, W)[source]
property coef
property intercept
predict(B, *args)[source]

Predict using the fitted model.

Parameters
  • B (array-like) – Covariates for the second stage.

  • *args – Optional. If provided, the first argument is treated as the covariates for the first stage.

Returns

Predicted values. If both B and A are provided, returns a tuple of predictions for both stages.

Return type

array or tuple

weighted_mean(arr, weights, axis=0)[source]

Compute the weighted mean of an array along the specified axis.

Parameters
  • arr (array-like) – Input array.

  • weights (array-like) – Weights for the mean computation.

  • axis (int, optional) – Axis along which to compute the mean. Defaults to 0.

Returns

Weighted mean.

Return type

array

class sparse2_l2_l2.sparse2_l2vsl2(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: _SparseLinear2AdversarialGMM

Sparse Linear NPIV estimator using $ell_2-ell_2$ optimization for nested NPIV.

This class solves the high-dimensional sparse linear problem using $ell_2$ relaxations for the minimax optimization problem in a nested NPIV setting.

Parameters

_SparseLinear2AdversarialGMM. (Same as) –

_check_duality_gap(A, B, C, D, Y, W)[source]

Calculate the duality gap to certify convergence of the algorithm.

The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like) – Weights.

Returns

True if the duality gap is below the tolerance level, indicating convergence.

Return type

bool

_post_process(A, B, C, D, Y, W)[source]
fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fit the model.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like, optional) – Weights. Defaults to None.

  • subsetted (bool, optional) – Whether to use subsets. Defaults to False.

  • subset_ind1 (array-like, optional) – Subset indices for the first stage. Required if subsetted is True.

  • subset_ind2 (array-like, optional) – Subset indices for the second stage. Defaults to None.

Returns

Fitted estimator.

Return type

self

class sparse2_l2_l2.sparse2_ridge_l2vsl2(mu=0.01, V1=100, V2=100, eta_alpha='auto', eta_w1='auto', eta_beta='auto', eta_w2='auto', n_iter=2000, tol=0.01, sparsity=None, fit_intercept=True)[source]

Bases: _SparseLinear2AdversarialGMM

Sparse Ridge NPIV estimator using $ell_2-ell_2$ optimization for nested NPIV.

This class solves the high-dimensional sparse ridge problem using $ell_2$ relaxations for the minimax optimization problem in a nested NPIV setting.

Parameters

_SparseLinear2AdversarialGMM. (Same as) –

_check_duality_gap(A, B, C, D, Y, W)[source]

Calculate the duality gap to certify convergence of the algorithm.

The ensembles can be thought of as primal and dual solutions, and the duality gap can be used as a certificate for convergence of the algorithm.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like) – Weights.

Returns

True if the duality gap is below the tolerance level, indicating convergence.

Return type

bool

_post_process(A, B, C, D, Y, W)[source]
fit(A, B, C, D, Y, W=None, subsetted=False, subset_ind1=None, subset_ind2=None)[source]

Fit the model.

Parameters
  • A (array-like) – Covariates for the first stage.

  • B (array-like) – Covariates for the second stage.

  • C (array-like) – Instrumental variables for the second stage.

  • D (array-like) – Instrumental variables for the first stage.

  • Y (array-like) – Outcomes.

  • W (array-like, optional) – Weights. Defaults to None.

  • subsetted (bool, optional) – Whether to use subsets. Defaults to False.

  • subset_ind1 (array-like, optional) – Subset indices for the first stage. Required if subsetted is True.

  • subset_ind2 (array-like, optional) – Subset indices for the second stage. Defaults to None.

Returns

Fitted estimator.

Return type

self

Linear Class

This module provides implementations of two-stage least squares (TSLS) and regularized TSLS using linear and elastic net regression.

Classes:

tsls: Two-stage least squares estimator. regtsls: Regularized two-stage least squares estimator using Elastic Net.

class tsls.regtsls[source]

Bases: object

Regularized two-stage least squares estimator using Elastic Net.

This class implements the regularized TSLS estimator using Elastic Net regression.

fit(Z, T, Y)[source]

Fit the regularized TSLS estimator.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

predict(T)[source]

Predict outcomes based on the fitted model.

Parameters

T (array-like) – Treatments.

Returns

Predicted outcomes.

Return type

array-like

class tsls.tsls[source]

Bases: object

Two-stage least squares estimator.

This class implements the TSLS estimator.

fit(Z, T, Y)[source]

Fit the TSLS estimator.

Parameters
  • Z (array-like) – Instrumental variables.

  • T (array-like) – Treatments.

  • Y (array-like) – Outcomes.

Returns

Fitted estimator.

Return type

self

predict(T)[source]

Predict outcomes based on the fitted model.

Parameters

T (array-like) – Treatments.

Returns

Predicted outcomes.

Return type

array-like