Skip to contents

Computes leaf membership internally as a sparse matrix and also calculates a (dense) kernel based on the sparse matrix all in C++.

Public fields

forest_kernel_ptr

External pointer to a C++ StochTree::ForestKernel class

Methods


Method new()

Create a new ForestKernel object.

Usage

Returns

A new ForestKernel object.


Method compute_leaf_indices()

Compute the leaf indices of each tree in the ensemble for every observation in a dataset. Stores the result internally, which can be extracted from the class via a call to get_leaf_indices.

Usage

ForestKernel$compute_leaf_indices(
  covariates_train,
  covariates_test = NULL,
  forest_container,
  forest_num
)

Arguments

covariates_train

Matrix of training set covariates at which to compute leaf indices

covariates_test

(Optional) Matrix of test set covariates at which to compute leaf indices

forest_container

Object of type ForestSamples

forest_num

Index of the forest in forest_container to be assessed

Returns

List of vectors. If covariates_test = NULL the list has one element (train set leaf indices), and otherwise the list has two elements (train and test set leaf indices).


Method compute_kernel()

Compute the kernel implied by a tree ensemble. This function calls compute_leaf_indices, so it is not necessary to call both. compute_leaf_indices is exposed at the class level to allow for extracting the vector of leaf indices for an ensemble directly in R.

Usage

ForestKernel$compute_kernel(
  covariates_train,
  covariates_test = NULL,
  forest_container,
  forest_num
)

Arguments

covariates_train

Matrix of training set covariates at which to assess ensemble kernel

covariates_test

(Optional) Matrix of test set covariates at which to assess ensemble kernel

forest_container

Object of type ForestSamples

forest_num

Index of the forest in forest_container to be assessed

Returns

List of matrices. If covariates_test = NULL, the list contains one n_train x n_train matrix, where n_train = nrow(covariates_train). This matrix is the kernel defined by W_train %*% t(W_train) where W_train is a matrix with n_train rows and as many columns as there are total leaves in an ensemble. If covariates_test is not NULL, the list contains two more matrices defined by W_test %*% t(W_train) and W_test %*% t(W_test).