Compute a kernel from a tree ensemble, defined by the fraction of trees of an ensemble in which two observations fall into the same leaf.

Usage

computeForestKernels(bart_model, X_train, X_test = NULL, forest_num = NULL)

Arguments

bart_model: Object of type bartmodel corresponding to a BART model with at least one sample
X_train: "Training" dataframe. In a traditional Gaussian process kriging context, this corresponds to the observations for which outcomes are observed.
X_test: (Optional) "Test" dataframe. In a traditional Gaussian process kriging context, this corresponds to the observations for which outcomes are unobserved and must be estimated based on the kernels k(X_test,X_test), k(X_test,X_train), and k(X_train,X_train). If not provided, this function will only compute k(X_train, X_train).
forest_num: (Option) Index of the forest sample to use for kernel computation. If not provided, this function will use the last forest.

Value

List of kernel matrices. If X_test = NULL, the list contains one n_train x n_train matrix, where n_train = nrow(X_train). This matrix is the kernel defined by W_train %*% t(W_train) where W_train is a matrix with n_train rows and as many columns as there are total leaves in an ensemble. If X_test is not NULL, the list contains two more matrices defined by W_test %*% t(W_train) and W_test %*% t(W_test).