KFC procedure is a three-step machine learning method aim at constructing predictions in both classification and regression problems. It is available in the journal of Statistical Computation and Simulation at the following link: https://dx.doi.org/10.1080/00949655.2021.1891539.
This proceudure consists of three steps:
-
$K$ -step ($K$ -means step):$K$ -means algorithm with$M$ Bregman divergences$\mathcal{B}={B_1, ... B_M}$ are implemented, [see, for example, Banerjee (2005)]. According to the properties of Bregman divergences, we expect to break down the input data into$M$ different partitions structures, each of$K$ clusters, namely$P_1=(P_{11},...,P_{1K}), ..., P_M=(P_{M1},...,P_{MK})$ . - F-step (Fitting step): For each Bregman divergence
$B_j$ , we fit simple local models on the$K$ clusters:$M_{j1},...,M_{jK}$ , obtained in the previous step. The collection of these$K$ local models is called a candidate model (Linear models, for example) namely$M_j=(M_{j1},...,M_{jK})$ . At the end of this step, we have$M$ candidate models corresponding to$M$ options of the Bregman divergences. To predict a new data point$x$ using a candidate model$M_j$ , we first assign the point to one of the$K$ clusters using the corresponding Bregman divergence$B_j$ , and the prediction of$x$ is given by the corresponding local model on that cluster:$M_j(x)=M_{jk^{\star}}(x)$ where$M_{jk^{\star}}(x)$ is the local model built on cluster$k^{\star}$ containing$x$ . - C-step (Combining step): All the candidates models constructed in step F are combined using Consensual aggregation methods in this step. The combining estimation methods used in this step are available in my AggregationMethods repository are:
-
GradientCOBRARegressor
: A kernel-based consensual regression aggregation method (see Has (2023)). -
MixCobraRegressor
: Aggregation using input-output trade-off (see Fischer and Mougeot (2019)). -
KernelAggClassifier
: A kernel-based combined classification rule (see Mojirsheibani (2000)). -
MixCobraClassifier
: Aggregation using input-output trade-off (see Fischer and Mougeot (2019)).
-
The procedure provides the predictions of all the candiate models (step F) and the ones of the procedure (step C).
To run the codes, you can clone
the repository directly or simply load the R script
source files from this repository using devtools package in Rstudio
as follows:
-
Install devtools package using command:
install.packages("devtools")
-
Loading the source codes from
GitHub
repository usingsource_url
function by:devtools::source_url("https://raw.githubusercontent.com/hassothea/KFC-Procedure/master/file.R")
where file.R
is the file name contained in this repository which you want to import into your Rstudio
.
The documentation and explanation of the methods are available on my webpage as listed below:
KFCRegressor
: see KFCRegressor documentation.