drr
Implements Dimensionality Reduction via Regression using
Kernel Ridge Regression.
Arguments
- X
input data, a matrix.
- ndim
the number of output dimensions and regression functions to be estimated, see details for inversion.
- lambda
the penalty term for the Kernel Ridge Regression.
- kernel
a kernel function or string, see
kernel-class
for details.- kernel.pars
a list with parameters for the kernel. each parameter can be a vector, crossvalidation will choose the best combination.
- pca
logical, do a preprocessing using pca.
- pca.center
logical, center data before applying pca.
- pca.scale
logical, scale data before applying pca.
- fastcv
- cv.folds
if using normal crossvalidation, the number of folds to be used.
- fastcv.test
an optional separate test data set to be used for
fastCV
, handed over as optiontest
tofastCV
.- fastkrr.nblocks
the number of blocks used for fast KRR, higher numbers are faster to compute but may introduce numerical inaccurracies, see
constructFastKRRLearner
for details.- verbose
logical, should the crossvalidation report back.
Value
A list the following items:
"fitted.data" The data in reduced dimensions.
"pca.means" The means used to center the original data.
"pca.scale" The standard deviations used to scale the original data.
"pca.rotation" The rotation matrix of the PCA.
"models" A list of models used to estimate each dimension.
"apply" A function to fit new data to the estimated model.
"inverse" A function to untransform data.
Details
Parameter combination will be formed and cross-validation used to
select the best combination. Cross-validation uses
CV
or fastCV
.
Pre-treatment of the data using a PCA and scaling is made \(\alpha = Vx\). the representation in reduced dimensions is
$$y_i = \alpha - f_i(\alpha_1, \ldots, \alpha_{i-1})$$
then the final DRR representation is:
$$r = (\alpha_1, y_2, y_3, \ldots,y_d)$$
DRR is invertible by
$$\alpha_i = y_i + f_i(\alpha_1,\alpha_2, \ldots, alpha_{i-1})$$
If less dimensions are estimated, there will be less inverse functions and calculating the inverse will be inaccurate.
References
Laparra, V., Malo, J., Camps-Valls, G., 2015. Dimensionality Reduction via Regression in Hyperspectral Imagery. IEEE Journal of Selected Topics in Signal Processing 9, 1026-1036. doi:10.1109/JSTSP.2015.2417833
Examples
tt <- seq(0,4*pi, length.out = 200)
helix <- cbind(
x = 3 * cos(tt) + rnorm(length(tt), sd = seq(0.1, 1.4, length.out = length(tt))),
y = 3 * sin(tt) + rnorm(length(tt), sd = seq(0.1, 1.4, length.out = length(tt))),
z = 2 * tt + rnorm(length(tt), sd = seq(0.1, 1.4, length.out = length(tt)))
)
helix <- helix[sample(nrow(helix)),] # shuffling data is important!!
system.time(
drr.fit <- drr(helix, ndim = 3, cv.folds = 4,
lambda = 10^(-2:1),
kernel.pars = list(sigma = 10^(0:3)),
fastkrr.nblocks = 2, verbose = TRUE,
fastcv = FALSE)
)
#> 2023-03-21 13:05:46: Constructing Axis 1/3
#> predictors: PC1 PC2 dependent: PC3
#> sigma=1 kernel=rbfdot lambda=0.01 nblocks=2 ( 2.596022 )
#> sigma=10 kernel=rbfdot lambda=0.01 nblocks=2 ( 3.519644 )
#> sigma=100 kernel=rbfdot lambda=0.01 nblocks=2 ( 4.354858 )
#> sigma=1000 kernel=rbfdot lambda=0.01 nblocks=2 ( 4.540877 )
#> sigma=1 kernel=rbfdot lambda=0.1 nblocks=2 ( 2.261119 )
#> sigma=10 kernel=rbfdot lambda=0.1 nblocks=2 ( 3.550488 )
#> sigma=100 kernel=rbfdot lambda=0.1 nblocks=2 ( 4.370996 )
#> sigma=1000 kernel=rbfdot lambda=0.1 nblocks=2 ( 4.541056 )
#> sigma=1 kernel=rbfdot lambda=1 nblocks=2 ( 2.466275 )
#> sigma=10 kernel=rbfdot lambda=1 nblocks=2 ( 3.821673 )
#> sigma=100 kernel=rbfdot lambda=1 nblocks=2 ( 4.440197 )
#> sigma=1000 kernel=rbfdot lambda=1 nblocks=2 ( 4.541966 )
#> sigma=1 kernel=rbfdot lambda=10 nblocks=2 ( 3.756388 )
#> sigma=10 kernel=rbfdot lambda=10 nblocks=2 ( 4.378636 )
#> sigma=100 kernel=rbfdot lambda=10 nblocks=2 ( 4.523233 )
#> sigma=1000 kernel=rbfdot lambda=10 nblocks=2 ( 4.542888 )
#> 2023-03-21 13:05:46: Constructing Axis 2/3
#> predictors: PC1 dependent: PC2
#> sigma=1 kernel=rbfdot lambda=0.01 nblocks=2 ( 2.158888 )
#> sigma=10 kernel=rbfdot lambda=0.01 nblocks=2 ( 2.237031 )
#> sigma=100 kernel=rbfdot lambda=0.01 nblocks=2 ( 3.570143 )
#> sigma=1000 kernel=rbfdot lambda=0.01 nblocks=2 ( 4.128292 )
#> sigma=1 kernel=rbfdot lambda=0.1 nblocks=2 ( 1.599839 )
#> sigma=10 kernel=rbfdot lambda=0.1 nblocks=2 ( 2.180394 )
#> sigma=100 kernel=rbfdot lambda=0.1 nblocks=2 ( 3.470345 )
#> sigma=1000 kernel=rbfdot lambda=0.1 nblocks=2 ( 4.22449 )
#> sigma=1 kernel=rbfdot lambda=1 nblocks=2 ( 1.684258 )
#> sigma=10 kernel=rbfdot lambda=1 nblocks=2 ( 2.68033 )
#> sigma=100 kernel=rbfdot lambda=1 nblocks=2 ( 3.970656 )
#> sigma=1000 kernel=rbfdot lambda=1 nblocks=2 ( 4.54119 )
#> sigma=1 kernel=rbfdot lambda=10 nblocks=2 ( 3.292726 )
#> sigma=10 kernel=rbfdot lambda=10 nblocks=2 ( 4.366873 )
#> sigma=100 kernel=rbfdot lambda=10 nblocks=2 ( 4.880853 )
#> sigma=1000 kernel=rbfdot lambda=10 nblocks=2 ( 5.049019 )
#> 2023-03-21 13:05:47: Constructing Axis 3/3
#> user system elapsed
#> 2.444 0.701 1.660
if (FALSE) {
library(rgl)
plot3d(helix)
points3d(drr.fit$inverse(drr.fit$fitted.data[,1,drop = FALSE]), col = 'blue')
points3d(drr.fit$inverse(drr.fit$fitted.data[,1:2]), col = 'red')
plot3d(drr.fit$fitted.data)
pad <- -3
fd <- drr.fit$fitted.data
xx <- seq(min(fd[,1]), max(fd[,1]), length.out = 25)
yy <- seq(min(fd[,2]) - pad, max(fd[,2]) + pad, length.out = 5)
zz <- seq(min(fd[,3]) - pad, max(fd[,3]) + pad, length.out = 5)
dd <- as.matrix(expand.grid(xx, yy, zz))
plot3d(helix)
for(y in yy) for(x in xx)
rgl.linestrips(drr.fit$inverse(cbind(x, y, zz)), col = 'blue')
for(y in yy) for(z in zz)
rgl.linestrips(drr.fit$inverse(cbind(xx, y, z)), col = 'blue')
for(x in xx) for(z in zz)
rgl.linestrips(drr.fit$inverse(cbind(x, yy, z)), col = 'blue')
}