Fits a PLSR model with the wide kernel algorithm.
Usage
widekernelpls.fit(
X,
Y,
ncomp,
center = TRUE,
stripped = FALSE,
tol = .Machine$double.eps^0.5,
maxit = 100,
...
)Arguments
- X
a matrix of observations.
NAs andInfs are not allowed.- Y
a vector or matrix of responses.
NAs andInfs are not allowed.- ncomp
the number of components to be used in the modelling.
- center
logical, determines if the \(X\) and \(Y\) matrices are mean centered or not. Default is to perform mean centering.
- stripped
logical. If
TRUEthe calculations are stripped as much as possible for speed; this is meant for use with cross-validation or simulations when only the coefficients are needed. Defaults toFALSE.- tol
numeric. The tolerance used for determining convergence in the algorithm.
- maxit
positive integer. The maximal number of iterations used in the internal Eigenvector calculation.
- ...
other arguments. Currently ignored.
Value
A list containing the following components is returned:
- coefficients
an array of regression coefficients for 1, ...,
ncompcomponents. The dimensions ofcoefficientsarec(nvar, npred, ncomp)withnvarthe number ofXvariables andnpredthe number of variables to be predicted inY.- scores
a matrix of scores.
- loadings
a matrix of loadings.
- loading.weights
a matrix of loading weights.
- Yscores
a matrix of Y-scores.
- Yloadings
a matrix of Y-loadings.
- projection
the projection matrix used to convert X to scores.
- Xmeans
a vector of means of the X variables.
- Ymeans
a vector of means of the Y variables.
- fitted.values
an array of fitted values. The dimensions of
fitted.valuesarec(nobj, npred, ncomp)withnobjthe number samples andnpredthe number of Y variables.- residuals
an array of regression residuals. It has the same dimensions as
fitted.values.- Xvar
a vector with the amount of X-variance explained by each component.
- Xtotvar
Total variance in
X.
If stripped is TRUE, only the components coefficients,
Xmeans and Ymeans are returned.
Details
This function should not be called directly, but through the generic
functions plsr or mvr with the argument
method="widekernelpls". The wide kernel PLS algorithm is efficient
when the number of variables is (much) larger than the number of
observations. For very wide X, for instance 12x18000, it can be
twice as fast as kernelpls.fit and simpls.fit.
For other matrices, however, it can be much slower. The results are equal
to the results of the NIPALS algorithm.
Note
The current implementation has not undergone extensive testing yet,
and should perhaps be regarded as experimental. Specifically, the internal
Eigenvector calculation does not always converge in extreme cases where the
Eigenvalue is close to zero. However, when it does converge, it always
converges to the same results as kernelpls.fit, up to
numerical inacurracies.
The algorithm also has a bit of overhead, so when the number of observations
is moderately high, kernelpls.fit can be faster even if the
number of predictors is much higher. The relative speed of the algorithms
can also depend greatly on which BLAS and/or LAPACK library is linked
against.
References
Rännar, S., Lindgren, F., Geladi, P. and Wold, S. (1994) A PLS Kernel Algorithm for Data Sets with Many Variables and Fewer Objects. Part 1: Theory and Algorithm. Journal of Chemometrics, 8, 111–125.