Formula based interface to the ROSA algorithm following the style of the pls
package.
Usage
rosa(
formula,
ncomp,
Y.add,
common.comp = 1,
data,
subset,
na.action,
scale = FALSE,
weights = NULL,
validation = c("none", "CV", "LOO"),
internal.validation = FALSE,
fixed.block = NULL,
design.block = NULL,
canonical = TRUE,
...
)
Arguments
- formula
Model formula accepting a single response (block) and predictor block names separated by + signs.
- ncomp
The maximum number of ROSA components.
- Y.add
Optional response(s) available in the data set.
- common.comp
Automatically create all combinations of common components up to length
common.comp
(default = 1).- data
The data set to analyse.
- subset
Expression for subsetting the data before modelling.
- na.action
How to handle NAs (no action implemented).
- scale
Optionally scale predictor variables by their individual standard deviations.
- weights
Optional object weights.
- validation
Optional cross-validation strategy "CV" or "LOO".
- internal.validation
Optional cross-validation for block selection process, "LOO", "CV3", "CV5", "CV10" (CV-number of segments), or vector of integers (default = FALSE).
- fixed.block
integer vector with block numbers for each component (0 = not fixed) or list of length <= ncomp (element length 0 = not fixed).
- design.block
integer vector containing block numbers of design blocks
- canonical
logical indicating if canonical correlation should be use when calculating loading weights (default), enabling B/W maximization, common components, etc. Alternatively (FALSE) a PLS2 strategy, e.g. for spectra response, is used.
- ...
Additional arguments for
cvseg
orrosa.fit
Value
An object of classes rosa
and mvr
having several associated printing (rosa_results
) and plotting methods (rosa_plots
).
Details
ROSA is an opportunistic method sequentially selecting components from whichever block explains the response most effectively. It can be formulated as a PLS model on concatenated input block with block selection per component. This implementation adds several options that are not described in the literature. Most importantly, it opens for internal validation in the block selection process, making this more robust. In addition it handles design blocks explicitly, enables classification and secondary responses (CPLS), and definition of common components.
References
Liland, K.H., Næs, T., and Indahl, U.G. (2016). ROSA - a fast extension of partial least squares regression for multiblock data analysis. Journal of Chemometrics, 30, 651–662, doi:10.1002/cem.2824.
See also
Overviews of available methods, multiblock
, and methods organised by main structure: basic
, unsupervised
, asca
, supervised
and complex
.
Common functions for computation and extraction of results and plotting are found in rosa_results
and rosa_plots
, respectively.
Examples
data(potato)
mod <- rosa(Sensory[,1] ~ ., data = potato, ncomp = 10, validation = "CV", segments = 5)
summary(mod)
#> Data: X dimension: 26 3946
#> Y dimension: 26 1
#> Fit method:
#> Number of components considered: 10
#>
#> VALIDATION: RMSEP
#> Cross-validated using 5 random segments.
#> (Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps
#> CV 1.778 1.0003 0.869 0.5666 0.5506 0.5227 0.5370
#> adjCV 1.778 0.9597 0.822 0.5459 0.5224 0.4912 0.5014
#> 7 comps 8 comps 9 comps 10 comps
#> CV 0.5607 0.6254 0.6578 0.7528
#> adjCV 0.5186 0.5747 0.6001 0.6875
#>
#> TRAINING: % variance explained
#> 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps 7 comps
#> X 27.82 40.72 47.95 49.48 53.96 55.28 57.79
#> Sensory[, 1] 76.54 86.77 94.07 96.47 97.03 97.49 97.83
#> 8 comps 9 comps 10 comps
#> X 65.49 67.14 68.16
#> Sensory[, 1] 98.03 98.28 98.51
# For examples of ROSA results and plotting see
# ?rosa_results and ?rosa_plots.