Skip to contents

Formula based interface to the ROSA algorithm following the style of the pls package.

Usage

rosa(
  formula,
  ncomp,
  Y.add,
  common.comp = 1,
  data,
  subset,
  na.action,
  scale = FALSE,
  weights = NULL,
  validation = c("none", "CV", "LOO"),
  internal.validation = FALSE,
  fixed.block = NULL,
  design.block = NULL,
  canonical = TRUE,
  ...
)

Arguments

formula

Model formula accepting a single response (block) and predictor block names separated by + signs.

ncomp

The maximum number of ROSA components.

Y.add

Optional response(s) available in the data set.

common.comp

Automatically create all combinations of common components up to length common.comp (default = 1).

data

The data set to analyse.

subset

Expression for subsetting the data before modelling.

na.action

How to handle NAs (no action implemented).

scale

Optionally scale predictor variables by their individual standard deviations.

weights

Optional object weights.

validation

Optional cross-validation strategy "CV" or "LOO".

internal.validation

Optional cross-validation for block selection process, "LOO", "CV3", "CV5", "CV10" (CV-number of segments), or vector of integers (default = FALSE).

fixed.block

integer vector with block numbers for each component (0 = not fixed) or list of length <= ncomp (element length 0 = not fixed).

design.block

integer vector containing block numbers of design blocks

canonical

logical indicating if canonical correlation should be use when calculating loading weights (default), enabling B/W maximization, common components, etc. Alternatively (FALSE) a PLS2 strategy, e.g. for spectra response, is used.

...

Additional arguments for cvseg or rosa.fit

Value

An object of classes rosa and mvr having several associated printing (rosa_results) and plotting methods (rosa_plots).

Details

ROSA is an opportunistic method sequentially selecting components from whichever block explains the response most effectively. It can be formulated as a PLS model on concatenated input block with block selection per component. This implementation adds several options that are not described in the literature. Most importantly, it opens for internal validation in the block selection process, making this more robust. In addition it handles design blocks explicitly, enables classification and secondary responses (CPLS), and definition of common components.

References

Liland, K.H., Næs, T., and Indahl, U.G. (2016). ROSA - a fast extension of partial least squares regression for multiblock data analysis. Journal of Chemometrics, 30, 651–662, doi:10.1002/cem.2824.

See also

Overviews of available methods, multiblock, and methods organised by main structure: basic, unsupervised, asca, supervised and complex. Common functions for computation and extraction of results and plotting are found in rosa_results and rosa_plots, respectively.

Examples

data(potato)
mod <- rosa(Sensory[,1] ~ ., data = potato, ncomp = 10, validation = "CV", segments = 5)
summary(mod)
#> Data: 	X dimension: 26 3946 
#> 	Y dimension: 26 1
#> Fit method:
#> Number of components considered: 10
#> 
#> VALIDATION: RMSEP
#> Cross-validated using 5 random segments.
#>        (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
#> CV           1.778   1.0003    0.869   0.5666   0.5506   0.5227   0.5370
#> adjCV        1.778   0.9597    0.822   0.5459   0.5224   0.4912   0.5014
#>        7 comps  8 comps  9 comps  10 comps
#> CV      0.5607   0.6254   0.6578    0.7528
#> adjCV   0.5186   0.5747   0.6001    0.6875
#> 
#> TRAINING: % variance explained
#>               1 comps  2 comps  3 comps  4 comps  5 comps  6 comps  7 comps
#> X               27.82    40.72    47.95    49.48    53.96    55.28    57.79
#> Sensory[, 1]    76.54    86.77    94.07    96.47    97.03    97.49    97.83
#>               8 comps  9 comps  10 comps
#> X               65.49    67.14     68.16
#> Sensory[, 1]    98.03    98.28     98.51

# For examples of ROSA results and plotting see 
# ?rosa_results and ?rosa_plots.