Weighted classical multidimensional scaling, also known as weighted principal coordinates analysis.

wcmdscale(d, k, eig = FALSE, add = FALSE, x.ret = FALSE, w)
# S3 method for wcmdscale
plot(x, choices = c(1, 2), type = "t", ...)
# S3 method for wcmdscale
scores(x, choices = NA, ...)

Arguments

d

a distance structure such as that returned by dist or a full symmetric matrix containing the dissimilarities.

k

the dimension of the space which the data are to be represented in; must be in \(\{1,2,\ldots,n-1\}\). If missing, all dimensions with above zero eigenvalue.

eig

indicates whether eigenvalues should be returned.

add

an additive constant \(c\) is added to the non-diagonal dissimilarities such that all \(n-1\) eigenvalues are non-negative. Alternatives are "lingoes" (default, also used with TRUE) and "cailliez" (which is the only alternative in cmdscale). See Legendre & Anderson (1999).

x.ret

indicates whether the doubly centred symmetric distance matrix should be returned.

w

Weights of points.

x

The wcmdscale result object when the function was called with options eig = TRUE or x.ret = TRUE (See Details).

choices

Axes to be returned; NA returns all real axes.

type

Type of graph which may be "t"ext, "p"oints or "n"one.

...

Other arguments passed to graphical functions.

Details

Function wcmdscale is based on function cmdscale (package stats of base R), but it uses point weights. Points with high weights will have a stronger influence on the result than those with low weights. Setting equal weights w = 1 will give ordinary multidimensional scaling.

With default options, the function returns only a matrix of scores scaled by eigenvalues for all real axes. If the function is called with eig = TRUE or x.ret = TRUE, the function returns an object of class "wcmdscale" with print, plot, scores, eigenvals and stressplot methods, and described in section Value.

The method is Euclidean, and with non-Euclidean dissimilarities some eigenvalues can be negative. If this disturbs you, this can be avoided by adding a constant to non-diagonal dissimilarities making all eigenvalues non-negative. The function implements methods discussed by Legendre & Anderson (1999): The method of Lingoes (add="lingoes") adds the constant \(c\) to squared dissimilarities \(d\) using \(\sqrt{d^2 + 2 c}\) and the method of Cailliez (add="cailliez") to dissimilarities using \(d + c\). Legendre & Anderson (1999) recommend the method of Lingoes, and base R function cmdscale implements the method of Cailliez.

Value

If eig = FALSE and x.ret = FALSE (default), a matrix with k columns whose rows give the coordinates of points corresponding to positive eigenvalues. Otherwise, an object of class wcmdscale containing the components that are mostly similar as in cmdscale:

points

a matrix with k columns whose rows give the coordinates of the points chosen to represent the dissimilarities.

eig

the \(n-1\) eigenvalues computed during the scaling process if eig is true.

x

the doubly centred and weighted distance matrix if x.ret is true.

ac, add

additive constant and adjustment method used to avoid negative eigenvalues. These are NA and FALSE if no adjustment was done.

GOF

Goodness of fit statistics for k axes. The first value is based on the sum of absolute values of all eigenvalues, and the second value is based on the sum of positive eigenvalues

weights

Weights.

negaxes

A matrix of scores for axes with negative eigenvalues scaled by the absolute eigenvalues similarly as points. This is NULL if there are no negative eigenvalues or k was specified, and would not include negative eigenvalues.

call

Function call.

References

Gower, J. C. (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53, 325--328.

Legendre, P. & Anderson, M. J. (1999). Distance-based redundancy analysis: testing multispecies responses in multifactorial ecological experiments. Ecology 69, 1--24.

Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Chapter 14 of Multivariate Analysis, London: Academic Press.

See also

The function is modelled after cmdscale, but adds weights (hence name) and handles negative eigenvalues differently. eigenvals.wcmdscale and stressplot.wcmdscale are some specific methods. Other multidimensional scaling methods are monoMDS, and isoMDS and sammon in package MASS.

Examples

## Correspondence analysis as a weighted principal coordinates ## analysis of Euclidean distances of Chi-square transformed data data(dune) rs <- rowSums(dune)/sum(dune) d <- dist(decostand(dune, "chi")) ord <- wcmdscale(d, w = rs, eig = TRUE) ## Ordinary CA ca <- cca(dune) ## Eigevalues are numerically similar ca$CA$eig - ord$eig
#> CA1 CA2 CA3 CA4 CA5 #> 2.220446e-16 -1.387779e-15 -1.054712e-15 2.498002e-16 1.110223e-16 #> CA6 CA7 CA8 CA9 CA10 #> 1.665335e-16 -1.387779e-17 -1.110223e-16 2.220446e-16 4.163336e-17 #> CA11 CA12 CA13 CA14 CA15 #> 1.179612e-16 -5.551115e-17 -1.387779e-17 3.469447e-18 -6.938894e-18 #> CA16 CA17 CA18 CA19 #> -5.204170e-18 1.734723e-18 -4.683753e-17 3.989864e-17
## Configurations are similar when site scores are scaled by ## eigenvalues in CA procrustes(ord, ca, choices=1:19, scaling = "sites")
#> #> Call: #> procrustes(X = ord, Y = ca, choices = 1:19, scaling = "sites") #> #> Procrustes sum of squares: #> -5.684e-14 #>
plot(procrustes(ord, ca, choices=1:2, scaling="sites"))
## Reconstruction of non-Euclidean distances with negative eigenvalues d <- vegdist(dune) ord <- wcmdscale(d, eig = TRUE) ## Only positive eigenvalues: cor(d, dist(ord$points))
#> [1] 0.9975185
## Correction with negative eigenvalues: cor(d, sqrt(dist(ord$points)^2 - dist(ord$negaxes)^2))
#> [1] 1