Multivariate data analysis and graphical display on Macintosh.
(Thioulouse 1989, 1990, 1992)
1-MacMul:
---------
MacMul performs three basic multivariate analyses: principal
component analysis (PCA, for quantitative variables), correspondence
analysis (CA, for count tables), and multiple correspondence analysis
(MCA, for qualitative variables). Usage of the three methods has been
completely standardized, both for input files and program outputs.
MacMul includes a complete, original and unified set of numerical aids
to interpretation: inertia analysis (absolute and relative contributions)
for rows and columns (Lebart et al. 1984), additional elements (rows and
columns), data reconstitution, percentage tables, among others.
Computations are based on the duality diagram (Escoufier 1987, 1990)
which allows a complete unification of the theoretical background of
multivariate methods, and considerably improves both computing
efficiency and programming easiness. The PCA part offers several
options: centered, standardized, non-centered, or general PCA (i.e. with
any diagonal metric). Data tables may have less rows than columns, and
computations are always performed in the lowest dimension space. The
maximum data table size is given by the following equations:
R + 4*C + M*M < 50,000 (800Kb version),
or by:
R + 4*C + M*M < 500,000 (3Mb version),
R being the number of rows in the table, C the number of columns, and
M the minimum of (R,C). Correspondence analysis of a 200 rows by
105 columns table takes less than 3 minutes on a Macintosh SE/30. For
a 300x300 data matrix, the computation time increases to 55 minutes
on a Macintosh SE/30.
For CA, we have added a "Double discrimination" option,
which performs automatically the computation of factor scores of each
cell of the table and of conditional means, variances and covariances.
New features in MacMul (v. 3.12) include:
- matrix computation commands (addition, substraction, multiplication,
inverse (generalized) and eigenvalues & vectors extraction.
- projection of the subspace spanned by one group of variables onto
the subspace spanned by another group. The projection may be performed
orthogonally given any diagonal metric.
- centering and standardization (given any diagonal metric).
- analysis of spatial neighbouring relationships.
- analysis of the smoothing operator corresponding to a spatial
neighbouring relationship.
- coupling with one discrete variable (i.e. discriminant and between/
within analyses: PCA, CA and MCA).
- three-ways tables analysis (the French STATIS method).
- random numbers generation.
2-GraphMu:
----------
GraphMu is designed to draw the graphical outputs of data analysis methods
(principal axes planes), as well as several types of graphics usefull
for the analysis of multivariate data (scattergrams, line charts, bar
charts, histograms, stepped curves, maps with circles and squares,
ellipses, Gaussian curves). It is possible to superimpose graphics over
digitized background maps. The main feature of GraphMu is the possibility
to draw automatically *collections* of graphics. Each elementary graphic
may correspond to one column of the data table (comparison of variables)
and/or to one group of rows (comparison of sets of individuals). Drawings
may be saved in files of type "PICT", and are compatible with commercial
drawing software of the Macintosh (e.g. MacDraw). Copy/paste operations
on pictures are supported, making superimpositions easy.
New features in GraphMu (v. 4.11) include:
- Gauss kernel density estimation method
- convex hulls of a cloud of points (may be used for example as
superimpositions over factor maps of multivariate analyses, to represent the
clusters obtained with MacDendro)
- drawing of dendrograms: the dendrograms are computed with MacDendro
and then displayed with GraphMu. Dendrograms can be drawn vertically
or horizontally with labels or numbers. Drawing are saved into PICT files.
3-MacDendro:
------------
MacDendro is a cluster analysis program. The main features are:
- computation of distances: Euclidean, Chi-2 and Jaccard's index
- clustering algorithms. 4 agglomerative algorithms (single link,
average link (UPGMA), complete link and second order moment), one
divisive algorithm (based on the second order moment criterium), and
one partitionning method (with the possibility to generate random initial
partitions).
- "inertia analysis" of hierarchies and partitions to help understanding
the contribution of variables to the formation of the nodes of the
hierarchy, or to the formation of the clusters.
GraphMu must be used to draw the resulting dendrograms and/or convex hulls.
The algorithms used in MacDendro come from M. Roux (1985) "Algorithmes
de classification", 150p., Masson, Paris. See also M. Roux (1991) "Basic
procedures in hierarchical cluster analysis", in Devillers et Karchers (eds),
Applied multivariate analysis in SAR and environmental studies, Kluwer
Academic Publishers, Dordrecht, The Netherlands.,
Ref.:
-----
Thioulouse, J. 1989. Statistical analysis and graphical display of
multivariate data on the Macintosh. Computer Applications in the
Biosciences 5, 4, 287-292.
Thioulouse, J. 1990. MacMul and GraphMu: two Macintosh programs
for the display and analysis of multivariate data. Computers and
Geosciences 16, 8, 1235-1240.
Thioulouse J. and Chessel D. 1992. A method for reciprocal scaling
of species tolerance and sample diversity. Ecology, 73, 2 (in press).