Multivariate analysis of RS was used to analyze spectral wavenumbers beyond the validated peak ratios that are usually reported. Principal components analysis (PCA) was selected due to its unsupervised nature of computing fundamental uncorrelated directions of variance using eigenvectors, accomplished using a package with options tailored for spectroscopy (Eigenvector Research Inc., package for MATLAB 7). Prior to PCA, data was “auto scaled,” which is the same as “z-scoring” or running PCA on the correlation matrix, such that each variable or wavenumber was set to zero mean and unit variance. This is essential in PCA of Raman spectra where certain peaks (like phosphate in bone) have a much higher intensity than others, which could inaccurately skew the ability of PCA to predict the mechanical properties of bone. Because the Raman signal of bone likely contains much more information than that which relates to mechanics, PC’s are selected for analysis by screening for those that significantly separate genotype or class. PC’s were first screened by F-test of variance and Lillefor’s test for normality. Failing normality in all cases, nonparametric Mann–Whitney U tests were used to test significance at (). For PC’s significantly separating data class, sparse multinomial logistic regression (SMLR; Duke University) was used to test for best classification. SMLR is an iterative multivariate weighting technique that allows for sparsity or the exclusion of features (or in this case PC’s) that do not help discriminate class. Note that a statistically significant difference in a Raman property between genotypes does not necessarily imply that the property should have some ability to classify genotype. Briefly, SMLR was run with a Laplacian prior, a direct kernel, no bias, no normalization, component–wise updates, and leave-one-sample-out cross-validation. In leave-one-out cross validation, the final classification accuracy is based upon the cumulative classification of the validation set (the sample left out). Therefore, if the sample left out at each iteration is always misclassified for that iteration, the accuracy can reach 0%. The algorithm was run for various weights of sparsity index (, 1, 10, 50) to ensure optimal classification. Because SMLR is an iterative, kernel-based technique, it does not compress to a univariate logistic regression in all cases; therefore, for an appropriate comparison, classification using peak ratios was also evaluated using SMLR. In all cases, single principal components yielded better classification than multiple principal components in the same SMLR computation; therefore, Spearman’s correlations were run on single PC’s to test the explanation of bending strength and toughness.