diff --git a/StudentNotebooks/Assignment01/zhaot4-f24-assignment1.Rmd b/StudentNotebooks/Assignment01/zhaot4-f24-assignment1.Rmd index 274fea4..97b62d6 100644 --- a/StudentNotebooks/Assignment01/zhaot4-f24-assignment1.Rmd +++ b/StudentNotebooks/Assignment01/zhaot4-f24-assignment1.Rmd @@ -354,7 +354,21 @@ What do the clustering and PCA results tell us about the data detected by the The clustering and PCA results from the M20 PIXL experiment data reveal a heterogeneous composition in the Martian surface samples, indicating diverse geological processes and materials. The three distinct clusters point to different rock or soil types, suggesting a complex geological history involving various environmental conditions and processes. ```{r} - +library(ggplot2) +library(factoextra) +library(dplyr) +summary(pixl_trim.mat.pca) +loadings <- pixl_trim.mat.pca$rotation +print("PCA Loadings:") +print(loadings) +data <- as.factor(km$cluster) +fviz_pca_biplot(pixl_trim.mat.pca, +geom.ind = "point", +col.ind = data, # Color by clusters +palette = "jco", +addEllipses = FALSE, +label = "var", +title = "PCA Biplot of Martian Geochemical Data") ``` ## SAVE, COMMIT and PUSH YOUR CHANGES! diff --git a/StudentNotebooks/Assignment01/zhaot4-f24-assignment1.html b/StudentNotebooks/Assignment01/zhaot4-f24-assignment1.html index a39bb51..09d77ae 100644 --- a/StudentNotebooks/Assignment01/zhaot4-f24-assignment1.html +++ b/StudentNotebooks/Assignment01/zhaot4-f24-assignment1.html @@ -11,9 +11,9 @@ - + -
This notebook is broken into two main parts:
+Part 1: Preparing your local repo for +DAR Assignment 2
Part 2: Loading and some analysis of the Mars +2020 (M20) Datasets
+Part 3: Individual analysis of your team’s +dataset
Part 4: Preparation of Team +Presentation
NOTE: The RPI github repository for all the code and +data required for this notebook may be found at:
+ +In this assignment you’ll start by making a copy of the Assignment 2 +template notebook, then you’ll add to your copy with your original work. +The instructions which follow explain how to accomplish this.
+NOTE: You already cloned the
+DAR-Mars-F24
repository for Assignment 1; you do
+not need to make another clone of the repo, but you must begin
+by updating your copy as instructed below:
DAR-Mars-F24
repositorycd
(change directory) to enter your home directory
+using: cd ~
pwd
to confirm where you arecd
to DAR-Mars-F24
+git pull origin main
to pull any updatescd
into Assignment02
+ls -al
to list the current contentsgit branch
to verify your
+current working branch
+dar-yourrcs
, type
+git checkout dar-yourrcs
(where yourrcs
is
+your RCS id)git branch
to confirmDAR-Mars-F24/StudentNotebooks/Assignment02
directory via
+the Files panel (lower right panel)
+You’re now ready to start coding Assignment 2!
+dar-f24-assignment2-template.Rmd
file using a new,
+original, descriptive filename that includes your RCS
+ID!
+dar-f24-assignment2-template.Rmd
erickj4
:
+erickj4-assignment2-f24.Rmd
template
in your new filename!title:
and subtitle:
headers
+(at the top of the file)author:
date:
; it should update
+automagically…Knit
command to create an PDF file;
+repeat as necessary
+Knit
and select
+Knit to PDFgit add
to add each new file
+you want to add to the repository
+git add yourfilename.Rmd
git add yourfilename.pdf
(created when you
+knitted)git commit -m "some comment"
where “some comment”
+is a useful comment describing your changesgit push origin dar-yourrcs
(where
+dar-yourrcs
is the branch you’ve been working in)In this assignment there are four datasets from separate instruments +on the Mars Perserverance rover available for analysis:
+Each dataset provides data about the mineralogy of the surface of +Mars. Based on the purpose and nature of the instrument, the data is +collected at different intervals along the path of Perseverance as it +makes it way across the Jezero crater. Some of the data (esp. LIBS) is +collected almost every Martian day, or sol. Some of the data +(PIXL and SHERLOC) is only collected at certain sample locations of +interest
+Your objective is to perform an analysis of the your team’s assigned +dataset in order to learn all you can about these Mars samples.
+NOTES:
+/academics/MATP-4910-F24/DAR-Mars-F24/Data
samples.Rds
dataset
+that includes useful details about the sample locations, including
+Martian latitude and longitude and the sol that individual samples were
+collected.rover.waypoints.Rds
that provides
+detailed location information (lat/lon) for the Perseverance rover
+throughout its journey, up to the present. This can be updated when
+necessary using the included roverStatus-f24.R
script.The first five features of the dataset describe twenty-four (24) +rover sample locations.
+The remaining features provides a simple binary (1
or
+0
) summary of presence or absence of 35 minerals at the 24
+rover sample locations.
Only the first sixteen (16) samples are maintained, as the remaining +are missing the mineral descriptors.
+The following code “cleans” the dataset to prepare for analysis. It +first creates a dataframe with metadata and measurements for samples, +and then creates a matrix containing only numeric measurements for later +analysis.
+# Load the saved lithology data with locations added
+lithology.df<- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/mineral_data_static.Rds")
+
+# Cast samples as numbers
+lithology.df$sample <- as.numeric(lithology.df$sample)
+
+# Convert rest into factors
+lithology.df[sapply(lithology.df, is.character)] <-
+ lapply(lithology.df[sapply(lithology.df, is.character)],
+ as.factor)
+
+# Keep only first 16 samples because the data for the rest of the samples is not available yet
+lithology.df<-lithology.df[1:16,]
+
+# Look at summary of cleaned data frame
+summary(lithology.df)
+## sample name SampleType campaign
+## Min. : 1.00 Atsah : 1 atmospheric: 1 Crater Floor:9
+## 1st Qu.: 4.75 Bearwallow: 1 regolith : 0 Delta Front :7
+## Median : 8.50 Coulettes : 1 rock core :15 Margin Unit :0
+## Mean : 8.50 Hahonih : 1
+## 3rd Qu.:12.25 Hazeltop : 1
+## Max. :16.00 Kukaklek : 1
+## (Other) :10
+## abrasion feldspar plagioclase pyroxene olivine quartz apatite
+## Alfalfa :2 0:14 0:13 0: 5 0: 6 0:14 0:13
+## Bellegarde :2 1: 2 1: 3 1:11 1:10 1: 2 1: 3
+## Berry Hollow:2
+## Dourbes :2
+## Novarupta :2
+## Quartier :2
+## (Other) :4
+## FeTi_Oxides Iron_Oxide Sulfate Perchlorates Phosphate Ca_Sulfate Carbonate
+## 0:13 0:9 0: 4 0:15 0:11 0:10 0: 1
+## 1: 3 1:7 1:12 1: 1 1: 5 1: 6 1:15
+##
+##
+##
+##
+##
+## Fe_Mg_clay Fe_Mg_carbonate Mg_sulfate Phyllosilicates Chlorite Halite
+## 0:13 0:14 0:13 0:12 0:14 0:13
+## 1: 3 1: 2 1: 3 1: 4 1: 2 1: 3
+##
+##
+##
+##
+##
+## Organic_matter Hydrated_Ca_Sulfate Hydrated_Sulfates Hydrated_Mg_Fe_Sulfate
+## 0: 5 0:14 0:14 0:13
+## 1:11 1: 2 1: 2 1: 3
+##
+##
+##
+##
+##
+## Na_Perchlorate Amorphous_Silicate Hydrated_Carbonates Disordered_Silicates
+## 0:15 0:9 0:16 0:14
+## 1: 1 1:7 1: 2
+##
+##
+##
+##
+##
+## Hydrated_Iron_Oxide Sulfate+Organic_Matter Other_hydrated_phases Kaolinite
+## 0:15 0:11 0:8 0:13
+## 1: 1 1: 5 1:8 1: 3
+##
+##
+##
+##
+##
+## Chromite Ilmenite Zircon/Baddeleyite Spinels
+## 0:14 0:14 0:14 0:14
+## 1: 2 1: 2 1: 2 1: 2
+##
+##
+##
+##
+##
+# Create a matrix containing only the numeric measurements. The remaining features are metadata about the sample.
+lithology.matrix <- sapply(lithology.df[,6:40],as.numeric)-1
+
+# Review the structure of our matrix
+str(lithology.matrix)
+## num [1:16, 1:35] 0 0 0 0 0 0 0 1 1 0 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:35] "feldspar" "plagioclase" "pyroxene" "olivine" ...
+The PIXL data provides summaries of the mineral compositions measured +at selected sample sites by the PIXL instrument. Note that here we scale +pixl.mat so features have mean 0 and standard deviation so results will +be different than in Assignment 1.
+# Load the saved PIXL data with locations added
+pixl.df <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/samples_pixl_wide.Rds")
+
+# Convert to factors
+pixl.df[sapply(pixl.df, is.character)] <- lapply(pixl.df[sapply(pixl.df, is.character)],
+ as.factor)
+
+# Review our dataframe
+summary(pixl.df)
+## sample Na20 Mgo Al203
+## Min. : 1.00 Min. :1.000 Min. : 0.730 Min. : 1.700
+## 1st Qu.: 4.75 1st Qu.:1.853 1st Qu.: 2.533 1st Qu.: 2.220
+## Median : 8.50 Median :1.900 Median :12.800 Median : 3.710
+## Mean : 8.50 Mean :2.672 Mean :11.682 Mean : 5.072
+## 3rd Qu.:12.25 3rd Qu.:4.500 3rd Qu.:19.100 3rd Qu.: 7.117
+## Max. :16.00 Max. :5.550 Max. :22.700 Max. :11.600
+##
+## Si02 P205 S03 Cl
+## Min. :22.60 Min. :0.1000 Min. : 0.780 Min. :0.400
+## 1st Qu.:31.22 1st Qu.:0.2350 1st Qu.: 1.495 1st Qu.:0.940
+## Median :38.85 Median :0.5250 Median : 2.600 Median :1.740
+## Mean :38.55 Mean :0.6512 Mean : 5.562 Mean :1.846
+## 3rd Qu.:41.17 3rd Qu.:0.8400 3rd Qu.: 3.800 3rd Qu.:2.080
+## Max. :57.10 Max. :2.7600 Max. :21.530 Max. :4.500
+##
+## K20 Cao Ti02 Cr203
+## Min. :0.0000 Min. :1.500 Min. :0.2000 Min. :0.000
+## 1st Qu.:0.1600 1st Qu.:2.655 1st Qu.:0.5900 1st Qu.:0.025
+## Median :0.2000 Median :3.120 Median :0.7000 Median :0.155
+## Mean :0.5800 Mean :3.688 Mean :0.8194 Mean :0.355
+## 3rd Qu.:0.8275 3rd Qu.:4.310 3rd Qu.:0.9900 3rd Qu.:0.290
+## Max. :1.9000 Max. :7.770 Max. :2.4900 Max. :1.900
+##
+## Mno FeO-T name type
+## Min. :0.1000 Min. :13.24 Atsah : 1 Igneous :8
+## 1st Qu.:0.2800 1st Qu.:16.71 Bearwallow: 1 N/A :1
+## Median :0.4000 Median :23.86 Coulettes : 1 Sedimentary:7
+## Mean :0.3812 Mean :21.45 Hahonih : 1
+## 3rd Qu.:0.4900 3rd Qu.:25.70 Hazeltop : 1
+## Max. :0.6900 Max. :30.05 Kukaklek : 1
+## (Other) :10
+## campaign location abrasion
+## Crater Floor:9 01 : 1 Alfalfa :2
+## Delta Front :7 02 : 1 Bellegrade :2
+## 03 : 1 Berry Hollow:2
+## 04 : 1 Dourbes :2
+## 05 : 1 Novarupta :2
+## 06 : 1 Quartier :2
+## (Other):10 (Other) :4
+# Make the matrix of just mineral percentage measurements
+pixl.matrix <- pixl.df[,2:14] %>% scale()
+
+# Review the structure
+str(pixl.matrix)
+## num [1:16, 1:13] 1.928 1.338 -0.498 -0.538 1.225 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:13] "Na20" "Mgo" "Al203" "Si02" ...
+## - attr(*, "scaled:center")= Named num [1:13] 2.672 11.682 5.072 38.554 0.651 ...
+## ..- attr(*, "names")= chr [1:13] "Na20" "Mgo" "Al203" "Si02" ...
+## - attr(*, "scaled:scale")= Named num [1:13] 1.492 7.957 3.75 11.026 0.694 ...
+## ..- attr(*, "names")= chr [1:13] "Na20" "Mgo" "Al203" "Si02" ...
+The LIBS data provides summaries of the mineral compositions measured +at selected sample sites by the LIBS instrument, part of the +Perseverance SuperCam.
+# Load the saved LIBS data with locations added
+libs.df <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/supercam_libs_moc_loc.Rds")
+
+#Drop features that are not to be used in the analysis for this notebook
+libs.df <- libs.df %>%
+ select(!(c(distance_mm,Tot.Em.,SiO2_stdev,TiO2_stdev,Al2O3_stdev,FeOT_stdev,
+ MgO_stdev,Na2O_stdev,CaO_stdev,K2O_stdev,Total)))
+
+# Convert the points to numeric
+libs.df$point <- as.numeric(libs.df$point)
+
+# Review what we have
+summary(libs.df)
+## sol lat lon target
+## Min. : 15.0 Min. :18.43 Min. :77.34 Length:1932
+## 1st Qu.: 281.0 1st Qu.:18.44 1st Qu.:77.36 Class :character
+## Median : 557.0 Median :18.46 Median :77.40 Mode :character
+## Mean : 565.1 Mean :18.46 Mean :77.40
+## 3rd Qu.: 872.0 3rd Qu.:18.48 3rd Qu.:77.44
+## Max. :1019.0 Max. :18.50 Max. :77.45
+## point SiO2 TiO2 Al2O3
+## Min. : 1.000 Min. : 0.00 Min. :0.0000 Min. : 0.000
+## 1st Qu.: 3.000 1st Qu.:42.04 1st Qu.:0.0300 1st Qu.: 3.080
+## Median : 5.000 Median :45.80 Median :0.3200 Median : 4.925
+## Mean : 5.776 Mean :43.47 Mean :0.3719 Mean : 6.246
+## 3rd Qu.: 8.000 3rd Qu.:49.23 3rd Qu.:0.6400 3rd Qu.: 8.533
+## Max. :28.000 Max. :76.12 Max. :2.4000 Max. :38.350
+## FeOT MgO CaO Na2O
+## Min. : 0.29 Min. : 0.29 Min. : 0.080 Min. :0.0000
+## 1st Qu.:13.27 1st Qu.: 5.72 1st Qu.: 1.830 1st Qu.:0.9775
+## Median :20.21 Median :12.78 Median : 3.625 Median :1.5200
+## Mean :20.07 Mean :16.47 Mean : 4.726 Mean :1.7600
+## 3rd Qu.:25.45 3rd Qu.:27.83 3rd Qu.: 4.622 3rd Qu.:2.4000
+## Max. :82.68 Max. :45.21 Max. :52.130 Max. :7.5200
+## K2O
+## Min. : 0.0000
+## 1st Qu.: 0.0000
+## Median : 0.3000
+## Mean : 0.5909
+## 3rd Qu.: 0.7800
+## Max. :34.8700
+# Make the a matrix contain only the libs measurements for each mineral
+libs.matrix <- as.matrix(libs.df[,6:13])
+
+# Check to see scaling
+str(libs.matrix)
+## num [1:1932, 1:8] 49.7 55.8 61.2 51 48 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:8] "SiO2" "TiO2" "Al2O3" "FeOT" ...
+The SHERLOC data you will be using for this lab is the result of +scientists’ interpretations of extensive spectral analysis of abrasion +samples provided by the SHERLOC instrument.
+NOTE: This dataset presents minerals as rows and +sample sites as columns. You’ll probably want to rotate the dataset for +easier analysis….
+# Read in data as provided.
+sherloc_abrasion_raw <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/abrasions_sherloc_samples.Rds")
+
+# Clean up data types
+sherloc_abrasion_raw$Mineral<-as.factor(sherloc_abrasion_raw$Mineral)
+sherloc_abrasion_raw[sapply(sherloc_abrasion_raw, is.character)] <- lapply(sherloc_abrasion_raw[sapply(sherloc_abrasion_raw, is.character)],
+ as.numeric)
+# Transform NA's to 0
+sherloc_abrasion_raw <- sherloc_abrasion_raw %>% replace(is.na(.), 0)
+
+# Reformat data so that rows are "abrasions" and columns list the presence of minerals.
+# Do this by "pivoting" to a long format, and then back to the desired wide format.
+
+sherloc_long <- sherloc_abrasion_raw %>%
+ pivot_longer(!Mineral, names_to = "Name", values_to = "Presence")
+
+# Make abrasion a factor
+sherloc_long$Name <- as.factor(sherloc_long$Name)
+
+# Make it a matrix
+sherloc.matrix <- sherloc_long %>%
+ pivot_wider(names_from = Mineral, values_from = Presence)
+
+# Get sample information from PIXL and add to measurements -- assumes order is the same
+
+sherloc.df <- cbind(pixl.df[,c("sample","type","campaign","abrasion")],sherloc.matrix)
+
+# Review what we have
+summary(sherloc.df)
+## sample type campaign abrasion
+## Min. : 1.00 Igneous :8 Crater Floor:9 Alfalfa :2
+## 1st Qu.: 4.75 N/A :1 Delta Front :7 Bellegrade :2
+## Median : 8.50 Sedimentary:7 Berry Hollow:2
+## Mean : 8.50 Dourbes :2
+## 3rd Qu.:12.25 Novarupta :2
+## Max. :16.00 Quartier :2
+## (Other) :4
+## Name Plagioclase Sulfate Ca-sulfate
+## Atsah : 1 Min. :0.0000 Min. :0.0000 Min. :0.0000
+## Bearwallow: 1 1st Qu.:0.0000 1st Qu.:0.1875 1st Qu.:0.0000
+## Coulettes : 1 Median :0.0000 Median :1.0000 Median :0.0000
+## Hahonih : 1 Mean :0.1875 Mean :0.6562 Mean :0.3438
+## Hazeltop : 1 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
+## Kukaklek : 1 Max. :1.0000 Max. :1.0000 Max. :1.0000
+## (Other) :10
+## Hydrated Ca-sulfate Mg-sulfate Hydrated Sulfates Hydrated Mg-Fe sulfate
+## Min. :0.000 Min. :0.0000 Min. :0.000 Min. :0.0000
+## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000
+## Median :0.000 Median :0.0000 Median :0.000 Median :0.0000
+## Mean :0.125 Mean :0.1875 Mean :0.125 Mean :0.1875
+## 3rd Qu.:0.000 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.0000
+## Max. :1.000 Max. :1.0000 Max. :1.000 Max. :1.0000
+##
+## Perchlorates Na-perchlorate Amorphous Silicate Phosphate
+## Min. :0.0000 Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.0000 Median :0.00000 Median :0.0000 Median :0.0000
+## Mean :0.0625 Mean :0.03125 Mean :0.1406 Mean :0.2031
+## 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.2500 3rd Qu.:0.3125
+## Max. :1.0000 Max. :0.50000 Max. :0.5000 Max. :1.0000
+##
+## Pyroxene Olivine Carbonate Fe-Mg carbonate
+## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.4375 1st Qu.:0.000
+## Median :1.0000 Median :0.6250 Median :1.0000 Median :0.000
+## Mean :0.6875 Mean :0.5312 Mean :0.7344 Mean :0.125
+## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.000
+##
+## Hydrated Carbonates Disordered Silicates Feldspar Quartz
+## Min. :0 Min. :0.000 Min. :0.000 Min. :0.00000
+## 1st Qu.:0 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.00000
+## Median :0 Median :0.000 Median :0.000 Median :0.00000
+## Mean :0 Mean :0.125 Mean :0.125 Mean :0.03125
+## 3rd Qu.:0 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:0.00000
+## Max. :0 Max. :1.000 Max. :1.000 Max. :0.25000
+##
+## Apatite FeTi oxides Halite Iron oxide
+## Min. :0.0000 Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.0000 Median :0.0000 Median :0.00000 Median :0.0000
+## Mean :0.1406 Mean :0.1406 Mean :0.04688 Mean :0.2812
+## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.5000
+## Max. :1.0000 Max. :1.0000 Max. :0.25000 Max. :1.0000
+##
+## Hydrated Iron oxide Organic matter Sulfate+Organic matter
+## Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.00000 Median :1.0000 Median :0.0000
+## Mean :0.01562 Mean :0.5938 Mean :0.2188
+## 3rd Qu.:0.00000 3rd Qu.:1.0000 3rd Qu.:0.2500
+## Max. :0.25000 Max. :1.0000 Max. :1.0000
+##
+## Other hydrated phases Phyllosilicates Chlorite
+## Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.2500 Median :0.00000 Median :0.0000
+## Mean :0.4375 Mean :0.09375 Mean :0.0625
+## 3rd Qu.:1.0000 3rd Qu.:0.06250 3rd Qu.:0.0000
+## Max. :1.0000 Max. :0.50000 Max. :0.5000
+##
+## Kaolinite (hydrous Al-clay) Chromite Ilmenite Zircon/Baddeleyite
+## Min. :0.0000 Min. :0.000 Min. :0.000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.000
+## Median :0.0000 Median :0.000 Median :0.000 Median :0.000
+## Mean :0.1875 Mean :0.125 Mean :0.125 Mean :0.125
+## 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.000 Max. :1.000 Max. :1.000
+##
+## Fe-Mg-clay minerals Spinels
+## Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.0000 Median :0.0000
+## Mean :0.1875 Mean :0.0625
+## 3rd Qu.:0.0000 3rd Qu.:0.0000
+## Max. :1.0000 Max. :0.5000
+##
+# Measurements are everything except first column
+sherloc.matrix<-as.matrix(sherloc.matrix[,-1])
+
+# Sherlock measurement matrix
+# Review the structure
+str(sherloc.matrix)
+## num [1:16, 1:35] 1 1 1 0 0 0 0 0 0 0 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:35] "Plagioclase" "Sulfate" "Ca-sulfate" "Hydrated Ca-sulfate" ...
+# Combine PIXL and SHERLOC dataframes
+pixl_sherloc.df <- cbind(pixl.df,sherloc.df )
+
+# Review what we have
+summary(pixl_sherloc.df)
+## sample Na20 Mgo Al203
+## Min. : 1.00 Min. :1.000 Min. : 0.730 Min. : 1.700
+## 1st Qu.: 4.75 1st Qu.:1.853 1st Qu.: 2.533 1st Qu.: 2.220
+## Median : 8.50 Median :1.900 Median :12.800 Median : 3.710
+## Mean : 8.50 Mean :2.672 Mean :11.682 Mean : 5.072
+## 3rd Qu.:12.25 3rd Qu.:4.500 3rd Qu.:19.100 3rd Qu.: 7.117
+## Max. :16.00 Max. :5.550 Max. :22.700 Max. :11.600
+##
+## Si02 P205 S03 Cl
+## Min. :22.60 Min. :0.1000 Min. : 0.780 Min. :0.400
+## 1st Qu.:31.22 1st Qu.:0.2350 1st Qu.: 1.495 1st Qu.:0.940
+## Median :38.85 Median :0.5250 Median : 2.600 Median :1.740
+## Mean :38.55 Mean :0.6512 Mean : 5.562 Mean :1.846
+## 3rd Qu.:41.17 3rd Qu.:0.8400 3rd Qu.: 3.800 3rd Qu.:2.080
+## Max. :57.10 Max. :2.7600 Max. :21.530 Max. :4.500
+##
+## K20 Cao Ti02 Cr203
+## Min. :0.0000 Min. :1.500 Min. :0.2000 Min. :0.000
+## 1st Qu.:0.1600 1st Qu.:2.655 1st Qu.:0.5900 1st Qu.:0.025
+## Median :0.2000 Median :3.120 Median :0.7000 Median :0.155
+## Mean :0.5800 Mean :3.688 Mean :0.8194 Mean :0.355
+## 3rd Qu.:0.8275 3rd Qu.:4.310 3rd Qu.:0.9900 3rd Qu.:0.290
+## Max. :1.9000 Max. :7.770 Max. :2.4900 Max. :1.900
+##
+## Mno FeO-T name type
+## Min. :0.1000 Min. :13.24 Atsah : 1 Igneous :8
+## 1st Qu.:0.2800 1st Qu.:16.71 Bearwallow: 1 N/A :1
+## Median :0.4000 Median :23.86 Coulettes : 1 Sedimentary:7
+## Mean :0.3812 Mean :21.45 Hahonih : 1
+## 3rd Qu.:0.4900 3rd Qu.:25.70 Hazeltop : 1
+## Max. :0.6900 Max. :30.05 Kukaklek : 1
+## (Other) :10
+## campaign location abrasion sample type
+## Crater Floor:9 01 : 1 Alfalfa :2 Min. : 1.00 Igneous :8
+## Delta Front :7 02 : 1 Bellegrade :2 1st Qu.: 4.75 N/A :1
+## 03 : 1 Berry Hollow:2 Median : 8.50 Sedimentary:7
+## 04 : 1 Dourbes :2 Mean : 8.50
+## 05 : 1 Novarupta :2 3rd Qu.:12.25
+## 06 : 1 Quartier :2 Max. :16.00
+## (Other):10 (Other) :4
+## campaign abrasion Name Plagioclase
+## Crater Floor:9 Alfalfa :2 Atsah : 1 Min. :0.0000
+## Delta Front :7 Bellegrade :2 Bearwallow: 1 1st Qu.:0.0000
+## Berry Hollow:2 Coulettes : 1 Median :0.0000
+## Dourbes :2 Hahonih : 1 Mean :0.1875
+## Novarupta :2 Hazeltop : 1 3rd Qu.:0.0000
+## Quartier :2 Kukaklek : 1 Max. :1.0000
+## (Other) :4 (Other) :10
+## Sulfate Ca-sulfate Hydrated Ca-sulfate Mg-sulfate
+## Min. :0.0000 Min. :0.0000 Min. :0.000 Min. :0.0000
+## 1st Qu.:0.1875 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000
+## Median :1.0000 Median :0.0000 Median :0.000 Median :0.0000
+## Mean :0.6562 Mean :0.3438 Mean :0.125 Mean :0.1875
+## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.000 3rd Qu.:0.0000
+## Max. :1.0000 Max. :1.0000 Max. :1.000 Max. :1.0000
+##
+## Hydrated Sulfates Hydrated Mg-Fe sulfate Perchlorates Na-perchlorate
+## Min. :0.000 Min. :0.0000 Min. :0.0000 Min. :0.00000
+## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000
+## Median :0.000 Median :0.0000 Median :0.0000 Median :0.00000
+## Mean :0.125 Mean :0.1875 Mean :0.0625 Mean :0.03125
+## 3rd Qu.:0.000 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.00000
+## Max. :1.000 Max. :1.0000 Max. :1.0000 Max. :0.50000
+##
+## Amorphous Silicate Phosphate Pyroxene Olivine
+## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.0000 Median :0.0000 Median :1.0000 Median :0.6250
+## Mean :0.1406 Mean :0.2031 Mean :0.6875 Mean :0.5312
+## 3rd Qu.:0.2500 3rd Qu.:0.3125 3rd Qu.:1.0000 3rd Qu.:1.0000
+## Max. :0.5000 Max. :1.0000 Max. :1.0000 Max. :1.0000
+##
+## Carbonate Fe-Mg carbonate Hydrated Carbonates Disordered Silicates
+## Min. :0.0000 Min. :0.000 Min. :0 Min. :0.000
+## 1st Qu.:0.4375 1st Qu.:0.000 1st Qu.:0 1st Qu.:0.000
+## Median :1.0000 Median :0.000 Median :0 Median :0.000
+## Mean :0.7344 Mean :0.125 Mean :0 Mean :0.125
+## 3rd Qu.:1.0000 3rd Qu.:0.000 3rd Qu.:0 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.000 Max. :0 Max. :1.000
+##
+## Feldspar Quartz Apatite FeTi oxides
+## Min. :0.000 Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.000 Median :0.00000 Median :0.0000 Median :0.0000
+## Mean :0.125 Mean :0.03125 Mean :0.1406 Mean :0.1406
+## 3rd Qu.:0.000 3rd Qu.:0.00000 3rd Qu.:0.0000 3rd Qu.:0.0000
+## Max. :1.000 Max. :0.25000 Max. :1.0000 Max. :1.0000
+##
+## Halite Iron oxide Hydrated Iron oxide Organic matter
+## Min. :0.00000 Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.00000 Median :0.0000 Median :0.00000 Median :1.0000
+## Mean :0.04688 Mean :0.2812 Mean :0.01562 Mean :0.5938
+## 3rd Qu.:0.00000 3rd Qu.:0.5000 3rd Qu.:0.00000 3rd Qu.:1.0000
+## Max. :0.25000 Max. :1.0000 Max. :0.25000 Max. :1.0000
+##
+## Sulfate+Organic matter Other hydrated phases Phyllosilicates
+## Min. :0.0000 Min. :0.0000 Min. :0.00000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000
+## Median :0.0000 Median :0.2500 Median :0.00000
+## Mean :0.2188 Mean :0.4375 Mean :0.09375
+## 3rd Qu.:0.2500 3rd Qu.:1.0000 3rd Qu.:0.06250
+## Max. :1.0000 Max. :1.0000 Max. :0.50000
+##
+## Chlorite Kaolinite (hydrous Al-clay) Chromite Ilmenite
+## Min. :0.0000 Min. :0.0000 Min. :0.000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.000
+## Median :0.0000 Median :0.0000 Median :0.000 Median :0.000
+## Mean :0.0625 Mean :0.1875 Mean :0.125 Mean :0.125
+## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.000
+## Max. :0.5000 Max. :1.0000 Max. :1.000 Max. :1.000
+##
+## Zircon/Baddeleyite Fe-Mg-clay minerals Spinels
+## Min. :0.000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.000 Median :0.0000 Median :0.0000
+## Mean :0.125 Mean :0.1875 Mean :0.0625
+## 3rd Qu.:0.000 3rd Qu.:0.0000 3rd Qu.:0.0000
+## Max. :1.000 Max. :1.0000 Max. :0.5000
+##
+# Combine PIXL and SHERLOC matrices
+pixl_sherloc.matrix<-cbind(pixl.matrix,sherloc.matrix)
+
+# Review the structure of our matrix
+str(pixl_sherloc.matrix)
+## num [1:16, 1:48] 1.928 1.338 -0.498 -0.538 1.225 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:48] "Na20" "Mgo" "Al203" "Si02" ...
+Create data and matrix from prior datasets
+# Combine our PIXL and Lithology dataframes
+pixl_lithology.df <- cbind(pixl.df,lithology.df )
+
+# Review what we have
+summary(pixl_lithology.df)
+## sample Na20 Mgo Al203
+## Min. : 1.00 Min. :1.000 Min. : 0.730 Min. : 1.700
+## 1st Qu.: 4.75 1st Qu.:1.853 1st Qu.: 2.533 1st Qu.: 2.220
+## Median : 8.50 Median :1.900 Median :12.800 Median : 3.710
+## Mean : 8.50 Mean :2.672 Mean :11.682 Mean : 5.072
+## 3rd Qu.:12.25 3rd Qu.:4.500 3rd Qu.:19.100 3rd Qu.: 7.117
+## Max. :16.00 Max. :5.550 Max. :22.700 Max. :11.600
+##
+## Si02 P205 S03 Cl
+## Min. :22.60 Min. :0.1000 Min. : 0.780 Min. :0.400
+## 1st Qu.:31.22 1st Qu.:0.2350 1st Qu.: 1.495 1st Qu.:0.940
+## Median :38.85 Median :0.5250 Median : 2.600 Median :1.740
+## Mean :38.55 Mean :0.6512 Mean : 5.562 Mean :1.846
+## 3rd Qu.:41.17 3rd Qu.:0.8400 3rd Qu.: 3.800 3rd Qu.:2.080
+## Max. :57.10 Max. :2.7600 Max. :21.530 Max. :4.500
+##
+## K20 Cao Ti02 Cr203
+## Min. :0.0000 Min. :1.500 Min. :0.2000 Min. :0.000
+## 1st Qu.:0.1600 1st Qu.:2.655 1st Qu.:0.5900 1st Qu.:0.025
+## Median :0.2000 Median :3.120 Median :0.7000 Median :0.155
+## Mean :0.5800 Mean :3.688 Mean :0.8194 Mean :0.355
+## 3rd Qu.:0.8275 3rd Qu.:4.310 3rd Qu.:0.9900 3rd Qu.:0.290
+## Max. :1.9000 Max. :7.770 Max. :2.4900 Max. :1.900
+##
+## Mno FeO-T name type
+## Min. :0.1000 Min. :13.24 Atsah : 1 Igneous :8
+## 1st Qu.:0.2800 1st Qu.:16.71 Bearwallow: 1 N/A :1
+## Median :0.4000 Median :23.86 Coulettes : 1 Sedimentary:7
+## Mean :0.3812 Mean :21.45 Hahonih : 1
+## 3rd Qu.:0.4900 3rd Qu.:25.70 Hazeltop : 1
+## Max. :0.6900 Max. :30.05 Kukaklek : 1
+## (Other) :10
+## campaign location abrasion sample name
+## Crater Floor:9 01 : 1 Alfalfa :2 Min. : 1.00 Atsah : 1
+## Delta Front :7 02 : 1 Bellegrade :2 1st Qu.: 4.75 Bearwallow: 1
+## 03 : 1 Berry Hollow:2 Median : 8.50 Coulettes : 1
+## 04 : 1 Dourbes :2 Mean : 8.50 Hahonih : 1
+## 05 : 1 Novarupta :2 3rd Qu.:12.25 Hazeltop : 1
+## 06 : 1 Quartier :2 Max. :16.00 Kukaklek : 1
+## (Other):10 (Other) :4 (Other) :10
+## SampleType campaign abrasion feldspar plagioclase
+## atmospheric: 1 Crater Floor:9 Alfalfa :2 0:14 0:13
+## regolith : 0 Delta Front :7 Bellegarde :2 1: 2 1: 3
+## rock core :15 Margin Unit :0 Berry Hollow:2
+## Dourbes :2
+## Novarupta :2
+## Quartier :2
+## (Other) :4
+## pyroxene olivine quartz apatite FeTi_Oxides Iron_Oxide Sulfate Perchlorates
+## 0: 5 0: 6 0:14 0:13 0:13 0:9 0: 4 0:15
+## 1:11 1:10 1: 2 1: 3 1: 3 1:7 1:12 1: 1
+##
+##
+##
+##
+##
+## Phosphate Ca_Sulfate Carbonate Fe_Mg_clay Fe_Mg_carbonate Mg_sulfate
+## 0:11 0:10 0: 1 0:13 0:14 0:13
+## 1: 5 1: 6 1:15 1: 3 1: 2 1: 3
+##
+##
+##
+##
+##
+## Phyllosilicates Chlorite Halite Organic_matter Hydrated_Ca_Sulfate
+## 0:12 0:14 0:13 0: 5 0:14
+## 1: 4 1: 2 1: 3 1:11 1: 2
+##
+##
+##
+##
+##
+## Hydrated_Sulfates Hydrated_Mg_Fe_Sulfate Na_Perchlorate Amorphous_Silicate
+## 0:14 0:13 0:15 0:9
+## 1: 2 1: 3 1: 1 1:7
+##
+##
+##
+##
+##
+## Hydrated_Carbonates Disordered_Silicates Hydrated_Iron_Oxide
+## 0:16 0:14 0:15
+## 1: 2 1: 1
+##
+##
+##
+##
+##
+## Sulfate+Organic_Matter Other_hydrated_phases Kaolinite Chromite Ilmenite
+## 0:11 0:8 0:13 0:14 0:14
+## 1: 5 1:8 1: 3 1: 2 1: 2
+##
+##
+##
+##
+##
+## Zircon/Baddeleyite Spinels
+## 0:14 0:14
+## 1: 2 1: 2
+##
+##
+##
+##
+##
+# Combine PIXL and Lithology matrices
+pixl_lithology.matrix<-cbind(pixl.matrix,lithology.matrix)
+
+# Review the structure
+str(pixl_lithology.matrix)
+## num [1:16, 1:48] 1.928 1.338 -0.498 -0.538 1.225 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:48] "Na20" "Mgo" "Al203" "Si02" ...
+Create Data and matrix from prior datasets by taking on appropriate +combinations.
+# Combine the Lithology and SHERLOC dataframes
+sherloc_lithology.df <- cbind(sherloc.df,lithology.df )
+
+# Review what we have
+summary(sherloc_lithology.df)
+## sample type campaign abrasion
+## Min. : 1.00 Igneous :8 Crater Floor:9 Alfalfa :2
+## 1st Qu.: 4.75 N/A :1 Delta Front :7 Bellegrade :2
+## Median : 8.50 Sedimentary:7 Berry Hollow:2
+## Mean : 8.50 Dourbes :2
+## 3rd Qu.:12.25 Novarupta :2
+## Max. :16.00 Quartier :2
+## (Other) :4
+## Name Plagioclase Sulfate Ca-sulfate
+## Atsah : 1 Min. :0.0000 Min. :0.0000 Min. :0.0000
+## Bearwallow: 1 1st Qu.:0.0000 1st Qu.:0.1875 1st Qu.:0.0000
+## Coulettes : 1 Median :0.0000 Median :1.0000 Median :0.0000
+## Hahonih : 1 Mean :0.1875 Mean :0.6562 Mean :0.3438
+## Hazeltop : 1 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
+## Kukaklek : 1 Max. :1.0000 Max. :1.0000 Max. :1.0000
+## (Other) :10
+## Hydrated Ca-sulfate Mg-sulfate Hydrated Sulfates Hydrated Mg-Fe sulfate
+## Min. :0.000 Min. :0.0000 Min. :0.000 Min. :0.0000
+## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000
+## Median :0.000 Median :0.0000 Median :0.000 Median :0.0000
+## Mean :0.125 Mean :0.1875 Mean :0.125 Mean :0.1875
+## 3rd Qu.:0.000 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.0000
+## Max. :1.000 Max. :1.0000 Max. :1.000 Max. :1.0000
+##
+## Perchlorates Na-perchlorate Amorphous Silicate Phosphate
+## Min. :0.0000 Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.0000 Median :0.00000 Median :0.0000 Median :0.0000
+## Mean :0.0625 Mean :0.03125 Mean :0.1406 Mean :0.2031
+## 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.2500 3rd Qu.:0.3125
+## Max. :1.0000 Max. :0.50000 Max. :0.5000 Max. :1.0000
+##
+## Pyroxene Olivine Carbonate Fe-Mg carbonate
+## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.4375 1st Qu.:0.000
+## Median :1.0000 Median :0.6250 Median :1.0000 Median :0.000
+## Mean :0.6875 Mean :0.5312 Mean :0.7344 Mean :0.125
+## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.000
+##
+## Hydrated Carbonates Disordered Silicates Feldspar Quartz
+## Min. :0 Min. :0.000 Min. :0.000 Min. :0.00000
+## 1st Qu.:0 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.00000
+## Median :0 Median :0.000 Median :0.000 Median :0.00000
+## Mean :0 Mean :0.125 Mean :0.125 Mean :0.03125
+## 3rd Qu.:0 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:0.00000
+## Max. :0 Max. :1.000 Max. :1.000 Max. :0.25000
+##
+## Apatite FeTi oxides Halite Iron oxide
+## Min. :0.0000 Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.0000 Median :0.0000 Median :0.00000 Median :0.0000
+## Mean :0.1406 Mean :0.1406 Mean :0.04688 Mean :0.2812
+## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.5000
+## Max. :1.0000 Max. :1.0000 Max. :0.25000 Max. :1.0000
+##
+## Hydrated Iron oxide Organic matter Sulfate+Organic matter
+## Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.00000 Median :1.0000 Median :0.0000
+## Mean :0.01562 Mean :0.5938 Mean :0.2188
+## 3rd Qu.:0.00000 3rd Qu.:1.0000 3rd Qu.:0.2500
+## Max. :0.25000 Max. :1.0000 Max. :1.0000
+##
+## Other hydrated phases Phyllosilicates Chlorite
+## Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.2500 Median :0.00000 Median :0.0000
+## Mean :0.4375 Mean :0.09375 Mean :0.0625
+## 3rd Qu.:1.0000 3rd Qu.:0.06250 3rd Qu.:0.0000
+## Max. :1.0000 Max. :0.50000 Max. :0.5000
+##
+## Kaolinite (hydrous Al-clay) Chromite Ilmenite Zircon/Baddeleyite
+## Min. :0.0000 Min. :0.000 Min. :0.000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.000
+## Median :0.0000 Median :0.000 Median :0.000 Median :0.000
+## Mean :0.1875 Mean :0.125 Mean :0.125 Mean :0.125
+## 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.000 Max. :1.000 Max. :1.000
+##
+## Fe-Mg-clay minerals Spinels sample name
+## Min. :0.0000 Min. :0.0000 Min. : 1.00 Atsah : 1
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.: 4.75 Bearwallow: 1
+## Median :0.0000 Median :0.0000 Median : 8.50 Coulettes : 1
+## Mean :0.1875 Mean :0.0625 Mean : 8.50 Hahonih : 1
+## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:12.25 Hazeltop : 1
+## Max. :1.0000 Max. :0.5000 Max. :16.00 Kukaklek : 1
+## (Other) :10
+## SampleType campaign abrasion feldspar plagioclase
+## atmospheric: 1 Crater Floor:9 Alfalfa :2 0:14 0:13
+## regolith : 0 Delta Front :7 Bellegarde :2 1: 2 1: 3
+## rock core :15 Margin Unit :0 Berry Hollow:2
+## Dourbes :2
+## Novarupta :2
+## Quartier :2
+## (Other) :4
+## pyroxene olivine quartz apatite FeTi_Oxides Iron_Oxide Sulfate Perchlorates
+## 0: 5 0: 6 0:14 0:13 0:13 0:9 0: 4 0:15
+## 1:11 1:10 1: 2 1: 3 1: 3 1:7 1:12 1: 1
+##
+##
+##
+##
+##
+## Phosphate Ca_Sulfate Carbonate Fe_Mg_clay Fe_Mg_carbonate Mg_sulfate
+## 0:11 0:10 0: 1 0:13 0:14 0:13
+## 1: 5 1: 6 1:15 1: 3 1: 2 1: 3
+##
+##
+##
+##
+##
+## Phyllosilicates Chlorite Halite Organic_matter Hydrated_Ca_Sulfate
+## 0:12 0:14 0:13 0: 5 0:14
+## 1: 4 1: 2 1: 3 1:11 1: 2
+##
+##
+##
+##
+##
+## Hydrated_Sulfates Hydrated_Mg_Fe_Sulfate Na_Perchlorate Amorphous_Silicate
+## 0:14 0:13 0:15 0:9
+## 1: 2 1: 3 1: 1 1:7
+##
+##
+##
+##
+##
+## Hydrated_Carbonates Disordered_Silicates Hydrated_Iron_Oxide
+## 0:16 0:14 0:15
+## 1: 2 1: 1
+##
+##
+##
+##
+##
+## Sulfate+Organic_Matter Other_hydrated_phases Kaolinite Chromite Ilmenite
+## 0:11 0:8 0:13 0:14 0:14
+## 1: 5 1:8 1: 3 1: 2 1: 2
+##
+##
+##
+##
+##
+## Zircon/Baddeleyite Spinels
+## 0:14 0:14
+## 1: 2 1: 2
+##
+##
+##
+##
+##
+# Combine the Lithology and SHERLOC matrices
+sherloc_lithology.matrix<-cbind(sherloc.matrix,lithology.matrix)
+
+# Review the resulting matrix
+str(sherloc_lithology.matrix)
+## num [1:16, 1:70] 1 1 1 0 0 0 0 0 0 0 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:70] "Plagioclase" "Sulfate" "Ca-sulfate" "Hydrated Ca-sulfate" ...
+Create data frame and matrix from prior datasets by making on +appropriate combinations.
+# Combine the Lithology and SHERLOC dataframes
+sherloc_lithology_pixl.df <- cbind(sherloc.df,lithology.df, pixl.df )
+
+# Review what we have
+summary(sherloc_lithology_pixl.df)
+## sample type campaign abrasion
+## Min. : 1.00 Igneous :8 Crater Floor:9 Alfalfa :2
+## 1st Qu.: 4.75 N/A :1 Delta Front :7 Bellegrade :2
+## Median : 8.50 Sedimentary:7 Berry Hollow:2
+## Mean : 8.50 Dourbes :2
+## 3rd Qu.:12.25 Novarupta :2
+## Max. :16.00 Quartier :2
+## (Other) :4
+## Name Plagioclase Sulfate Ca-sulfate
+## Atsah : 1 Min. :0.0000 Min. :0.0000 Min. :0.0000
+## Bearwallow: 1 1st Qu.:0.0000 1st Qu.:0.1875 1st Qu.:0.0000
+## Coulettes : 1 Median :0.0000 Median :1.0000 Median :0.0000
+## Hahonih : 1 Mean :0.1875 Mean :0.6562 Mean :0.3438
+## Hazeltop : 1 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
+## Kukaklek : 1 Max. :1.0000 Max. :1.0000 Max. :1.0000
+## (Other) :10
+## Hydrated Ca-sulfate Mg-sulfate Hydrated Sulfates Hydrated Mg-Fe sulfate
+## Min. :0.000 Min. :0.0000 Min. :0.000 Min. :0.0000
+## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000
+## Median :0.000 Median :0.0000 Median :0.000 Median :0.0000
+## Mean :0.125 Mean :0.1875 Mean :0.125 Mean :0.1875
+## 3rd Qu.:0.000 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.0000
+## Max. :1.000 Max. :1.0000 Max. :1.000 Max. :1.0000
+##
+## Perchlorates Na-perchlorate Amorphous Silicate Phosphate
+## Min. :0.0000 Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.0000 Median :0.00000 Median :0.0000 Median :0.0000
+## Mean :0.0625 Mean :0.03125 Mean :0.1406 Mean :0.2031
+## 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.2500 3rd Qu.:0.3125
+## Max. :1.0000 Max. :0.50000 Max. :0.5000 Max. :1.0000
+##
+## Pyroxene Olivine Carbonate Fe-Mg carbonate
+## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.4375 1st Qu.:0.000
+## Median :1.0000 Median :0.6250 Median :1.0000 Median :0.000
+## Mean :0.6875 Mean :0.5312 Mean :0.7344 Mean :0.125
+## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.000
+##
+## Hydrated Carbonates Disordered Silicates Feldspar Quartz
+## Min. :0 Min. :0.000 Min. :0.000 Min. :0.00000
+## 1st Qu.:0 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.00000
+## Median :0 Median :0.000 Median :0.000 Median :0.00000
+## Mean :0 Mean :0.125 Mean :0.125 Mean :0.03125
+## 3rd Qu.:0 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:0.00000
+## Max. :0 Max. :1.000 Max. :1.000 Max. :0.25000
+##
+## Apatite FeTi oxides Halite Iron oxide
+## Min. :0.0000 Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.0000 Median :0.0000 Median :0.00000 Median :0.0000
+## Mean :0.1406 Mean :0.1406 Mean :0.04688 Mean :0.2812
+## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.5000
+## Max. :1.0000 Max. :1.0000 Max. :0.25000 Max. :1.0000
+##
+## Hydrated Iron oxide Organic matter Sulfate+Organic matter
+## Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.00000 Median :1.0000 Median :0.0000
+## Mean :0.01562 Mean :0.5938 Mean :0.2188
+## 3rd Qu.:0.00000 3rd Qu.:1.0000 3rd Qu.:0.2500
+## Max. :0.25000 Max. :1.0000 Max. :1.0000
+##
+## Other hydrated phases Phyllosilicates Chlorite
+## Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.2500 Median :0.00000 Median :0.0000
+## Mean :0.4375 Mean :0.09375 Mean :0.0625
+## 3rd Qu.:1.0000 3rd Qu.:0.06250 3rd Qu.:0.0000
+## Max. :1.0000 Max. :0.50000 Max. :0.5000
+##
+## Kaolinite (hydrous Al-clay) Chromite Ilmenite Zircon/Baddeleyite
+## Min. :0.0000 Min. :0.000 Min. :0.000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.000
+## Median :0.0000 Median :0.000 Median :0.000 Median :0.000
+## Mean :0.1875 Mean :0.125 Mean :0.125 Mean :0.125
+## 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.000 Max. :1.000 Max. :1.000
+##
+## Fe-Mg-clay minerals Spinels sample name
+## Min. :0.0000 Min. :0.0000 Min. : 1.00 Atsah : 1
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.: 4.75 Bearwallow: 1
+## Median :0.0000 Median :0.0000 Median : 8.50 Coulettes : 1
+## Mean :0.1875 Mean :0.0625 Mean : 8.50 Hahonih : 1
+## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:12.25 Hazeltop : 1
+## Max. :1.0000 Max. :0.5000 Max. :16.00 Kukaklek : 1
+## (Other) :10
+## SampleType campaign abrasion feldspar plagioclase
+## atmospheric: 1 Crater Floor:9 Alfalfa :2 0:14 0:13
+## regolith : 0 Delta Front :7 Bellegarde :2 1: 2 1: 3
+## rock core :15 Margin Unit :0 Berry Hollow:2
+## Dourbes :2
+## Novarupta :2
+## Quartier :2
+## (Other) :4
+## pyroxene olivine quartz apatite FeTi_Oxides Iron_Oxide Sulfate Perchlorates
+## 0: 5 0: 6 0:14 0:13 0:13 0:9 0: 4 0:15
+## 1:11 1:10 1: 2 1: 3 1: 3 1:7 1:12 1: 1
+##
+##
+##
+##
+##
+## Phosphate Ca_Sulfate Carbonate Fe_Mg_clay Fe_Mg_carbonate Mg_sulfate
+## 0:11 0:10 0: 1 0:13 0:14 0:13
+## 1: 5 1: 6 1:15 1: 3 1: 2 1: 3
+##
+##
+##
+##
+##
+## Phyllosilicates Chlorite Halite Organic_matter Hydrated_Ca_Sulfate
+## 0:12 0:14 0:13 0: 5 0:14
+## 1: 4 1: 2 1: 3 1:11 1: 2
+##
+##
+##
+##
+##
+## Hydrated_Sulfates Hydrated_Mg_Fe_Sulfate Na_Perchlorate Amorphous_Silicate
+## 0:14 0:13 0:15 0:9
+## 1: 2 1: 3 1: 1 1:7
+##
+##
+##
+##
+##
+## Hydrated_Carbonates Disordered_Silicates Hydrated_Iron_Oxide
+## 0:16 0:14 0:15
+## 1: 2 1: 1
+##
+##
+##
+##
+##
+## Sulfate+Organic_Matter Other_hydrated_phases Kaolinite Chromite Ilmenite
+## 0:11 0:8 0:13 0:14 0:14
+## 1: 5 1:8 1: 3 1: 2 1: 2
+##
+##
+##
+##
+##
+## Zircon/Baddeleyite Spinels sample Na20 Mgo
+## 0:14 0:14 Min. : 1.00 Min. :1.000 Min. : 0.730
+## 1: 2 1: 2 1st Qu.: 4.75 1st Qu.:1.853 1st Qu.: 2.533
+## Median : 8.50 Median :1.900 Median :12.800
+## Mean : 8.50 Mean :2.672 Mean :11.682
+## 3rd Qu.:12.25 3rd Qu.:4.500 3rd Qu.:19.100
+## Max. :16.00 Max. :5.550 Max. :22.700
+##
+## Al203 Si02 P205 S03
+## Min. : 1.700 Min. :22.60 Min. :0.1000 Min. : 0.780
+## 1st Qu.: 2.220 1st Qu.:31.22 1st Qu.:0.2350 1st Qu.: 1.495
+## Median : 3.710 Median :38.85 Median :0.5250 Median : 2.600
+## Mean : 5.072 Mean :38.55 Mean :0.6512 Mean : 5.562
+## 3rd Qu.: 7.117 3rd Qu.:41.17 3rd Qu.:0.8400 3rd Qu.: 3.800
+## Max. :11.600 Max. :57.10 Max. :2.7600 Max. :21.530
+##
+## Cl K20 Cao Ti02
+## Min. :0.400 Min. :0.0000 Min. :1.500 Min. :0.2000
+## 1st Qu.:0.940 1st Qu.:0.1600 1st Qu.:2.655 1st Qu.:0.5900
+## Median :1.740 Median :0.2000 Median :3.120 Median :0.7000
+## Mean :1.846 Mean :0.5800 Mean :3.688 Mean :0.8194
+## 3rd Qu.:2.080 3rd Qu.:0.8275 3rd Qu.:4.310 3rd Qu.:0.9900
+## Max. :4.500 Max. :1.9000 Max. :7.770 Max. :2.4900
+##
+## Cr203 Mno FeO-T name
+## Min. :0.000 Min. :0.1000 Min. :13.24 Atsah : 1
+## 1st Qu.:0.025 1st Qu.:0.2800 1st Qu.:16.71 Bearwallow: 1
+## Median :0.155 Median :0.4000 Median :23.86 Coulettes : 1
+## Mean :0.355 Mean :0.3812 Mean :21.45 Hahonih : 1
+## 3rd Qu.:0.290 3rd Qu.:0.4900 3rd Qu.:25.70 Hazeltop : 1
+## Max. :1.900 Max. :0.6900 Max. :30.05 Kukaklek : 1
+## (Other) :10
+## type campaign location abrasion
+## Igneous :8 Crater Floor:9 01 : 1 Alfalfa :2
+## N/A :1 Delta Front :7 02 : 1 Bellegrade :2
+## Sedimentary:7 03 : 1 Berry Hollow:2
+## 04 : 1 Dourbes :2
+## 05 : 1 Novarupta :2
+## 06 : 1 Quartier :2
+## (Other):10 (Other) :4
+# Combine the Lithology, SHERLOC and PIXLmatrices
+sherloc_lithology_pixl.matrix<-cbind(sherloc.matrix,lithology.matrix,pixl.matrix)
+
+# Review the resulting matrix
+str(sherloc_lithology_pixl.matrix)
+## num [1:16, 1:83] 1 1 1 0 0 0 0 0 0 0 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:83] "Plagioclase" "Sulfate" "Ca-sulfate" "Hydrated Ca-sulfate" ...
+Each team has been assigned one of six datasets:
+Dataset B: PIXL: The PIXL team’s goal is to understand and +explain how scaling changes results from Assignment 1. The matrix +version was scaled above but not in Assignment 1.
Dataset C: LIBS (with appropriate scaling as necessary. Not +scaled yet.)
Dataset D: Sherloc (with appropriate scaling as necessary. Not +scaled yet.)
Dataset E: PIXL + Sherloc (with appropriate scaling as necessary. +Not scaled yet.)
Dataset F: PIXL + Lithography (with appropriate scaling as +necessary. Not scaled yet.)
Dataset G: Sherloc + Lithograpy (with appropriate scaling as +necessary. Not scaled yet.)
Dataset H: PIXL + Sherloc + Lithograpy (with appropriate scaling +as necessary. Not scaled yet.)
For the data set assigned to your team, perform the following +steps. Feel free to use the methods/code from Assignment 1 as +desired. Communicate with your teammates. Make sure that you are doing +different variations of below analysis so that no team member does the +exact same analysis. If you want to use the same clustering for your +team (which is okay but then vary rest), make sure you use the same +random seeds.
+summary(sherloc_lithology.df)
+## sample type campaign abrasion
+## Min. : 1.00 Igneous :8 Crater Floor:9 Alfalfa :2
+## 1st Qu.: 4.75 N/A :1 Delta Front :7 Bellegrade :2
+## Median : 8.50 Sedimentary:7 Berry Hollow:2
+## Mean : 8.50 Dourbes :2
+## 3rd Qu.:12.25 Novarupta :2
+## Max. :16.00 Quartier :2
+## (Other) :4
+## Name Plagioclase Sulfate Ca-sulfate
+## Atsah : 1 Min. :0.0000 Min. :0.0000 Min. :0.0000
+## Bearwallow: 1 1st Qu.:0.0000 1st Qu.:0.1875 1st Qu.:0.0000
+## Coulettes : 1 Median :0.0000 Median :1.0000 Median :0.0000
+## Hahonih : 1 Mean :0.1875 Mean :0.6562 Mean :0.3438
+## Hazeltop : 1 3rd Qu.:0.0000 3rd Qu.:1.0000 3rd Qu.:1.0000
+## Kukaklek : 1 Max. :1.0000 Max. :1.0000 Max. :1.0000
+## (Other) :10
+## Hydrated Ca-sulfate Mg-sulfate Hydrated Sulfates Hydrated Mg-Fe sulfate
+## Min. :0.000 Min. :0.0000 Min. :0.000 Min. :0.0000
+## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.0000
+## Median :0.000 Median :0.0000 Median :0.000 Median :0.0000
+## Mean :0.125 Mean :0.1875 Mean :0.125 Mean :0.1875
+## 3rd Qu.:0.000 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.0000
+## Max. :1.000 Max. :1.0000 Max. :1.000 Max. :1.0000
+##
+## Perchlorates Na-perchlorate Amorphous Silicate Phosphate
+## Min. :0.0000 Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.0000 Median :0.00000 Median :0.0000 Median :0.0000
+## Mean :0.0625 Mean :0.03125 Mean :0.1406 Mean :0.2031
+## 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.2500 3rd Qu.:0.3125
+## Max. :1.0000 Max. :0.50000 Max. :0.5000 Max. :1.0000
+##
+## Pyroxene Olivine Carbonate Fe-Mg carbonate
+## Min. :0.0000 Min. :0.0000 Min. :0.0000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.4375 1st Qu.:0.000
+## Median :1.0000 Median :0.6250 Median :1.0000 Median :0.000
+## Mean :0.6875 Mean :0.5312 Mean :0.7344 Mean :0.125
+## 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:1.0000 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.0000 Max. :1.0000 Max. :1.000
+##
+## Hydrated Carbonates Disordered Silicates Feldspar Quartz
+## Min. :0 Min. :0.000 Min. :0.000 Min. :0.00000
+## 1st Qu.:0 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.00000
+## Median :0 Median :0.000 Median :0.000 Median :0.00000
+## Mean :0 Mean :0.125 Mean :0.125 Mean :0.03125
+## 3rd Qu.:0 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:0.00000
+## Max. :0 Max. :1.000 Max. :1.000 Max. :0.25000
+##
+## Apatite FeTi oxides Halite Iron oxide
+## Min. :0.0000 Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.0000 Median :0.0000 Median :0.00000 Median :0.0000
+## Mean :0.1406 Mean :0.1406 Mean :0.04688 Mean :0.2812
+## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:0.00000 3rd Qu.:0.5000
+## Max. :1.0000 Max. :1.0000 Max. :0.25000 Max. :1.0000
+##
+## Hydrated Iron oxide Organic matter Sulfate+Organic matter
+## Min. :0.00000 Min. :0.0000 Min. :0.0000
+## 1st Qu.:0.00000 1st Qu.:0.0000 1st Qu.:0.0000
+## Median :0.00000 Median :1.0000 Median :0.0000
+## Mean :0.01562 Mean :0.5938 Mean :0.2188
+## 3rd Qu.:0.00000 3rd Qu.:1.0000 3rd Qu.:0.2500
+## Max. :0.25000 Max. :1.0000 Max. :1.0000
+##
+## Other hydrated phases Phyllosilicates Chlorite
+## Min. :0.0000 Min. :0.00000 Min. :0.0000
+## 1st Qu.:0.0000 1st Qu.:0.00000 1st Qu.:0.0000
+## Median :0.2500 Median :0.00000 Median :0.0000
+## Mean :0.4375 Mean :0.09375 Mean :0.0625
+## 3rd Qu.:1.0000 3rd Qu.:0.06250 3rd Qu.:0.0000
+## Max. :1.0000 Max. :0.50000 Max. :0.5000
+##
+## Kaolinite (hydrous Al-clay) Chromite Ilmenite Zircon/Baddeleyite
+## Min. :0.0000 Min. :0.000 Min. :0.000 Min. :0.000
+## 1st Qu.:0.0000 1st Qu.:0.000 1st Qu.:0.000 1st Qu.:0.000
+## Median :0.0000 Median :0.000 Median :0.000 Median :0.000
+## Mean :0.1875 Mean :0.125 Mean :0.125 Mean :0.125
+## 3rd Qu.:0.0000 3rd Qu.:0.000 3rd Qu.:0.000 3rd Qu.:0.000
+## Max. :1.0000 Max. :1.000 Max. :1.000 Max. :1.000
+##
+## Fe-Mg-clay minerals Spinels sample name
+## Min. :0.0000 Min. :0.0000 Min. : 1.00 Atsah : 1
+## 1st Qu.:0.0000 1st Qu.:0.0000 1st Qu.: 4.75 Bearwallow: 1
+## Median :0.0000 Median :0.0000 Median : 8.50 Coulettes : 1
+## Mean :0.1875 Mean :0.0625 Mean : 8.50 Hahonih : 1
+## 3rd Qu.:0.0000 3rd Qu.:0.0000 3rd Qu.:12.25 Hazeltop : 1
+## Max. :1.0000 Max. :0.5000 Max. :16.00 Kukaklek : 1
+## (Other) :10
+## SampleType campaign abrasion feldspar plagioclase
+## atmospheric: 1 Crater Floor:9 Alfalfa :2 0:14 0:13
+## regolith : 0 Delta Front :7 Bellegarde :2 1: 2 1: 3
+## rock core :15 Margin Unit :0 Berry Hollow:2
+## Dourbes :2
+## Novarupta :2
+## Quartier :2
+## (Other) :4
+## pyroxene olivine quartz apatite FeTi_Oxides Iron_Oxide Sulfate Perchlorates
+## 0: 5 0: 6 0:14 0:13 0:13 0:9 0: 4 0:15
+## 1:11 1:10 1: 2 1: 3 1: 3 1:7 1:12 1: 1
+##
+##
+##
+##
+##
+## Phosphate Ca_Sulfate Carbonate Fe_Mg_clay Fe_Mg_carbonate Mg_sulfate
+## 0:11 0:10 0: 1 0:13 0:14 0:13
+## 1: 5 1: 6 1:15 1: 3 1: 2 1: 3
+##
+##
+##
+##
+##
+## Phyllosilicates Chlorite Halite Organic_matter Hydrated_Ca_Sulfate
+## 0:12 0:14 0:13 0: 5 0:14
+## 1: 4 1: 2 1: 3 1:11 1: 2
+##
+##
+##
+##
+##
+## Hydrated_Sulfates Hydrated_Mg_Fe_Sulfate Na_Perchlorate Amorphous_Silicate
+## 0:14 0:13 0:15 0:9
+## 1: 2 1: 3 1: 1 1:7
+##
+##
+##
+##
+##
+## Hydrated_Carbonates Disordered_Silicates Hydrated_Iron_Oxide
+## 0:16 0:14 0:15
+## 1: 2 1: 1
+##
+##
+##
+##
+##
+## Sulfate+Organic_Matter Other_hydrated_phases Kaolinite Chromite Ilmenite
+## 0:11 0:8 0:13 0:14 0:14
+## 1: 5 1:8 1: 3 1: 2 1: 2
+##
+##
+##
+##
+##
+## Zircon/Baddeleyite Spinels
+## 0:14 0:14
+## 1: 2 1: 2
+##
+##
+##
+##
+##
+str(sherloc_lithology.matrix)
+## num [1:16, 1:70] 1 1 1 0 0 0 0 0 0 0 ...
+## - attr(*, "dimnames")=List of 2
+## ..$ : NULL
+## ..$ : chr [1:70] "Plagioclase" "Sulfate" "Ca-sulfate" "Hydrated Ca-sulfate" ...
+Scale this data appropriately (you can choose the scaling +method or decide to not scale data): Explain why you chose a +scaling method or to not scale. (3 pts) Since the dataset contains both +numerical and categorical features, I decide to not scale the data since +scaling categorical variables can result in meaningless +transformations.
Cluster the data using k-means or your favorite clustering +method (like hierarchical clustering): Describe how you picked the +best number of clusters. Indicate the number of points in each clusters. +Coordinate with your team so you try different approaches. If you want +to share results with your team mates, make sure to use the same random +seeds. (6 pts) After darwing the wss plot of the matrix, I use elbow +method plot to decide the best number of clusters. In this case, The +elbow appears to form around 3 or 4 clusters. The sharp drop in the WCSS +from 2 clusters to 3 clusters suggests that adding a third cluster +significantly improves the clustering quality. Therefore, I choose 3 +clusters. The first cluster has 7 points, the second cluster has 2 +points and the third cluster has 7 points.
CLustering
+# A user-defined function to examine clusters and plot the results
+wssplot <- function(data, nc=15, seed=10){
+ wss <- data.frame(cluster=1:nc, quality=c(0))
+ for (i in 1:nc){
+ set.seed(seed)
+ wss[i,2] <- kmeans(data, centers=i)$tot.withinss}
+ ggplot(data=wss,aes(x=cluster,y=quality)) +
+ geom_line() +
+ ggtitle("Quality of k-means by Cluster")
+}
+
+wssplot(sherloc_lithology.matrix, nc=8, seed=2)
+Create final clusters
+# Use our chosen 'k' to perform k-means clustering
+set.seed(2)
+k <- 3
+km <- kmeans(sherloc_lithology.matrix,k)
+At first, we keep the scale as the origin, and create a heat map of +the cluster centers with rows and columns clustered.
+pheatmap(km$centers,scale="none")
+
+According to this heat map, the features at left have a high mean than
+the features at right. Moreover, cluster 2 has an over all more high
+mean than cluster 1 and cluster 3.
Next, indicating how many samples are in each cluster
+library(knitr)
+# clusters sizes are in the km object produced by kmeans
+clusters.df<-data.frame(cluster= 1:3, size=km$size)
+kable(clusters.df,caption="Samples per cluster")
+cluster | +size | +
---|---|
1 | +7 | +
2 | +2 | +
3 | +7 | +
# Perform the PCA on the matrix we created earlier
+
+sherloc_lithology.pca <- prcomp(sherloc_lithology.matrix, scale=FALSE)
+
+# generate the Scree plot
+ggscreeplot(sherloc_lithology.pca)
+Creative analysis:
+summary(sherloc_lithology.pca)
+## Importance of components:
+## PC1 PC2 PC3 PC4 PC5 PC6 PC7
+## Standard deviation 1.6370 1.5095 1.3427 1.1118 0.88099 0.70792 0.52226
+## Proportion of Variance 0.2793 0.2375 0.1879 0.1289 0.08091 0.05224 0.02843
+## Cumulative Proportion 0.2793 0.5169 0.7048 0.8337 0.91458 0.96682 0.99526
+## PC8 PC9 PC10 PC11 PC12 PC13
+## Standard deviation 0.21330 2.894e-16 1.79e-16 1.175e-16 9.92e-17 6.478e-17
+## Proportion of Variance 0.00474 0.000e+00 0.00e+00 0.000e+00 0.00e+00 0.000e+00
+## Cumulative Proportion 1.00000 1.000e+00 1.00e+00 1.000e+00 1.00e+00 1.000e+00
+## PC14 PC15 PC16
+## Standard deviation 4.869e-17 1.972e-17 2.379e-18
+## Proportion of Variance 0.000e+00 0.000e+00 0.000e+00
+## Cumulative Proportion 1.000e+00 1.000e+00 1.000e+00
+# Plotting the variance explained by each principal component
+plot(sherloc_lithology.pca, type="l")
+
+Together, the first three components explain about 70.48% of the
+variance.This means that using the first three components might provide
+a good approximation of the dataset, significantly reducing its
+dimensionality while retaining most of the important information. Elbow
+Rule: The plot shows a clear “elbow” around the 3rd or 4th component.
+After this point, the variance explained by each additional component
+drops off significantly. Based on this scree plot, the first 3-4
+principal components should be considered for further analysis, as they
+explain the majority of the variance in the dataset. After the 4th
+component, the additional variance explained becomes minimal.
loadings <- abs(sherloc_lithology.pca$rotation[, 1:3])
+print(loadings)
+## PC1 PC2 PC3
+## Plagioclase 0.064336916 0.253285711 0.034736225
+## Sulfate 0.175146396 0.057235074 0.031239837
+## Ca-sulfate 0.232159520 0.167134575 0.064154734
+## Hydrated Ca-sulfate 0.042131612 0.170024387 0.021970393
+## Mg-sulfate 0.200670557 0.107436782 0.114769312
+## Hydrated Sulfates 0.046640374 0.046533930 0.004709232
+## Hydrated Mg-Fe sulfate 0.200670557 0.107436782 0.114769312
+## Perchlorates 0.022205304 0.083261324 0.012765831
+## Na-perchlorate 0.011102652 0.041630662 0.006382916
+## Amorphous Silicate 0.017782440 0.039123645 0.003335076
+## Phosphate 0.050067834 0.156937069 0.139435041
+## Pyroxene 0.205440349 0.175242080 0.113777068
+## Olivine 0.199073864 0.131670868 0.156241523
+## Carbonate 0.128752482 0.006768303 0.124116776
+## Fe-Mg carbonate 0.090812262 0.012522980 0.038558702
+## Hydrated Carbonates 0.000000000 0.000000000 0.000000000
+## Disordered Silicates 0.087911478 0.018904080 0.101611247
+## Feldspar 0.087911478 0.018904080 0.101611247
+## Quartz 0.021977870 0.004726020 0.025402812
+## Apatite 0.010321118 0.046989967 0.231737838
+## FeTi oxides 0.047682938 0.190839718 0.025161851
+## Halite 0.017151739 0.023946076 0.006448218
+## Iron oxide 0.002321888 0.246964416 0.025709074
+## Hydrated Iron oxide 0.005551326 0.020815331 0.003191458
+## Organic matter 0.022898001 0.214954717 0.167587308
+## Sulfate+Organic matter 0.020247980 0.106056547 0.030089121
+## Other hydrated phases 0.225821526 0.068591591 0.097707669
+## Phyllosilicates 0.066658805 0.006321295 0.060445299
+## Chlorite 0.043955739 0.009452040 0.050805623
+## Kaolinite (hydrous Al-clay) 0.200670557 0.107436782 0.114769312
+## Chromite 0.004769792 0.067805298 0.228546380
+## Ilmenite 0.004769792 0.067805298 0.228546380
+## Zircon/Baddeleyite 0.004769792 0.067805298 0.228546380
+## Fe-Mg-clay minerals 0.200670557 0.107436782 0.114769312
+## Spinels 0.002384896 0.033902649 0.114273190
+## feldspar 0.087911478 0.018904080 0.101611247
+## plagioclase 0.064336916 0.253285711 0.034736225
+## pyroxene 0.205440349 0.175242080 0.113777068
+## olivine 0.265007473 0.145848928 0.080033088
+## quartz 0.087911478 0.018904080 0.101611247
+## apatite 0.026975096 0.015456026 0.241312212
+## FeTi_Oxides 0.064336916 0.253285711 0.034736225
+## Iron_Oxide 0.114386824 0.246904610 0.105433723
+## Sulfate 0.178723740 0.006381100 0.140169948
+## Perchlorates 0.022205304 0.083261324 0.012765831
+## Phosphate 0.069106708 0.185480413 0.263282605
+## Ca_Sulfate 0.265007473 0.145848928 0.080033088
+## Carbonate 0.022205304 0.083261324 0.012765831
+## Fe_Mg_clay 0.200670557 0.107436782 0.114769312
+## Fe_Mg_carbonate 0.090812262 0.012522980 0.038558702
+## Mg_sulfate 0.200670557 0.107436782 0.114769312
+## Phyllosilicates 0.178723740 0.006381100 0.140169948
+## Chlorite 0.087911478 0.018904080 0.101611247
+## Halite 0.068606958 0.095784304 0.025792870
+## Organic_matter 0.026475345 0.265808691 0.003822477
+## Hydrated_Ca_Sulfate 0.042131612 0.170024387 0.021970393
+## Hydrated_Sulfates 0.046640374 0.046533930 0.004709232
+## Hydrated_Mg_Fe_Sulfate 0.200670557 0.107436782 0.114769312
+## Na_Perchlorate 0.022205304 0.083261324 0.012765831
+## Amorphous_Silicate 0.026716609 0.181623180 0.026392880
+## Hydrated_Carbonates 0.000000000 0.000000000 0.000000000
+## Disordered_Silicates 0.087911478 0.018904080 0.101611247
+## Hydrated_Iron_Oxide 0.022205304 0.083261324 0.012765831
+## Sulfate+Organic_Matter 0.023825324 0.156910520 0.201498906
+## Other_hydrated_phases 0.269777265 0.078043631 0.148513293
+## Kaolinite 0.200670557 0.107436782 0.114769312
+## Chromite 0.004769792 0.067805298 0.228546380
+## Ilmenite 0.004769792 0.067805298 0.228546380
+## Zircon/Baddeleyite 0.004769792 0.067805298 0.228546380
+## Spinels 0.004769792 0.067805298 0.228546380
+threshold <- 0.2
+important_vars <- apply(loadings, 1, function(x) any(x > threshold))
+sherloc_lithology_filtered <- sherloc_lithology.pca
+sherloc_lithology_filtered$rotation <- sherloc_lithology_filtered$rotation[important_vars, ]
+Using loading to show how much each variable contributes to the +principal components. Then, I filtered the variables which only keep the +variables that has a contribution greater than 0.2. The high positive or +negative values in the loading are most important to each component +Based on the result, SHERLOC measurements dominate PC1 and lithology +measurements dominate PC2, we could infer that PC1 captures the primary +chemical or spectroscopic composition of the Martian surface, while PC2 +relates to the geological context.
+Create a PCA biplot to visualize the clusters
+sherloc_lithology.df$cluster <- factor(km$cluster,
+ levels = c(1, 2, 3),
+ labels = c("Cluster 1", "Cluster 2", "Cluster 3"))
+# For this lab we'll create a PCA biplot the easy way using ggbiplot!
+ggbiplot::ggbiplot(sherloc_lithology_filtered,
+ choices = c(1, 2),
+ labels = sherloc_lithology.df$type,
+ groups = sherloc_lithology.df$cluster) +
+ xlim(-2, 2) + ylim(-2, 2) +
+ ggtitle("Filtered Biplot: PC1 vs PC2")
+
+Iron Oxide, Amorphous Silicate, Pyroxene, Olivine, Hydrated Sulfates:
+These variables contribute the most to PC1, suggesting that PC1
+continues to represent a contrast between igneous materials (e.g.,
+pyroxene, olivine) and sedimentary or hydrated minerals (e.g., hydrated
+sulfates). Moreover, as these variables group closely together, which
+indicates a common geological origin. This likely corresponds to igneous
+regions on Mars Phyllosilicates and Iron Oxide: These variables are also
+significant contributors to PC2, meaning PC2 is likely to represent
+variations within sedimentary environments which separate different
+types of hydrated minerals and phyllosilicates. This could reflect
+differences in the extent or type of chemical alteration due to
+water.
make another PCA biplot to create a comparison between PC1 and +PC3
+ggbiplot::ggbiplot(sherloc_lithology_filtered,
+ choices = c(1, 3),
+ labels = sherloc_lithology.df$type,
+ groups = sherloc_lithology.df$cluster) +
+ xlim(-2, 2) + ylim(-2, 2) +
+ ggtitle("Filtered Biplot: PC1 vs PC3")
+## Warning: Removed 2 rows containing missing values or values outside the scale range
+## (`geom_text()`).
+
+Organic Matter and Phosphate: These variables are much more important in
+PC3 than PC2, suggesting that PC3 may capture some trend related to the
+presence of organic compounds or chemical processes that differ from
+those driving the first two PCs, distinguishing areas where organic
+compounds or specific chemical interactions are more prevalent.
Prepare a presentation of your teams result to present in class on +September 11 starting at 9am in AE217 (20 pts) The +presentation should include the following elements
+0.Your teams names and members 1. A Description of +the data set that you analyzed including how many observations and how +many features. (<= 1.5 mins) 2. Each team member gets three +minutes to explain their analysis: * what analysis they +performed * the results of that analysis * a brief discussion of their +interpretation of these results * <= 18 mins total! 3. A +Conclusion slide indicating major findings of the teams +(<= 1.5 mins) 4. Thoughts on potential next steps +for the MARS team (<= 1.5 mins)
+A template for your team presentation is included here: https://bit.ly/dar-template-f24
The rubric for the presentation is here:
https://docs.google.com/document/d/1-4o1O4h2r8aMjAplmE-ItblQnyDAKZwNs5XCnmwacjs/pub
+When you are satisfied with your edits and your notebook knits +successfully, remember to push your changes to the repo using the +following steps:
+git branch
+git add <your changed files>
git commit -m "Some useful comments"
git push origin <your branch name>
The IDEA Cluster provides seven compute nodes (4x 48 cores, 3x 80 +cores, 1x storage server)
+erickj4@rpi.edu