diff --git a/AppRelated/CampaignQuestionFolder/CampaignQuestionNotebookAshton.Rmd b/AppRelated/CampaignQuestionFolder/CampaignQuestionNotebookAshton.Rmd index 379a6e0..2ff1620 100644 --- a/AppRelated/CampaignQuestionFolder/CampaignQuestionNotebookAshton.Rmd +++ b/AppRelated/CampaignQuestionFolder/CampaignQuestionNotebookAshton.Rmd @@ -3,38 +3,59 @@ title: "DAR F24 How do Campaigns Compare?" author: "Ashton Compton" date: "`r Sys.Date()`" output: - pdf_document: - toc: yes html_document: toc: yes + pdf_document: + toc: yes subtitle: "DAR Project Name: Mars" --- Note - This notebook is meant as a draft for a page in the Mars Mission Minder 2D app addressing the topic of campaigns. Question - "What is the difference between samples from the two campaigns Delta Front and Crater Floor?" -Every sample from the Perserverance Rover is assigned a campaign. In this analysis, the two campaigns were Delta Front and Crater Floor. +Every sample from the Perseverance Rover is assigned a campaign. For this analysis, the two accessible campaigns were Delta Front and Crater Floor. Crater Floor is where Perseverance landed, and is a flat open plain. Next the rover traveled to a Delta Fan structure and began climbing it. Samples taken at the beginning of this journey belong to Delta Front. + +Title: Map of Samples Annotating the Two Campaigns + +![campaigns Map](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/marsmap.png) -Title: Samples on map highlighted by campaign -![Mars Map](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/marsSamples.png) -Description: A map of the Martian surface showing each sample site colored by campaign. Samples colored orange are apart of Crater Floor campaign, while samples colored blue are apart of Delta Front campaign. +Description: Map showing the first 16 Perserverance Samples with the Crater Floor and Delta Front regions annotated. -Title: Feature Count in Sherloc seperated by Campaign -![Sherloc Count](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/sherlocCount.png) -Description: Sherloc was grouped by campaign and the total value of each feature across all samples was summed up and displayed in the above plot. If a feature had a value of zero for every sample, it was not included in the plot. -Title: Sherloc Feature Distribution by Campaign -![Sherloc Boxplot](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/sherlocBox.png) -Description: Sherloc Features distribution shown with a series a boxplots, seperated by campaign. +Title: Counting Mineral Occurrence across Samples separated by Campaign + +![Lithology Count](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/sherlocCount.png) +Description: Sherloc was grouped by campaign and the total value of each feature across all samples was summed up and displayed in the above plot. If a feature had a value of zero for every sample, it was not included in the plot. Title: Pixl Feature Distribution by Campaign -![pixl distribution](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/pixlDistributionbyCampaign.png) -Description: Box plots showing the distribution of each feature in Pixl, seperated by Campaign. + +![pixl distribution](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/pixlDis.png) +Description: Box plots showing the distribution of each feature in Pixl, seperated by Campaign. Note: Sort low to high Title: Libs clustering with Pixl data seperated by campaign on Ternairy Diagram -![LibsandPixl](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/LibsandPixlTern.png) -Description: Libs data was clustered and ternairy plotted along with Pixl data separated by campaign. The Pixl data points are labelled by name as well. +![LibsandPixl](~/DAR-Mars-F24/AppRelated/CampaignQuestionFolder/Pics/tern.png) +Description: Libs data was clustered and ternairy plotted along with Pixl data separated by campaign. The Pixl data points are labelled by name as well. Note-Use the common clusters and use the non-aggregate plot + +Conclusions: + +In Sherloc, certain minerals are abundant in both campaigns, especially Crater Floor. + -Carbonate is common in both campaigns + -Organic Matter is also common in both campaigns + -Sulfate and Olivine are also common in both + +High in Crater Floor: + -Pyroxene and amorphous silicate are abundant in Crater Floor but sparse in Delta Front + +Fe_Mg_Clay, Hydrated_Mg_Fe_sulfate, Kaolinite, and Mg_sulfates are in 3 samples in Delta Front, but +not at all in Crater Floor. + +There are 20 minerals that are exclusively in either Crater Floor or Delta Front. + +4 minerals have a count of zero, meaning they weren’t detected in any campaign (Perchlorates, Na_Perchlorate, Hydrated_Carbonates, & Hydrated_Iron_Oxide). These minerals are present in the atmospheric sample, which is absent in this analysis. + +The pixl graph reveals some big differences between Crater Floor and Delta Front. Namely in Al2O3, CaO, +Cr2O3, MgO, P2O5, SO3, & SiO2 CODE: Load libaries @@ -109,15 +130,17 @@ Prepare data #Load in data ### # Load the saved lithology data with locations added -lithology.df<- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/mineral_data_static.Rds") +#lithology.df<- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/mineral_data_static.Rds") +meta.df<- readRDS("~/DAR-Mars-F24/StudentData/v1_sample_meta.Rds") -#lithology.df<- readRDS("~/DAR-Mars-F24/StudentData/v1_lithology.Rds") -pixl_pos.df<- readRDS("~/DAR-Mars-F24/StudentData/pixl_sol_coordinates.Rds") -#Remove atmospheric sample -pixl_pos.df <- pixl_pos.df[2:16,] +#Select 2:16 +meta.df <- meta.df[2:16,] + +#Read in v1 lithology +lithology.df<- readRDS("~/DAR-Mars-F24/StudentData/v1_lithology.Rds") # Cast samples as numbers -lithology.df$sample <- as.numeric(lithology.df$sample) +lithology.df$Sample <- as.numeric(lithology.df$Sample) # Convert rest into factors lithology.df[sapply(lithology.df, is.character)] <- @@ -127,12 +150,14 @@ lithology.df[sapply(lithology.df, is.character)] <- # Keep only first 16 samples because the data for the rest of the samples is not available yet #Also i'm getting rid of the atmospheric sample for now lithology.df<-lithology.df[2:16,] -# Create a matrix containing only the numeric measurements. The remaining features are metadata about the sample. -lithology.matrix <- sapply(lithology.df[,6:40],as.numeric)-1 +lithology.df$campaign <- meta.df$Campaign ### +#Used for map +pixl_pos.df<- meta.df %>% select(Lat, Lon) + # Load the saved PIXL data with locations added -pixl.df <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/samples_pixl_wide.Rds") +pixl.df <- readRDS("~/DAR-Mars-F24/StudentData/v1_pixl.Rds") # Convert to factors pixl.df[sapply(pixl.df, is.character)] <- lapply(pixl.df[sapply(pixl.df, is.character)], @@ -141,23 +166,10 @@ pixl.df[sapply(pixl.df, is.character)] <- lapply(pixl.df[sapply(pixl.df, is.char #Get rid of atmospheric sample pixl.df <- pixl.df[2:16,] -# Make the matrix of just mineral percentage measurements -pixl.matrix <- pixl.df[,2:14] - -### -# Load the saved LIBS data with locations added -libs.df <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/supercam_libs_moc_loc.Rds") - -#Drop features that are not to be used in the analysis for this notebook -libs.df <- libs.df %>% - select(!(c(distance_mm,Tot.Em.,SiO2_stdev,TiO2_stdev,Al2O3_stdev,FeOT_stdev, - MgO_stdev,Na2O_stdev,CaO_stdev,K2O_stdev,Total))) - -# Convert the points to numeric -libs.df$point <- as.numeric(libs.df$point) - -# Make the a matrix contain only the libs measurements for each mineral -libs.matrix <- as.matrix(libs.df[,6:13]) +#Add campaign +pixl.df$campaign <- meta.df$Campaign +# # Make the matrix of just mineral percentage measurements +# pixl.matrix <- pixl.df[,2:14] ### # Read in data as provided. @@ -188,7 +200,7 @@ sherloc.matrix <- sherloc.matrix[2:16,] # Get sample information from PIXL and add to measurements -- assumes order is the same -sherloc.df <- cbind(pixl.df[,c("sample","type","campaign","abrasion")],sherloc.matrix) +sherloc.df <- cbind(pixl.df[,c("Sample")],sherloc.matrix) # Measurements are everything except first column sherloc.matrix<-as.matrix(sherloc.matrix[,-1]) @@ -211,20 +223,20 @@ set.seed Used as base for map picture ```{r} #Produce sample map with campaign differences -pixl_pos.df$campaign <- pixl.df$campaign -ggplot(pixl_pos.df, aes(x= Lat, y= Long, color=campaign, label=sample, size=1)) + - geom_point() + - theme_classic() +# pixl_pos.df$Campaign <- meta.df$Campaign +# ggplot(pixl_pos.df, aes(x= Lat, y= Lon, color=Campaign, size=1)) + +# geom_point() + +# theme_classic() ``` Make interactive plotly plot for Lithology ```{r, result02_data} # Include all data processing code (if necessary), clearly commented #Start with lithology #Group by campaign & remove metadata -lithology.df.sorted <- lithology.df %>% group_by(campaign) %>% select(-c(sample,name,SampleType,abrasion)) +lithology.df.sorted <- lithology.df %>% group_by(campaign) %>% select(-c(Sample)) #Turn into long form and only keep positive cases -lithology.df.sorted <- lithology.df.sorted %>% pivot_longer(2:ncol(lithology.df.sorted),names_to = "Feature", values_to="Factor") %>% filter(Factor == 1) +lithology.df.sorted <- lithology.df.sorted %>% pivot_longer(2:ncol(lithology.df.sorted)-1,names_to = "Feature", values_to="Factor") %>% filter(Factor == 1) #Count # of identical cases lithology.df.sorted <- lithology.df.sorted %>% count(Feature) @@ -238,71 +250,53 @@ p <- ggplot(lithology.df.sorted, aes(x=factor(Feature, levels = (Feature %>% uni geom_col(position=position_dodge(preserve="total"), width=0.6) + theme(panel.grid.major.x=element_blank(), axis.text.x = element_text(angle = 60, vjust = 1.0, hjust=1, size = 12)) + labs(x="", y="Count") + - ggtitle("Lithology Features Count by Campaign") + - scale_fill_paletteer_d(palette = "fishualize::Cephalopholis_argus") + ggtitle("Sherloc Dataset, Total Mineral Occurance across Samples") + + scale_fill_manual(values=c('#d6001c','#54585a')) ggplotly(p, tooltip = c("campaign",'x', "n")) #Commented out to knit to pdf, picture at top of report ``` Create similar plot for Sherloc ```{r} -#Repeat for sherloc -#Group by campaign & remove metadata -sherloc.df.sorted <- sherloc.df %>% group_by(campaign) %>% select(-c(sample,Name,type,abrasion)) - -#Turn into long form and only keep positive cases -sherloc.df.sorted <- sherloc.df.sorted %>% pivot_longer(2:ncol(sherloc.df.sorted),names_to = "Feature", values_to="Factor") %>% filter(Factor == 1) - -#Count # of identical cases -sherloc.df.sorted <- sherloc.df.sorted %>% count(Feature) - -#Sort, Crater Floor is High to low & Delta Front is added back in low to high -sherloc.df.sorted <- sherloc.df.sorted %>% filter(campaign == "Crater Floor") %>% arrange(desc(n)) %>% ungroup() %>% add_row(sherloc.df.sorted %>% filter(campaign == "Delta Front") %>% arrange(n)) - -p <- ggplot(sherloc.df.sorted, aes(x=factor(Feature, levels = (Feature %>% unique())), y = n, fill = campaign)) + - geom_col(position=position_dodge(preserve="total"), width=0.6) + - theme(panel.grid.major.x=element_blank(), axis.text.x = element_text(angle = 60, vjust = 1.0, hjust=1, size = 12)) + - labs(x="", y="Count") + - ggtitle("Sherloc Features Count by Campaign") + - scale_fill_paletteer_d(palette = "fishualize::Cephalopholis_argus") - -ggplotly(p, tooltip = c("campaign",'x', "n")) -#Commented out to knit to pdf, picture at top of report +# #Repeat for sherloc +# #Group by campaign & remove metadata +# sherloc.df.sorted <- sherloc.df %>% group_by(Campaign) %>% select(-c(Sample)) +# +# #Turn into long form and only keep positive cases +# sherloc.df.sorted <- sherloc.df.sorted %>% pivot_longer(2:ncol(sherloc.df.sorted),names_to = "Feature", values_to="Factor") %>% filter(Factor == 1) +# +# #Count # of identical cases +# sherloc.df.sorted <- sherloc.df.sorted %>% count(Feature) +# +# #Sort, Crater Floor is High to low & Delta Front is added back in low to high +# sherloc.df.sorted <- sherloc.df.sorted %>% filter(campaign == "Crater Floor") %>% arrange(desc(n)) %>% ungroup() %>% add_row(sherloc.df.sorted %>% filter(campaign == "Delta Front") %>% arrange(n)) +# +# p <- ggplot(sherloc.df.sorted, aes(x=factor(Feature, levels = (Feature %>% unique())), y = n, fill = campaign)) + +# geom_col(position=position_dodge(preserve="total"), width=0.6) + +# theme(panel.grid.major.x=element_blank(), axis.text.x = element_text(angle = 60, vjust = 1.0, hjust=1, size = 12)) + +# labs(x="", y="Count") + +# ggtitle("Sherloc Features Count by Campaign") + +# scale_fill_paletteer_d(palette = "fishualize::Cephalopholis_argus") +# +# ggplotly(p, tooltip = c("campaign",'x', "n")) +# #Commented out to knit to pdf, picture at top of report ``` Make box plots for pixl and sherloc ```{R} #Make box plots -pixl.lf <- pixl.df %>% select(-c(sample, name, type, location, abrasion)) %>% pivot_longer(1:13) +pixl.lf <- pixl.df %>% select(-c(Sample)) %>% pivot_longer(1:13) colnames(pixl.lf)<- c("campaign", "feature", "value") -ggplot(data = pixl.lf, aes(x=feature, y=value, color = campaign)) + +ggplot(data = pixl.lf, aes(x=factor(feature, levels = (feature %>% unique())), y=value, color = campaign)) + geom_boxplot() + scale_y_log10() + - ggtitle("pixl distribution by campaign") + - labs(x="", y="log10 scale from percent composition") - -#Repeat for sherloc -sherloc.lf <- sherloc.df %>% select(-c(sample,Name,type,abrasion)) %>% pivot_longer(2:36) -colnames(sherloc.lf)<- c("campaign", "feature", "value") -ggplot(data = sherloc.lf, aes(x=feature, y=value, color = campaign)) + - geom_boxplot() + - ggtitle("sherloc distribution by campaign") + + ggtitle("pixl, Compound Distribution by Campaign") + labs(x="", y="log10 scale from percent composition") + - theme(panel.grid.major.x=element_blank(), axis.text.x = element_text(angle = 60, vjust = 1.0, hjust=1, size = 10)) + scale_colour_manual(values=c('#d6001c','#54585a')) ``` Code for ternairy plot ```{r} -pixl.df <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/samples_pixl_wide.Rds") -pixl.df[sapply(pixl.df, is.character)] <- lapply(pixl.df[sapply(pixl.df, is.character)], - as.factor) -pixl.df <- pixl.df[2:16,] #Excluding first, atmospheric sample - -#PIXL data, with identically reflected compositions -new_pixl_trim <- pixl.df %>% - dplyr::select(c("Na20","Mgo","Al203","Si02", "K20","Cao","FeO-T", campaign, type)) %>% - rename("Na2O"="Na20","MgO"="Mgo","Al2O3"="Al203","SiO2"="Si02","K2O"="K20", - "CaO"="Cao","FeOT"="FeO-T") #take the sums of the specific elements -pixl_ternary <- new_pixl_trim %>% +pixl_ternary <- pixl.df %>% mutate(x=(SiO2+Al2O3)/100,y=(FeOT+MgO)/100,z=(CaO+Na2O+K2O)/100) %>% select(-c(SiO2,Al2O3,FeOT,MgO,CaO,Na2O,K2O)) %>% drop_na() @@ -313,18 +307,18 @@ pixl_ternary <- cbind(pixl_ternary, Sample_display= "10,11","","12,13","","14,15","","16")) # Load the saved LIBS data with locations added -libs.df <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/supercam_libs_moc_loc.Rds") +libs.df <- readRDS("~/DAR-Mars-F24/StudentData/v1_libs.Rds") -libs.df$point <- as.numeric(libs.df$point) +libs.df$Point <- as.numeric(libs.df$Point) #suppressing warnings here because target and point do not get a mean calculated, #but thats is fine as we have the target anyways and point is no longer relevant suppressWarnings( libs.uniquetar <- - aggregate(libs.df, list(Target = libs.df$target), mean)) + aggregate(libs.df, list(Target = libs.df$Target), mean)) #drop target and point from the data frame -libs.uniquetar <- libs.uniquetar %>% select(!c(target,point)) +libs.uniquetar <- libs.uniquetar %>% select(!c(Target,Point)) #Sum the elements we are looking at libs.df <- libs.df %>% @@ -349,23 +343,26 @@ tern.km2 <- kmeans(as.matrix(libs_ternplot2), centers=4) libs_ternplot2 <- cbind(libs_ternplot2, cluster=as.factor(tern.km2$cluster)) -#plot for aggregate data -ggtern(libs_ternplot2, ggtern::aes(x=x, y=y, z=z,cluster=cluster)) + - geom_point(aes(color=cluster), alpha = 0.5) + - theme_rgbw() + - labs(title="Mars 2020 LIBS Ternary Plot Aggregated by Target with PIXL Data", - x="Si+Al2", - y="Fe+Mg", - z="Ca+Na2+K2") + - suppressWarnings(geom_point( - data=pixl_ternary, ggtern::aes(x=x, y=y, z=z, cluster=type, shape=campaign), - size = 2.5)) + - suppressWarnings(geom_text(data=pixl_ternary, - ggtern::aes(x=x, y=y, z=z, label=Sample_display, cluster=type, - hjust = ifelse(x > 0.43, 1, -0.1), # Horizontal adjust to avoid overlap - vjust = ifelse(x == 0.3668, 1.3, - ifelse(x == 0.375, 1, ifelse(x > 0.43, 1.5, -0.3))), - fontface="bold"), - size=3)) +#ternary plot for LIBS data +ggtern() + + #color by cluster + geom_point(data=libs_ternplot, aes(x=x,y=y,z=z, colour = cluster), alpha = 0.5) + + scale_colour_manual(values=c('#d6001c','#54585a','#9ea2a2','#000')) + + labs(title="Sample Cation Compositions", + subtitle="LIBS data Clustered by Cation Group with PIXL samples by Campaign", + x="Si+Al2", + y="Fe+Mg", + z="Ca+Na2+K2") + + #Add pixl + geom_point(data=pixl_ternary, aes(x=x,y=y,z=z, shape=campaign), colour='green', size=3) + + # #Add labels to PIXL data corresponding to sample number + geom_text(data=pixl_ternary, + ggtern::aes(x=x, y=y, z=z, label=Sample_display, + hjust = ifelse(x > 0.43, 1, -0.1), # Horizontal adjust to avoid overlap + vjust = ifelse(x == 0.3668, 1.3, + ifelse(x == 0.375, 1, ifelse(x > 0.43, 1.5, -0.3))), + fontface="bold"), + size=3, colour='green') + + theme_bw() ``` diff --git a/AppRelated/CampaignQuestionFolder/CampaignQuestionNotebookAshton.html b/AppRelated/CampaignQuestionFolder/CampaignQuestionNotebookAshton.html new file mode 100644 index 0000000..7b21d70 --- /dev/null +++ b/AppRelated/CampaignQuestionFolder/CampaignQuestionNotebookAshton.html @@ -0,0 +1,2692 @@ + + + + +
+ + + + + + + + + + +Note - This notebook is meant as a draft for a page in the Mars +Mission Minder 2D app addressing the topic of campaigns.
+Question - “What is the difference between samples from the two +campaigns Delta Front and Crater Floor?”
+Every sample from the Perseverance Rover is assigned a campaign. For +this analysis, the two accessible campaigns were Delta Front and Crater +Floor. Crater Floor is where Perseverance landed, and is a flat open +plain. Next the rover traveled to a Delta Fan structure and began +climbing it. Samples taken at the beginning of this journey belong to +Delta Front.
+Title: Map of Samples Annotating the Two Campaigns
+Description: Map showing the first 16 Perserverance Samples with the +Crater Floor and Delta Front regions annotated.
+Title: Counting Mineral Occurrence across Samples separated by +Campaign
+Description: Sherloc was grouped by campaign +and the total value of each feature across all samples was summed up and +displayed in the above plot. If a feature had a value of zero for every +sample, it was not included in the plot.
+Title: Pixl Feature Distribution by Campaign
+Description: Box plots showing the +distribution of each feature in Pixl, seperated by Campaign. Note: Sort +low to high
+Title: Libs clustering with Pixl data seperated by campaign on +Ternairy Diagram
+Description: Libs data was clustered and ternairy +plotted along with Pixl data separated by campaign. The Pixl data points +are labelled by name as well. Note-Use the common clusters and use the +non-aggregate plot
+Conclusions:
+In Sherloc, certain minerals are abundant in both campaigns, +especially Crater Floor. -Carbonate is common in both campaigns -Organic +Matter is also common in both campaigns -Sulfate and Olivine are also +common in both
+High in Crater Floor: -Pyroxene and amorphous silicate are abundant +in Crater Floor but sparse in Delta Front
+Fe_Mg_Clay, Hydrated_Mg_Fe_sulfate, Kaolinite, and Mg_sulfates are in +3 samples in Delta Front, but not at all in Crater Floor.
+There are 20 minerals that are exclusively in either Crater Floor or +Delta Front.
+4 minerals have a count of zero, meaning they weren’t detected in any +campaign (Perchlorates, Na_Perchlorate, Hydrated_Carbonates, & +Hydrated_Iron_Oxide). These minerals are present in the atmospheric +sample, which is absent in this analysis.
+The pixl graph reveals some big differences between Crater Floor and +Delta Front. Namely in Al2O3, CaO, Cr2O3, MgO, P2O5, SO3, & SiO2
+CODE: Load libaries Set up dataframes/matrices
+Prepare data
+#Load in data
+###
+# Load the saved lithology data with locations added
+#lithology.df<- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/mineral_data_static.Rds")
+meta.df<- readRDS("~/DAR-Mars-F24/StudentData/v1_sample_meta.Rds")
+
+#Select 2:16
+meta.df <- meta.df[2:16,]
+
+#Read in v1 lithology
+lithology.df<- readRDS("~/DAR-Mars-F24/StudentData/v1_lithology.Rds")
+
+# Cast samples as numbers
+lithology.df$Sample <- as.numeric(lithology.df$Sample)
+
+# Convert rest into factors
+lithology.df[sapply(lithology.df, is.character)] <-
+ lapply(lithology.df[sapply(lithology.df, is.character)],
+ as.factor)
+
+# Keep only first 16 samples because the data for the rest of the samples is not available yet
+#Also i'm getting rid of the atmospheric sample for now
+lithology.df<-lithology.df[2:16,]
+
+lithology.df$campaign <- meta.df$Campaign
+###
+#Used for map
+pixl_pos.df<- meta.df %>% select(Lat, Lon)
+
+# Load the saved PIXL data with locations added
+pixl.df <- readRDS("~/DAR-Mars-F24/StudentData/v1_pixl.Rds")
+
+# Convert to factors
+pixl.df[sapply(pixl.df, is.character)] <- lapply(pixl.df[sapply(pixl.df, is.character)],
+ as.factor)
+
+#Get rid of atmospheric sample
+pixl.df <- pixl.df[2:16,]
+
+#Add campaign
+pixl.df$campaign <- meta.df$Campaign
+# # Make the matrix of just mineral percentage measurements
+# pixl.matrix <- pixl.df[,2:14]
+
+###
+# Read in data as provided.
+sherloc_abrasion_raw <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/abrasions_sherloc_samples.Rds")
+
+# Clean up data types
+sherloc_abrasion_raw$Mineral<-as.factor(sherloc_abrasion_raw$Mineral)
+sherloc_abrasion_raw[sapply(sherloc_abrasion_raw, is.character)] <- lapply(sherloc_abrasion_raw[sapply(sherloc_abrasion_raw, is.character)],
+ as.numeric)
+# Transform NA's to 0
+sherloc_abrasion_raw <- sherloc_abrasion_raw %>% replace(is.na(.), 0)
+
+# Reformat data so that rows are "abrasions" and columns list the presence of minerals.
+# Do this by "pivoting" to a long format, and then back to the desired wide format.
+
+sherloc_long <- sherloc_abrasion_raw %>%
+ pivot_longer(!Mineral, names_to = "Name", values_to = "Presence")
+
+# Make abrasion a factor
+sherloc_long$Name <- as.factor(sherloc_long$Name)
+
+# Make it a matrix
+sherloc.matrix <- sherloc_long %>%
+ pivot_wider(names_from = Mineral, values_from = Presence)
+
+#Remove atmospheric sample
+sherloc.matrix <- sherloc.matrix[2:16,]
+
+# Get sample information from PIXL and add to measurements -- assumes order is the same
+
+sherloc.df <- cbind(pixl.df[,c("Sample")],sherloc.matrix)
+
+# Measurements are everything except first column
+sherloc.matrix<-as.matrix(sherloc.matrix[,-1])
+
+###
+#Add in wss plot for elbow method clustering
+wssplot <- function(data, nc = 15, seed =10, title="Quality of k-means by Cluster") {
+ wss <- data.frame(cluster=1:nc, quality=c(0))
+ for (i in 1:nc){
+ set.seed(seed)
+ wss[i,2] <- kmeans(data, centers=i)$tot.withinss}
+ ggplot(data=wss,aes(x=cluster,y=quality)) +
+ geom_line() +
+ ggtitle(title)
+}
+
+seed <- 14
+set.seed
+## function (seed, kind = NULL, normal.kind = NULL, sample.kind = NULL)
+## {
+## kinds <- c("Wichmann-Hill", "Marsaglia-Multicarry", "Super-Duper",
+## "Mersenne-Twister", "Knuth-TAOCP", "user-supplied", "Knuth-TAOCP-2002",
+## "L'Ecuyer-CMRG", "default")
+## n.kinds <- c("Buggy Kinderman-Ramage", "Ahrens-Dieter", "Box-Muller",
+## "user-supplied", "Inversion", "Kinderman-Ramage", "default")
+## s.kinds <- c("Rounding", "Rejection", "default")
+## if (length(kind)) {
+## if (!is.character(kind) || length(kind) > 1L)
+## stop("'kind' must be a character string of length 1 (RNG to be used).")
+## if (is.na(i.knd <- pmatch(kind, kinds) - 1L))
+## stop(gettextf("'%s' is not a valid abbreviation of an RNG",
+## kind), domain = NA)
+## if (i.knd == length(kinds) - 1L)
+## i.knd <- -1L
+## }
+## else i.knd <- NULL
+## if (!is.null(normal.kind)) {
+## if (!is.character(normal.kind) || length(normal.kind) !=
+## 1L)
+## stop("'normal.kind' must be a character string of length 1")
+## normal.kind <- pmatch(normal.kind, n.kinds) - 1L
+## if (is.na(normal.kind))
+## stop(gettextf("'%s' is not a valid choice", normal.kind),
+## domain = NA)
+## if (normal.kind == 0L)
+## stop("buggy version of Kinderman-Ramage generator is not allowed",
+## domain = NA)
+## if (normal.kind == length(n.kinds) - 1L)
+## normal.kind <- -1L
+## }
+## if (!is.null(sample.kind)) {
+## if (!is.character(sample.kind) || length(sample.kind) !=
+## 1L)
+## stop("'sample.kind' must be a character string of length 1")
+## sample.kind <- pmatch(sample.kind, s.kinds) - 1L
+## if (is.na(sample.kind))
+## stop(gettextf("'%s' is not a valid choice", sample.kind),
+## domain = NA)
+## if (sample.kind == 0L)
+## warning("non-uniform 'Rounding' sampler used", domain = NA)
+## if (sample.kind == length(s.kinds) - 1L)
+## sample.kind <- -1L
+## }
+## .Internal(set.seed(seed, i.knd, normal.kind, sample.kind))
+## }
+## <bytecode: 0x558e691d4380>
+## <environment: namespace:base>
+Used as base for map picture
+#Produce sample map with campaign differences
+# pixl_pos.df$Campaign <- meta.df$Campaign
+# ggplot(pixl_pos.df, aes(x= Lat, y= Lon, color=Campaign, size=1)) +
+# geom_point() +
+# theme_classic()
+Make interactive plotly plot for Lithology
+# Include all data processing code (if necessary), clearly commented
+#Start with lithology
+#Group by campaign & remove metadata
+lithology.df.sorted <- lithology.df %>% group_by(campaign) %>% select(-c(Sample))
+
+#Turn into long form and only keep positive cases
+lithology.df.sorted <- lithology.df.sorted %>% pivot_longer(2:ncol(lithology.df.sorted)-1,names_to = "Feature", values_to="Factor") %>% filter(Factor == 1)
+
+#Count # of identical cases
+lithology.df.sorted <- lithology.df.sorted %>% count(Feature)
+
+#Sort, Crater Floor is High to low & Delta Front is added back in low to high
+lithology.df.sorted <- lithology.df.sorted %>% filter(campaign == "Crater Floor") %>% arrange(desc(n)) %>% ungroup() %>% add_row(lithology.df.sorted %>% filter(campaign == "Delta Front") %>% arrange(n))
+
+
+#Make interactive plot for lithology
+p <- ggplot(lithology.df.sorted, aes(x=factor(Feature, levels = (Feature %>% unique())), y = n, fill = campaign)) +
+ geom_col(position=position_dodge(preserve="total"), width=0.6) +
+ theme(panel.grid.major.x=element_blank(), axis.text.x = element_text(angle = 60, vjust = 1.0, hjust=1, size = 12)) +
+ labs(x="", y="Count") +
+ ggtitle("Sherloc Dataset, Total Mineral Occurance across Samples") +
+ scale_fill_manual(values=c('#d6001c','#54585a'))
+
+ggplotly(p, tooltip = c("campaign",'x', "n"))
+
+
+#Commented out to knit to pdf, picture at top of report
+Create similar plot for Sherloc
+# #Repeat for sherloc
+# #Group by campaign & remove metadata
+# sherloc.df.sorted <- sherloc.df %>% group_by(Campaign) %>% select(-c(Sample))
+#
+# #Turn into long form and only keep positive cases
+# sherloc.df.sorted <- sherloc.df.sorted %>% pivot_longer(2:ncol(sherloc.df.sorted),names_to = "Feature", values_to="Factor") %>% filter(Factor == 1)
+#
+# #Count # of identical cases
+# sherloc.df.sorted <- sherloc.df.sorted %>% count(Feature)
+#
+# #Sort, Crater Floor is High to low & Delta Front is added back in low to high
+# sherloc.df.sorted <- sherloc.df.sorted %>% filter(campaign == "Crater Floor") %>% arrange(desc(n)) %>% ungroup() %>% add_row(sherloc.df.sorted %>% filter(campaign == "Delta Front") %>% arrange(n))
+#
+# p <- ggplot(sherloc.df.sorted, aes(x=factor(Feature, levels = (Feature %>% unique())), y = n, fill = campaign)) +
+# geom_col(position=position_dodge(preserve="total"), width=0.6) +
+# theme(panel.grid.major.x=element_blank(), axis.text.x = element_text(angle = 60, vjust = 1.0, hjust=1, size = 12)) +
+# labs(x="", y="Count") +
+# ggtitle("Sherloc Features Count by Campaign") +
+# scale_fill_paletteer_d(palette = "fishualize::Cephalopholis_argus")
+#
+# ggplotly(p, tooltip = c("campaign",'x', "n"))
+# #Commented out to knit to pdf, picture at top of report
+Make box plots for pixl and sherloc
+#Make box plots
+pixl.lf <- pixl.df %>% select(-c(Sample)) %>% pivot_longer(1:13)
+colnames(pixl.lf)<- c("campaign", "feature", "value")
+ggplot(data = pixl.lf, aes(x=factor(feature, levels = (feature %>% unique())), y=value, color = campaign)) +
+ geom_boxplot() +
+ scale_y_log10() +
+ ggtitle("pixl, Compound Distribution by Campaign") +
+ labs(x="", y="log10 scale from percent composition") +
+ scale_colour_manual(values=c('#d6001c','#54585a'))
+## Warning in scale_y_log10(): log-10 transformation introduced infinite values.
+## Warning: Removed 5 rows containing non-finite outside the scale range
+## (`stat_boxplot()`).
++Code for ternairy plot
+#take the sums of the specific elements
+pixl_ternary <- pixl.df %>%
+ mutate(x=(SiO2+Al2O3)/100,y=(FeOT+MgO)/100,z=(CaO+Na2O+K2O)/100) %>%
+ select(-c(SiO2,Al2O3,FeOT,MgO,CaO,Na2O,K2O)) %>%
+ drop_na()
+
+#This is for the labels on the Ternary Plot below
+pixl_ternary <- cbind(pixl_ternary, Sample_display=
+ c("2","3","4,6,7","5,8,9","","","","",
+ "10,11","","12,13","","14,15","","16"))
+
+# Load the saved LIBS data with locations added
+libs.df <- readRDS("~/DAR-Mars-F24/StudentData/v1_libs.Rds")
+
+libs.df$Point <- as.numeric(libs.df$Point)
+
+#suppressing warnings here because target and point do not get a mean calculated,
+#but thats is fine as we have the target anyways and point is no longer relevant
+suppressWarnings(
+ libs.uniquetar <-
+ aggregate(libs.df, list(Target = libs.df$Target), mean))
+
+#drop target and point from the data frame
+libs.uniquetar <- libs.uniquetar %>% select(!c(Target,Point))
+
+#Sum the elements we are looking at
+libs.df <- libs.df %>%
+ mutate(y = (FeOT + MgO) / 100, z = (CaO+Na2O+K2O) / 100, x = (SiO2 + Al2O3) / 100)
+
+#Same thing but aggregate
+libs.uniquetar <- libs.uniquetar %>%
+ mutate(y = (FeOT + MgO) / 100, z = (CaO+Na2O+K2O) / 100, x = (SiO2 + Al2O3) / 100)
+
+libs_ternplot <- libs.df %>% select(c(x,y,z))
+libs_ternplot2 <- libs.uniquetar %>% select(c(x,y,z))
+
+set.seed(1234)
+
+#kmeans on the original data
+tern.km <- kmeans(libs_ternplot, centers=4)
+
+libs_ternplot <- cbind(libs_ternplot, cluster=as.factor(tern.km$cluster))
+
+#kmeans on the aggregate data
+tern.km2 <- kmeans(as.matrix(libs_ternplot2), centers=4)
+
+libs_ternplot2 <- cbind(libs_ternplot2, cluster=as.factor(tern.km2$cluster))
+
+#ternary plot for LIBS data
+ggtern() +
+ #color by cluster
+ geom_point(data=libs_ternplot, aes(x=x,y=y,z=z, colour = cluster), alpha = 0.5) +
+ scale_colour_manual(values=c('#d6001c','#54585a','#9ea2a2','#000')) +
+ labs(title="Sample Cation Compositions",
+ subtitle="LIBS data Clustered by Cation Group with PIXL samples by Campaign",
+ x="Si+Al2",
+ y="Fe+Mg",
+ z="Ca+Na2+K2") +
+ #Add pixl
+ geom_point(data=pixl_ternary, aes(x=x,y=y,z=z, shape=campaign), colour='green', size=3) +
+ # #Add labels to PIXL data corresponding to sample number
+ geom_text(data=pixl_ternary,
+ ggtern::aes(x=x, y=y, z=z, label=Sample_display,
+ hjust = ifelse(x > 0.43, 1, -0.1), # Horizontal adjust to avoid overlap
+ vjust = ifelse(x == 0.3668, 1.3,
+ ifelse(x == 0.375, 1, ifelse(x > 0.43, 1.5, -0.3))),
+ fontface="bold"),
+ size=3, colour='green') +
+ theme_bw()
+## Warning in geom_point(data = libs_ternplot, aes(x = x, y = y, z = z, colour =
+## cluster), : Ignoring unknown aesthetics: z
+## Warning in geom_point(data = pixl_ternary, aes(x = x, y = y, z = z, shape =
+## campaign), : Ignoring unknown aesthetics: z
+## Warning in geom_text(data = pixl_ternary, ggtern::aes(x = x, y = y, z = z, :
+## Ignoring unknown aesthetics: z
+
+
+
+
+
+