wangx53-assignment5.Rmd

---
title: "DAR F24 Assignment 5 Notebook"
author: "Xuanting Wang RIN :662016667"
date: "`r Sys.Date()`"
output:
  pdf_document:
    toc: true
    latex_engine: xelatex
  word_document:
    toc: true
  html_document:
    toc: true
subtitle: 'DAR Project Name: Mars'

---


## PIXL Data Analysis

1. **Classification by Cation Groups**:
   - **Cation Composition Counts**: The PIXL dataset showed that the majority of samples were classified as "Si-Al rich" with 11 samples, followed by "Fe-Mg rich" with 5 samples. This classification indicates a composition with predominant Si and Al in most samples&#8203;:contentReference[oaicite:0]{index=0}.

2. **ANOVA Results for Cation Groups by Campaign**:
   - **Si_Al**: The ANOVA test for `Si_Al` showed a significant difference across campaigns (\(p = 0.0014\)), indicating that the `Si_Al` composition varies meaningfully between campaigns.
   - **Fe_Mg**: The `Fe_Mg` composition did not show significant variation across campaigns (\(p = 0.0791\)), suggesting similar levels of Fe and Mg in the different campaigns.
   - **Ca_Na_K**: For `Ca_Na_K`, a significant difference was found across campaigns (\(p = 0.0136\)), indicating some compositional variance based on campaign location.

3. **Density Plots of Cation Groups**:
   - The density plots of `Si_Al`, `Fe_Mg`, and `Ca_Na_K` highlighted the distribution patterns within campaigns for PIXL data, showing that `Si_Al` tends to dominate, followed by `Fe_Mg` and less concentration in `Ca_Na_K`.

4. **Dunn's Post-Hoc Test**:
   - For `Si_Al`, significant differences were found between Crater Floor and Delta Front (\(p = 0.0004\)), supporting the variance in `Si_Al` composition across campaigns.
   - **Fe_Mg**: A marginal significance (\(p = 0.0491\)) between Crater Floor and Delta Front suggests moderate differences, while `Ca_Na_K` showed a clearer significance (\(p = 0.0017\)), indicating stronger compositional changes across these locations.

5. **Single-Sample t-Test**:
   - **Si_Al**: The mean concentration of `Si_Al` was significantly greater than the hypothetical value of 10 (\(p < 0.001\)), with a mean around 43.63.
   - **Fe_Mg**: Similarly, `Fe_Mg` was significantly higher than the benchmark of 10 (\(p < 0.001\)), averaging 33.13.
   - **Ca_Na_K**: The mean `Ca_Na_K` concentration was significantly lower than 10 (\(p = 0.0049\)), at an average of 6.94.

6. **Logistic Regression**:
   - Logistic regression with `Si_Al`, `Fe_Mg`, and `Ca_Na_K` as predictors did not yield statistically significant results, indicating limited predictive power in differentiating campaigns based on these cation groups in PIXL data.

---

## LIBS Data Analysis

1. **Classification by Cation Groups**:
   - **Cation Composition Counts**: For LIBS, `Si-Al rich` samples were predominant (1257 samples), followed by `Fe-Mg rich` (645 samples) and a smaller number of `Ca-Na-K rich` samples (30), demonstrating a general trend toward higher Si and Al compositions&#8203;:contentReference[oaicite:1]{index=1}.

2. **ANOVA Results for Cation Groups by Campaign**:
   - **Si_Al**: The ANOVA for `Si_Al` by campaign was highly significant (\(p < 0.0001\)), indicating substantial variation between campaigns, especially between Campaign 3 and the others.
   - **Fe_Mg**: This group also showed significant differences (\(p < 0.0001\)), suggesting that Fe and Mg levels vary notably by campaign.
   - **Ca_Na_K**: Similar to the other cation groups, `Ca_Na_K` showed significant differences across campaigns (\(p < 0.0001\)), with Campaign 3 distinctively lower than Campaign 1 and 2.

3. **Density Plots of Cation Groups**:
   - Density plots of `Si_Al`, `Fe_Mg`, and `Ca_Na_K` for LIBS data revealed high densities for `Si_Al` and moderate densities for `Fe_Mg`, with lower densities in `Ca_Na_K` across campaigns, aligning with the observed classification trends.

4. **Dunn's Post-Hoc Test**:
   - **Si_Al**: Dunn’s test indicated significant differences between Campaign 3 and both Campaigns 1 and 2 (\(p < 0.001\)), corroborating the ANOVA results.
   - **Fe_Mg**: The test also highlighted significant differences between all campaigns for `Fe_Mg` (\(p < 0.001\)), supporting variability across locations.
   - **Ca_Na_K**: Similar patterns were observed with significant differences across all campaign comparisons (\(p < 0.001\)), indicating that `Ca_Na_K` concentrations are not consistent across campaigns in the LIBS data.

5. **Single-Sample t-Test**:
   - **Si_Al**: LIBS data for `Si_Al` showed a mean significantly above 10 (\(p < 0.001\)), with an average of 49.71.
   - **Fe_Mg**: The mean concentration was also significantly greater than 10 (\(p < 0.001\)), at 36.55.
   - **Ca_Na_K**: The mean `Ca_Na_K` concentration was significantly lower than 10 (\(p < 0.001\)), averaging around 7.08, similar to PIXL data.

6. **Logistic Regression**:
   - Multinomial logistic regression for campaign prediction using `Si_Al`, `Fe_Mg`, and `Ca_Na_K` showed that `Fe_Mg` was significant (\(p = 0.0134\)) in predicting campaign differences, unlike `Si_Al` and `Ca_Na_K`, which were not. This indicates some predictive strength in `Fe_Mg` for campaign classification in the LIBS dataset.

---

## Conclusion

Both PIXL and LIBS data reveal distinct composition patterns across campaigns, with high `Si_Al` concentrations dominating in both datasets. ANOVA and Dunn’s tests consistently highlight significant campaign-based compositional differences, especially in `Si_Al` and `Fe_Mg`. However, logistic regression showed limited predictive power in differentiating campaigns based on cation compositions alone, though `Fe_Mg` in the LIBS data showed some promise as a predictor. The single-sample t-tests confirm that both datasets generally exhibit `Si_Al` and `Fe_Mg` concentrations above typical benchmarks, while `Ca_Na_K` is below. This analysis suggests substantial consistency between PIXL and LIBS in terms of cation group trends across Martian campaigns, with some variability captured in individual cation groups.


**Calculating Cation Group Sums**:
   - Created new columns to represent grouped sums:
     - `Si_Al`: sum of SiO₂ and Al₂O₃.
     - `Fe_Mg`: sum of FeO-T and MgO.
     - `Ca_Na_K`: sum of CaO, Na₂O, and K₂O.

**Initial Classification**: Based on the sums of these groups, you assigned a class label:
   - If `Si_Al` was the largest, it’s classified as "Si-Al rich".
   - If `Fe_Mg` was the largest, it’s classified as "Fe-Mg rich".
   - Otherwise, it’s classified as "Ca-Na-K rich".

**Ranking with Quantiles**: assigned quantile ranks (dividing values into 3 levels) for `Si_Al`, `Fe_Mg`, and `Ca_Na_K` values, using these to classify samples into the same three categories based on the highest rank.

## PIXL Data Analysis

```{r}
libs_data<- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/supercam_libs_moc_loc.Rds")

pixl_data<- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/samples_pixl_wide.Rds")


# Include 'campaign' column in the subset
cation_data <- pixl_data[, c("Si02", "Al203", "FeO-T", "Mgo", "Cao", "Na20", "K20", "campaign")]

# Calculate the cation group sums using backticks for column names with special characters
cation_data$Si_Al <- cation_data$Si02 + cation_data$Al203
cation_data$Fe_Mg <- cation_data$`FeO-T` + cation_data$Mgo
cation_data$Ca_Na_K <- cation_data$Cao + cation_data$Na20 + cation_data$K20

# Set thresholds for classification
cation_data$class <- ifelse(cation_data$Si_Al >
                              cation_data$Fe_Mg & cation_data$Si_Al > cation_data$Ca_Na_K,
"Si-Al rich",
ifelse(cation_data$Fe_Mg > cation_data$Ca_Na_K, "Fe-Mg rich", "Ca-Na-K rich"))

# Check the classifications
table(cation_data$class)
library(dplyr)

# Create quantiles for each group (Si_Al, Fe_Mg, Ca_Na_K)
cation_data$Si_Al_rank <- ntile(cation_data$Si_Al, 3)  # Divides Si_Al into 3 quantiles
cation_data$Fe_Mg_rank <- ntile(cation_data$Fe_Mg, 3)  # Divides Fe_Mg into 3 quantiles
cation_data$Ca_Na_K_rank <- ntile(cation_data$Ca_Na_K, 3)  # Divides Ca_Na_K into 3 quantiles

# Now classify based on which group has the highest rank
cation_data$class <- ifelse(cation_data$Si_Al_rank >=
                              cation_data$Fe_Mg_rank &
                              cation_data$Si_Al_rank >= cation_data$Ca_Na_K_rank,
                            "Si-Al rich",
                            ifelse(cation_data$Fe_Mg_rank >= cation_data$Ca_Na_K_rank,
                                   "Fe-Mg rich",
                                   "Ca-Na-K rich"))

# Plotting the scatter plots again
library(ggplot2)

# install.packages("ggtern")
library(ggtern)
library(dplyr)


# Prepare the data for ternary plot
# Make sure the three components are in proportions or standardized
cation_data$total <- cation_data$Si_Al + cation_data$Fe_Mg + cation_data$Ca_Na_K
cation_data$Si_Al_prop <- cation_data$Si_Al / cation_data$total
cation_data$Fe_Mg_prop <- cation_data$Fe_Mg / cation_data$total
cation_data$Ca_Na_K_prop <- cation_data$Ca_Na_K / cation_data$total
```


```{r}
# Create the ternary plot
ggtern(data = cation_data, aes(x = Si_Al_prop, y = Fe_Mg_prop, z = Ca_Na_K_prop, color = class)) +
  geom_point(size = 3, alpha = 0.7) +
  scale_color_manual(values =
                       c("Si-Al rich" = "blue", "Fe-Mg rich" = "green", "Ca-Na-K rich" = "orange")) +
  labs(title = "Ternary Plot of Si+Al, Fe+Mg, and Ca+Na+K for PIXL Data",
       x = "Si + Al",
       y = "Fe + Mg",
       z = "Ca + Na + K",
       color = "Rock Type") +
  theme_minimal()

```


```{r}
# Load necessary libraries
library(dplyr)

# Calculate the count of each campaign and class combination
campaign_composition_summary <- cation_data %>%
  group_by(campaign, class) %>%
  summarize(count = n()) %>%
  ungroup()

# Calculate the proportion within each campaign
campaign_composition_summary <- campaign_composition_summary %>%
  group_by(campaign) %>%
  mutate(proportion = count / sum(count)) %>%
  ungroup()

# Display the summary table
print(campaign_composition_summary)

```


```{r}
# Convert 'campaign' to a factor for ANOVA
cation_data$campaign <- as.factor(cation_data$campaign)

# Perform ANOVA for each cation group across campaigns
anova_Si_Al <- aov(Si_Al ~ campaign, data = cation_data)
anova_Fe_Mg <- aov(Fe_Mg ~ campaign, data = cation_data)
anova_Ca_Na_K <- aov(Ca_Na_K ~ campaign, data = cation_data)

# Summary of ANOVA results
summary(anova_Si_Al)
summary(anova_Fe_Mg)
summary(anova_Ca_Na_K)

# If significant, run a post-hoc Tukey test to determine where the differences lie
TukeyHSD(anova_Si_Al)
TukeyHSD(anova_Fe_Mg)
TukeyHSD(anova_Ca_Na_K)

```

### ANOVA Results

1. **Si-Al Group (Si_Al ~ campaign)**:
   - The ANOVA test for the `Si_Al` group across campaigns showed a significant effect, with a p-value of **2.47e-15** (p < 0.001), indicating that the mean `Si_Al` values vary significantly between campaigns.

2. **Fe-Mg Group (Fe_Mg ~ campaign)**:
   - Similarly, the ANOVA test for `Fe_Mg` showed a very strong significant effect, with a p-value of **<2e-16** (p < 0.001). This suggests that `Fe_Mg` values also differ significantly between campaigns.

3. **Ca-Na-K Group (Ca_Na_K ~ campaign)**:
   - For the `Ca_Na_K` group, the ANOVA test was significant with a p-value of **4.03e-11** (p < 0.001), meaning that `Ca_Na_K` values also vary significantly across campaigns.

Overall, these results indicate that each cation group (Si-Al, Fe-Mg, Ca-Na-K) has statistically significant differences in composition across the different campaigns.

### Tukey Post-hoc Tests

1. **Si-Al Group**:
   - Significant differences were found between:
     - **Campaign 3 and Campaign 1** (p = 0.0000191): Campaign 3 has lower `Si_Al` values than Campaign 1.
     - **Campaign 3 and Campaign 2** (p < 0.0001): Campaign 3 has lower `Si_Al` values than Campaign 2.
   - No significant difference was found between Campaigns 1 and 2.

2. **Fe-Mg Group**:
   - Significant differences were found between all campaign pairs:
     - **Campaign 2 and Campaign 1** (p = 0.0017): Campaign 2 has higher `Fe_Mg` values than Campaign 1.
     - **Campaign 3 and Campaign 1** (p < 0.0001): Campaign 3 has much higher `Fe_Mg` values than Campaign 1.
     - **Campaign 3 and Campaign 2** (p < 0.0001): Campaign 3 also has higher `Fe_Mg` values than Campaign 2.

3. **Ca-Na-K Group**:
   - Significant differences were found between:
     - **Campaign 3 and Campaign 1** (p = 0.0001): Campaign 3 has lower `Ca_Na_K` values than Campaign 1.
     - **Campaign 3 and Campaign 2** (p < 0.0001): Campaign 3 has lower `Ca_Na_K` values than Campaign 2.
   - No significant difference was found between Campaigns 1 and 2.

```{r}
# Density plot for each cation group
ggplot(cation_data, aes(color = class)) +
  geom_density(aes(x = Si_Al), fill = "blue", alpha = 0.3) +
  geom_density(aes(x = Fe_Mg), fill = "green", alpha = 0.3) +
  geom_density(aes(x = Ca_Na_K), fill = "orange", alpha = 0.3) +
  labs(title = "Density Plot of Cation Groups for PIXL Data",
       x = "Cation Group Concentrations",
       color = "Composition Class") +
  theme_minimal()


# Load necessary libraries
library(ggplot2)
library(gridExtra)

# Box plot for Si_Al by campaign
plot_Si_Al <- ggplot(cation_data, aes(x = campaign, y = Si_Al, fill = campaign)) +
  geom_boxplot() +
  labs(title = "Si_Al Distribution Across Campaigns",
       x = "Campaign",
       y = "Si + Al Concentration") +
  theme_minimal() +
  theme(legend.position = "none")

# Box plot for Fe_Mg by campaign
plot_Fe_Mg <- ggplot(cation_data, aes(x = campaign, y = Fe_Mg, fill = campaign)) +
  geom_boxplot() +
  labs(title = "Fe_Mg Distribution Across Campaigns",
       x = "Campaign",
       y = "Fe + Mg Concentration") +
  theme_minimal() +
  theme(legend.position = "none")

# Box plot for Ca_Na_K by campaign
plot_Ca_Na_K <- ggplot(cation_data, aes(x = campaign, y = Ca_Na_K, fill = campaign)) +
  geom_boxplot() +
  labs(title = "Ca_Na_K Distribution Across Campaigns",
       x = "Campaign",
       y = "Ca + Na + K Concentration") +
  theme_minimal() +
  theme(legend.position = "none")

# Arrange the plots in a single layout
grid.arrange(plot_Si_Al, plot_Fe_Mg, plot_Ca_Na_K, nrow = 1)

```

1. **Density Plot of Cation Groups**:
   - created a density plot to visualize the distribution of concentrations for each cation group (Si-Al, Fe-Mg, Ca-Na-K).
   - Each cation group concentration was assigned a different color: blue for Si-Al, green for Fe-Mg, and orange for Ca-Na-K.
   - The densities were overlaid with transparency (alpha = 0.3) to allow for easy comparison across groups.

2. **Box Plots for Cation Groups Across Campaigns**:
   - created three separate box plots to show the distribution of each cation group (Si-Al, Fe-Mg, Ca-Na-K) across different campaigns.
   - Each plot includes:
     - Si-Al box plot: displays concentration differences across campaigns.
     - Fe-Mg box plot: displays Fe and Mg concentration across campaigns.
     - Ca-Na-K box plot: displays Ca, Na, and K concentration across campaigns.
   - The plots were arranged in a single row using `grid.arrange` for easy comparison.

### Analysis

1. **Density Plot**:
   - The density plot shows distinct distribution peaks for each cation group, indicating that each group has a unique concentration range within the PIXL data.
   - For instance, the Si-Al group (blue) has a prominent peak on the left, suggesting a concentration mode in lower values, while the Fe-Mg (green) and Ca-Na-K (orange) groups have more spread-out distributions.
   - Overlapping regions between density curves suggest some samples may have balanced compositions of multiple cation groups, while isolated peaks highlight group-specific characteristics.

2. **Box Plots Across Campaigns**:
   - **Si-Al Distribution**: The box plot shows that Campaign 1 has a generally higher median Si-Al concentration compared to Campaigns 2 and 3, suggesting Campaign 1 samples are richer in Si and Al.
   - **Fe-Mg Distribution**: Fe-Mg concentrations show a trend of increasing from Campaign 1 to Campaign 3, with Campaign 3 showing the highest median concentration. This aligns with previous findings that Campaign 3 has significant Fe-Mg richness.
   - **Ca-Na-K Distribution**: Ca-Na-K concentrations are relatively low across all campaigns, but Campaign 3 has slightly lower median values compared to Campaigns 1 and 2, consistent with previous analyses.


```{r}
# Filter data for two specific campaigns and remove NA values
campaign_a_data <- na.omit(subset(cation_data, campaign == "A")$Si_Al)
campaign_b_data <- na.omit(subset(cation_data, campaign == "B")$Si_Al)

# Check if both campaigns have enough data points
if (length(campaign_a_data) > 1 & length(campaign_b_data) > 1) {
  # Perform Mann-Whitney test
  mann_whitney_test <- wilcox.test(campaign_a_data, campaign_b_data)
  print(mann_whitney_test)
} else {
  print("Insufficient data for Mann-Whitney test between selected campaigns.")
}

```

```{r}
# install.packages("dunn.test")
library(dunn.test)

# Perform Dunn's test for each cation group
# Example for Si_Al across campaigns
dunn_test_Si_Al <- dunn.test(cation_data$Si_Al, cation_data$campaign, method = "bonferroni")
print(dunn_test_Si_Al)

# Repeat for other cation groups
dunn_test_Fe_Mg <- dunn.test(cation_data$Fe_Mg, cation_data$campaign, method = "bonferroni")
dunn_test_Ca_Na_K <- dunn.test(cation_data$Ca_Na_K, cation_data$campaign, method = "bonferroni")

# Print the results
print(dunn_test_Fe_Mg)
print(dunn_test_Ca_Na_K)

```

```{r}
# Hypothetical mean values for each cation group to test against
test_value_Si_Al <- 10
test_value_Fe_Mg <- 10
test_value_Ca_Na_K <- 10

# Single-sample t-test for Si_Al
t_test_Si_Al <- t.test(cation_data$Si_Al, mu = test_value_Si_Al)
print(t_test_Si_Al)

# Single-sample t-test for Fe_Mg
t_test_Fe_Mg <- t.test(cation_data$Fe_Mg, mu = test_value_Fe_Mg)
print(t_test_Fe_Mg)

# Single-sample t-test for Ca_Na_K
t_test_Ca_Na_K <- t.test(cation_data$Ca_Na_K, mu = test_value_Ca_Na_K)
print(t_test_Ca_Na_K)

```

### Kruskal-Wallis Test Results
The Kruskal-Wallis test was performed for each cation group (Si-Al, Fe-Mg, Ca-Na-K) across campaigns. For each test, the chi-squared values were large with p-values essentially zero, indicating significant differences in cation group concentrations across campaigns.

### Dunn’s Test (Post-hoc Analysis)
Since the Kruskal-Wallis test showed significant differences, Dunn’s test was applied to perform pairwise comparisons between campaigns for each cation group with Bonferroni correction:

1. **Si-Al Group**:
   - **Significant Differences**:
     - Campaign 3 vs. Campaign 1: \(p = 7.85 \times 10^{-5}\) (significant)
     - Campaign 3 vs. Campaign 2: \(p = 1.28 \times 10^{-13}\) (significant)
   - **Non-significant Difference**:
     - Campaign 1 vs. Campaign 2: \(p = 0.34\) (not significant)
   - **Analysis**: Campaign 3 has significantly different Si-Al levels compared to Campaigns 1 and 2, suggesting unique geological composition in that region.

2. **Fe-Mg Group**:
   - **Significant Differences**:
     - Campaign 2 vs. Campaign 1: \(p = 0.0004\)
     - Campaign 3 vs. Campaign 1: \(p < 0.0001\)
     - Campaign 3 vs. Campaign 2: \(p < 0.0001\)
   - **Analysis**: All pairwise comparisons are significant, with Campaign 3 showing the highest Fe-Mg levels. This points to distinct Fe-Mg enrichment in Campaign 3 samples.

3. **Ca-Na-K Group**:
   - **Significant Differences**:
     - Campaign 1 vs. Campaign 3: \(p = 0.0054\)
     - Campaign 2 vs. Campaign 3: \(p < 0.0001\)
   - **Non-significant Difference**:
     - Campaign 1 vs. Campaign 2: \(p = 0.20\) (not significant)
   - **Analysis**: Campaign 3 differs significantly from the other campaigns, with lower Ca-Na-K levels compared to Campaigns 1 and 2.


**Regression**
1. **Convert Campaign to Factor**:
   -  ensured that `campaign` is treated as a categorical variable by converting it to a factor, which is necessary for logistic regression.

2. **Binary Logistic Regression**:
   -  ran a binary logistic regression model assuming `campaign` had two levels, using `Si_Al`, `Fe_Mg`, and `Ca_Na_K` as predictors.
   - The `glm` function with `family = "binomial"` fits the model, and `summary(logistic_model)` displays the coefficients and p-values, which indicate the influence of each predictor on the likelihood of being in a particular campaign category.

3. **Multinomial Logistic Regression**:
   - used the `nnet` package’s `multinom` function to perform multinomial logistic regression, which is suitable for cases where `campaign` has more than two levels.
   - The `summary(multinom_model)` shows the estimated coefficients for each predictor, indicating how `Si_Al`, `Fe_Mg`, and `Ca_Na_K` concentrations influence the probability of each campaign classification.

4. **Predict Campaigns and Probabilities**:
   - Using `predict(multinom_model, type = "class")`, you predicted the most likely campaign class for each observation.
   - With `predict(multinom_model, type = "probs")`, you retrieved the predicted probabilities for each campaign, showing the likelihood of each sample belonging to each campaign.
   - The `head(predicted_campaigns)` and `head(predicted_probabilities)` functions display the first few rows of these predictions.


```{r}
# Convert campaign to a factor if it’s not already
cation_data$campaign <- as.factor(cation_data$campaign)

# Run logistic regression (binary outcome assumed)
logistic_model <- glm(campaign ~ Si_Al + Fe_Mg + Ca_Na_K, data = cation_data, family = "binomial")
summary(logistic_model)

# Install the nnet package if not already installed
# install.packages("nnet")
library(nnet)

# Run multinomial logistic regression
multinom_model <- multinom(campaign ~ Si_Al + Fe_Mg + Ca_Na_K, data = cation_data)
summary(multinom_model)


# Predict probabilities for each campaign
predicted_campaigns <- predict(multinom_model, type = "class")
predicted_probabilities <- predict(multinom_model, type = "probs")

# View the predictions
head(predicted_campaigns)
head(predicted_probabilities)

```

### Binary Logistic Regression

The binary logistic regression was run to see how `Si_Al`, `Fe_Mg`, and `Ca_Na_K` influence campaign classification (assuming a binary outcome):

- **Intercept**: Significant with a p-value of 0.0106, suggesting a baseline effect when all predictors are zero.
- **Si_Al**: Not significant (p = 0.2657), indicating that `Si_Al` does not have a strong influence in distinguishing between the two campaign categories in this binary model.
- **Fe_Mg**: Significant (p = 0.0134), suggesting that higher `Fe_Mg` concentrations are associated with a higher probability of one of the campaign classifications.
- **Ca_Na_K**: Not significant (p = 0.5575), indicating little impact on the binary classification of campaigns.

**Interpretation**: In this binary logistic model, only `Fe_Mg` shows a significant effect, which suggests it may be a key differentiator between the two assumed campaign levels.

### Multinomial Logistic Regression

The multinomial logistic regression was conducted to model campaign classification as a multi-level factor:

- **Campaign 2 (vs. Campaign 1)**:
  - **Si_Al**: Non-significant, showing minimal impact on distinguishing Campaign 2 from Campaign 1.
  - **Fe_Mg**: Positive coefficient (0.0295), suggesting that higher `Fe_Mg` values increase the likelihood of being in Campaign 2 relative to Campaign 1.
  - **Ca_Na_K**: Positive but non-significant, implying it doesn’t strongly differentiate Campaign 2 from Campaign 1.

- **Campaign 3 (vs. Campaign 1)**:
  - **Si_Al**: Negative coefficient (-0.0323), suggesting that lower `Si_Al` values may be associated with Campaign 3, though it is not statistically significant.
  - **Fe_Mg**: Positive coefficient (0.0340), indicating that higher `Fe_Mg` values are associated with Campaign 3 compared to Campaign 1.
  - **Ca_Na_K**: Negative coefficient, though non-significant, suggesting lower `Ca_Na_K` may be associated with Campaign 3 relative to Campaign 1.

### Predicted Probabilities

The predicted probabilities show the likelihood of each sample belonging to each campaign based on `Si_Al`, `Fe_Mg`, and `Ca_Na_K`. The probabilities indicate the model’s confidence in its predictions for each campaign classification.


**LIBS Data**

1. **Load and Prepare LIBS Data**:
   - Loaded the `supercam_libs_moc_loc.Rds` file and converted it into a data frame.
   - Ensured specific columns (`SiO2`, `Al2O3`, `FeOT`, `MgO`, `CaO`, `Na2O`, `K2O`) are numeric to facilitate numerical analysis.

2. **Select Relevant Cation Data**:
   - Created a subset `cation_data` containing only the cation columns.

3. **Calculate Cation Group Sums**:
   - Calculated the sums of certain cation groups:
     - `Si_Al` (SiO₂ + Al₂O₃)
     - `Fe_Mg` (FeO-T + MgO)
     - `Ca_Na_K` (CaO + Na₂O + K₂O)

4. **Initial Classification Based on Sums**:
   - Classified each sample based on the highest group sum:
     - "Si-Al rich" if `Si_Al` was the highest.
     - "Fe-Mg rich" if `Fe_Mg` was the highest.
     - "Ca-Na-K rich" if `Ca_Na_K` was the highest.
   - Verified the classification distribution with `table(cation_data$class)`.


```{r}
# Load the LIBS data
libs_data <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/supercam_libs_moc_loc.Rds")


# Convert the result back to a data frame (instead of a matrix)
libs_data <- as.data.frame(libs_data)

# Ensure all relevant columns are numeric by converting them
cols_to_convert <- c("SiO2", "Al2O3", "FeOT", "MgO", "CaO", "Na2O", "K2O")
libs_data[cols_to_convert] <- lapply(libs_data[cols_to_convert], as.numeric)

# Review the structure to ensure columns are now numeric
str(libs_data)

# Select valid columns for cation analysis
cation_data <- libs_data[, cols_to_convert]

# Inspect the first few rows to ensure data is selected correctly
head(cation_data)

# Calculate the cation group sums
cation_data$Si_Al <- cation_data$SiO2 + cation_data$Al2O3
cation_data$Fe_Mg <- cation_data$FeOT + cation_data$MgO
cation_data$Ca_Na_K <- cation_data$CaO + cation_data$Na2O + cation_data$K2O

# Set thresholds for classification
cation_data$class <- ifelse(cation_data$Si_Al > cation_data$Fe_Mg &
                              cation_data$Si_Al > cation_data$Ca_Na_K,
                            "Si-Al rich",
                            ifelse(cation_data$Fe_Mg > cation_data$Ca_Na_K,
                                   "Fe-Mg rich",
                                   "Ca-Na-K rich"))

# Check the classifications
table(cation_data$class)

# Create quantiles for each group (Si_Al, Fe_Mg, Ca_Na_K)
library(dplyr)
cation_data$Si_Al_rank <- ntile(cation_data$Si_Al, 3)  # Divides Si_Al into 3 quantiles
cation_data$Fe_Mg_rank <- ntile(cation_data$Fe_Mg, 3)  # Divides Fe_Mg into 3 quantiles
cation_data$Ca_Na_K_rank <- ntile(cation_data$Ca_Na_K, 3)  # Divides Ca_Na_K into 3 quantiles

# Now classify based on which group has the highest rank
cation_data$class <- ifelse(cation_data$Si_Al_rank >=
                              cation_data$Fe_Mg_rank &
                              cation_data$Si_Al_rank >= cation_data$Ca_Na_K_rank,
                            "Si-Al rich",
                            ifelse(cation_data$Fe_Mg_rank >= cation_data$Ca_Na_K_rank,
                                   "Fe-Mg rich",
                                   "Ca-Na-K rich"))

# Check the updated classification distribution
table(cation_data$class)

# Prepare the data for ternary plot
# Ensure the three components are in proportions or standardized
cation_data$total <- cation_data$Si_Al + cation_data$Fe_Mg + cation_data$Ca_Na_K
cation_data$Si_Al_prop <- cation_data$Si_Al / cation_data$total
cation_data$Fe_Mg_prop <- cation_data$Fe_Mg / cation_data$total
cation_data$Ca_Na_K_prop <- cation_data$Ca_Na_K / cation_data$total

# Check the structure of the final data
str(cation_data)


# Create the ternary plot
ggtern(data = cation_data, aes(x = Si_Al_prop, y = Fe_Mg_prop, z = Ca_Na_K_prop, color = class)) +
  geom_point(size = 3, alpha = 0.7) +
  scale_color_manual(values =
                       c("Si-Al rich" = "blue", "Fe-Mg rich" = "green", "Ca-Na-K rich" = "orange")) +
  labs(title = "Ternary Plot of Si+Al, Fe+Mg, and Ca+Na+K for LIBS Data",
       x = "Si + Al",
       y = "Fe + Mg",
       z = "Ca + Na + K",
       color = "Rock Type") +
  theme_minimal()
```


```{r}
# Adjust the campaign ranges based on the actual sol values
cation_data$campaign <- ifelse(libs_data$sol < 100, "Campaign 1",
                               ifelse(libs_data$sol < 500, "Campaign 2", "Campaign 3"))

# Convert the 'campaign' column to a factor
cation_data$campaign <- as.factor(cation_data$campaign)

# Check the distribution of the campaign column
table(cation_data$campaign)

# Perform ANOVA for each cation group across campaigns
anova_Si_Al <- aov(Si_Al ~ campaign, data = cation_data)
anova_Fe_Mg <- aov(Fe_Mg ~ campaign, data = cation_data)
anova_Ca_Na_K <- aov(Ca_Na_K ~ campaign, data = cation_data)

# Summary of ANOVA results
summary(anova_Si_Al)
summary(anova_Fe_Mg)
summary(anova_Ca_Na_K)

# If significant, run a post-hoc Tukey test to determine where the differences lie
TukeyHSD(anova_Si_Al)
TukeyHSD(anova_Fe_Mg)
TukeyHSD(anova_Ca_Na_K)

```

### Campaign Distribution
- The data has been divided into three campaigns based on the sol values:
  - **Campaign 1**: Sol < 100
  - **Campaign 2**: 100 <= Sol < 500
  - **Campaign 3**: Sol >= 500
- Distribution of samples by campaign:
  - Campaign 1: 70 samples
  - Campaign 2: 804 samples
  - Campaign 3: 1058 samples

### ANOVA Results
ANOVA tests were conducted to determine if there are significant differences in `Si_Al`, `Fe_Mg`, and `Ca_Na_K` concentrations across the three campaigns.

1. **Si-Al Group**:
   - The ANOVA for `Si_Al` shows a highly significant difference across campaigns (p < 0.001).
   - **Interpretation**: There is a statistically significant variation in `Si_Al` concentrations between campaigns.

2. **Fe-Mg Group**:
   - The ANOVA for `Fe_Mg` is also highly significant (p < 0.001).
   - **Interpretation**: This indicates strong differences in `Fe_Mg` concentrations across the campaigns, suggesting that some campaigns are richer in Fe and Mg.

3. **Ca-Na-K Group**:
   - The ANOVA for `Ca_Na_K` is significant as well (p < 0.001).
   - **Interpretation**: There are notable differences in `Ca_Na_K` concentrations across campaigns.

### Tukey Post-hoc Test Results
To identify which specific campaign pairs have significant differences, Tukey’s test was applied.

1. **Si-Al Group**:
   - **Campaign 3 vs. Campaign 1**: Significant difference (p = 0.0000191), with Campaign 3 having lower `Si_Al` concentrations.
   - **Campaign 3 vs. Campaign 2**: Highly significant (p < 0.0001), with Campaign 3 showing lower `Si_Al` than Campaign 2.
   - **Campaign 2 vs. Campaign 1**: No significant difference.
   - **Interpretation**: Campaign 3 has distinctly lower `Si_Al` concentrations compared to the other campaigns.

2. **Fe-Mg Group**:
   - **Campaign 2 vs. Campaign 1**: Significant (p = 0.0017), with Campaign 2 having higher `Fe_Mg` concentrations.
   - **Campaign 3 vs. Campaign 1**: Highly significant (p < 0.0001), with Campaign 3 showing much higher `Fe_Mg` than Campaign 1.
   - **Campaign 3 vs. Campaign 2**: Highly significant (p < 0.0001), with Campaign 3 also having higher `Fe_Mg` than Campaign 2.
   - **Interpretation**: Both Campaigns 2 and 3 are richer in Fe and Mg compared to Campaign 1, with Campaign 3 having the highest concentrations.

3. **Ca-Na-K Group**:
   - **Campaign 3 vs. Campaign 1**: Significant (p = 0.0001), with Campaign 3 having lower `Ca_Na_K` concentrations.
   - **Campaign 3 vs. Campaign 2**: Highly significant (p < 0.0001), with Campaign 3 showing lower `Ca_Na_K` than Campaign 2.
   - **Campaign 2 vs. Campaign 1**: No significant difference.
   - **Interpretation**: Campaign 3 has lower `Ca_Na_K` concentrations compared to Campaigns 1 and 2, which do not differ significantly from each other.


```{r}
# Load necessary library
library(dplyr)

# Calculate the count and proportion for each campaign and composition class
campaign_composition_summary <- cation_data %>%
  group_by(campaign, class) %>%
  summarize(count = n(), .groups = 'drop') %>%
  group_by(campaign) %>%
  mutate(proportion = count / sum(count)) %>%
  ungroup()

# Display the results
print(campaign_composition_summary)


```


```{r}

# Combined density plot with facets for each cation group
cation_data_long <- cation_data %>%
  tidyr::pivot_longer(cols = c(Si_Al, Fe_Mg, Ca_Na_K),
                      names_to = "cation_group",
                      values_to = "concentration")

ggplot(cation_data_long, aes(x = concentration, fill = campaign)) +
  geom_density(alpha = 0.4) +
  facet_wrap(~ cation_group, scales = "free") +
  labs(title = "Density Plots of Cation Groups Across Campaigns",
       x = "Concentration",
       y = "Density",
       fill = "Campaign") +
  theme_minimal()


# Box plot for Si_Al by campaign
ggplot(cation_data, aes(x = campaign, y = Si_Al, fill = campaign)) +
  geom_boxplot() +
  labs(title = "Box Plot of Si_Al Across Campaigns",
       x = "Campaign",
       y = "Si + Al Concentration") +
  theme_minimal()

# Box plot for Fe_Mg by campaign
ggplot(cation_data, aes(x = campaign, y = Fe_Mg, fill = campaign)) +
  geom_boxplot() +
  labs(title = "Box Plot of Fe_Mg Across Campaigns",
       x = "Campaign",
       y = "Fe + Mg Concentration") +
  theme_minimal()

# Box plot for Ca_Na_K by campaign
ggplot(cation_data, aes(x = campaign, y = Ca_Na_K, fill = campaign)) +
  geom_boxplot() +
  labs(title = "Box Plot of Ca_Na_K Across Campaigns",
       x = "Campaign",
       y = "Ca + Na + K Concentration") +
  theme_minimal()

```
### Density Plots for Cation Groups Across Campaigns

1. **Si_Al**:
   - Campaign 1 (red) shows a distinct peak at a slightly higher concentration compared to Campaigns 2 and 3, indicating higher `Si_Al` concentrations.
   - Campaigns 2 (green) and 3 (blue) have similar peak densities, but Campaign 3 has a broader distribution, suggesting more variation in `Si_Al` concentrations within that campaign.

2. **Fe_Mg**:
   - Campaign 1 has lower `Fe_Mg` concentrations, as shown by its peak at a lower concentration range.
   - Campaign 3 shows a shift towards higher concentrations with a broad distribution, while Campaign 2 lies between Campaigns 1 and 3.
   - This aligns with earlier findings that Campaign 3 is richer in `Fe_Mg`.

3. **Ca_Na_K**:
   - Campaigns 1 and 2 have similar distributions for `Ca_Na_K`, peaking at lower concentration values.
   - Campaign 3 shows a slight peak shift toward lower concentrations compared to Campaigns 1 and 2, indicating lower `Ca_Na_K` concentrations in Campaign 3.

### Box Plots for Each Cation Group Across Campaigns
1. **Si_Al Box Plot**:
   - Campaign 1 has a higher median `Si_Al` concentration than Campaigns 2 and 3, with a slightly wider interquartile range (IQR).
   - Campaign 3 has the lowest median `Si_Al` concentration, with more outliers below the median, suggesting a distinct trend toward lower `Si_Al` values in that campaign.

2. **Fe_Mg Box Plot**:
   - There is a noticeable increase in median `Fe_Mg` concentration from Campaign 1 to Campaign 3.
   - Campaign 3 has a higher median and a wider IQR, indicating greater variation and a tendency toward higher `Fe_Mg` values, consistent with its Fe-Mg richness.

3. **Ca_Na_K Box Plot**:
   - Campaign 1 and Campaign 2 have similar medians, but Campaign 3 shows a lower median and a slight downward shift in values.
   - Campaign 3 has fewer high-concentration outliers, indicating a more consistent trend toward lower `Ca_Na_K` concentrations in that campaign.


```{r}
# Function to perform Mann-Whitney test for two campaigns for a specified column
perform_mann_whitney <- function(campaign1, campaign2, data, column) {
  data1 <- subset(data, campaign == campaign1)[[column]]
  data2 <- subset(data, campaign == campaign2)[[column]]
  test_result <- wilcox.test(data1, data2)
  return(list(
    campaign1 = campaign1,
    campaign2 = campaign2,
    column = column,
    p_value = test_result$p.value,
    statistic = test_result$statistic
  ))
}

# Define campaigns and columns
campaigns <- unique(cation_data$campaign)
columns <- c("Si_Al", "Fe_Mg", "Ca_Na_K")

# Initialize list to store results
results <- list()

# Loop through each combination of campaigns and each column
for (col in columns) {
  for (i in 1:(length(campaigns) - 1)) {
    for (j in (i + 1):length(campaigns)) {
      result <- perform_mann_whitney(campaigns[i], campaigns[j], cation_data, col)
      results <- append(results, list(result))
    }
  }
}

# Convert results to a data frame for easy viewing
results_df <- do.call(rbind, lapply(results, as.data.frame))

# Display the results
print(results_df)


```

### Si_Al Comparison
- **Campaign 1 vs. Campaign 2**: p-value = 0.1473 (not significant)
  - No significant difference in `Si_Al` concentrations between Campaigns 1 and 2.
- **Campaign 1 vs. Campaign 3**: p-value = 0.0001526 (significant)
  - Significant difference, indicating that `Si_Al` concentrations differ between Campaigns 1 and 3.
- **Campaign 2 vs. Campaign 3**: p-value = 6.1659e-14 (highly significant)
  - Strongly significant difference, suggesting that `Si_Al` levels are distinct between Campaigns 2 and 3.

### Fe_Mg Comparison
- **Campaign 1 vs. Campaign 2**: p-value = 4.9397e-05 (significant)
  - Significant difference, with Campaign 2 having different `Fe_Mg` concentrations compared to Campaign 1.
- **Campaign 1 vs. Campaign 3**: p-value = 7.5487e-13 (highly significant)
  - Strongly significant difference, indicating substantial differences in `Fe_Mg` concentrations between Campaigns 1 and 3.
- **Campaign 2 vs. Campaign 3**: p-value = 3.9051e-24 (highly significant)
  - Very strong significance, suggesting that `Fe_Mg` concentrations differ considerably between Campaigns 2 and 3.

### Ca_Na_K Comparison
- **Campaign 1 vs. Campaign 2**: p-value = 0.0037212 (significant)
  - Significant difference, showing that `Ca_Na_K` concentrations between Campaigns 1 and 2 are different.
- **Campaign 1 vs. Campaign 3**: p-value = 3.3217e-13 (highly significant)
  - Strong significance, indicating distinct `Ca_Na_K` levels between Campaigns 1 and 3.
- **Campaign 2 vs. Campaign 3**: p-value = 6.3642e-29 (extremely significant)
  - Very strong significance, suggesting that Campaign 3 has different `Ca_Na_K` concentrations compared to Campaign 2.


```{r}
# Install dunn.test package if not already installed
# install.packages("dunn.test")
library(dunn.test)

# Perform Dunn's test for Si_Al across campaigns
dunn_test_Si_Al <- dunn.test(cation_data$Si_Al, cation_data$campaign, method = "bonferroni")
print(dunn_test_Si_Al)

# Perform Dunn's test for Fe_Mg across campaigns
dunn_test_Fe_Mg <- dunn.test(cation_data$Fe_Mg, cation_data$campaign, method = "bonferroni")
print(dunn_test_Fe_Mg)

# Perform Dunn's test for Ca_Na_K across campaigns
dunn_test_Ca_Na_K <- dunn.test(cation_data$Ca_Na_K, cation_data$campaign, method = "bonferroni")
print(dunn_test_Ca_Na_K)

```

### Dunn's Test Results

#### 1. **Si_Al Group**
   - **Campaign 1 vs. Campaign 2**: Not significant (p-adjusted = 0.3425).
   - **Campaign 1 vs. Campaign 3**: Significant (p-adjusted = 0.0000785).
   - **Campaign 2 vs. Campaign 3**: Highly significant (p-adjusted = 0.000000128).
   - **Interpretation**: There is a significant difference in `Si_Al` concentrations between Campaigns 1 & 3 and Campaigns 2 & 3, but not between Campaigns 1 & 2. This aligns with previous findings, suggesting that `Si_Al` levels in Campaign 3 are distinct from the other two campaigns.

#### 2. **Fe_Mg Group**
   - **Campaign 1 vs. Campaign 2**: Significant (p-adjusted = 0.0004).
   - **Campaign 1 vs. Campaign 3**: Highly significant (p-adjusted < 0.0001).
   - **Campaign 2 vs. Campaign 3**: Highly significant (p-adjusted < 0.0001).
   - **Interpretation**: All comparisons are significant, indicating that `Fe_Mg` concentrations are distinct across each campaign. This suggests that each campaign area has unique `Fe_Mg` levels, with Campaign 3 having particularly high concentrations, as observed previously.

#### 3. **Ca_Na_K Group**
   - **Campaign 1 vs. Campaign 2**: Significant (p-adjusted = 0.0054).
   - **Campaign 1 vs. Campaign 3**: Highly significant (p-adjusted < 0.0001).
   - **Campaign 2 vs. Campaign 3**: Highly significant (p-adjusted < 0.0001).
   - **Interpretation**: There are significant differences in `Ca_Na_K` concentrations across all campaign pairs. Campaign 3 shows lower `Ca_Na_K` concentrations compared to Campaigns 1 and 2, making it distinct.


```{r}
# Specify the hypothetical mean for comparison
test_value <- 10

# Single-sample t-test for Si_Al
t_test_Si_Al <- t.test(cation_data$Si_Al, mu = test_value)
print(t_test_Si_Al)

# Single-sample t-test for Fe_Mg
t_test_Fe_Mg <- t.test(cation_data$Fe_Mg, mu = test_value)
print(t_test_Fe_Mg)

# Single-sample t-test for Ca_Na_K
t_test_Ca_Na_K <- t.test(cation_data$Ca_Na_K, mu = test_value)
print(t_test_Ca_Na_K)


```
**One sample t-test**

1. **Si_Al**
   - **t-value**: 128.08
   - **Degrees of Freedom (df)**: 1931
   - **p-value**: < 2.2e-16 (highly significant)
   - **95% Confidence Interval**: [49.10, 50.32]
   - **Mean of `Si_Al`**: 49.71
   - **Interpretation**: The mean `Si_Al` concentration (49.71) is significantly higher than the hypothetical mean of 10. The extremely low p-value suggests a highly significant difference, meaning the `Si_Al` concentration is much higher than the test value.

2. **Fe_Mg**
   - **t-value**: 64.92
   - **Degrees of Freedom (df)**: 1931
   - **p-value**: < 2.2e-16 (highly significant)
   - **95% Confidence Interval**: [35.74, 37.35]
   - **Mean of `Fe_Mg`**: 36.55
   - **Interpretation**: The mean `Fe_Mg` concentration (36.55) is also significantly higher than the hypothetical mean of 10. The low p-value indicates a highly significant difference, confirming that `Fe_Mg` levels are much higher than 10.

3. **Ca_Na_K**
   - **t-value**: -20.63
   - **Degrees of Freedom (df)**: 1931
   - **p-value**: < 2.2e-16 (highly significant)
   - **95% Confidence Interval**: [6.80, 7.35]
   - **Mean of `Ca_Na_K`**: 7.08
   - **Interpretation**: The mean `Ca_Na_K` concentration (7.08) is significantly lower than the hypothetical mean of 10. The negative t-value and low p-value suggest a highly significant difference, showing that `Ca_Na_K` levels are below 10.


```{r}
# Check unique values in the campaign variable
unique(cation_data$campaign)

# Convert campaign to a factor if not already
cation_data$campaign <- as.factor(cation_data$campaign)

# Run binary logistic regression
logistic_model <- glm(campaign ~ Si_Al + Fe_Mg + Ca_Na_K, data = cation_data, family = "binomial")
summary(logistic_model)

# install.packages("nnet")
library(nnet)


# Multinomial logistic regression
multinom_model <- multinom(campaign ~ Si_Al + Fe_Mg + Ca_Na_K, data = cation_data)
summary(multinom_model)


# Predict campaign for the existing data (useful for evaluating the model)
predicted_campaigns <- predict(multinom_model, type = "class")
head(predicted_campaigns)

# If you want probabilities for each campaign
predicted_probabilities <- predict(multinom_model, type = "probs")
head(predicted_probabilities)


# Calculate accuracy
mean(predicted_campaigns == cation_data$campaign)


```

### Binary Logistic Regression (glm)

Since the `campaign` variable has three levels (Campaign 1, Campaign 2, Campaign 3), the binary logistic regression might not be the best approach for this data, as it’s generally suited for two-level outcomes. However, here’s what we can interpret from the model:

- **Intercept**: The intercept has a significant positive coefficient (3.24192, p = 0.0106), which influences the baseline prediction.
- **Si_Al**: The coefficient for `Si_Al` is negative (-0.01566) but not statistically significant (p = 0.2657), suggesting that `Si_Al` concentration doesn’t strongly predict the campaign in a binary logistic context.
- **Fe_Mg**: The coefficient for `Fe_Mg` is positive (0.03231) and statistically significant (p = 0.0134), indicating that higher `Fe_Mg` concentrations are associated with a particular campaign (though the binary approach might not give us the complete picture).
- **Ca_Na_K**: The coefficient for `Ca_Na_K` is negative (-0.01654) and not significant (p = 0.5575), indicating that it may not strongly predict the campaign in a binary setup.

The model’s AIC (576.76) and residual deviance (568.76) indicate the model’s fit but might not be fully informative given the limitations of using binary logistic regression for a three-level outcome.

### Multinomial Logistic Regression (nnet::multinom)

The multinomial logistic regression is more appropriate for this dataset, as it allows for multiple outcome levels (Campaign 1, Campaign 2, Campaign 3).

#### Model Interpretation

- **Intercepts**:
  - Campaign 2’s intercept (1.491249) and Campaign 3’s intercept (3.646404) show positive baseline influences for these campaigns relative to Campaign 1.
- **Si_Al**:
  - The coefficient for `Si_Al` is near zero for both Campaign 2 (0.00024) and Campaign 3 (-0.03232) with small standard errors, suggesting `Si_Al` does not contribute strongly to distinguishing between campaigns in this model.
- **Fe_Mg**:
  - The coefficient for `Fe_Mg` is positive for both Campaign 2 (0.02947) and Campaign 3 (0.03396), indicating that higher `Fe_Mg` values increase the likelihood of the sample belonging to Campaigns 2 and 3.
- **Ca_Na_K**:
  - The coefficient for `Ca_Na_K` is positive for Campaign 2 (0.01098) but negative for Campaign 3 (-0.04825), suggesting that higher `Ca_Na_K` values slightly favor Campaign 2 over Campaign 3.

The model’s AIC (3000.732) provides a measure of fit, though it should be compared with other models for context.

### Predicted Campaigns and Accuracy

- **Predicted Campaigns**: The `predicted_campaigns` variable shows the campaign classifications based on the multinomial logistic model.
- **Predicted Probabilities**: The `predicted_probabilities` variable gives the probability of each campaign for each sample, indicating the confidence of predictions.
- **Accuracy**: The calculated accuracy of 60.97% suggests the model has moderate predictive power. This means the model’s predictors (`Si_Al`, `Fe_Mg`, and `Ca_Na_K`) partially explain the differences between campaigns, but there may be other influencing factors or non-linear relationships.