Skip to content

finished final notebook #205

Merged
merged 1 commit into from Dec 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
976 changes: 976 additions & 0 deletions StudentNotebooks/Assignment05/wangx53-assignment5.Rmd

Large diffs are not rendered by default.

Binary file not shown.
@@ -0,0 +1,159 @@
---
title: "Data Analytics Research Individual Final Project Report"
author: "Evangeline Wang"
date: "Fall 2024"
output:
pdf_document:
toc: yes
toc_depth: '3'
html_notebook: default
html_document:
toc: yes
toc_depth: 3
toc_float: yes
number_sections: yes
theme: united
---

# DAR Project and Group Members

* Project name: MARS
* Project team members:
- Xuanting Wang (Primary Contributor)

# 0.0 Preliminaries

This report includes the analysis and findings on Martian sample composition patterns using data from the Perseverance rover’s PIXL and LIBS instruments. Required R packages include:

* `ggplot2`
* `tidyverse`
* `ggtern`
* Additional packages are installed and loaded as necessary.

```{r, include=FALSE}
# Install required packages if not already installed
packages <- c("ggplot2", "tidyverse", "dplyr", "ggtern")
for (pkg in packages) {
if (!require(pkg, character.only = TRUE)) {
install.packages(pkg, dependencies = TRUE)
library(pkg, character.only = TRUE)
}
}
```

# 1.0 Project Introduction

This project investigates the chemical composition of Martian samples across campaigns using PIXL and LIBS datasets. Key objectives include:

- Identifying patterns in cation group compositions (Si-Al, Fe-Mg, Ca-Na-K).
- Assessing variations across campaigns using statistical analysis.
- Comparing insights derived from PIXL and LIBS data.

Data analysis involved methods such as ANOVA, post-hoc tests, and logistic regression for campaign classification based on cation group compositions.

# 2.0 Organization of Report

This report is organized as follows:

- **Section 3.0:** PIXL Data Analysis – Findings and visualizations.
- **Section 4.0:** LIBS Data Analysis – Results and comparisons.
- **Section 5.0:** Conclusions, limitations, and future directions.
- **Section 6.0:** Appendix – Supplementary materials.

# 3.0 PIXL Data Analysis

## 3.1 Data and Methods

PIXL datasets were processed to calculate the cation group sums:

- **Si-Al:** Sum of \( SiO_2 \) and \( Al_2O_3 \).
- **Fe-Mg:** Sum of \( FeO-T \) and \( MgO \).
- **Ca-Na-K:** Sum of \( CaO \), \( Na_2O \), and \( K_2O \).

Samples were classified based on the largest cation group proportion. Statistical methods included:

- ANOVA for campaign-based differences.
- Dunn’s post-hoc tests for pairwise comparisons.
- Logistic regression for campaign classification.

## 3.2 Findings

1. **Classification Results:**
- **Si-Al rich:** Majority of samples (11).
- **Fe-Mg rich:** Fewer samples (5).
- **Ca-Na-K rich:** Minimal samples.

2. **Statistical Results:**
- ANOVA indicated significant differences in Si-Al (p = 0.0014) and Ca-Na-K (p = 0.0136) across campaigns.
- Fe-Mg showed marginal significance (p = 0.0791).

3. **Post-hoc Test Results:**
- Significant differences in Si-Al and Ca-Na-K between Crater Floor and Delta Front.

4. **Logistic Regression:**
- Limited predictive power for campaign classification using cation compositions.

## 3.3 Visualizations

- **Ternary Plot:** Proportional distribution of cation groups.
- **Density Plots:** Distribution patterns of Si-Al, Fe-Mg, and Ca-Na-K.
- **Box Plots:** Campaign-specific variations in cation concentrations.

# 4.0 LIBS Data Analysis

## 4.1 Data and Methods

LIBS data followed the same processing pipeline as PIXL. The analysis included:

- Campaign-based classification.
- Statistical tests (ANOVA and Dunn’s test).
- Comparisons between LIBS and PIXL results.

## 4.2 Findings

1. **Classification Results:**
- Si-Al rich (majority), Fe-Mg rich, and Ca-Na-K rich distributions mirrored PIXL trends.

2. **Statistical Results:**
- Significant variations were observed in all cation groups (p < 0.0001) across campaigns.

3. **Post-hoc Test Results:**
- Clear differences between Campaign 3 and the other campaigns.

4. **Logistic Regression:**
- Fe-Mg showed some predictive strength in distinguishing campaigns.

## 4.3 Visualizations

- **Ternary Plot:** Similar trends as PIXL.
- **Box Plots:** Campaign-specific distributions.

# 5.0 Conclusions, Limitations, and Future Work

## 5.1 Conclusions

- Both datasets showed consistent compositional trends, with Campaign 3 exhibiting distinct patterns.
- Significant differences were noted in Si-Al, Fe-Mg, and Ca-Na-K across campaigns.

## 5.2 Limitations

- Limited predictive power in logistic regression models.
- Variability in sample sizes may affect statistical robustness.

## 5.3 Recommendations

- Incorporate additional datasets (e.g., SHERLOC) for broader insights.
- Explore machine learning models for improved classification accuracy.

# 6.0 Appendix

## Supplementary Figures

- Extended ternary plots, density plots, and box plots.
- Statistical tables summarizing ANOVA and post-hoc results.

## References

1. PIXL and LIBS Data Documentation.
2. R Documentation for ggplot2 and tidyverse.

Binary file not shown.