Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
DAR-Mars-F24/Assignment02/mishrs5-assignment2-f24.Rmd
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
644 lines (458 sloc)
24.2 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
title: "Mars 2020 Mission Data Notebook:" | |
subtitle: "DAR Assignment 2 (Fall 2024)" | |
author: "Soumeek Mishra" | |
date: "`r format(Sys.time(), '%d %B %Y')`" | |
output: | |
pdf_document: default | |
html_document: | |
toc: true | |
number_sections: true | |
df_print: paged | |
--- | |
```{r setup, include=FALSE} | |
# Required R package installation; RUN THIS BLOCK BEFORE ATTEMPTING TO KNIT THIS NOTEBOOK!!! | |
# This section install packages if they are not already installed. | |
# This block will not be shown in the knit file. | |
knitr::opts_chunk$set(echo = TRUE) | |
# Set the default CRAN repository | |
local({r <- getOption("repos") | |
r["CRAN"] <- "http://cran.r-project.org" | |
options(repos=r) | |
}) | |
if (!require("pandoc")) { | |
install.packages("pandoc") | |
library(pandoc) | |
} | |
# Required packages for M20 LIBS analysis | |
if (!require("rmarkdown")) { | |
install.packages("rmarkdown") | |
library(rmarkdown) | |
} | |
if (!require("tidyverse")) { | |
install.packages("tidyverse") | |
library(tidyverse) | |
} | |
if (!require("stringr")) { | |
install.packages("stringr") | |
library(stringr) | |
} | |
if (!require("ggbiplot")) { | |
install.packages("ggbiplot") | |
library(ggbiplot) | |
} | |
if (!require("pheatmap")) { | |
install.packages("pheatmap") | |
library(pheatmap) | |
} | |
``` | |
# DAR ASSIGNMENT 2 (Introduction): Introductory DAR Notebook | |
This notebook is broken into two main parts: | |
* **Part 1:** Preparing your local repo for **DAR Assignment 2** | |
* **Part 2:** Loading and some analysis of the Mars 2020 (M20) Datasets | |
* Lithology: _Summarizes the mineral characteristics of samples collected at certain sample locations._ | |
* PIXL: Planetary Instrument for X-ray Lithochemistry. _Measures elemental chemistry of samples at sub-millimeter scales of samples._ | |
* SHERLOC: Scanning Habitable Environments with Raman and Luminescence for Organics and Chemicals. _Uses cameras, a spectrometer, and a laser of samples to search for organic compounds and minerals that have been altered in watery environments and may be signs of past microbial life._ | |
* LIBS: Laser-induced breakdown spectroscopy. _Uses a laser beam to help identify minerals in samples and other areas that are beyond the reach of the rover's robotic arm or in areas too steep for the rover to travel._ | |
* **Part 3:** Individual analysis of your team's dataset | |
**NOTE:** The RPI github repository for all the code and data required for this notebook may be found at: | |
* https://github.rpi.edu/DataINCITE/DAR-Mars-F24 | |
* **Part 4:** Preparation of Team Presentation | |
# DAR ASSIGNMENT 2 (Part 1): Preparing your local repo for Assignment 2 | |
In this assignment you'll start by making a copy of the Assignment 2 template notebook, then you'll add to your copy with your original work. The instructions which follow explain how to accomplish this. | |
**NOTE:** You already cloned the `DAR-Mars-F24` repository for Assignment 1; you **do not** need to make another clone of the repo, but you must begin by updating your copy as instructed below: | |
## Updating your local clone of the `DAR-Mars-F24` repository | |
* Access RStudio Server on the IDEA Cluster at http://lp01.idea.rpi.edu/rstudio-ose/ | |
* REMINDER: You must be on the RPI VPN!! | |
* Access the Linux shell on the IDEA Cluster by clicking the **Terminal** tab of RStudio Server (lower left panel). | |
* You now see the Linux shell on the IDEA Cluster | |
* `cd` (change directory) to enter your home directory using: `cd ~` | |
* Type `pwd` to confirm where you are | |
* In the Linux shell, `cd` to `DAR-Mars-F24` | |
* Type `git pull origin main` to pull any updates | |
* Always do this when you being work; we might have added or changed something! | |
* In the Linux shell, `cd` into `Assignment02` | |
* Type `ls -al` to list the current contents | |
* Don't be surprised if you see many files! | |
* In the Linux shell, type `git branch` to verify your current working branch | |
* If it is not `dar-yourrcs`, type `git checkout dar-yourrcs` (where `yourrcs` is your RCS id) | |
* Re-type `git branch` to confirm | |
* Now in the RStudio Server UI, navigate to the `DAR-Mars-F24/StudentNotebooks/Assignment02` directory via the **Files** panel (lower right panel) | |
* Under the **More** menu, set this to be your R working directory | |
* Setting the correct working directory is essential for interactive R use! | |
You're now ready to start coding Assignment 2! | |
## Creating your copy of the Assignment 2 notebook | |
1. In RStudio, make a **copy** of `dar-f24-assignment2-template.Rmd` file using a *new, original, descriptive* filename that **includes your RCS ID!** | |
* Open `dar-f24-assignment2-template.Rmd` | |
* **Save As...** using a new filename that includes your RCS ID | |
* Example filename for user `erickj4`: `erickj4-assignment2-f24.Rmd` | |
* POINTS OFF IF: | |
* You don't create a new filename! | |
* You don't include your RCS ID! | |
* You include `template` in your new filename! | |
2. Edit your new notebook using RStudio and save | |
* Change the `title:` and `subtitle:` headers (at the top of the file) | |
* Change the `author:` | |
* Don't bother changing the `date:`; it should update automagically... | |
* **Save** your changes | |
3. Use the RStudio `Knit` command to create an HTML file; repeat as necessary | |
* Use the down arrow next to the word `Knit` and select **Knit to HTML** | |
* You may also knit to PDF... | |
4. In the Linux terminal, use `git add` to add each new file you want to add to the repository | |
* Type: `git add yourfilename.Rmd` | |
* Type: `git add yourfilename.html` (created when you knitted) | |
* Add your PDF if you also created one... | |
5. When you're ready, in Linux commit your changes: | |
* Type: `git commit -m "some comment"` where "some comment" is a useful comment describing your changes | |
* This commits your changes to your local repo, and sets the stage for your next operation. | |
6. Finally, push your commits to the RPI github repo | |
* Type: `git push origin dar-yourrcs` (where `dar-yourrcs` is the branch you've been working in) | |
* Your changes are now safely on the RPI github. | |
7. **REQUIRED:** On the RPI github, **submit a pull request.** | |
* In a web browser, navigate to https://github.rpi.edu/DataINCITE/DAR-Mars-F24 | |
* In the branch selector drop-down (by default says **master**), select your branch | |
* **Submit a pull request for your branch** | |
* One of the DAR instructors will merge your branch, and your new files will be added to the master branch of the repo. _Do not merge your branch yourself!_ | |
# DAR ASSIGNMENT 2 (Part 2): Loading the Mars 2020 (M20) Datasets | |
In this assignment there are four datasets from separate instruments on the Mars Perserverance rover available for analysis: | |
* **Lithology:** Summarizes the mineral characteristics of samples collected at certain sample locations | |
* **PIXL:** Planetary Instrument for X-ray Lithochemistry of collected samples | |
* **SHERLOC:** Scanning Habitable Environments with Raman and Luminescence for Organics and Chemicals for collected samples | |
* **LIBS:** Laser-induced breakdown spectroscopy which are measured in many areas (not just samples) | |
Each dataset provides data about the mineralogy of the surface of Mars. Based on the purpose and nature of the instrument, the data is collected at different intervals along the path of Perseverance as it makes it way across the Jezero crater. Some of the data (esp. LIBS) is collected almost every Martian day, or _sol_. Some of the data (PIXL and SHERLOC) is only collected at certain sample locations of interest | |
Your objective is to an analysis of the your teams dataset in order to learn all you can about these Mars samples. | |
NOTES: | |
* All of these datasets can be found in `/academics/MATP-4910-F24/DAR-Mars-F24/Data` | |
* We have included a comprehensive `samples.Rds` dataset that includes useful details about the sample locations, including Martian latitude and longitude and the sol that individual samples were collected. | |
* Also included is `rover.waypoints.Rds` that provides detailed location information (lat/lon) for the Perseverance rover throughout its journey, up to the present. This can be updated when necessary using the included `roverStatus-f24.R` script. | |
* A general guide to the available Mars 2020 data is available here: https://pds-geosciences.wustl.edu/missions/DAR-Mars2020/ | |
## Data Set A: Load the Lithology Data | |
The first five features of the dataset describe twenty-four (24) rover sample locations. | |
The remaining features provides a simple binary (`1` or `0`) summary of presence or absence of 35 minerals at the 24 rover sample locations. | |
Only the first sixteen (16) samples are maintained, as the remaining are missing the mineral descriptors. | |
The following code "cleans" the dataset to prepare for analysis. It first creates a dataframe with metadata and measurements for samples, and then creates a matrix containing only numeric measurements for later analysis. | |
```{r} | |
# Load the saved lithology data with locations added | |
lithology.df<- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/mineral_data_static.Rds") | |
# Cast samples as numbers | |
lithology.df$sample <- as.numeric(lithology.df$sample) | |
# Convert rest into factors | |
lithology.df[sapply(lithology.df, is.character)] <- lapply(lithology.df[sapply(lithology.df, is.character)], | |
as.factor) | |
# Keep only first 16 samples because the data for the rest of the samples is not available yet | |
lithology.df<-lithology.df[1:16,] | |
# Look at summary of cleaned data frame | |
summary(lithology.df) | |
# Create a matrix containing only the numeric measurements. The remaining features are metadata about the sample. | |
lithology.matrix <- sapply(lithology.df[,6:40],as.numeric)-1 | |
# Review the structure of our matrix | |
str(lithology.matrix) | |
``` | |
## Data Set B: Load the PIXL Data | |
The PIXL data provides summaries of the mineral compositions measured at selected sample sites by the PIXL instrument. | |
```{r} | |
# Load the saved PIXL data with locations added | |
pixl.df <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/samples_pixl_wide.Rds") | |
# Convert to factors | |
pixl.df[sapply(pixl.df, is.character)] <- lapply(pixl.df[sapply(pixl.df, is.character)], | |
as.factor) | |
# Review our dataframe | |
summary(pixl.df) | |
# Make the matrix of just mineral percentage measurements | |
pixl.matrix <- pixl.df[,2:14] | |
# Review the structure | |
str(pixl.matrix) | |
``` | |
## Data Set C: Load the LIBS Data | |
The LIBS data provides summaries of the mineral compositions measured at selected sample sites by the LIBS instrument, part of the Perseverance SuperCam. | |
```{r} | |
# Load the saved LIBS data with locations added | |
libs.df <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/supercam_libs_moc_loc.Rds") | |
#Drop features that are not to be used in the analysis for this notebook | |
libs.df <- libs.df %>% | |
select(!(c(distance_mm,Tot.Em.,SiO2_stdev,TiO2_stdev,Al2O3_stdev,FeOT_stdev, | |
MgO_stdev,Na2O_stdev,CaO_stdev,K2O_stdev,Total))) | |
# Convert the points to numeric | |
libs.df$point <- as.numeric(libs.df$point) | |
# Review what we have | |
summary(libs.df) | |
# Make the a matrix contain only the libs measurements for each mineral | |
libs.matrix <- as.matrix(libs.df[,6:13]) | |
# Review the structure | |
str(libs.matrix) | |
``` | |
## Dataset D: Load the SHERLOC Data | |
The SHERLOC data you will be using for this lab is the result of scientists' interpretations of extensive spectral analysis of abrasion samples provided by the SHERLOC instrument. | |
**NOTE:** This dataset presents minerals as rows and sample sites as columns. You'll probably want to rotate the dataset for easier analysis.... | |
```{r} | |
# Read in data as provided. | |
sherloc_abrasion_raw <- readRDS("/academics/MATP-4910-F24/DAR-Mars-F24/Data/abrasions_sherloc_samples.Rds") | |
# Clean up data types | |
sherloc_abrasion_raw$Mineral<-as.factor(sherloc_abrasion_raw$Mineral) | |
sherloc_abrasion_raw[sapply(sherloc_abrasion_raw, is.character)] <- lapply(sherloc_abrasion_raw[sapply(sherloc_abrasion_raw, is.character)], | |
as.numeric) | |
# Transform NA's to 0 | |
sherloc_abrasion_raw <- sherloc_abrasion_raw %>% replace(is.na(.), 0) | |
# Reformat data so that rows are "abrasions" and columns list the presence of minerals. | |
# Do this by "pivoting" to a long format, and then back to the desired wide format. | |
sherloc_long <- sherloc_abrasion_raw %>% | |
pivot_longer(!Mineral, names_to = "Name", values_to = "Presence") | |
# Make abrasion a factor | |
sherloc_long$Name <- as.factor(sherloc_long$Name) | |
# Make it a matrix | |
sherloc.matrix <- sherloc_long %>% | |
pivot_wider(names_from = Mineral, values_from = Presence) | |
# Get sample information from PIXL and add to measurements -- assumes order is the same | |
sherloc.df <- cbind(pixl.df[,c("sample","type","campaign","abrasion")],sherloc.matrix) | |
# Review what we have | |
summary(sherloc.df) | |
# Measurements are everything except first column | |
sherloc.matrix<-sherloc.matrix[,-1] | |
# Sherlock measurement matrix | |
# Review the structure | |
str(sherloc.matrix) | |
``` | |
## Data Set E: PIXL + Sherloc | |
```{r} | |
# Combine PIXL and SHERLOC dataframes | |
pixl_sherloc.df <- cbind(pixl.df,sherloc.df ) | |
# Review what we have | |
summary(pixl_sherloc.df) | |
# Combine PIXL and SHERLOC matrices | |
pixl_sherloc.matrix<-cbind(pixl.matrix,sherloc.matrix) | |
# Review the structure of our matrix | |
str(pixl_sherloc.matrix) | |
``` | |
## Data Set F: PIXL + Lithology | |
Create data and matrix from prior datasets. This is the one I am working on. | |
```{r} | |
# We combine our PIXL and Lithology dataframes | |
pixl_lithology.df <- cbind(pixl.df,lithology.df ) | |
pixl_lithology.df | |
# We review what we have | |
summary(pixl_lithology.df) | |
# We combine PIXL and Lithology matrices | |
pixl_lithology.matrix<-cbind(pixl.matrix,lithology.matrix) | |
pixl_lithology.matrix | |
# We review the structure | |
str(pixl_lithology.matrix) | |
``` | |
Description of the dataset contained in the dataframe | |
```{r} | |
# Description of the dataset in the dataframe | |
# Structure of the dataframe | |
str(pixl_lithology.df) | |
# Summary statistics of the dataframe | |
summary(pixl_lithology.df) | |
# Descriptive statistics of the dataframe | |
library(psych) | |
describe(pixl_lithology.df) | |
# Column names of the dataframe | |
colnames(pixl_lithology.df) | |
# Datatypes of each column of the dataframe | |
sapply(pixl_lithology.df, class) | |
``` | |
Description of the dataset in the form of matrix. | |
```{r} | |
class(pixl_lithology.matrix) | |
``` | |
```{r} | |
# We define a function to calculate multiple statistics | |
calculate_stats <- function(x) { | |
if (is.numeric(x)) { | |
stats <- c( | |
Mean = mean(x, na.rm = TRUE), | |
Median = median(x, na.rm = TRUE), | |
StdDev = sd(x, na.rm = TRUE), | |
Min = min(x, na.rm = TRUE), | |
Max = max(x, na.rm = TRUE), | |
Range = diff(range(x, na.rm = TRUE)), | |
Variance = var(x, na.rm = TRUE) | |
) | |
return(stats) | |
} else { | |
return(rep(NA, 7)) # NA is returned for non-numeric columns | |
} | |
} | |
# We apply the function to each column | |
descriptive_stats <- sapply(pixl_lithology.matrix, calculate_stats) | |
# Transpose makes it look better | |
descriptive_stats <- t(descriptive_stats) | |
# Finally converting it to a dataframe | |
descriptive_stats_df <- as.data.frame(descriptive_stats) | |
print(descriptive_stats_df) | |
``` | |
Now we find the number of rows and columns of the matrix as well as the metadata and the measurement features. | |
```{r} | |
# number of rows and columns | |
num_rows<-nrow(pixl_lithology.matrix) | |
num_rows | |
num_cols<-ncol(pixl_lithology.matrix) | |
num_cols | |
cat("The dataset contains", num_rows, "samples (rows) and", num_cols, "features (columns).\n") | |
``` | |
Metadata and measurements features | |
```{r} | |
# We make a list of both metadata and measurement features | |
metadata_features<-c("sample","name","type","campaign","location","abrasion","SampleType") | |
metadata_features | |
#For measurement features we list everything that is not a metadata feature from the matrix | |
measurement_features<-setdiff(colnames(pixl_lithology.matrix),metadata_features) | |
measurement_features | |
cat("\nMetadata Features:\n", metadata_features, "\n") | |
cat("Measurement Features:\n", measurement_features, "\n") | |
``` | |
Next we perform z-score scaling (normalization) as I believe it is the best scaling method especially to handle outliers | |
```{r} | |
pixl_lithology.matrix_scaled<-scale(pixl_lithology.matrix) | |
pixl_lithology.matrix_scaled | |
``` | |
Next, we perform k-means clustering on the scaled matrix including the elbow plot which can give a fair idea for choosing the number of clusters | |
```{r} | |
# We work on the NA, NaN, or Inf before clustering because kmeans can have an error if they exist | |
# We check for NA values | |
sum(is.na(pixl_lithology.matrix_scaled)) | |
# We check for NaN values | |
sum(is.nan(pixl_lithology.matrix_scaled)) | |
# We check for Inf values | |
sum(is.infinite(pixl_lithology.matrix_scaled)) | |
# We replace NA, NaN, or Inf with a specific value (e.g., mean of the column) | |
pixl_lithology.matrix_scaled[is.na(pixl_lithology.matrix_scaled)] <- 0 | |
pixl_lithology.matrix_scaled[is.nan(pixl_lithology.matrix_scaled)] <- 0 | |
pixl_lithology.matrix_scaled[is.infinite(pixl_lithology.matrix_scaled)] <- 0 | |
pixl_lithology.matrix_scaled | |
# A user-defined function to examine clusters and plot the results | |
wssplot <- function(data, nc=15, seed=10){ | |
wss <- data.frame(cluster=1:nc, quality=c(0)) | |
for (i in 1:nc){ | |
set.seed(seed) | |
wss[i,2] <- kmeans(data, centers=i)$tot.withinss} | |
ggplot(data=wss,aes(x=cluster,y=quality)) + | |
geom_line() + | |
ggtitle("Quality of k-means by Cluster") | |
} | |
# Apply `wssplot()` to our PIXL + Lithography data | |
wssplot(pixl_lithology.matrix_scaled, nc=8, seed=2) | |
``` | |
We can see k=7 is ideal from the elbow plot | |
So, we will use k=7 for k-means clustering | |
```{r} | |
# Use our chosen 'k' to perform k-means clustering | |
set.seed(2) | |
k <- 7 | |
km <- kmeans(pixl_lithology.matrix_scaled,k) | |
km | |
``` | |
We now examine the cluster means | |
Below is the heatmap of the cluster centers with rows and columns clustered. | |
```{r} | |
pheatmap(km$centers) | |
``` | |
Perform PCA on PIXL + Lithography Data | |
We have already scaled the data so we keep scale=FALSE | |
```{r} | |
# Now, there maybe one or more columns in the matrix are constant (i.e all values of that column are same or the column contains only zeros). When performing PCA constant columns can't be rescaled to unit variance which is essential in PCA. | |
# So, we deal with those. | |
# We identify constant columns | |
constant_columns <- apply(pixl_lithology.matrix_scaled, 2, function(x) length(unique(x)) == 1) | |
# We identify zero columns | |
zero_columns <- apply(pixl_lithology.matrix_scaled, 2, function(x) all(x == 0)) | |
# We combine both | |
problem_columns <- which(constant_columns | zero_columns) | |
# We remove the problem columns | |
pixl_lithology.matrix_scaled_cleaned <- pixl_lithology.matrix_scaled[, -problem_columns] | |
# We print the problem columns | |
print(problem_columns) | |
pixl_lithology.matrix_scaled.pca <- prcomp(pixl_lithology.matrix_scaled_cleaned, scale=FALSE) | |
# We generate the scree plot | |
ggscreeplot(pixl_lithology.matrix_scaled.pca) | |
``` | |
We make a table indicating how many samples are in each cluster | |
```{r} | |
library(knitr) | |
cluster1.df <- data.frame(cluster = 1:7, size = km$size) | |
kable(cluster1.df,caption="Samples per cluster") | |
``` | |
Now, finally we create the PCA biplot using ggbiplot of the data colored by cluster and label by rock type | |
```{r} | |
ggbiplot::ggbiplot(pixl_lithology.matrix_scaled.pca, | |
labels = pixl_lithology.df$type, | |
groups = as.factor(km$cluster)) + | |
xlim(-2,2) + ylim(-2,2) | |
``` | |
We create the hierarchial clustering and plot the dendrogram now | |
```{r} | |
# We compute the distance matrix | |
dist_matrix <- dist(pixl_lithology.matrix_scaled_cleaned, method = "euclidean") | |
# We perform hierarchical clustering | |
hclust_model <- hclust(dist_matrix, method = "ward.D2") | |
# We plot the dendrogram | |
plot(hclust_model, main = "Hierarchical Clustering Dendrogram", | |
xlab = "Sample Index", ylab = "Distance", cex = 0.9) | |
# We classify the clusters with a red line partition | |
rect.hclust(hclust_model, k = 7, border = "red") | |
``` | |
## Data Set G: Sherloc + Lithology | |
Create Data and matrix from prior datasets by taking on appropriate matrix | |
```{r} | |
# Combine the Lithology and SHERLOC dataframes | |
sherloc_lithology.df <- cbind(sherloc.df,lithology.df ) | |
# Review what we have | |
summary(sherloc_lithology.df) | |
# Combine the Lithology and SHERLOC matrices | |
sherloc_lithology.matrix<-cbind(sherloc.matrix,lithology.matrix) | |
# Review the resulting matrix | |
str(sherloc_lithology.matrix) | |
``` | |
# Analysis of Data (Part 3) | |
Each team has been assigned one of six datasets: | |
1. Dataset B: PIXL: The PIXL team's goal is to understand and explain how scaling improves results from Assignment 1 | |
2. Dataset C: LIBS (with appropriate scaling as necessary) | |
3. Dataset D: Sherloc (with appropriate scaling as necessary) | |
4. Dataset E: PIXL + Sherloc (with appropriate scaling as necessary) | |
5. Dataset F: PIXL + Lithography (with appropriate scaling as necessary) | |
6. Dataset G: Sherloc + Lithograpy (with appropriate scaling as necessary) | |
**For each data set perform the following steps.** Feel free to use the methods/code from Assignment 1 as desired. Communicate with your teammates. Make sure that you are doing different variations of below analysis so that no team member does the exact same analysis. If you want to share clustering (which is okay but then vary rest), make sure you use the same random seeds. | |
1. _Describe the data set contained in the data frame and matrix:_ How many rows does it have and how many features? Which features are measurements and which features are metadata about the samples? (3 pts) | |
2. _Scale this data appropriately (you can choose the scaling method):_ Explain why you chose that scaling method. (3 pts) | |
3. _Cluster the data using k-means or your favorite clustering method (like hierarchical clustering):_ Describe how you picked the best number of clusters. Indicate the number of points in each clusters. Coordinate with your team so you try different approaches. If you want to share results with your team mates, make sure to use the same random seeds. (6 pts) | |
4. _Perform a **creative analysis** that provides insights into what one or more of the clusters are and what they tell you about the MARS data:_ | |
# Preparation of Team Presentation (Part 4) | |
Prepare a presentation of your teams result to present in class on **September 11** starting at 9am in AE217 (20 pts) | |
The presentation should include the following elements | |
1. A **Description** of the data set that you analyzed including how many observations and how many features. (<= 1.5 mins) | |
2. Each team member gets **three minutes** to explain their analysis: | |
* what analysis they performed | |
* the results of that analysis | |
* a brief discussion of their interpretation of these results | |
* <= 18 mins _total!_ | |
3. A **Conclusion** slide indicating major findings of the teams (<= 1.5 mins) | |
4. Thoughts on **potential next steps** for the MARS team (<= 1.5 mins) | |
* A template for your team presentation is included here: https://bit.ly/dar-template-f24 | |
* The rubric for the presentation is here: | |
https://docs.google.com/document/d/1-4o1O4h2r8aMjAplmE-ItblQnyDAKZwNs5XCnmwacjs/pub | |
# When you're done: SAVE, COMMIT and PUSH YOUR CHANGES! | |
When you are satisfied with your edits and your notebook knits successfully, remember to push your changes to the repo using the following steps: | |
* `git branch` | |
* To double-check that you are in your working branch | |
* `git add <your changed files>` | |
* `git commit -m "Some useful comments"` | |
* `git push origin <your branch name>` | |
# Prepare group presentation | |
Prepare a (at most) _three-slide_ presentation of your classification results and creative analysis. Create a joint presentation with your teammates using the Google Slides template available here: https://bit.ly/45twtUP (copy the template and customize with your content) | |
Prepare a conclusion slide that summarizes all your results. | |
Be prepared to present your results on xx Sep 2024 in class! | |
# APPENDIX: Accessing RStudio Server on the IDEA Cluster | |
The IDEA Cluster provides seven compute nodes (4x 48 cores, 3x 80 cores, 1x storage server) | |
* The Cluster requires RCS credentials, enabled via registration in class | |
* email John Erickson for problems `erickj4@rpi.edu` | |
* RStudio, Jupyter, MATLAB, GPUs (on two nodes); lots of storage and computes | |
* Access via RPI physical network or VPN only | |
# More info about Rstudio on our Cluster | |
## RStudio GUI Access: | |
* Use: | |
* http://lp01.idea.rpi.edu/rstudio-ose/ | |
* http://lp01.idea.rpi.edu/rstudio-ose-3/ | |
* http://lp01.idea.rpi.edu/rstudio-ose-6/ | |
* http://lp01.idea.rpi.edu/rstudio-ose-7/ | |
* Linux terminal accessible from within RStudio "Terminal" or via ssh (below) | |