Skip to content
White, Rachael edited this page Jul 22, 2022 · 25 revisions
Clone this wiki locally

MAP-T Minder

LIVE APP: https://inciteprojects.idea.rpi.edu/apps/AlzApp/

MAP-T Minder (MTM) is a web-based, dashboard-style data browsing tool that enables interactive exploration of single-cell RNA sequencing data. The primary inspiration for our application centers around exposing a large dataset of RNA transcripts from brain organoid models affected with frontotemporal dementia (and associated isogenic controls) which was put forward and initially characterized in the 2021 paper by Bowles et al. with the Neural Stem Cell Institute. The Bowles study performed transcriptomic and physiological characterization of over 6,000 cerebral organoids derived from three tau-V337M (MAPT gene) mutation carrier cell lines and respective isogenic CRISPR-corrected lines. In the context of this large-scale and largely untapped dataset, MAP-T Minder affords the user the ability to characterize transcriptional expression across different neuronal cell types, profile expression differences between FTD-Tau mutant cells versus controls, and map expression trajectories over developmental time.

More fundamentally, MAP-T Minder was built out of the goal of enhancing the accessibility of large-scale NGS data to both the biology researcher and general user for drawing biological insights. Our data-browsing functionalities allow the user to quickly and easily explore the hosted transcript data from multiple perspectives, and to tailor their view of the data to specific research focuses. A central and ongoing goal of this project is to present the neural stem cell and tauopathy research communities with a general tool for automated and user-customized analyses of newly generated transcriptomic datasets.

Feature Highlights

  • Gene Explorer View

    • Research questions addressed:
      • What are the expression trends of my gene of interest as presenting in FTD-Tau-affected organoid cells compared to isogenic controls? How this question is addressed.
      • What are the expression profiles of my gene of interest in each represented neuronal celltype? How this question is addressed.
      • For each of the above, what are the associated expression level distributions? How this question is addressed.
      • How do these patterns vary or persist over the timecourse of neuronal development? How this question is addressed.
  • CellType Explorer View

    • Research questions addressed:
      • What "marker genes" characterize each neuronal celltype represented in the dataset, based on significance testing of gene expression profiles in this celltype versus the rest? How this question is addressed.
      • To what extent (what percentage of cells) are those marker genes represented? How this question is addressed.
      • Which biological functions and pathways are associated with the genes enriched in my neuronal cell/tissue type of interest? How do those pathways rank against each other, in terms of enrichment significance? How this question is addressed.
      • How do the different neuronal cell types represented communicate with each other? What key signaling pathways are enriched in diseased cells, and how do the communication networks change over time?
  • Differential Expression Data Browser

    • Explore the disease profile manifesting in the data by browsing a summary of the factors found to differ significantly between the mutant and control conditions. Data view filterable with respect to any of: individual genes, celltype categories, or over time.
  • User-focused Design

    • Tabset-oriented organization of the app lends itself to intuitive feature navigation. Application features are categorized semantically into top-level views to aid in straightforward, user-driven querying and data browsability.

The Data

Primary (default) dataset: Single-cell RNA sequencing data obtained using the 10X platform and derived from human induced pluripotent stem cell (iPSC) cerebral organoids expressing tau-V337M and isogenic corrected controls at 2, 4 and 6 months of neurocortical development.

  • Data description:

    • Single-Cell RNAseq data originally generated in an experimental suite presented in Bowles et al. (2021). This data represents gene expression levels and cell attributes for a large body of 370,000 individual cells, encompassing a set of mutant cells with a mutation in the MAPT gene that is implicated in frontotemporal dementia (FTD), as well as the equivalent CRISPR-corrected control cells.
  • Associated data files:

    • FinalMergedData-downsampled.rds - a downsampled version of the full dataset presented in Bowles, stored as Seurat, and containing 100 cells per identity category (cell type). Used for plotting in select visualizations to give a representation of the dataset while maintaining plot rendering efficiency.
    • overall_celltype_props_data.rds - Dataframe (dim) of overall proportions of each celltype represented at each sequencing timepoint
    • FetchDataOutput-Allcells.rds - Full dataframe (dim) of cell-level metadata for the complete dataset of ~370,000 cells. Used for plotting dimensional reduction "feature" plots colored by cell type category.
    • DEgenes_MtvsWt_alltimepts_allcelltypes.csv - Dataframe (dim) of genes identified by differential expression testing with Seurat as being significantly expressed in V337M (FTD-tau mutant) cells versus controls, as calculated individually by timepoint and for each celltype, with associated significance scores and log fold change values. Used to populate Differential Expression Statistics panel.
    • average_expression_allcelltypes_timepts.rds - Dataframe (dim) of gene expression averages for the top 3000 most variable genes, at all unique combinations of 3 timepoints x 14 celltypes x 2 variants (V337M/V337V) represented in the full dataset. Used to populate gene trajectory line plots under Gene Explorer panel.
    • REVIGO_simMatrix_x.rds and REVIGO_simMatrix_x.rds - Plotting data furnishing revigo visualizations for each celltype. Used in CellType Explorer: CellType Marker Genes panel.
    • celltype_marker_genes_x.rds - Dataframe (dim) of genes found to be significantly enriched in each celltype (without respect to timepoint and/or variant), with associated significance scores and log fold change values.
    • CellChatDB.csv
    • cellchat.rds
    • net_pathway.csv

Implementation and Deployment

MAP-T Minder is an open source R project freely available with full documentation via this GitHub repository. This application is implemented using the R language and environment for statistical computing and graphics, incorporating best practices and using well-known packages whenever possible. R was chosen as our programming foundation for its powerful environment for statistical computing and graphics, as well as its ease of integration into front-end interactive user interfaces through the R Shiny platform.

Data Accessibility

All visualizations and analytics in MAP-T Minder can be adapted into other applications or analysis pipelines using the provided code and data. Generated visualizations and results may be downloaded from within the app; please consider citing this resource or Bowles et al. 2021 if obtained and redistributed. Our source data preparation and pre-processing is documented on our GitHub Wiki (coming soon). Data Loader scripts enable new data sources and preparations to be easily incorporated (coming soon).

Note on our mode of deployment.

Key Resources Employed

MAP-T Minder leverages the following analysis toolkits for its data-browsing functionalities:

  • Seurat: R Toolkit for Single-Cell Genomics
    • Hao and Hao et al. Integrated analysis of multimodal single-cell data. Cell (2021) [Seurat V4]
  • R Tidyverse
    • Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686.
  • G:profiler: R tool for Gene Ontology Functional Enrichment Analysis
    • Uku Raudvere, Liis Kolberg, Ivan Kuzmin, Tambet Arak, Priit Adler, Hedi Peterson, Jaak Vilo: g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update) Nucleic Acids Research 2019; doi:10.1093/nar/gkz369 [PDF].
  • Revigo: Gene Ontology Semantic Summarization
    • REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms. Supek F, Bošnjak M, Škunca N, Šmuc T (2011) REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms. PLOS ONE 6(7): e21800. https://doi.org/10.1371/journal.pone.0021800
  • MAST: Model-based Analysis of Single-Cell Transcriptomics
    • Finak, G., McDavid, A., Yajima, M. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol 16, 278 (2015). https://doi.org/10.1186/s13059-015-0844-5.
  • CellChatDB

Acknowledgements

MAP-T Minder was created by undergraduate and graduate students in the Data INCITE Lab at Rensselaer Polytechnic Institute, with generous support from the NIH and the Rensselaer Institute for Data Exploration and Applications (IDEA). This project was directed by Kristin P. Bennett, John S. Erickson, Keith Fraser, Sally Temple, Nathan Bowles, and Thomas Kiehl, with primary implementation by Rachael C. White and Haowen He.

The MAP-T Minder Team would like to thank all members of the Tau Consortium who provided initial feedback and ongoing momentum for this project.

How to Cite this App

Please use our GitHub link https://github.rpi.edu/DataINCITE/AlzApp/wiki/_new for now... we anticipate a formal publication release in the near future.