Skip to content

Data in Detail

White, Rachael edited this page Jul 22, 2022 · 1 revision
Clone this wiki locally

Primary (default) dataset: Single-cell RNA sequencing data obtained using the 10X platform and derived from human induced pluripotent stem cell (iPSC) cerebral organoids expressing tau-V337M and isogenic corrected controls at 2, 4 and 6 months of neurocortical development.

  • Data description:

    • Single-Cell RNAseq data originally generated in an experimental suite presented in Bowles et al. (2021). This data represents gene expression levels and cell attributes for a large body of 370,000 individual cells, encompassing a set of mutant cells with a mutation in the MAPT gene that is implicated in frontotemporal dementia (FTD), as well as the equivalent CRISPR-corrected control cells.
  • Associated data files:

    • FinalMergedData-downsampled.rds - a downsampled version of the full dataset presented in Bowles, stored as Seurat, and containing 100 cells per identity category (cell type). Used for plotting in select visualizations to give a representation of the dataset while maintaining plot rendering efficiency.
    • overall_celltype_props_data.rds - Dataframe (dim) of overall proportions of each celltype represented at each sequencing timepoint
    • FetchDataOutput-Allcells.rds - Full dataframe (dim) of cell-level metadata for the complete dataset of ~370,000 cells. Used for plotting dimensional reduction "feature" plots colored by cell type category.
    • DEgenes_MtvsWt_alltimepts_allcelltypes.csv - Dataframe (dim) of genes identified by differential expression testing with Seurat as being significantly expressed in V337M (FTD-tau mutant) cells versus controls, as calculated individually by timepoint and for each celltype, with associated significance scores and log fold change values. Used to populate Differential Expression Statistics panel.
    • average_expression_allcelltypes_timepts.rds - Dataframe (dim) of gene expression averages for the top 3000 most variable genes, at all unique combinations of 3 timepoints x 14 celltypes x 2 variants (V337M/V337V) represented in the full dataset. Used to populate gene trajectory line plots under Gene Explorer panel.
    • REVIGO_simMatrix_x.rds and REVIGO_simMatrix_x.rds - Plotting data furnishing revigo visualizations for each celltype. Used in CellType Explorer: CellType Marker Genes panel.
    • celltype_marker_genes_x.rds - Dataframe (dim) of genes found to be significantly enriched in each celltype (without respect to timepoint and/or variant), with associated significance scores and log fold change values.
    • CellChatDB.csv
    • cellchat.rds
    • net_pathway.csv