What is this tutorial for?

This tutorial describes how to perform the downstream analysis on single-cell nano CUT&Tag data (2 biological replicates, 3 modalities).

This tutorial assumes that the data have already been preprocessed, including demultiplexing of the different modalities, running cellranger and custom cell calling (see Set Up and Preprocessing).

Please note how in this tutorial we are describing how to perform the downstream analysis by using peaks, not bins. For tutorial on bins, see here.

In addition, note how the biological data here analysed are 2 biological replicates of the same sample, thus it is safe to merge the two samples without any kind of integration. Often experiments are designed in order to have multiple samples and/or biological conditions requiring data integration, differential analysis, etc. These are not part of this vignette, but will be implemented in other vignettes in the future.

The downstream analysis implemented in this vignette is composed of 6 steps:

        1. Data loading (peaks, metadata and fragments)
        2. Quality controls (QC)
        3. Merge Seurat objects
        4. Normalisation and dimensional reduction
        5. Non-linear dimension reduction and clustering
        6. Annotation

Where/how can I get the input data?

1. Data loading (peaks, metadata and fragments)

In this section we are going to load the input files for each modality and sample, create a common set of peaks for each sample, create a fragment object for each experiment, generate a peaks x cell matrix for each experiment and finally generate one Seurat object for experiment.

First, load all the required libraries.

library(Signac)
library(Seurat)
library(GenomicRanges)
library(future)
library(stringr)
library(ggplot2)
library(gghalves)
library(ggpubr)
library(EnsDb.Mmusculus.v79)
library(ComplexUpset)
library(regioneR)
library(scales)
library(ggVennDiagram)


1.1 Setting some initial variables

First, define some variables that will be used across the entire vignette. Modify them according to your data.

Extremely important:

  1. set as repodir the directory where the nanoscope Github repo is. This is required in order to upload all the custom functions needed to run this vignette
# directory where the nanoscope repo was cloned
repodir <- "~/Documents/GitHub/nanoscope/" # change accordingly
source(paste0(repodir,"scripts/functions_scCT.R"))
  1. set as wd the output root directory.
  • If you are following our tutorial, please place the nanoscope output directories sample_P23209 and sample_P24004 in a directory called result and set it as working directory wd <- "~/NatProt/nanoscope/results/"
  • If you have skipped the tutorial and directly downloaded the toy data from Zenodo, your wd will be /path/to/nanoscope_toy_data/results/
  • If you are running this vignette on your personal data please ensure that your wd is formatted as follow:
            - fragments.tsv.gz is expected to be in a folder with this path: wd/{sample_name}/{modality_barcode}/cellranger/outs/
            - metadata.csv is expected to be in a folder with this path: wd/{sample_name}/{modality_barcode}/cell_picking/
            -{modality}_peaks.broadPeak is expected to be in a folder with this path: wd/{sample_name}/{modality_barcode}/peaks/macs_broad/
            -Note that this vignette assumes the same modalities to have the same barcodes across the different samples

Please note that nothing but the directories containing the input data for each samples must be stored in wd

# directory where the nanoscope output files are stored
wd <- "~/NatProt/nanoscope/results" # change accordingly
setwd(wd)
  1. set other variables (genome version, assay type, etc)
# Set the used genome version
genome <- "mm10"
# Genome annotations
genome_ann <- EnsDb.Mmusculus.v79
# Assay (peaks or bins)
assay <- "peaks"
# Samples
samples <- dir()
# Modalities
modalities <- dir(samples[1]) # this assumes sample1, sample2, sampleN share the same modalities and barcodes!
# Minimum number of cells a feature must be detected in to be included in the Seurat obj
min_cell <- 1
# Minium number of features a cell must have to be included in the Seurat obj
min_feat <- 1


1.2 Loading peaks from different modalities and samples

Here we are going to load the peaks data for each modality, for each sample.
For convenience, files will be stored in a list.

# Read-in peak coordinates for each modality iterating across samples and modalities
input.ls <- list()
for (smpl in samples) {
  cat("Loading peaks for sample",smpl,"\n")
  for (mod in modalities) {
    cat("\t",mod)

    # get mod name withouth barcode
    mod2 <- strsplit(mod,"_")[[1]][1]

    # Load peaks called by macs for H3K27ac and H3K27me3 and by cellranger for ATAC
    if (mod2 != "ATAC") {
      cat("\t(macs peaks)\n")
      input.ls[[paste0(mod,"_",smpl)]] <- read.table(paste0(smpl,"/",mod,"/peaks/macs_broad/",mod2,"_peaks.broadPeak"))[1:3]
      colnames(input.ls[[paste0(mod,"_",smpl)]] ) <- c("chr", "start", "end")
    } else {
      cat("\t(cellranger peaks)\n")
      input.ls[[paste0(mod,"_",smpl)]] <- read.table(paste0(smpl,"/",mod,"/cellranger/outs/peaks.bed"))[1:3]
      colnames(input.ls[[paste0(mod,"_",smpl)]] ) <- c("chr", "start", "end")
    }

  }
}
## Loading peaks for sample sample_P23209 
##   ATAC_TATAGCCT  (cellranger peaks)
##   H3K27ac_ATAGAGGC   (macs peaks)
##   H3K27me3_CCTATCCT  (macs peaks)
## Loading peaks for sample sample_P24004 
##   ATAC_TATAGCCT  (cellranger peaks)
##   H3K27ac_ATAGAGGC   (macs peaks)
##   H3K27me3_CCTATCCT  (macs peaks)


1.3 Creating a common set of peaks

Since the peaks were identified independently in each experiment, it is unlikely they will perfectly overlap. We thus need to merge into unique intervals the overlapping peaks from the different samples. Please note how the overlapping peaks among different samples will be joined into single intervals, whereas the peaks found in only one, or a subset of, sample will be left unaltered and will be part of the so-called “common set of peaks”.

This will be done by following the Signac vignette.

Having uploaded the peaks for all the modalities and samples, we will convert them into genomic ranges and create a common set of peaks in each sample by using GenomicRanges::reduce function. GenomicRanges::reduce will merge all the overlapping peaks still maintaining in the “common set of peaks” the peaks found in only one sample, as shown below:

First, convert peaks to genomic ranges

input.ls <- lapply(input.ls,function(x){ makeGRangesFromDataFrame(x) })

(We are going to frequently use the lapply function. lapply applies a given R function to all the items (x) of a given list)

Then, merge all the overlapping peaks, still maintaining in the “common set of peaks” the peaks found in only one sample. This way it is safe to merge objects from different samples into a single Seurat object (see below Merge Seurat objects)

combined.peaks.ls <- list()

for (mod in modalities) {
  combined.peaks.ls[[mod]] <- reduce(x = c(input.ls[[paste0(mod,"_",samples[1])]],
                                           input.ls[[paste0(mod,"_",samples[2])]]))
  
}
combined.peaks.ls
## $ATAC_TATAGCCT
## GRanges object with 116675 ranges and 0 metadata columns:
##              seqnames          ranges strand
##                 <Rle>       <IRanges>  <Rle>
##        [1]       chr1 3094778-3095663      *
##        [2]       chr1 3333330-3334204      *
##        [3]       chr1 3367922-3368816      *
##        [4]       chr1 3399705-3400612      *
##        [5]       chr1 3466125-3467002      *
##        ...        ...             ...    ...
##   [116671] JH584304.1     52767-54323      *
##   [116672] JH584304.1     55003-55931      *
##   [116673] JH584304.1     59270-60187      *
##   [116674] JH584304.1     68483-69442      *
##   [116675] JH584295.1       1466-1970      *
##   -------
##   seqinfo: 25 sequences from an unspecified genome; no seqlengths
## 
## $H3K27ac_ATAGAGGC
## GRanges object with 70327 ranges and 0 metadata columns:
##             seqnames          ranges strand
##                <Rle>       <IRanges>  <Rle>
##       [1]       chr1 3042535-3043231      *
##       [2]       chr1 3325312-3325622      *
##       [3]       chr1 3368224-3368416      *
##       [4]       chr1 3412976-3413298      *
##       [5]       chr1 3514757-3515057      *
##       ...        ...             ...    ...
##   [70323] JH584304.1       491-44360      *
##   [70324] JH584304.1     48370-61344      *
##   [70325] JH584304.1    68621-114400      *
##   [70326] GL456393.1     40014-41631      *
##   [70327] JH584292.1     13043-14058      *
##   -------
##   seqinfo: 37 sequences from an unspecified genome; no seqlengths
## 
## $H3K27me3_CCTATCCT
## GRanges object with 57932 ranges and 0 metadata columns:
##             seqnames          ranges strand
##                <Rle>       <IRanges>  <Rle>
##       [1]       chr1 3361152-3362524      *
##       [2]       chr1 3667086-3678017      *
##       [3]       chr1 4412597-4416942      *
##       [4]       chr1 4425744-4431988      *
##       [5]       chr1 4487340-4509495      *
##       ...        ...             ...    ...
##   [57928] JH584294.1   154717-155488      *
##   [57929] JH584304.1       707-17155      *
##   [57930] JH584304.1     23562-31962      *
##   [57931] JH584304.1     59012-61030      *
##   [57932] JH584304.1    68573-114394      *
##   -------
##   seqinfo: 35 sequences from an unspecified genome; no seqlengths

We might also be interested in knowing how many of the peaks are shared between the samples, and how many are, instead, “private” of each sample. Keep always in mind that all of these peaks, common and private, are part of the “common set of peaks”.

upset_plot <- getUpsetPeaks(modalities = modalities, 
                            samples = samples, 
                            combined_peaks_ls = combined.peaks.ls, 
                            input_ls = input.ls)
upset_plot


1.4 Creating fragment objects

To quantify our combined set of peaks in each modality and sample, we need to create a fragment object for each experiment (i.e., an object holding all the information related to a single fragment file). To this end we will use the Signac function CreateFragmentObject which requires, for each experiment, the fragments.tsv.gz and the metadata.csv files.

Let’s start by loading the metadata data!

Load metadata files

##### Metadata
# Load metadata for each experiment iterating across the sample list
metadata.ls <- list()
for (smpl in samples) {
  cat("Loading metadata for sample",smpl,"\n")
  for (mod in modalities) {
    cat("\t",mod,"\n")

    # Load metadata
    metadata.ls[[paste0(mod,"_",smpl)]] <- read.csv(paste0(smpl,"/",mod,"/cell_picking/metadata.csv"),
                                                  stringsAsFactors = FALSE)
    rownames(metadata.ls[[paste0(mod,"_",smpl)]]) <- metadata.ls[[paste0(mod,"_",smpl)]]$barcode

 }
}
## Loading metadata for sample sample_P23209 
##   ATAC_TATAGCCT 
##   H3K27ac_ATAGAGGC 
##   H3K27me3_CCTATCCT 
## Loading metadata for sample sample_P24004 
##   ATAC_TATAGCCT 
##   H3K27ac_ATAGAGGC 
##   H3K27me3_CCTATCCT

Filtering-out low-quality cells

Here we are going to exclude the cells which have not passed the internal filtering (flagged as passedMB). This internal filtering takes into consideration the log10UMI and the percentage of reads in peaks for each cell and clusterizes the cells based on these values (by building a Gaussian finite mixture model fitted via EM algorithm via Mclust). This will cluster the cells in n clusters, but only the cells belonging to the top1 cluster with the highest log10UMI will be classified as passedMB=TRUE.

Before doing this, however, we would like to know how many cells we are filtering out:

plotPassed(metadata.ls,xaxis_text = 9,angle_x=60)

And which cells we are filtering out:

plotPassedCells(metadata.ls,samples,modalities)

In this case, you can see how the cells can be easily clustered in three groups: empty droplets, weird droplets and good cells and how only the good cells are selected. However, if you notice something weird, please, re-do the cell selection by applying some hard cutoffs.

Once we have classified the good cells as passedMB=TRUE, we can proceed with the selection of the good cells only, discarding all the others from further analyses:

metadata.ls <- lapply(metadata.ls, function(x){x[x$passedMB,]})


Load fragments files and creation of fragment objects

Having loaded the metadta files, we can now create the fragment objects for each experiment.
As explained in the Signac vignette, the CreateFragmentObject function checks that the file is present on disk, it is compressed and indexed, computes the MD5 sum for the file and the tabix index so that we can tell if the file is modified at any point, and checks that the expected cells (metadata.csv) are present in the file.

##### Fragments
# Create one fragment object per each experiment iterating across the sample list
fragment.ls <- list()
for (smpl in samples) {
  cat("Loading peaks for sample",smpl,"\n")
  for (mod in modalities) {
    cat("\t",mod,"\n")

    # Load fragments
    fragment.ls[[paste0(mod,"_",smpl)]] <- CreateFragmentObject(
                                              path = paste0(smpl,"/",mod,"/cellranger/outs/fragments.tsv.gz"),
                                              cells = metadata.ls[[paste0(mod,"_",smpl)]]$barcode)

  }
}
## Loading peaks for sample sample_P23209 
##   ATAC_TATAGCCT 
##   H3K27ac_ATAGAGGC 
##   H3K27me3_CCTATCCT 
## Loading peaks for sample sample_P24004 
##   ATAC_TATAGCCT 
##   H3K27ac_ATAGAGGC 
##   H3K27me3_CCTATCCT
head(fragment.ls[[paste0("ATAC_TATAGCCT_",samples[1])]])
chrom start end barcode readCount
chr1 3003932 3004576 GCCTAGGCAGTTACAC-1 1
chr1 3003985 3004021 CTACTTAAGTTCAGGG-1 7
chr1 3005376 3005689 CCAGATACAGCGTCGT-1 3
chr1 3005764 3005913 CAAGCTAGTTTGCATG-1 1
chr1 3005861 3006012 TCTCTGGTCCAGAGAG-1 1
chr1 3006223 3006256 AACAGTCCATCTGCAA-1 3


1.5 Quantifying peaks in each experiment

Now that each experiment has been associated to a fragment object containing fragments and metadata, we can create a peaks x cell matrix for each experiment. This will be done by using the Signac function FeatureMatrix.
FeatureMatrix basically constructs a peaks x cell matrix by parsing: i) fragments, ii) peaks and iii) metadata. Please, note how the features here used to build the peaks x cell matrix are the common set of peaks for each modalities identified in the previous step (Creating a common set of peaks).

To speed-up the process, we launch FeatureMatrix with process_n = 20000. This runs OK on our 64GB MacBook Pro, but could go out-of-memory on other systems. If this happens, please, decrease the process_n value (2000 should be fine for every system).

# As usual, append the object to a list
counts.ls <- list()
# Iterate among the experiment names in order to retrieve fragments and cell barcodes from the same experiment
for (experim in names(fragment.ls)) {
  
  cat("Analysing experiment: ",experim,"\n")
  
  # modality name
  modal <- paste0(str_split_fixed(experim,"_",4)[,1],"_",str_split_fixed(experim,"_",4)[,2])
  cat("\tPeaks from modality: ",modal,"\n")

  # Create FeatureMatrix
  counts.ls[[experim]] <- FeatureMatrix(fragments = fragment.ls[[experim]],           # fragments
                                        features = combined.peaks.ls[[modal]],        # modality common set of peaks
                                        cells = metadata.ls[[experim]]$barcode,       # metadata
                                        process_n = 20000) # the higher, the faster, the more memory is used

}
## Analysing experiment:  ATAC_TATAGCCT_sample_P23209 
##  Peaks from modality:  ATAC_TATAGCCT 
## Analysing experiment:  H3K27ac_ATAGAGGC_sample_P23209 
##  Peaks from modality:  H3K27ac_ATAGAGGC 
## Analysing experiment:  H3K27me3_CCTATCCT_sample_P23209 
##  Peaks from modality:  H3K27me3_CCTATCCT 
## Analysing experiment:  ATAC_TATAGCCT_sample_P24004 
##  Peaks from modality:  ATAC_TATAGCCT 
## Analysing experiment:  H3K27ac_ATAGAGGC_sample_P24004 
##  Peaks from modality:  H3K27ac_ATAGAGGC 
## Analysing experiment:  H3K27me3_CCTATCCT_sample_P24004 
##  Peaks from modality:  H3K27me3_CCTATCCT
counts.ls[[paste0(modal,"_",samples[1])]][1:10,1:25]
## 10 x 25 sparse Matrix of class "dgCMatrix"
##                                                                         
## chr1-3361152-3362524 . . . . .  . . . . . . . . . . . . . . . .  . . . .
## chr1-3667086-3678017 2 1 . . .  . . . . . . . . . . 2 . 2 . . 2  . . . 3
## chr1-4412597-4416942 . . . . .  . . . . . . . . 1 . . . . . . .  . . . 1
## chr1-4425744-4431988 . . . . .  . . . . . . . . . . . . . . . .  . . . .
## chr1-4487340-4509495 8 . 2 . 5 10 1 . . . . . . 1 . 1 . . . . 4 13 2 2 7
## chr1-4514289-4515435 1 . . . .  . . . . . . . . 1 . . . . . . .  . . . 2
## chr1-4568747-4578581 1 . . . 1  . . . . . 1 . . . . . . . . . .  1 . . .
## chr1-4592282-4595271 . . . . .  1 . . . . . . . . . . . . . . .  . . . .
## chr1-4687932-4689799 . . . . .  . . . . . . . . . . . . . . . .  . . . .
## chr1-4785566-4794296 . . . . .  . . . . . . . . . . . . . . . .  . . . .


1.6 Create one Seurat object for each experiment

Now that we have generated a peaks x cell matrix for each experiment, we can store it in a Seurat object along with the fragment object and the metadata.

This is done for each experiment separately, objects will be merged in the next step. The reason for this is that we would like to first perform QC on each experiment individually and then merge all the experiments in a single Seurat object.

obj.ls <- list()
# Iterate among the experiment names in order to retrieve fragments and cell barcodes from the same experiment
for (experim in names(counts.ls)) {

  # Get sample and modality name from the experiment variable
  smpl <- str_split_fixed(experim,"_",3)[,3]
  modality <- str_split_fixed(experim,"_",2)[,1]

  # Create the chromatin assay
  chrom.assay <- CreateChromatinAssay(counts = counts.ls[[experim]],
                                      fragments = fragment.ls[[experim]],
                                      genome = genome,
                                      min.cells = min_cell,
                                      min.features = min_feat)

  # Create the object
  obj.ls[[experim]] <- CreateSeuratObject(counts = chrom.assay,
                                          assay = assay,
                                          meta.data=metadata.ls[[experim]],
                                          project = smpl)

  # Add some metadata of origin
  obj.ls[[experim]]$dataset <- experim
  obj.ls[[experim]]$modality <- modality
  obj.ls[[experim]]$sample <- smpl

}


2. Quality controls (QC) OPTIONAL

Before merging the objects, we are going to do some QC on the individual experiments. In particular, we are going to drop cells with abnormal number of UMIs and too low percentage of UMIs overlapping peaks. To define what is “abnormal” and what is not, we will use quantiles. Everything greater than the 95th or less than the 5th quantiles will be here considered as “abnormal” and therefore discarded.

Please, consider how applying these filters is optional and the thresholds used should be specifically set for each dataset as they usually are highly sample-specific. The thresholds we are applying here are quite harsh and, if applied to other datasets, might lead to removal of important biological information.

Please note how we are not here applying any filtering on the TSS enrichment score as while this might be relevant for ATAC it is probably not for H3K27ac and H3K27me3. In case, however, you would like to apply this filtering we recommend to follow Signac’s tutorial on this.

obj.ls.qc <- list()
quant_high <- 0.99
quant_low <- 0.01
for (experiment in names(obj.ls)) {

  # Calculate 5th and 95th quantiles for logUMI
  logUMI_cutoff_high <- quantile(obj.ls[[experiment]]$logUMI, quant_high)
  logUMI_cutoff_low  <- quantile(obj.ls[[experiment]]$logUMI, quant_low)
  # Calculate 5th quantile for logUMI for % fragments in peaks (here indicated by peak_ratio_MB)
  peak_ratio_MB_cutoff_low  <- quantile(obj.ls[[experiment]]$peak_ratio_MB, quant_low)

  # Filter out cells outside these thresholds
  obj.ls.qc[[experiment]] <- subset(obj.ls[[experiment]],
                                    logUMI > logUMI_cutoff_low &
                                      logUMI < logUMI_cutoff_high &
                                      peak_ratio_MB > peak_ratio_MB_cutoff_low)

  # print number of discarded cells
  old_n_cell <- nrow(obj.ls[[experiment]][[]])
  new_n_cell <- nrow(obj.ls.qc[[experiment]][[]])
  discarded <- old_n_cell - new_n_cell
  cat(experiment,"\n")
  cat("\tdiscarded",discarded,"cells (",round(discarded/old_n_cell*100,2),"%)\n")
}
## ATAC_TATAGCCT_sample_P23209 
##  discarded 81 cells ( 3.01 %)
## H3K27ac_ATAGAGGC_sample_P23209 
##  discarded 80 cells ( 3.01 %)
## H3K27me3_CCTATCCT_sample_P23209 
##  discarded 77 cells ( 2.86 %)
## ATAC_TATAGCCT_sample_P24004 
##  discarded 83 cells ( 2.95 %)
## H3K27ac_ATAGAGGC_sample_P24004 
##  discarded 85 cells ( 3 %)
## H3K27me3_CCTATCCT_sample_P24004 
##  discarded 85 cells ( 3.01 %)
plotCounts(obj = obj.ls, quantiles = c(quant_low,quant_high), feature = "logUMI")

plotCounts(obj = obj.ls, quantiles = c(quant_low), feature = "peak_ratio_MB",ylabel = "% UMI in peaks")

rm(obj.ls)


Number of cells for each sample and modality

Now we can plot, for each sample, the number of cells being simultaneously profiled by all the three modalities. Since we will annotate the cells based on their H3K27ac signal (see 6. Annotation), all the cells lacking the H3K27ac profiling will result as not annotated at the end of the processing of the data. So it is important that the majority of the cells have H3K27ac profiling and that, in general, there is a good overlap among the different modalities.

# Plot number of cells for each modality for each sample
venn1 <- commonCellHistonMarks(mod1 = obj.ls.qc[[paste0(modalities[1],"_",samples[1])]], 
                               name_mod1 = strsplit(modalities[1],"_")[[1]][1],
                               mod2 = obj.ls.qc[[paste0(modalities[2],"_",samples[1])]], 
                               name_mod2 = gsub("H3","",strsplit(modalities[2],"_")[[1]][1]),
                               mod3 = obj.ls.qc[[paste0(modalities[3],"_",samples[1])]], 
                               name_mod3 = gsub("H3","",strsplit(modalities[3],"_")[[1]][1]),
                               sample = samples[1] )
venn2 <- commonCellHistonMarks(mod1 = obj.ls.qc[[paste0(modalities[1],"_",samples[2])]], 
                               name_mod1 = strsplit(modalities[1],"_")[[1]][1],
                               mod2 = obj.ls.qc[[paste0(modalities[2],"_",samples[2])]], 
                               name_mod2 = gsub("H3","",strsplit(modalities[2],"_")[[1]][1]),
                               mod3 = obj.ls.qc[[paste0(modalities[3],"_",samples[2])]], 
                               name_mod3 = gsub("H3","",strsplit(modalities[3],"_")[[1]][1]),
                               sample = samples[2] )
ggarrange(venn1,venn2,ncol=2)

3. Merge Seurat objects

Now that we have one object for experiment containing only QC-passed cells, and that all the experiments share a common set of peaks, we can merge all the objects from different samples obtaining one Seurat object for modality.

Merge Seurat objects across different samples

combined.obj.ls <- list()
for (mod in modalities) {

  # merge objects among the different samples
  combined.obj.ls[[mod]] <- merge(obj.ls.qc[[paste0(mod,"_",samples[1])]],
                                  y = obj.ls.qc[[paste0(mod,"_",samples[2])]],
                                  add.cell.ids = c(samples[1], samples[2]) )

}
combined.obj.ls[["ATAC_TATAGCCT"]][["peaks"]]
## ChromatinAssay data with 116675 features for 5345 cells
## Variable features: 0 
## Genome: mm10 
## Annotation present: FALSE 
## Motifs present: FALSE 
## Fragment files: 2
combined.obj.ls[["H3K27ac_ATAGAGGC"]][["peaks"]]
## ChromatinAssay data with 69613 features for 5324 cells
## Variable features: 0 
## Genome: mm10 
## Annotation present: FALSE 
## Motifs present: FALSE 
## Fragment files: 2
combined.obj.ls[["H3K27me3_CCTATCCT"]][["peaks"]]
## ChromatinAssay data with 57176 features for 5358 cells
## Variable features: 0 
## Genome: mm10 
## Annotation present: FALSE 
## Motifs present: FALSE 
## Fragment files: 2


4. Normalisation and dimensional reduction

After having filter out low quality cells and having generated one Seurat object per each modality, we can proceed with normalisation and dimensionality reduction. Once again, we will follow Signac vignette.

# Normalisation
combined.obj.ls <- lapply(combined.obj.ls,RunTFIDF)
# a warning (Some features contain 0 total counts) might rise. This is due to the cell filtering performed in the QC step. Should be safe to proceed

# Variable features
combined.obj.ls <- lapply(combined.obj.ls,FindTopFeatures)

# Dim reduction
combined.obj.ls <- lapply(combined.obj.ls,RunSVD)

It usually happens that the first LSI (you can think it as a principal component, if you are more familiar with scrna-seq) is mostly associated to the sequencing depth, rather than biological variation. We will test this by running a modified version of the DepthCor function. The modifications simply consists in plotting the modality name as plot title.

plot.list <- lapply(combined.obj.ls,DepthCorMulMod)
ggarrange(plot.list[[1]],plot.list[[2]],plot.list[[3]],ncol=3)

Here, the correlation between the first LSI and sequencing depth is quite strong. We will discard it from further analyses.

Then, the dimensionality of the dataset should be determined. Although more sophisticated (and time consuming) methods have been implemented (like the JackStraw procedure), we think that the ‘Elbow plot’ could be informative enough. In the ‘Elbow plot’ the standard deviation explained by each LSI is plotted. When the the plateau is reached, it means that adding more LSI to our analysis does not actually increase the variability explained.

plot.list_elbow <- lapply(combined.obj.ls,ElbowPlot,reduction = "lsi",ndims = 50)
ggarrange(plot.list_elbow[[1]],plot.list_elbow[[2]],plot.list_elbow[[3]],ncol=3)

In this case, we believe that considering the first 40 LSI should be definitely more than enough for all the three modalities (probably even less than 40 should be fine).

5. Non-linear dimension reduction and clustering

Now that we have normalised and found variable features in the data, we can perform graph-based clustering and non-linear dimension reduction for visualization. This is done individually for each modality as different modalities might require different parameters (e.g., number of PCs, clustering resolution, ..).

To find the most appropriate parameter we have used, and suggest using, clustertree.

# Run dimension reduction and clustering on each modality individually

# ATAC
combined.obj.ls$ATAC_TATAGCCT <- RunUMAP(combined.obj.ls$ATAC_TATAGCCT,reduction = 'lsi', dims = 2:40)
combined.obj.ls$ATAC_TATAGCCT <- FindNeighbors(combined.obj.ls$ATAC_TATAGCCT,reduction = 'lsi', dims = 2:40)
combined.obj.ls$ATAC_TATAGCCT <- FindClusters(combined.obj.ls$ATAC_TATAGCCT, verbose = FALSE, algorithm = 3,resolution = .5)
# algorithm = 3 means SLM algorithm is used for modularity optimization

# H3K27ac
combined.obj.ls$H3K27ac_ATAGAGGC <- RunUMAP(combined.obj.ls$H3K27ac_ATAGAGGC,reduction = 'lsi', dims = 2:40)
combined.obj.ls$H3K27ac_ATAGAGGC <- FindNeighbors(combined.obj.ls$H3K27ac_ATAGAGGC,reduction = 'lsi', dims = 2:40)
combined.obj.ls$H3K27ac_ATAGAGGC <- FindClusters(combined.obj.ls$H3K27ac_ATAGAGGC, verbose = FALSE, algorithm = 3,resolution = .8)

# H3K27me3
combined.obj.ls$H3K27me3_CCTATCCT <- RunUMAP(combined.obj.ls$H3K27me3_CCTATCCT,reduction = 'lsi', dims = 2:40)
combined.obj.ls$H3K27me3_CCTATCCT <- FindNeighbors(combined.obj.ls$H3K27me3_CCTATCCT,reduction = 'lsi', dims = 2:40)
combined.obj.ls$H3K27me3_CCTATCCT <- FindClusters(combined.obj.ls$H3K27me3_CCTATCCT, verbose = FALSE, algorithm = 3,resolution = .8)
# Plot each modality separately
p1=DimPlot(combined.obj.ls$ATAC_TATAGCCT, label = TRUE) + NoLegend() +
  ggtitle("ATAC") +
  theme(plot.title = element_text(size=15,hjust=0.5,face='bold'))

p2=DimPlot(combined.obj.ls$H3K27ac_ATAGAGGC, label = TRUE) + NoLegend() +
  ggtitle("H3K27ac") +
  theme(plot.title = element_text(size=15,hjust=0.5,face='bold'))

p3=DimPlot(combined.obj.ls$H3K27me3_CCTATCCT, label = TRUE) + NoLegend() +
  ggtitle("H3K27me3") +
  theme(plot.title = element_text(size=15,hjust=0.5,face='bold'))

ggarrange(p1,p2,p3,ncol=3)


6. Annotation

Now that we have identified the different cell states, we need to annotate them, assigning to each of them a biological identity. To this end, we will estimate the activity of each gene by quantifying the number of fragments mapping to its gene body and 2kb upstream the TSS. This will be done by using the Signac function GeneActivity.

Of course, this has to be done on modalities marking transcriptionally active regions. In our case either ATAC or H3K27ac. We believe that H3K27ac is the active modality which might best predict gene activity, thus we will calculate the gene activity scores on nano CUT&Tag H3K27ac modality.

Once again, for most of these processes, we will here follow the Signac vignette.

6.1 Add gene annotations to the Seurat object

Before creating a gene activity matrix we first need to add gene annotations to the H3K27ac Seurat object.

# get gene annotations from Ensembl
annotations <- GetGRangesFromEnsDb(ensdb = genome_ann)

# change to UCSC style since the data was mapped to mm10
seqlevelsStyle(annotations) <- 'UCSC'

# add the gene information to the object
Annotation(combined.obj.ls$H3K27ac_ATAGAGGC) <- annotations


6.2 Create gene activity matrix

Having added the gene annotations, we can calculate the gene activity. This is done by the Signac GeneActivity function that: i) extracts gene coordinates and extends them to include the 2 kb upstream region and ii) counts the number of fragments for each cell that map to each of these regions. Finally, the gene activities are added to the Seurat object as ‘RNA’

# Calculate gene activities
gene.activities <- GeneActivity(combined.obj.ls$H3K27ac_ATAGAGGC)

# Add the gene activity matrix to the Seurat object as a new assay
combined.obj.ls$H3K27ac_ATAGAGGC[['RNA']] <- CreateAssayObject(counts = gene.activities)

# normalize RNA data
combined.obj.ls$H3K27ac_ATAGAGGC <- NormalizeData(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  assay = 'RNA',
  normalization.method = 'LogNormalize',
  scale.factor = median(combined.obj.ls$H3K27ac_ATAGAGGC$nCount_RNA)
)

# Set RNA as default assay
DefaultAssay(combined.obj.ls$H3K27ac_ATAGAGGC) <- 'RNA'


6.3 Visualisation of gene activity for some known marker genes

By plotting the activities of known canonical marker genes we can get some hints on the biological identities of most of the H3K27ac clusters. Although the signal might sometimes be a bit noisy, already by plotting the gene activity scores we can discriminate some cell states such as astrocytes (telencephalon and non-telencephalon), vascular cells (endothelial, smooth muscle, leptomeningeal cells), oligodendrocytes, neurons (excitatory and inhibitory), microglia, macrophages, Bergmann glia, olfactory ensheating cells and choroid plexus cells.

(These atlases are a great resource to manually annotate your favorite cell state!)

For conciseness, we here opted for a manual annotation, but the probably best practice would be to integrate one of the active nanoCUT&Tag modalities with scRNA-seq data. If interested in doing so, please follow this Signac tutorial.

Astrocytes telencephalon - AST-TE (clusters 0, 7)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Foxg1','Lhx2','Mfge8'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Vascular endothelial cells - VEC (cluster 1)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Gpr116','Cldn5','Emcn'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Astrocytes non-telencephalon - AST-NT (cluster 2)

FeaturePlot(object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Irx2','Agt','Slc6a9'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Excitatory neurons - EXC (clusters 3, 19, 21)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Neurod1','Neurod6'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 2,
  label = T)


Oligodendrocytes - OL (cluster 4)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Mag','Mbp'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Microglia - MGL (cluster 5)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('P2ry12','Ccr6','Selplg'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Inhibitory neurons - INH (clusters 6, 9, 16, 17, 18)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Gad1','Slc32a1'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 2,
  label = T)


Bergmann glia - BG (cluster 8)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('A2m','Gdf10','Hopx'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Vascular smooth muscle cells - VSMC (cluster 10)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Map3k7cl','Myh11','Acta2'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Oligodendrocyte precursor cells - OPC (cluster 11)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Pdgfra','Sox10','Olig1'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Macrophages - MAC (cluster 12)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('F13a1','Cd74'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 2,
  label = T)


Olfactory ensheating cells - OEC (cluster 13)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Frzb','Mgst1','Prss56'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)


Vascular leptomeningeal cells - VLMC (cluster 14)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Dcn','Slc6a13'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 2,
  label = T)


Choroid plexus - CHP (15)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Clic6','Folr1'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 2,
  label = T)


Ependymal cells - EPE (20)

FeaturePlot(
  object = combined.obj.ls$H3K27ac_ATAGAGGC,
  features = c('Foxj1','Pifo','Dynlrb2'),
  pt.size = 0.1,
  max.cutoff = 'q95',
  ncol = 3,
  label = T)

6.4 Final annotation

Now we are ready to annotate the clusters! We are going to generate here two layers of annotation, one more general and one more specific.

Layer 2 (cell-state-specific)

##### Annotation layer2
# create a named annotation vector
l2.ids <- c("0"="AST-TE1","1"="VEC","2"="AST-NTE","3"="EXC1","4"="OL",
            "5"="MGL","6"="INH1","7"="AST-TE2","8"="BG","9"="INH2",
            "10"="VSMC","11"="OPC","12"="MAC","13"="OEC","14"="VLMC",
            "15"="CHP","16"="INH3","17"="INH4","18"="INH5","19"="EXC2",
            "20"="EPE","21"="EXC3")
names(l2.ids) <- levels(combined.obj.ls$H3K27ac_ATAGAGGC)

# rename the cluster identities with the layer2 annotations
combined.obj.ls$H3K27ac_ATAGAGGC <- RenameIdents(combined.obj.ls$H3K27ac_ATAGAGGC,l2.ids)

# Add layer2 annotations to the object
combined.obj.ls$H3K27ac_ATAGAGGC <- AddMetaData(combined.obj.ls$H3K27ac_ATAGAGGC,
                                                Idents(combined.obj.ls$H3K27ac_ATAGAGGC),"layer2_annotation")

# UMAP plot
p1=DimPlot(combined.obj.ls$H3K27ac_ATAGAGGC, label = TRUE, repel = T) + NoLegend() +
  ggtitle("H3K27ac") +
  theme(plot.title = element_text(size=15,hjust=0.5,face='bold'))

Layer 1 (cell-type-specific)

##### General layer, L1
# create a named annotation vector
l1.ids <- c("AST-TE1"="Astroependymal","VEC"="Vascular","AST-NTE"="Astroependymal",
            "EXC1"="Neurons","OL"="Oligodendrocytes","MGL"="Immune","INH1"="Neurons",
            "AST-TE2"="Astroependymal","BG"="Astroependymal","INH2"="Neurons",
            "VSMC"="Vascular","OPC"="Oligodendrocytes","MAC"="Immune",
            "OEC"="OEC","VLMC"="Vascular","CHP"="Astroependymal","INH3"="Neurons",
            "INH4"="Neurons","INH5"="Neurons","EXC2"="Neurons","EPE"="Astroependymal","EXC3"="Neurons")



names(l1.ids) <- levels(combined.obj.ls$H3K27ac_ATAGAGGC)

# rename the cluster identities with the layer1 annotations
combined.obj.ls$H3K27ac_ATAGAGGC <- RenameIdents(combined.obj.ls$H3K27ac_ATAGAGGC,l1.ids)

# Add layer1 annotations to the object
combined.obj.ls$H3K27ac_ATAGAGGC <- AddMetaData(combined.obj.ls$H3K27ac_ATAGAGGC,
                                                Idents(combined.obj.ls$H3K27ac_ATAGAGGC),"layer1_annotation")

# UMAP plot
p2=DimPlot(combined.obj.ls$H3K27ac_ATAGAGGC, label = F) +
  ggtitle("H3K27ac") +
  theme(plot.title = element_text(size=15,hjust=0.5,face='bold'))


Finally, transfer the layer1 and layer2 annotations from the H3K27ac to the H3K27me3 and ATAC modalities.

# H3K27ac annotated cell states
idents_l1 <- combined.obj.ls$H3K27ac_ATAGAGGC$layer1_annotation
idents_l2 <- combined.obj.ls$H3K27ac_ATAGAGGC$layer2_annotation


# Transfer to ATAC and H3K27me3

# layer1
combined.obj.ls$ATAC_TATAGCCT <- AddMetaData(combined.obj.ls$ATAC_TATAGCCT,
                                             idents_l1,
                                             col.name='layer1_annotation')
combined.obj.ls$H3K27me3_CCTATCCT <- AddMetaData(combined.obj.ls$H3K27me3_CCTATCCT,
                                                 idents_l1,
                                                 col.name='layer1_annotation')

# layer2
combined.obj.ls$ATAC_TATAGCCT <- AddMetaData(combined.obj.ls$ATAC_TATAGCCT,
                                             idents_l2,
                                             col.name='layer2_annotation')
combined.obj.ls$H3K27me3_CCTATCCT <- AddMetaData(combined.obj.ls$H3K27me3_CCTATCCT,
                                                 idents_l2,
                                                 col.name='layer2_annotation')
# Plot each modality separately
p1=DimPlot(combined.obj.ls$ATAC_TATAGCCT, label = TRUE,repel = T, group.by = "layer2_annotation", label.size = 3) + 
  NoLegend() +
  ggtitle("ATAC") +
  theme(plot.title = element_text(size=15,hjust=0.5,face='bold'))

p2=DimPlot(combined.obj.ls$H3K27ac_ATAGAGGC, label = TRUE,repel = T, group.by = "layer2_annotation", label.size = 3) + 
  NoLegend() +
  ggtitle("H3K27ac") +
  theme(plot.title = element_text(size=15,hjust=0.5,face='bold'))

p3=DimPlot(combined.obj.ls$H3K27me3_CCTATCCT, label = TRUE,repel = T, group.by = "layer2_annotation", label.size = 3) + 
  NoLegend() +
  ggtitle("H3K27me3") +
  theme(plot.title = element_text(size=15,hjust=0.5,face='bold'))

ggarrange(p1,p2,p3,ncol=3)

plotConnectModal(seurat = combined.obj.ls, group = 'layer1_annotation')

saveRDS(combined.obj.ls, file = "../nanoscope_final_peaks.rds")


7. Session info

Session Info
sessionInfo()
## R version 4.2.2 (2022-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Monterey 12.5.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] patchwork_1.1.2            ggbio_1.46.0              
##  [3] ggVennDiagram_1.2.2        scales_1.2.1              
##  [5] regioneR_1.30.0            ComplexUpset_1.3.5        
##  [7] EnsDb.Mmusculus.v79_2.99.0 ensembldb_2.22.0          
##  [9] AnnotationFilter_1.22.0    GenomicFeatures_1.50.4    
## [11] AnnotationDbi_1.60.2       Biobase_2.58.0            
## [13] ggpubr_0.6.0               gghalves_0.1.4            
## [15] ggplot2_3.4.2              stringr_1.5.0             
## [17] future_1.32.0              GenomicRanges_1.50.2      
## [19] GenomeInfoDb_1.34.9        IRanges_2.32.0            
## [21] S4Vectors_0.36.2           BiocGenerics_0.44.0       
## [23] SeuratObject_4.1.3         Seurat_4.3.0              
## [25] Signac_1.9.0              
## 
## loaded via a namespace (and not attached):
##   [1] rappdirs_0.3.3              rtracklayer_1.58.0         
##   [3] scattermore_1.1             GGally_2.1.2               
##   [5] tidyr_1.3.0                 bit64_4.0.5                
##   [7] knitr_1.42                  irlba_2.3.5.1              
##   [9] DelayedArray_0.24.0         data.table_1.14.8          
##  [11] rpart_4.1.19                KEGGREST_1.38.0            
##  [13] RCurl_1.98-1.12             generics_0.1.3             
##  [15] cowplot_1.1.1               RSQLite_2.3.1              
##  [17] RANN_2.6.1                  proxy_0.4-27               
##  [19] bit_4.0.5                   spatstat.data_3.0-1        
##  [21] xml2_1.3.4                  httpuv_1.6.11              
##  [23] SummarizedExperiment_1.28.0 xfun_0.39                  
##  [25] hms_1.1.3                   jquerylib_0.1.4            
##  [27] evaluate_0.21               promises_1.2.0.1           
##  [29] fansi_1.0.4                 restfulr_0.0.15            
##  [31] progress_1.2.2              dbplyr_2.3.2               
##  [33] igraph_1.4.1                DBI_1.1.3                  
##  [35] htmlwidgets_1.6.2           reshape_0.8.9              
##  [37] spatstat.geom_3.2-1         purrr_1.0.1                
##  [39] ellipsis_0.3.2              dplyr_1.1.2                
##  [41] backports_1.4.1             biomaRt_2.54.1             
##  [43] deldir_1.0-9                MatrixGenerics_1.10.0      
##  [45] vctrs_0.6.2                 ROCR_1.0-11                
##  [47] abind_1.4-5                 cachem_1.0.8               
##  [49] withr_2.5.0                 RVenn_1.1.0                
##  [51] BSgenome_1.66.3             progressr_0.13.0           
##  [53] checkmate_2.2.0             sctransform_0.3.5          
##  [55] GenomicAlignments_1.34.1    prettyunits_1.1.1          
##  [57] goftest_1.2-3               cluster_2.1.4              
##  [59] lazyeval_0.2.2              crayon_1.5.2               
##  [61] spatstat.explore_3.2-1      units_0.8-2                
##  [63] pkgconfig_2.0.3             labeling_0.4.2             
##  [65] nlme_3.1-162                ProtGenerics_1.30.0        
##  [67] nnet_7.3-19                 rlang_1.1.1                
##  [69] globals_0.16.2              lifecycle_1.0.3            
##  [71] miniUI_0.1.1.1              filelock_1.0.2             
##  [73] BiocFileCache_2.6.1         dichromat_2.0-0.1          
##  [75] polyclip_1.10-4             matrixStats_0.63.0         
##  [77] lmtest_0.9-40               graph_1.76.0               
##  [79] Matrix_1.5-4.1              carData_3.0-5              
##  [81] zoo_1.8-12                  base64enc_0.1-3            
##  [83] ggridges_0.5.4              png_0.1-8                  
##  [85] viridisLite_0.4.2           rjson_0.2.21               
##  [87] bitops_1.0-7                KernSmooth_2.23-21         
##  [89] Biostrings_2.66.0           blob_1.2.4                 
##  [91] classInt_0.4-9              parallelly_1.35.0          
##  [93] spatstat.random_3.1-5       rstatix_0.7.2              
##  [95] ggsignif_0.6.4              memoise_2.0.1              
##  [97] magrittr_2.0.3              plyr_1.8.8                 
##  [99] ica_1.0-3                   zlibbioc_1.44.0            
## [101] compiler_4.2.2              BiocIO_1.8.0               
## [103] RColorBrewer_1.1-3          fitdistrplus_1.1-11        
## [105] Rsamtools_2.14.0            cli_3.6.1                  
## [107] XVector_0.38.0              listenv_0.9.0              
## [109] pbapply_1.7-0               htmlTable_2.4.1            
## [111] Formula_1.2-5               MASS_7.3-60                
## [113] tidyselect_1.2.0            stringi_1.7.12             
## [115] highr_0.10                  yaml_2.3.7                 
## [117] ggrepel_0.9.3               grid_4.2.2                 
## [119] sass_0.4.6                  VariantAnnotation_1.44.1   
## [121] fastmatch_1.1-3             tools_4.2.2                
## [123] future.apply_1.11.0         parallel_4.2.2             
## [125] rstudioapi_0.14             foreign_0.8-84             
## [127] gridExtra_2.3               farver_2.1.1               
## [129] Rtsne_0.16                  digest_0.6.31              
## [131] BiocManager_1.30.20         shiny_1.7.4                
## [133] Rcpp_1.0.10                 car_3.1-2                  
## [135] broom_1.0.4                 later_1.3.1                
## [137] RcppAnnoy_0.0.20            OrganismDbi_1.40.0         
## [139] httr_1.4.6                  biovizBase_1.46.0          
## [141] sf_1.0-12                   colorspace_2.1-0           
## [143] XML_3.99-0.14               tensor_1.5                 
## [145] reticulate_1.28             splines_4.2.2              
## [147] uwot_0.1.14                 RBGL_1.74.0                
## [149] RcppRoll_0.3.0              spatstat.utils_3.0-3       
## [151] sp_1.6-0                    plotly_4.10.1              
## [153] xtable_1.8-4                jsonlite_1.8.4             
## [155] R6_2.5.1                    Hmisc_5.1-0                
## [157] pillar_1.9.0                htmltools_0.5.5            
## [159] mime_0.12                   glue_1.6.2                 
## [161] fastmap_1.1.1               BiocParallel_1.32.6        
## [163] class_7.3-22                codetools_0.2-19           
## [165] utf8_1.2.3                  lattice_0.21-8             
## [167] bslib_0.4.2                 spatstat.sparse_3.0-1      
## [169] tibble_3.2.1                curl_5.0.0                 
## [171] leiden_0.4.3                survival_3.5-5             
## [173] rmarkdown_2.21              munsell_0.5.0              
## [175] e1071_1.7-13                GenomeInfoDbData_1.2.9     
## [177] reshape2_1.4.4              gtable_0.3.3