seurat subset multiple conditions

6g and Extended Data Fig. ), Digitalization Initiative of the Zurich Higher Education Institutions Rapid-Action Call #2021.1_RAC_ID_34 (to C.C. The heterogeneity of Bm cells could be explained by several models38,39. | WhichCells(object = object, ident.remove = "ident.remove") | WhichCells(object = object, idents = "ident.remove", invert = TRUE) | In g, two-sided Wilcoxon test was used with Holm multiple comparison correction. that a certain variable was either 1, 2 or 3. We did not assume normal distribution for the flow cytometry data and used nonparametric tests such as KruskalWallis to test for differences between continuous variables in more than two groups, and P values were adjusted for multiple testing using Dunns method. 9e). ## locale: Germinal centre-driven maturation of B cell response to mRNA vaccination. 2a) of patient CoV-P1 pre-exposure to SARS-CoV-2, at days 33 and 152 post-symptom onset and at day 12 post-first dose of SARS-CoV-2 mRNA vaccination (that is, day 166 post-symptom onset). If so, would only performing batch correction on batches of the same diet and merging all the diets together without batch correction be a valid method of retaining gene expression differences between diet but not batches? An AUC value of 1 means that expression values for this gene alone can perfectly classify the two groupings (i.e. Reincke, M. E. et al. Asterisks indicate significantly different segment usage between S and the respective S+ Bm cell subsets. Eight patients were vaccinated against SARS-CoV-2 (analyzed on average at day 144 after last vaccination), whereas the other eight patients were considered SARS-CoV-2-recovered based on a history of SARS-CoV-2 infection or positive anti-nucleocapsid (N) serum antibody measurement, with six of them additionally vaccinated against SARS-CoV-2 (assessed on average at day 118 post-last vaccination) (Extended Data Fig. USA 104, 97709775 (2007). I'm also interested in understanding better how to do this. Not the answer you're looking for? Bm cells specific for RBD, wild-type spike (SWT) or spike variants B.1.351 (Sbeta) and B.1.617.2 (Sdelta) were identified by SAV multimers carrying specific oligonucleotide barcodes. a, Cohort overview of SARS-CoV-2 Infection Cohort. Pseudotime-based trajectory analysis using Monocle 3 in our scRNA-seq dataset (Extended Data Fig. Sci. column name in object@meta.data, etc. VL segments were sorted by a hierarchical clustering. 43, e47 (2015). For this, a count matrix was created with HC/LC segments as rows and samples as columns. Can I general this code to draw a regular polyhedron? Conversely, the frequency of S+ CD21CD27 Bm cells rose quickly and remained stable over 150days post-vaccination, accounting for about 20% of S+ Bm cells (Fig. SCT_integrated <- FindNeighbors(SCT_integrated, dims = 1:15) Bm cells are colored by cluster (f, left), tissue origin (f, right) or SWT binding (g). But I am not sure which assay should be used for FindVariableFeatures of the subset cells, RNA, SCT, or Integrated? I would like some help with this thread as well. We used an adaptation of LIBRA-seq68 to identify antigen-specific cells in our sequencing data. Choose a subset of cells, and then split by samples and then re-run the integration steps (select integration features, find anchors and integrate data). CyTOF workflow: differential discovery in high-throughput high-dimensional cytometry datasets. 4a,b). Naturally enhanced neutralizing breadth against SARS-CoV-2 one year after infection. However, this brings the cost of flexibility. 25,26,27,28,29). h, Expression of selected genes (left) and surface protein markers (right) are shown in Bm cell clusters. Nature 604, 141145 (2022). high.threshold = Inf, Using this subsetted data, I tried 4 different approaches: Approach 1: Default reintegration > Re-cluster (following, Approach 2: SCT reintegration > Re-cluster (following, Approach 3: No re-integration > Re-scale > Re-cluster (following, Approach 4: No re-integration > SC transform > Re-cluster (following. Also, instead of changing the default assay to "RNA", finding the variable features, and changing the default assay back to "integrated", would it be make more sense to just delete those lines of code and just change: Andreas E. Moor or Onur Boyman. isn't the whole point of integration to remove batch effects? Could you please let me know if the steps below are the correct way to go about identifying clusters and markers? Human memory B cells show plasticity and adopt multiple fates upon recall response to SARS-CoV-2, https://doi.org/10.1038/s41590-023-01497-y. & Cancro, M. P. Age-associated B cells: key mediators of both protective and autoreactive humoral responses. Transcriptomes of individual cells were used as inputs for the gsva() function with default parameters. SARS-CoV-2 infection generates tissue-localized immunological memory in humans. Defining antigen-specific plasmablast and memory B cell subsets in human blood after viral infection or vaccination. The alternative would be to subset() the population of interest and run the complete preprocessing including integration only on those cells again. We performed scRNA-seq combined with feature barcoding, which allowed us to assess surface phenotype and to perform BCR-seq in sorted S+ Bm cells and S B cells from paired blood and tonsil samples of four patients (two SARS-CoV-2-recovered and two SARS-CoV-2-vaccinated). In h, a two-sided Wilcoxon rank sum test was used, and P values corrected by Bonferroni correction. Each of the cells in cells.1 exhibit a higher level than each of the cells in cells.2). 1g and Extended Data Fig. c, Stacked bar graphs show single patient contribution to the WNN clusters. Does anyone have an idea how I can automate the subset process? ## [19] ROCR_1.0-11 limma_3.54.1 globals_0.16.2 Thank you. Hugo. With Seurat, you can easily switch between different assays at the single cell level (such as ADT counts from CITE-seq, or integrated/batch-corrected data). @vertesy just came here to chime in after seeing your comment mate, so I tried what you are suggesting, and I see no marked difference, in fact, I don't have the data to show rn because I've a lot on my plate currently, but subset>integrate>re-cluster is more laborious and less useful than integrate>subset>re-cluster. I did integration with SCTransform. SCT_integrated <- IntegrateData(anchorset = SCT_Integrated.anchors, normalization.method = "SCT", features.to.integrate = rownames(SCT_Integrated)) The SWT+ Bm cells in the IgG+CD27hiCD45RBhi cluster (cluster 5) were mainly from blood, in the IgG+CD21hi cluster (cluster 2) predominantly tonsillar, while the IgG+CD27lo cluster (cluster 4) contained SWT+ Bm cells from both compartments. As far as heterogeneity goes, if you keep sub-sampling till you reach 2 cells you will find differences between even them. One way to look broadly at these changes is to plot the average expression of both the stimulated and control cells and look for genes that are visual outliers on a scatter plot. Sign up for the Nature Briefing newsletter what matters in science, free to your inbox daily. Density plots indicate count distributions across binding score ranges are shown on top and on the side. On the basis of our data, we suggest a linearplastic model where the antigen stimulation and GC maturation of SARS-CoV-2-specific B cells resulted in the gradual adoption of a CD21+Ki-67lo resting Bm cell state at months 612 post-infection. (palm-face-impact)@MariaKwhere were you 3 months ago?! a, WNNUMAP was derived from scRNA-seq dataset at months 6 and 12 post-infection (n=9) and colored by indicated Bm cell subsets (top) and S+ and S separated by month 6 preVac, month 12 nonVac and month 12 postVac (bottom). control_subset <- RunPCA(control_subset, npcs = 30, verbose = FALSE) to Branch lengths represent mutation numbers per site between each node. I have a Seurat object that I have run through doubletFinder. (I assume if I just need to delete the 3 lines of code I just mentioned above and change 3j,k). To subset the Seurat object, the SubsetData() function can be easily used. Semilog line was fitted to data (R2=0.2695). Johnson, J. L. et al. SubsetData function - RDocumentation Additionally, genes like CXCL10 which we saw were specific to monocyte and B cell interferon response show up as highly significant in this list as well. Hi All, They donated blood before vaccination, at days 813 (week 2) post-second dose, 6months after the second dose and days 1114 post-third dose. 2d). I am worried that the top variable features of the original Seurat Object are not the same variable features of the new subset. | SetIdent(object = object, ident.use = "new.idents") | Idents(object = object) <- "new.idents" | Have a question about this project? To extend our analyses to SARS-CoV-2-specific Bm cells in the peripheral lymphoid organs, we analyzed paired tonsil and blood samples from a cohort of 16 patients (9 females and 7 males) undergoing tonsillectomy who were exposed to SARS-CoV-2 by infection, vaccination or both. Systemic and mucosal antibody responses specific to SARS-CoV-2 during mild versus severe COVID-19. I have increased the resolution on FindClusters to analyze the integrated object and get my cluster of interested subclustered enough for DEG analysis but would simply like a new UMAP plot to visualize expression within that group of clusters. Gene set enrichments for individual cells were summarized to patient pseudobulks by calculating mean enrichment values of cells belonging to the same patient. Our work also provides insight into the CD21CD27 Bm cells, which made up a sizeable portion of Bm cells following acute viral infection and vaccination in humans. By clicking Sign up for GitHub, you agree to our terms of service and Introduction to scRNA-seq integration Seurat - Satija Lab Does this batch-correction overfit the data so much so such that legitimate biological differences in gene expression profiles of cells from different diets (HFD, LFD, Chow) are gone? How to perform subclustering and DE analysis on a subset of an - Github Hi all, I'm also interested to this topic: what is the best way to subset and reclustering data starting from an integrating dataset? If NULL ), Swiss Academy of Medical Sciences (SAMW) fellowships (#323530-191230 to Y.Z. 7 Phenotypic and functional characterization of circulating S, Extended Data Fig. All authors edited and approved the final paper. Connect and share knowledge within a single location that is structured and easy to search. 9 scRNA-seq B cell receptor (BCR) repertoire and Monocle analysis. PLoS Comput. ), Innovation grant of University Hospital Zurich (to O.B. Patients with COVID-19 and healthy individuals were recruited at one of four hospitals in the Canton of Zurich, Switzerland. Comparison of V heavy and light chain usage within S+ Bm cell subsets in the scRNA-seq data from SARS-CoV-2-recovered individuals (months 6 and 12 post-infection) revealed very similar chain usage in S+ CD21+ resting (CD21+CD27+ and CD21+CD27 combined), CD21CD27+CD71+ activated and CD21CD27FcRL5+ Bm cells (Extended Data Fig. Why does Acts not mention the deaths of Peter and Paul? Cell Rep. 37, 109823 (2021). VH and V light (VL) genes are indicated on top of dendrograms. | FontSize | Set font sizes for various elements of a plot | Severe deficiency of switched memory B cells (CD27+IgMIgD) in subgroups of patients with common variable immunodeficiency: a new approach to classify a heterogeneous disease. | MergeSeurat(object1 = object1, object2 = object2) | merge(x = object1, y = object2) |. The scRNA-seq dataset identified a trend towards increased clonality of S+ Bm cells in the six patients vaccinated between month 6 and month 12 post-infection when comparing pre-vaccination with post-vaccination (Fig. 10, eaan8405 (2018). Conversely, CD21+CD27+ and CD21+CD27 Bm cells were prominent at months 6 and 12, amounting to 60.5% and 29.1% of S+ Bm cells at month 12, respectively (Fig. Ritchie, M. E. et al. P values are provided if significant (p<0.05) between the S and S+ Bm cell subsets. ## [115] lmtest_0.9-40 jquerylib_0.1.4 RcppAnnoy_0.0.20 Choose a subset of cells, and use the integration assay to Run PCA, umap, findneighbors and findclusters to do subclustering. ; #323530-177975 to S.A.; #323530-191220 to C.C. Note that plotting functions now return ggplot2 objects, so you can add themes, titles, and, "2,700 PBMCs clustered using Seurat and viewed\non a two-dimensional tSNE", # Plotting helper functions work with ggplot2-based scatter plots, such as DimPlot, FeaturePlot, CellScatter, and. Poon, M. M. L. et al. Sci. Immunity 52, 842855.e6 (2020). O.B. To subscribe to this RSS feed, copy and paste this URL into your RSS reader. Each set of modal data (eg. At month 6 post-infection (pre-vaccination), 80% of those 30 clones had a CD21+ resting Bm cell phenotype (Fig. PubMed Long-lived plasma cells can continuously secrete high-affinity antibodies that are protective against a homologous pathogen7, whereas Bm cells encode a broader repertoire which allows protection against variants of the initial pathogen after restimulation8. Subsetting the before integrating data to interested cells and then do the whole integration, followed by PCA, umap, findneighbors and findclusters seemed reasonale to me. We can explore these marker genes for each cluster and use them to annotate our clusters as specific cell types. J. Immunol. The method is named sctransform, and avoids some of the pitfalls of standard normalization workflows, including the addition of a pseudocount, and log-transformation. P.T. Longitudinal tracking of S+ Bm cell clones between month 6 and month 12 post-infection identified 30 persistent clones in individuals vaccinated during that period (Fig. For f and g, statistical analysis of the gene set enrichment and variation analyses was performed as outlined in Methods, and all adjusted P values are shown. object, Downstream analysis was conducted in R version 4.1.0 mainly with the package Seurat (v4.1.1) (ref. Google Scholar. Zumaquero, E. et al. MathJax reference. Extended Data Fig. *P<0.05, **P<0.01, ***P<0.001, ****P<0.0001. a, Sorting strategy for SARS-CoV-2 S+ Bm cells and S B cells, gated on CD19+ non-PB, for scRNA-seq is provided. The clonality distance threshold was set to 0.20 for the longitudinal analysis of the SARS-CoV-2 Infection Cohort dataset and to 0.05 for the SARS-CoV-2 Tonsil Cohort dataset. Note that @timoast from the Seurat team recommended otherwise, although I never seen an explanation why would this not best way to go. i, SHM counts are provided for nave B cells (n=1,607), blood (n=170) and tonsillar SWT+ Bm cells (n=1,128). These observations in circulating Bm cells were paralleled by the appearance of resting Bm cells in tonsils, where they showed high expression of CD69 and CD21 and comparable SHM counts to circulating Bm cells. J. Clin. privacy statement. After sorting, cell suspensions were pelleted at 400g for 10min at 4C, resuspended and loaded into the Chromium Chip following the manufacturers instructions. wrote the paper with contribution by J.M., K.W. ## [97] compiler_4.2.0 plotly_4.10.1 png_0.1-8 8b,c). Which was the first Sci-Fi story to predict obnoxious "robo calls"? Replies here and in some other GitHub issues have slightly different approaches but they all make general sense. So I have a couple of questions regarding my workflow: For downstream DE analysis, the scale.data slot in the SCT assay has disappeared after integration. I have a seurat object with 10 samples (5 in duplicates). J. Immunol. | WhichCells(object = object, ident = "ident.keep") | WhichCells(object = object, idents = "ident.keep") | You signed in with another tab or window. c, Stacked bar plots (mean+standard deviation) represent isotypes in blood and tonsillar S+ Bm cells from both SARS-CoV-2-vaccinated and SARS-CoV-2-recovered individuals (n=16; also applies to d and e). Transl. d, Violin plots of frequencies of Bm cell subsets of S+ Bm cells at the indicated time points. 2 Flow cytometry gating strategies and frequencies of SARS-CoV-2 spike-specific B, Extended Data Fig. 1b and Extended Data Fig. seurat_object <- subset (seurat_object, subset = DF.classifications_0.25_0.03_252 == 'Singlet') #this approach works I would like to automate this process but the _0.25_0.03_252 of DF.classifications_0.25_0.03_252 is based on values that are calculated and will not be known in advance. Is it necessary to run FindVariableFeatures on the RNA assay of the subset and get new variables to use in PCA in order to properly cluster the subset? Which of course included re-calculating the variable genes (on the "RNA" Slot) and re-integration. control_subset <- RunUMAP(control_subset, dims = 1:15) and J.N. You are using a browser version with limited support for CSS. For scRNA-seq data, distribution was assumed to be normal, but this was not formally tested. Peer reviewer reports are available. Holla, P. et al. Single-cell RNA sequencing (scRNA-seq) indicated that single Bm cell clones adopted different fates upon antigen reexposure. f, Violin plots of percentages of Ki-67+ S+ Bm cells are shown at indicated timepoints. How can I find help page about "%in%"? 11, 2664 (2020). The S+ CD21CD27 Bm cells identified here were transcriptionally very similar to their atypical counterparts in SLE. Identified Bm cells (SARS-CoV-2 S B cells, n=2258; SWT+ Bm cells, n=1298) were subsequently reclustered as indicated in the box. Yang, R. et al. e, Stacked bar graphs (mean + SD) display isotype distribution in S+ Bm cell subsets in samples of SARS-CoV-2-recovered individuals postVac at months 6 and 12 post-infection from flow cytometry dataset (n=37).