Computational Biology Publications

October 29, 2024

BiP/GRP78 is a pro-viral factor for diverse dsDNA viruses that promotes the survival and proliferation of cells upon KSHV infection

Researchers discovered that the ER chaperone BiP (HSPA5) is upregulated during the lytic phase of Kaposi’s sarcoma-associated herpesvirus (KSHV) infection, independent of the unfolded protein response. Inhibiting BiP genetically or pharmacologically halts KSHV replication and reduces infected cell proliferation. This inhibition also limits the spread of other herpesviruses and poxviruses with minimal toxicity to normal cells, suggesting BiP as a potential target for broad-spectrum antiviral therapies and treatment of KSHV-related cancers.

October 29, 2024

Microbial dynamics and pulmonary immune responses in COVID-19 secondary bacterial pneumonia

Secondary bacterial pneumonia (2°BP) is associated with significant morbidity following respiratory viral infection, yet remains incompletely understood. In a prospective cohort of 112 critically ill adults intubated for COVID-19, we comparatively assess longitudinal airway microbiome dynamics and the pulmonary transcriptome of patients who developed 2°BP versus controls who did not. Taken together, our findings provide fresh insights into the microbial dynamics and host immune features of COVID-19-associated 2°BP, and suggest that suppressed immune signaling, potentially mediated by corticosteroid treatment, permits expansion of opportunistic bacterial pathogens.

October 18, 2024

Zebrahub-Multiome: Uncovering Gene Regulatory Network Dynamics During Zebrafish Embryogenesis

A sequel to Zebrahub, Zebrahub-Multiome adds epigenomics as an additional modality to dive deeper into how the gene regulatory networks are shaped during zebrafish development.

October 3, 2024

Impact of doxycycline post-exposure prophylaxis for sexually transmitted infections on the gut microbiome and antimicrobial resistome

Doxycycline post-exposure prophylaxis (doxy-PEP) reduces bacterial sexually transmitted infections among men who have sex with men and transgender women. Although poised for widespread clinical implementation, the impact of doxy-PEP on antimicrobial resistance remains a primary concern as its effects on the gut microbiome and resistome, or the antimicrobial resistance genes (ARGs) present in the gut microbiome, are unknown.

June 29, 2024

Challenges and Progress in RNA Velocity: Comparative Analysis Across Multiple Biological Contexts

A comparative analysis of the pros, cons, and caveats of different RNA velocity methods for predicting cell state changes from single-cell sequencing data.

June 26, 2024

protoSpaceJAM: an open-source, customizable and web-accessible design platform for CRISPR/Cas insertional knock-in

Here, we present protoSpaceJAM, an open-source algorithm to automate and optimize gRNA and HDR donor design for CRISPR/Cas insertional knock-in experiments at the genome-wide scale. protoSpaceJAM utilizes biological rules to rank gRNAs based on specificity, distance to insertion site, and position relative to regulatory regions. protoSpaceJAM can introduce ‘recoding’ mutations (silent mutations and mutations in non-coding sequences) in HDR donors to prevent re-cutting and increase knock-in efficiency.

April 22, 2024

Single-cell analysis reveals M. tuberculosis ESX-1-mediated accumulation of permissive macrophages in infected mouse lungs

Single-cell profiling identified Mycobacterium tuberculosis ESX-1-mediated recruitment of immunosuppressive lung
macrophages as bacterial reservoirs. MTB induces an anti-inflammatory transcriptional signature in mononuclear phagocytes and in bone marrow derived macrophages in an ESX-1 dependent manner. Spatial transcriptomics revealed an upregulation of anti-inflammatory signals within MTB lesions, where monocyte-derived macrophages concentrate near MTB-infected cells.

April 19, 2024

Dysregulation of CD4+ and CD8+ resident memory T, myeloid, and stromal cells in steroid-experienced, checkpoint inhibitor colitis

Single-cell RNA sequencing atlas of ulcerative colitis (scRNA-seq and CITE-seq) to study the effect of Check Point Inhibitor, done in collaboration with UCSF.

April 18, 2024

Simultaneous detection of pathogens and antimicrobial resistance genes with the open source, cloud-based, CZ ID pipeline

Antimicrobial resistant (AMR) pathogens represent urgent threats to human health, and their surveillance is of paramount importance. To address this need, we developed the Chan Zuckerberg ID (CZ ID) AMR module, an open-access, cloud-based workflow designed to integrate detection of both microbes and AMR genes in mNGS and whole-genome sequencing (WGS) data. We highlight diverse applications of the AMR module through analysis of both publicly available and newly generated mNGS and WGS data from four clinical cohort studies and an environmental surveillance project.

February 19, 2024

Single-cell and spatial multi-omics highlight effects of anti-integrin therapy across cellular compartments in ulcerative colitis

Single-cell multi-omic sequencing atlas of ulcerative colitis (scRNA-seq, CITE-seq, spatial transcriptomics, and spatial proteomics) to study the effect of Vedolizumab, done in collaboration with UCSF.

January 30, 2024

PoMeLo: a systematic computational approach to predicting metabolic loss in pathogen genomes

Genome streamlining, the process by which genomes become smaller and encode fewer genes over time, is a common phenomenon among pathogenic bacteria. Characterizing genome streamlining, gene loss, and metabolic pathway degradation can be useful in assessing pathogen dependency on host metabolism and identifying potential targets for host-directed therapeutics. PoMeLo (Predictor of Metabolic Loss) is a novel evolutionary genomics-guided computational approach for identifying metabolic gaps in the genomes of pathogenic bacteria

January 2, 2024

The antibiotic resistance reservoir of the lung microbiome expands with age in a population of critically ill patients

The lung microbiome can influence susceptibility of respiratory tract infections and represents an important reservoir for exchange of antimicrobial resistance genes. Using a multivariable logistic regression model, we find that detection of antimicrobial resistance gene expression was significantly higher in adults compared with children after adjusting for demographic and clinical characteristics. This association remained significant after additionally adjusting for lung bacterial microbiome characteristics, and when modeling age as a continuous variable.

December 18, 2023

Global organelle profiling reveals subcellular localization and remodeling at proteome scale

Here, we present a high-resolution strategy to map subcellular organization using organelle immuno-capture coupled to mass spectrometry. Applying this strategy to characterize the cellular landscape following hCoV-OC43 viral infection, we discover that many proteins are regulated by changes in their spatial distribution rather than by changes in their total abundance.

November 18, 2023

Building up a genomic surveillance platform for SARS-CoV-2 in the middle of a pandemic: a true North–South collaboration

Low- and middle-income countries face additional challenges in establishing, maintaining and expanding genomic surveillance. We present our experience of establishing a genomic surveillance system at the Aga Khan University, Karachi, Pakistan.Our experience offers lessons for the successful development of Genomic Surveillance Infrastructure in resource-limited settings struck by a pandemic.

November 2, 2023

CZ CELL×GENE Discover: A single-cell data platform for scalable exploration, analysis and modeling of aggregated data

Here, we present CZ CellxGene Discover, a single-cell sequencing data platform that provides curated and interoperable data for over 50 million cells via a free-to-use online data portal. A suite of tools and features enables accessibility and reusability of the data via both computational and visual interfaces to allow researchers to rapidly explore individual datasets and perform cross-corpus analysis.

August 3, 2023

Evolutionary genomics identifies host-directed therapeutics to treat intracellular bacterial infections

Obligate intracellular bacteria shed essential biosynthetic pathways during their evolution towards host dependency, providing an opportunity for host-directed therapeutics. With Rickettsiaceae as a model, we employed a novel evolutionary genomics-guided approach to systematically compare this cytosolic family of bacteria to the related Anaplasmataceae that reside in the host vacuole. Testing inhibitors against 14 metabolic pathways missing from the pathogen resulted in reduced bacterial growth without host cell cytotoxicity, supporting the feasibility of our approach for host-directed drug development against obligate pathogens.

July 11, 2023

Tutorial: guidelines for manual cell type annotation of single-cell multi-omics datasets using interactive software

A guideline for manual cell type annotation of single-cell multi-omic datasets using interactive software (CZ CELLxGENE)

September 20, 2022

AIRRscape: an interactive tool for exploring B-cell receptor repertoires and antibody responses

Technological advances in next generation sequencing have allowed for broad experimental sampling of immune repertoires, providing insight into how our immune system responds to infection, vaccination, autoimmunity, and cancer. The scale of these “big data”, however, make it difficult to bioinformatically extract the key sequence features that are shared across multiple repertoires. With AIRRscape, we enable large-scale immune repertoire visualization and analysis that requires no knowledge of the command line or advanced programming. By providing the community with an open-source, interactive, and user-friendly interface, we reduce the barriers to exploring immune repertoires at scale.

September 17, 2022

ortho_seqs: A Python tool for sequence analysis and higher order sequence–phenotype mapping (preprint)

An important goal in sequence analysis is to understand how parts of DNA, RNA, or protein sequences interact with each other and to predict how these interactions result in given phenotypes. Mapping phenotypes onto underlying sequence space at first- and higher order levels in order to independently quantify the impact of given nucleotides or residues along a sequence is critical to understanding sequence–phenotype relationships. We developed a Python software tool, ortho_seqs, that quantifies higher order sequence-phenotype interactions based on our previously published method of applying multivariate tensor-based orthogonal polynomials to biological sequences.

May 13, 2022

The Tabula Sapiens: A multiple-organ, single-cell transcriptomic atlas of humans

Molecular characterization of cell types using single-cell transcriptome sequencing is revolutionizing cell biology and enabling new insights into the physiology of human organs. We created a human reference atlas comprising nearly 500,000 cells from 24 different tissues and organs, many from the same donor. This atlas enabled molecular characterization of more than 400 cell types, their distribution across tissues, and tissue-specific variation in gene expression.

September 21, 2021

Leveraging the Cell Ontology to classify unseen cell types

Here, we present OnClass, an algorithm and accompanying software for automatically classifying cells into cell types that are part of the controlled vocabulary that forms the Cell Ontology.

December 11, 2020

Analyzing genomic data using tensor-based orthogonal polynomials with application to synthetic RNAs

An important goal in molecular biology is to quantify both the patterns across a genomic sequence and the relationship between phenotype and underlying sequence. We propose a multivariate tensor-based orthogonal polynomial approach to characterize nucleotides or amino acids in a given sequence and map corresponding phenotypes onto the sequence space.

October 19, 2020

MARS: discovering novel cell types across heterogeneous single-cell experiments

Although tremendous effort has been put into cell-type annotation, identification of previously uncharacterized cell types in heterogeneous single-cell RNA-seq data remains a challenge. Here we present MARS, a meta-learning approach for identifying and annotating known as well as new cell types.

July 15, 2020

A single-cell transcriptomic atlas characterizes ageing tissues in the mouse

Despite rapid advances over recent years, many of the molecular and cellular processes that underlie the progressive loss of healthy physiology are poorly understood. To gain a better insight into these processes, here we generate a single-cell transcriptomic atlas across the lifespan of Mus musculus that includes data from 23 tissues and organs.

October 18, 2018

Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris

Here we present a compendium of single-cell transcriptomic data from the model organism Mus musculus that comprises more than 100,000 cells from 20 organs and tissues. Learn more.

Publications