Yasin Şenbabaoğlu stands on the rooftop at CZ Biohub San Francisco, where a newly installed cooling system helps maintain the temperature of the Biohub’s high-performance computing cluster. (Credit: Dale Ramos)

Scientists are an intriguing breed, and Yasin Şenbabaoğlu categorizes them into two distinct pools: those who delight in uncovering new insights, and others who thrive on designing tools that facilitate these discoveries.

Ironically, Şenbabaoğlu (pronounced “SHEN-buhbuh-o-loo”) himself has spent his scientific career refusing to confine his interests, finding joy in both discovery and tool building. Now, as Director of Computational Biology at the Chan Zuckerberg Biohub San Francisco, he’s embracing new opportunities to pursue his passions at the intersection of biology and computer science.

In his new role, he’s eager to deepen collaborations among Biohub researchers working to develop a range of novel data- and image-analysis tools that are capable of making meaningful predictions, such as how viral infections affect cell function, or the consequences of gene deactivation on observable traits.

“If you look at breakthroughs, there’s a pattern of two things coming together — curiosity-driven science and technological innovation,” Şenbabaoğlu says. “Without one or the other, progress is nearly impossible.”

Letting numbers guide

Şenbabaoğlu’s journey into computational biology began during his childhood in Istanbul, Turkey, when he reached out to professors at a local university for a high school science project. One scientist had a study underway investigating how the body’s internal clock, which aligns physiological processes in circadian rhythms, influences how we respond to medication.

This work in chronobiology, a field that studies time-related phenomena in living organisms, was the perfect blend of mathematics — Şenbabaoğlu’s favorite subject — and biology. However, there was one problem: the university’s ethics committee did not permit high school students to conduct experiments with mice. Undeterred, Şenbabaoğlu convinced the researcher to let him study similar mechanisms in plants instead.

After high school, his drive to merge biology and mathematics led him to upstate New York, where he pursued an undergraduate degree in biology, with concentrations in computational biology and statistical genomics, at Cornell University. This desire to solve biological problems using quantitative methods continued to evolve, leading to a master’s degree in statistics and a Ph.D. in bioinformatics.

As a postdoctoral fellow at Memorial Sloan Kettering Cancer Center, Şenbabaoğlu worked alongside cell biologists and cancer specialists to explore questions in immunotherapy, a treatment approach that rallies the immune system to attack cancer cells. He focused on understanding why immune cells sometimes fail to activate or, in other instances, promote malignant growth within the complex ecosystem — or “microenvironment”— that surrounds a tumor.

“The tumor microenvironment is pretty messy,” Şenbabaoğlu explains. “We worked hard to untangle it to show the different types of cells present and their association with patient health outcomes.”

At that time Şenbabaoğlu and his team were studying clear cell renal cell carcinoma (ccRCC), a type of kidney cancer characterized by high levels of immune cells in the tumor microenvironment. By examining gene activity within ccRCC cancer cells, and using computational methods to link gene expression profiles with different types of immune cells, they developed a method that would pinpoint the precise composition and quantities of immune cells within a tumor microenvironment.

“The findings were powerful — just by identifying those cells, we could predict patients’ response to cancer immunotherapy,” Şenbabaoğlu adds.

In 2016, he transitioned to Genentech, where he continued his work in immunotherapy, focusing on so-called immune-desert tumors, which see few, if any, immune cells entering the tumor microenvironment, and thus tend to be resistant to immunotherapy.

At Genentech, he also spearheaded the development of AI tools aimed at identifying the spatial arrangement of cells within tumors to help predict patients’ responses to immunotherapy, and mastered a key lesson in computational biology: adapting to the unexpected. When he first started investigating immune-desert tumors, his work centered upon identifying shortcomings within immune cells. However, upon analyzing and modeling the data, a different picture emerged.

It became apparent that the immune cells weren’t deficient themselves. Instead, it was the cancer cells that were orchestrating a hostile environment, and impeding the function of these immune cells. This shift in perspective transformed the scientific inquiry away from immune cells and towards tumor cells, with Şenbabaoğlu’s colleagues now looking to understand the mechanisms employed by tumor cells to create and sustain this hostile environment.

“Computational biology offers a unique perspective in this regard,” Şenbabaoğlu says. “As you gather data, you often find yourself in unforeseen territories. So, if you cling to your assumptions and disregard what the data may reveal, it becomes incredibly challenging to align with the real world.”

The next wave of innovation

With decades of experience utilizing numerical analyses and machine learning to understand biology, Şenbabaoğlu is now poised to support CZ Biohub San Francisco researchers’ vision: comprehending the roles of cells and cell systems across a wide spectrum, from the intricate mechanisms within cells to health and disease patterns across human populations.

“Biohub researchers are generating unique, large datasets that aren’t being prioritized by industry or academia,” he explains. “This is computational biologists’ heaven — we can apply machine learning to these datasets to conduct experiments virtually, rather than one by one, and drive breakthroughs.”

As he heads the Computational Biology and Data Science Platform of CZ Biohub SF, Şenbabaoğlu is excited for the unexpected twists and turns that lie ahead. Alongside advancing ongoing work like collaborating with the Biohub’s zebrafish groups to gain insights into embryonic development and viral infection, he’s also leading efforts to pioneer AI models that can translate microscopic images of cells to the language of genes and proteins. These digital models would simulate the complex functions and behaviors of real cells, helping researchers predict cellular responses to diseases and accelerating drug development.

For these AI models to be precise, it’s crucial that the scientific community not only create extensive datasets but also adopt an ethos of open sharing of data, both important CZ Biohub SF objectives. Despite the challenges involved — like standardizing datasets for universal accessibility and providing comprehensive metadata describing the data’s origin, nature, and lineage — the advantages of open science are immense, Şenbabaoğlu says.

“This will not only bring the next wave of innovation in healthcare but also help create an environment where computational biologists can thrive and emerge as the next generation of scientific leaders,” he says.