The BCRO Webinar

The Bioinformatics CRO Webinar Series

January 21, 2026: James Opzoomer – Biophysics-Informed Spatial Transcriptomics Approaches to Identify Cytokines Causally Driving Downstream Gene Programs

The BCRO Webinar

James Opzoomer is a Senior Scientist in the Innovation Lab at Relation, where he develops single-cell and spatial genomics platforms to accelerate drug discovery. His projects span high-throughput multimodal single-cell sequencing and spatial transcriptomics technology development, generating ML-ready datasets that power novel therapeutic insights.

In this live webinar, he discussed BISTR (biophysics-informed spatial transcriptomics regression) as a computational toolbox for building biologically plausible predictive models from spatial transcriptomics by combining RNA dynamics as a readout of changing gene programs, and paracrine cytokine diffusion as a physically constrained model of cell–cell communication. By linking inferred cytokine secretion, a spatial propagation diffusion model, and receptor-associated changes in mRNA maturation, BISTR aims to suggest cell-type-specific, testable causal relationships between extracellular signals and downstream transcriptional responses.

Transcript of The Bioinformatics CRO Webinar Series – Biophysics-Informed Spatial Transcriptomics Approaches to Identify Cytokines Causally Driving Downstream Gene Programs

Disclaimer: Transcripts may contain errors.

Grant Belgard: Welcome to the next talk in The Bioinformatics CRO webinar miniseries. At The Bioinformatics CRO, we help life science teams turn complex data into clear decision ready insights, providing flexible expert bioinformatics support from study design through analysis and reporting. As part of that mission, our webinar series features practitioner focused talks with concrete takeaways you can put to work right away. Today’s talk is by James Opzoomer. James is a senior scientist in the innovation Lab at Relation where he developed single cell and spatial genomics platforms to accelerate drug discovery. His projects span high throughput multimodal single cell sequencing and spatial transcriptomics technology development generating ML ready data sets that power novel therapeutic insights. Today James will be presenting on biophysics informed spatial transcriptomics approaches to identify cytokines causally driving downstream gene programs. After the talk, we’ll host a live Q&A session. This is streaming both to YouTube and LinkedIn and on either platform you can put your questions in the chat or the comments at any point during the talk and we’ll bring them into our discussion afterwards. James, over to you.

James Opzoomer: Thank you and hello. So I’m delighted to be speaking today at this uh BCRO webinar and I’d like to thank Grant and the BCRO team for inviting me to speak with you today about relation and some of our spatial transcriptomics work within the Innovation team. So I’m going to start today by giving you an overview of Relation and our approach to data generation and then I’ll dive into a novel spatial transcriptomics an analysis method that we’re developing called BISTR and provide a worked example at the end. So, first I’d like to start with a question. What are some of the main challenges with the current model of drug development? And why is now a uniquely good moment to deploy large-scale patient genomics to solve this problem?

James Opzoomer: So shown here are four major trends that define the future of drug development and healthcare. And on the left we have sort of two negative trends. First that the cost of drugs is ever increasing. We spend more money on health care but we don’t see commensurate increases in in life expectancy. And this is also demonstrated by the the ratio of health care spend to life expectancy on the left. Now the two good trends on the right are that the cost of sequencing is is drastically decreasing. You can now do a whole genome sequencing for about $100 and the cost of compute that’s driven by titans like Nvidia has made it more accessible than than ever before. So really the problem that Relation is trying to deal with is the first one decreasing the cost of drugs. And what we want to ask is can we use these two trends on the right to solve those on the left.

James Opzoomer: Now this slide really represents a simplified overview of the drug development funnel which I’m sure you’re all well aware of. On the left we start with maybe 20 programs, 20 ideas for new medicines and we invest on the order of 1 to 3 billion across this funnel and after all that work we typically end up with just one marketed drug. So most of the attrition here is because we were wrong about the biology. So although every stage in the funnel is important, the decisions we make right at the beginning in target discovery echo all the way through this funnel to the clinic where failure is acutely expensive. And so that’s why we believe that that target discovery is really the most important problem in in drug discovery.

James Opzoomer: So at Relation our ambition is to transform target discovery into an engineering discipline. And now this means building systematic repeatable processes powered by large-scale patient data and ML models.

James Opzoomer: So the funnel that I previously showed you is another representation of this statistic on the top left that over 90% of drugs that enter clinical trials ultimately fail. So how do we transform R&D so that this number looks very different in the future? Now over the last few years several large analyses have given us an important clue. So on the right there are two examples of these. The first is is a recent Nature paper where it looked across many clinical programs from and the papers from Matt Nelson’s group. They show that when a drug target is supported by human genetic evidence the probability of success in the clinic is increased compared to targets without that evidence. In other words, genetics gives us causal anchors in human biology. The second work shows that single cell RNA sequencing of human tissue sharpens that picture. So by knowing which cells in which tissues express a genetically supported target, we can better predict efficacy.

James Opzoomer: So how do these these approaches fall into historical data collection strategies? So on the left we have large end low value highdimensional observational data. These are things like the human cell atlas um large bio bank cohorts. There’s a lot of it but it’s noisy, confounded and often only weakly connected to clear interventions that we want to make in drug discovery. And on the right we have small and high value but lowdimensional uh interventional data mechanistic experiments in model systems but in small numbers and with low dimensional readouts a few readouts and few perturbations. Now what we actually need for AI driven target discovery is bespoke multimodal perturbation data that links interventions to rich molecular and cellular readouts across diverse biological systems that are related to patient primary patient material. Now that missing data layer is what enables us to train models that actually learn the consequences of perturbing a target in a specific cell type and tissue.

James Opzoomer: And you know overall we believe that current models and data in the public domain are nowhere near sufficient to deliver meaningful impact in target discovery. So we therefore have to build the right data and the right models applied to where they most make sense.

James Opzoomer: So now that I’ve talked about why we care so much about genetics single and single cell data, I wanted to give a quick overview of how Relation is actually set up to do this in practice. And this slide represents a highlevel map of our platform. On the left you see human tissue profiling. This is where we generate deep multimodal data directly from patient samples. whole genome sequencing um single cell spatial transcriptomics single cell transcriptomics and proteomics. Now all of this is connected to the cellular modeling teams who run perturbation experiments on patient derived primary cell systems to generate bespoke data for the models and this connects to translation pharmacology who take the prioritized drug targets and turn them into to drug discovery programs. Now this is all connected to to both data science and our three main machine learning platforms. ROSALIND which identifies genetically validated drug targets, ADA which focuses on reversibility and TURING which provides drug discovery context of our targets. And I’m not going to go into these platforms in detail today because I really want to focus on the spatial genomics data that we generate in human tissue profiling and some of the new analysis methods that we’re developing to better use our spatial transcriptomics data in in drug discovery. So as an example of the type of primary patient data that we collect, I just wanted to show a case study of osteomics. This is our flagship observational clinical study focused on osteoporosis and bone disease. So in this study we partner with orthopedic surgeons across London to collect human bone waste from key surgeries. So these are total joint replacements elective surgeries associated with osteoarthritis and um hemiarthroplasty. So these are non-elective surgeries resulting from osteoporotic fracture really the end stage of osteoporosis.

James Opzoomer: So from each patient we build a genuinely multimodal data set. So that’s whole genome sequencing to identify variants and genes that causally in influence bone density, fracture risk and response to therapy. And this feeds into our genetic discovery platform at ROSALIND. We also generate single nucleus RNAseq of bone and joint tissue to map those genetically supported targets into specific bone stromal and immune cell types and states within the tissue and this sharpens our view of where these targets are expressed within the tissue. We also collect blood-based proteomics to find circulating biomarkers that report on pathway activity can be later used for for patient stratification. And in addition to this also rich clinical metadata including bone BMD or bone mineral density to anchor everything back to quantitative phenotypes. And now this lets our models learn how genetics and cell state translate into real clinical outcomes.

James Opzoomer: So in addition to the the single cell RNAseq we generate we generate spatial transcriptomics data with Xenium and the VisiumHD platforms on human bone and other tissues in associated with the other disease programs we’re working on. And this is really important because single cell data tells us what cell types and states are present within the tissue, but really we lose where they sit in the tissue and how they interact and communicate with other cells within this spatial context.

James Opzoomer: So together these genomics and single cell data sets give us a dense patient centric view of disease biology and in particular we in the Innovation Lab are interested how we can utilize this spatial transcriptomics data to disentangle the causal microenvironmental signals. So the cell communication pathways that drive cell state and cellular response to micro environment. And this has led us to develop a new analysis method called BISTR um or bioysics informed spatial transcriptomics regression that I’d like to share with you today.

James Opzoomer: So spatial technologies are key for preserving the in situ cellular context present in tissues providing a contextual perturbation system of sorts to understand some of the micro environmental signaling factors that may be driving a particular cell state within a tissue or within a particular disease. So we’re often attempting to model our disease states in less complex in vitro systems like some of the ones shown here 2D cell models and 3D organoids or organ-on-chip models. And the kind of the motivating feature of this BISTR package is to answer some of these questions. It’s can we identify cytokines responsible for cell identity and behavior in primary patient tissue and could we then stimulate cell models to mimic some of these these disease relevant or patient relevant micro environmental niches. And we hope that this can add value to the drug discovery process and to kind of our efforts in in vitro cellular modeling by using this knowledge to build experimental systems with greater disease relevance in vitro.

James Opzoomer: So a lot of this work is enabled by the advancements in the resolution of of spatial genomics technologies which is is really rapidly changing. And we recently published a review in Cell Genomics tracking these technology trends called SC trends. And this kind of summarizes the historical development in spatial omics technologies as well as some of the analysis packages available. And we also comment on these these kind of developing spatial technologies in real time since it’s such a such a fast moving field at our blog sctrends.org. So I encourage you to check it out if you’re able to.

James Opzoomer: So the work that I’m going to show you today is really focused on uh 10x Genomics VisiumHD platform and this is one of these spatial sequencing based spatial transcriptomics technologies where the increased resolution in this generation of platform now two micrometers has really enabled subcellular resolution allowing us to track several biophysical processes that are shown out here on the right. So RNA abundance, RNA localization and also RNA splicing at the subscellular level. And we can use these two micron pixels to approximately reassemble single cell data based on image segmentation tools in the imaging modality to create approximately single cell data.

James Opzoomer: So this slide sort of positions BISTR among other spatial modeling approaches. On the left are are sort of simple heuristic based approaches like using a radius around a specific cell or a k-nearest neighborhood and computing sort of some summary statistics. They’re fast. But the spatial scale is often somewhat arbitrary and the tissue is treated more like a discrete bin than a sort of a continuous space that it is. On the right, we’ve got deep learning based approaches. Now, these can be powerful, especially when they leverage analysis pipelines from the image space or are often paired with single cell data, but they’re typically more data hungry and and less sometimes less mechanistically interpretable. So, BISTR sits in the the biophysical model space in between. So we encode this process of of um intracellular signaling via ligand diffusion as a diffusion decay problem with boundary exchange to generate interpretable per cell exposure features without choosing an ad hoc neighborhood. It runs with more modest compute and also sets up a clean entry point for ML once the inverse problem is well posed.

James Opzoomer: So this is sort of a schematic representation of the BISTR computational pipeline. You have your underlying biological system and you generate subcellular spatial transcriptomics data say 10x VisiumHD data. We then use an image segmentation, vision transformer for instance, to identify nuclei and cell boundaries and infer subcellular compartments. You then quantify the transcripts on the nuclear and cellular level and then we construct the extracellular domains the space between the cells as a finite element triangulation mesh and we model paracrine signaling fields per ligand across this mesh using a finite element methods. This allows us to extract the per cell signaling features which we identify with receptor gating. So understanding the concentration of the ligand at a cell boundary and whether the cell expresses the cognate receptor to this ligand and from that we can characterize which ligands predict certain gene expression via a GLM based model.

James Opzoomer: So this is another schematic that that represents the data flow within the the BISTR Python package. You have your VisiumHD data. You identify nuclei with a vision transformer and you perform a morphological expansion of cells to create a like a cell cytoplasm boundary giving you approximately single cell data. You then build the FEM triangulation network. You use public databases to look up ligand receptor, ligand and receptor genes that are expressed within your cell types of interest and you solve the FEM network across all of your ligands within the intracellular space. Now this gives you the FE solution at the cell boundary. And we also look at ligand flux which is the relationship between the expression of a ligand within the cell and the FE solution at the cell boundary effectively identifying whether a cell is a source of a particular intercellular communication ligand or a sink, is it just receiving this signal and then we use a GLM to identify which ligands are most predictive of certain gene expression programs downstream.

James Opzoomer: So now I want to show you a kind of a worked example on a publicly available uh VisiumHD data set. So this is the BISTR package applied to this uh 10X Fenomics colorectal cancer data set. This is a a 10x VisiumHD FFPE data set that was published as part of the preprint that was released along with the VisiumHD product launch in in 2024. So here you can see a a highlevel view of the image of the tissue that has been assayed and zooming in onto a smaller subsection of the tissue. So you can see the individual cells. We use a vision transformer model to perform nuclei segmentation and then morphological nuclei expansion. So we follow this expansion to assemble the two micron spots into approximately single cell data which we annotate with its various cell types giving us a tissue representation of single cell data that looks like this. Here they’re colored by their cell type annotation.

James Opzoomer: So on the left here you can see we construct the extracellular domain and mesh. So we triangulate the extracellular space between the cells whilst using a tissue mask to limit the extracellular triangulation to the space that’s only underneath tissue. And starting with a ligand expression per cell, we formulate an FEM problem with diffusion and and decay parameters plus [] membrane coupling that allows us to solve a sparse linear system per ligand and get the FE solution across the tissue space. And here you can see the cells themselves are colored by the expression of ligand vgf-a. And you can see the FE solution in the intracellular space colored in this sort of white to red heat showing that cells express- expressing high vgf-a secrete, are predicted to secrete vgf-a into the intracellular tissue space. And we model this diffusion with decay throughout the tissue. And this ultimately gives us a FE solution across each of the communicating cells within the tissue which we gate basically binarizing them based on whether they express the receptor to a particular ligand or not. If they do express to the ligand then we calculate the FE solution across the cell membrane of each cell and also the flux. So this is the average membrane exchange signal. So effectively this is the proportion of the ligand expression within the cell and at the boundary of the cell from the extracellular space. Is this cell a source or a sink of this intercellular communication signal?

James Opzoomer: So in order to understand what ligands might be affecting certain cell types, we found that the coefficient of variation and also in a related sense looking at the mean ligand flux versus the standard deviation of the ligand flux is informative to understand the kind of most variable intercellular communication ligands across a tissue and cell type. So in this respect, in this particular example we’re looking at vgf-a here in tumor cells which is, which has a relatively high mean flux across this tissue section.

James Opzoomer: So we use then a negative binomial GLM fit to the per cell gene counts which has predictors such as receptor gated ligand exposure. So the coefficients of this model quantify how exposure shifts expected expression and we can see which exposure to which ligand are related to specific genes and then gene programs. Here we can see that our model captures the directionality of many genes known to be associated with a vgf-a exposure in tumor cells. And this indicates that we’re capturing known biological processes associated with this ligand inter- ligand receptor interaction in this tissue.

James Opzoomer: So in closing remarks I think we often find that sequencing based spatial transcriptomics technologies um have a lower UMI coverage um that’s somewhat sparer than single cell RNAseq. This has motivated us to develop novel tools to understand the relationship between intracellular ligand receptor signaling and downstream gene expression. So this tool that we developed BISTR converts spatial transcriptomic counts and in coordination with segmentation into physically constrained extracellular ligand fields and then into per cell exposure for downstream modeling of the effect of ligand exposure on gene expression. And we believe that modeling um ligand receptor interactions like this with a biophysics constraints gives more interpretability into the intercellular signaling process. And we’ve designed this BISTR method as a flexible toolbox that is deployed as a Python package which we hope to make publicly available sometime soon. The goal of this approach really is to generate more tissue contextual experimentally testable hypotheses especially where simple in vitro systems miss micro environmental signaling contexts so we can better understand the intercellular signaling processes that drive cell states in patient tissue and in particular to better understand disease. So we we will be publishing this approach hopefully as a pre-print soon and so I encourage you to to keep your eyes out for it at that time. So yeah, thank you for listening today.

Grant Belgard: James, thank you very much. Um so does the BISTR package work with spatial transcript domain technologies other than VisiumHD?

James Opzoomer: Yeah. So it’s designed from a like the core methods within designs within a spatial sort of transcriptomics method agnostic approach. I really hope that I kind of highlighted that what you need is subcellular resolution spatial transcriptomics data and from there you can reassemble sort of approximately single cell and whatever compartment you can segment with your sort of image layer into that form of data. So, VisiumHD is great for that, but we’re excited to get our hands on um hopefully the new Illumina spatial transcriptomics technology that’s coming out that appears to be sort of in this one micron resolution. But yeah, it should work across different spatial transcriptomics technologies although we have only tested it with VisiumHD but we hope to expand that outwards soon. Thanks.

Grant Belgard: Now what makes the BISTR package biopysics informed rather than just a spatial regression?

James Opzoomer: So that’s a good question. So the the kind of BISTR approach explicitly models paracrine signaling as a spatial field within the extracellular space using this diffusion with decay FEM approach solved over the effectively the finite element mesh that we build from the native tissue geometry from the spatial transcriptomics you know the sort of the imaging data and the spatial transcriptomics data itself. So this kind of we believe this builds a more representative intracellular communication space than just representing cells as nodes on a graph without understanding you know the distance but also some of the spec- tissue specific features that might exist within it. For instance, you know, a future direction that we hope to go is to to use image segmentation within tissues to create different tissue zones, right, which you can identify from H&E and other types of immunofluorescent staining where ligands might have difficulty passing through and thinking in particular, we work a lot on bone as I touched on, but you know, using that to to create more representative data.

Grant Belgard: Great. Well, thank you, James, and thanks to everyone for joining us. Join us for our next webinar on February 18th at 11:00 a.m. Eastern. Uh, Phil Ewels from Sequera will discuss reproducible bioinformatics at scale, nf-core, and Nextflow. Thanks.

James Opzoomer: Thank you.