The Bioinformatics CRO Podcast

Episode 80 with Diane Shao

Dr. Diane Shao, an attending neurologist at Boston Children’s Hospital and instructor of neurology at Harvard Medical School, discusses her work as a physician scientist focusing on genetic causes of childhood neurodevelopmental conditions.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Diane Shao

Dr. Diane Shao, is an attending neurologist at Boston Children’s Hospital, an instructor of neurology at Harvard Medical School, and an investor with Legacy Venture Capital.

Transcript of Episode 80: Diane Shao

Disclaimer: Transcripts are automated and may contain errors.

Grant Belgard: Welcome to The Bioinformatics CRO Podcast. I’m Grant Belgard, and today I’m joined by Dr. Diane Shao, an attending neurologist at Boston Children’s Hospital and instructor of neurology at Harvard Medical School. Dr. Shao is a physician scientist whose work focuses on understanding the genetic causes of childhood neurodevelopmental conditions, including how newer single-cell approaches can help answer questions we couldn’t address before. She’s also an investor. Diane, welcome to the show.

Diane Shao: Thank you so much, Grant. And it’s such an honor to be here and also reconnect with you after our long-time friendship from college.

Grant Belgard: Indeed. So what’s been most energizing for you lately in your work?

Diane Shao: Well, something I really like to think about in my work is how to span across disciplines. So, you know, based on your introduction, I think the listeners can understand I do some very fundamental research in genomics. I also really think about how that research applies to patient translation. I actually see the patients and have to involve sometimes not yet solid data in terms of making a then firm clinical decision.

And then as an investor, thinking about how to assess that landscape. And so all of these, I would say, require a different vision and goal in mind. And so I think a lot about for any given application, what is my vision of translation or patient care or understanding fundamentals, et cetera, and how to generate the data, work with the data, apply that to really further that vision, kind of like big picture goals.

Grant Belgard: What’s a question you’re hearing more often now than you were a few years ago?

Diane Shao: The question I’m hearing more, you know, more and more people want to translate, from the time a PhD student starts working in the lab to post-docs graduating thinking, or post-docs trying to think about what’s next in their careers, wondering about industry versus academia. There is such a strong focus on translation, making that impact on humans versus I think, 10 years ago, as I was still going through training, it was more, you were doing a PhD, a lot more students were thinking about an academic path, fundamental biology.

And, you know, I don’t know if this shift is good or bad, but it certainly brings new questions to the table. And then the, you know, way far past academia had this idea that, you know, going to industry is, you know, maybe a sellout, you know, you’re not asking us interesting questions, but I think there is a growing realization that those questions are also extremely interesting, extremely impactful and need really, really smart people to be involved.

Grant Belgard: What does a good week look like for you? What kinds of activities make you feel like you made progress?

Diane Shao: Yeah, I do think stepping back and seeing where have things gone from, maybe you get some data that is really uncertain and murky. And then this week it’s, hey, we can draw one small conclusion from that. Or it’s thinking about, hey, this data, which is applied to fundamental biology, may be able to be reanalyzed in some small way to give us an insight on an actionable clinical impact. Or thinking about like, could this have implications for which companies we think could really be really strong in their market spaces? And so even one small 1% insight, I think is a good success because it’s building on that, 1% in all different directions that ultimately leads to where we’re going.

Grant Belgard: So since you have this kind of dual role of physician scientist, how do you think about differentiating between this is something interesting and this is something actionable, right? Because for your patients in the real world, you have to make decisions now, you can’t wait five, 10 years for something to maybe be firmed up. So how do you approach that?

Diane Shao: Yeah, those are really, I think, questions that the field of let’s say genetics is always constantly grappling with. So I’ll just give you an example from my clinic from this week just to make it a little more pertinent.

So this week, I saw a case in my neurogenetics clinic at Boston Children’s Hospital, a two year old that has a condition called lissencephaly, which is their brains are smooth and they had clinical sequencing that comes back with a rare variant that’s homozygous, meaning it’s in both the mother inherited and father inherited alleles of a particular gene and records the variant as variant of uncertain significance which means on a clinical lab basis, they cannot provide a diagnosis. And if you search the literature for this gene, there are exactly four patients reported worldwide that have other variants, not this exact variant in this new gene, but their features are all extremely similar to the given patient.

And so there is a practical matter of how many people need to exist in the world for you to have confidence that the fifth variant in mother and paternal inherited alleles of a gene is now diagnostic. And so that’s a clinical lab question. And on their end, they said, hey, we can’t call this disease causing, we have to give it this variant of uncertain significance label. And then there’s a whole other decision to be made on a clinical level. So when the patient comes to see me and I’m like, okay, the lab is not gonna be able to change your classification, but I can tell you your brain looks almost exactly like the four other brains that are out there. Your kid is manifesting all the symptoms that those other four are. We have a very good sense that we should be worried about the things that those other patients have.

So if they have, for example, those other patients had problems with their eyes and so I’m saying, hey, we gotta check their eyes, we gotta check their hearing. These are the things I can worry about now. And those are really practical clinical decision-making matters. And then there’s a whole interesting aspect of, well, what do we do in this gray zone? And those are kind of boundary pushing now research questions. So I then spoke with one of the residents who was very excited about this potentially new gene and this new presentation that we’re seeing. And so they’re saying, hey, can we write this up? I’m like, yeah, it would be great.

And it would be great if we found 10 other people so that we now have the statistical informatic confidence to provide this diagnosis. We can then go back to the clinical lab, change their classification, who would change therefore the classification of other patients that come in with a similar presentation and new variants in that gene, which otherwise wouldn’t be diagnostic. And so we can then go back to the research realm and really make a difference. And so I don’t know if that kind of showcases how the different elements of clinical decision-making, gray zones in what is known in a diagnostic laboratory, and then what can be brought back into the research side are clear from my description.

Grant Belgard:
Oh, that’s great. What kinds of uncertainty are you most comfortable with and which kinds do you work hardest to reduce?

Diane Shao: Yeah, so in terms of, we can talk about a couple of different settings here. So maybe in the clinical setting that I, currently in this example, continuing on the kinds of decisions that we can make are interventional. Does this child need therapy? That’s a pretty certain yes. And I can probably give a good sense of how much therapy they need. Do I need certain screening and certain organ systems based on what I know? The answer is yes. And the risk that I’m going to be wrong, or even if they don’t totally need it, that screening will be negative. Okay, those are tolerable risks.

But other risks are not so tolerable. For example, if I am wrong about the variant interpretation and the family is doing prenatal genetic testing for embryo selection for their next child, at that level, I may stop at my confident assessment that this is absolutely the disease-causing gene until I have more of my research statistical evidence that I’m going to gather with my resident, let’s say. And so those are various arenas where I may or may not be able to make a solid decision. And then in the stepping back into the research space, these, the confidence in a research diagnosis is a little more clear.

Because on a research basis, you don’t need to have an assessment of any patient with any variant in that gene that comes in. You just need to have a sense of, is that particular variant causing a functional change? And on a research basis, there’s a lot of other modalities that can give us confidence. You can do, look at the RNA changes. You can see how that variant affects gene function. Structurally, you can do other types of statistical testing if you enroll patient cohorts within your cohort to do, for example, linkage analysis or other types of confidence-building metrics. And so in different settings, there are different ways to really increase confidence in different types of interpretations.

Grant Belgard: How do you think about measuring success when outcomes can take years to show up?

Diane Shao: Yeah, yeah, that’s a great question. You’re kind of thinking about as we, let’s say, push forward our research agenda on a given genetic condition, what the success is.

Grant Belgard: Well, or I guess, or in the case of investment, right?

Diane Shao: Yeah, okay, yeah, no, I think that’s a great question. So why don’t we jump to the investment for just a moment? So right now, for example, not all rare disease genes are good even current targets for investment. To even embark on starting a company, there are only certain, you’ll see this mentality where people will invest in only a handful, let’s say, of rare diseases that are broadly of interest. And partly it’s because those are the diseases that we know the most about.

There is a lot more research dollars, there’s a lot more research interest. Maybe the patient advocacy groups have been really promoting or focused on getting a therapeutic out and there’s enough support interests that finally there’s enough data and understanding so that initial startup can even be conceptualized for investors to be interested. And so not every, let’s say, genetic condition in this moment in time is ready for research translation.

And so pushing that long-term envelope from the fundamental discovery of a gene to when is it ready to be even considered a therapeutic target to actually pushing out the company to then now assessing that market landscape and seeing whether or not it’s worth funding, et cetera, is a really, really long pipeline as you’re suggesting. And so in any given moment, there are many different people from investigators pushing their visions and agendas to the NIH pushing their research vision agenda to the business development people pushing their agendas and the investors pushing their agendas.

And I kind of really see the progress for each individual needs to be unique. As an investor, I am really interested in pushing the investments that we make into rare diseases more broadly, but that doesn’t mean every rare disease that’s presented to me with the potential therapeutic target is a good investment to make. And so a progress on an investment front means having grasped, let’s say, further the landscape of a particular genetic condition, grasped maybe the market space, what are the FDA regulations?

Those are things are progress in an investment space versus in a research setting, I may be a little more agnostic to which disease I’m looking at and promoting. And that research space may be promoting, let’s say, like progressing new techniques for gene discovery. It may be figuring out how can I collaborate better with others. And so for people in any given phase of all of these different intersecting sectors, I think progress at the end of the day is very, very individual. And I hope that collectively across everyone, this will really push the boundaries of treatment for any disorders, you know, rare or common.

Grant Belgard: So across all the domains in which you operate, what is your expectation for the impacts AI will have in the near term, right? Looking out over the next one to two years?

Diane Shao: Yeah, after this conversation, I’d love to hear your thoughts on that too, Grant. But for me, I feel like AI has touched every aspect of both what I do and also how I assess both the research spaces I want to go into as well as investment spaces I am considering. At a high level right now, I would describe my usage of AI as increases in efficiency. So increases in data sourcing, let’s say to help me find relevant papers and subject matter and people and spaces, et cetera. I also feel it as efficiency in terms of helping me integrate across different, let’s say, perspectives.

Right now I have all this data that describes this biology. Now I want to understand how to change this into a clinical risk assessment model, et cetera. These are kind of, I would still consider efficiency spaces. That being said, I know that the AI field really wants to do new discovery, pushing the envelope, idea creation from AI. I don’t feel that it’s there right now and I’m not really engaged in tool development to know how close are we to that. I think that pushing efficiency and data interpretation, management, et cetera, is already a really, really large task.

It takes off so much from my plate to be able to outsource a lot of those tasks to AI for it to also hold that information for me across, let’s say, these are the grants I’m writing and this is all the data that I need you to store for me. And as I re-synthesize into a new grant with a slightly different focus, how can you help me shape that? And then so it lessens my work a lot. And I’ve found it tremendously beneficial. And to me, that’s really important because it means I can leave my mind space, let’s say, open to big vision problems. I can be the one leading idea generation and then using AI to kind of curate these spaces. So I would say that’s my perspective. I think AI is transformative, but I don’t feel like AI is transformative in the way of taking the place of human creativity and pushing the boundaries of, let’s say, the unknown, unknowns in the world.

Grant Belgard: When you’re designing a study, what decisions early on have you found most impact downstream data quality and interpretability?

Diane Shao: Yeah, so the types of study that I design most are in the realm of human genetics. So I do some human gene discovery research for which I would say the pipelines for that are probably pretty well described. And then I also do single cell technology development for the purpose of understanding how mutations arise and in particular, understanding the variation in the DNA within or between individual cells of an individual, what we call somatic mosaicism.

I’m part of a pretty large consortium from the NIH called the NIH Somatic Mosaicism Across Human Tissues Network, where analogous to other large consortium efforts, one of the most notable ones being the Human Genome Sequencing Project back in the 2000s, the idea is that by characterizing the full intra-individual variability in genetics, that tool can be extremely useful across many, many different areas of biology and life sciences. And so for single cell technology development, that experimental design really affects the downstream.

So for example, I have been working on understanding human brain development and the single cell copy number landscape. Copy number changes are structural changes in the DNA where whole areas of regions of chromosomes either get amplified or deleted or lost. And so to detect structural copy number changes uses fundamentally different techniques than detecting other types of variations, such as single nucleotide variation, where you’re just changing, let’s say C to a G or a single nucleotide, and also uses totally different techniques than identifying, let’s say, repeat expansions in single cells, which are also highly mosaic across an individual.

And so the study design choice of tool becomes really critical to say, is it even possible to analyze my genetic change of interest? And that decision comes down to a matter of, what is the goal of the project? And also has some practical considerations of cost, and also has some technical considerations of do I have the informatic support to analyze the type of variation I’m interested in?

Grant Belgard: How do you communicate uncertainty to different audiences, scientists, clinicians, families, leadership?

Diane Shao: That’s a great question. Depending on the audience, I try to do things differently. Most people do a lot better with the things that are certain than the things that are uncertain. I think that for me, always trying to portray first what we do clearly know can be really, really helpful to then give a framework to all the things that we still don’t know or still exploring.

So just to give a concrete example of that, in my work in mosaicism, I really think that there will be totally new possibilities for genomic biomarkers or different possibilities for precision medicine that are related to the genetic landscape when we look across all the cells in the body. But of course, we’re still in early days of that on a research basis, so I don’t know if that’s true. But what I do know is true is that we have, for example, in our brain, in our neurons, hundreds of single nucleotide variation per neuron, times six billion neurons in our brain by the time we are born.

So that is a biological fact. And so I can hang on that certainty and share with people that certainty and then describe what I think we can do with that level of genomic data. And just so the audience kind of understands where I’m going a little bit with this thought, think about, for example, the difference between when the Human Genome Project first came to light and they sequenced one human versus what we can do with the genetics now that we’re sequencing hundreds of thousands of humans across different countries and across different disease modalities, et cetera. That type of data, while we don’t know yet what will be revealed about ourselves and tissues and how they all work together from a DNA perspective, I think is also inevitable to shift how we think about disease and how we think about diagnostic possibilities.

Grant Belgard:
What is the current state of the evidence on the impact of mosaicism on clinically relevant phenotypes and the prevalence?

Diane Shao: Yeah, that’s a great question, Grant. So in certain diseases, it is a fairly actually commonplace now to think about mosaicism. There are a number of disorders where it’s pretty common to now look for mosaic genetic causes. So for example, epilepsy, there is a subset of patients with epilepsy which will get surgical removal of the epileptic lesion and often somatic mutations are found in those lesions. They follow particular biological pathway principles and so those are pretty clear.

Another realm which is pretty common to think about now is vascular disorders. So localized cavernous malformations, there’s a pretty common precedent. Vascular disorders like Sturge-Weber syndrome which is a capillary malformation over just one part of the body now are pretty commonplace. So there are certain disorders where it is common to now think about somatic mutations as the primary cause. There are other disorders that is coming to light that even while they can have both causes in the germline and at a mosaic level, that many of those individuals actually are mosaics.

And just because think about generating a human, how many cell divisions that you went through to generate this entire person from the time they were an egg and a sperm meeting each other to the huge five to seven foot human being, there’s just a lot of mosaicism to be had and that causes disease and sometimes they look like germline presentations even if the person themselves are mixed genetically. And then there’s a whole realm of things that we don’t know which is maybe a subject of research but things like there are diseases where certain cell types are lost.

So for example, in Hirschsprung’s disease, a very particular neuronal cell is lost from the gut intestine. And so to me, that’s a high likelihood place that there is likely a somatic localized cell-specific component but when it’s lost, how do we use genetics to actually determine what it was that was lost to begin with? So a lot of questions but I hope that answers your question on the areas that we do currently know which is I would say a tip of the iceberg.

Grant Belgard: So how do you think about future development of precision medicine and so on in a mosaic condition?

Diane Shao: Yeah, so I’m really excited about a couple different areas. One area is simply leveraging the power of essentially what I would describe as let’s say a saturating mutagenesis experiment within an individual. So thinking about what we’ve learned from human populations. So when we sequence hundreds of thousands of people from human populations, we can see, hey, these genes never have a mutation and the other genes have mutations that are just scattered all across the genome.

And those genes that never have a mutation, they’re actually important to humans in some way. There’s a reason why we never have a mutation usually is because either they were embryonic lethal or they affected reproductive fitness in some way. And so that’s actually a huge part currently of gene discovery to compare to population databases and say, hey, those areas are constrained, this may be an important disease gene. And so similarly, you can imagine that there are lots and lots of disorders which don’t have a strong reproductive fitness component.

Think about cancer, for example, in old age, it’s not necessarily gonna be selected against the population level. Think about eye conditions like strabismus where you’re not really gonna have a strong reproductive fitness signal or autism even, nowadays many people are getting diagnosed when they’re already lived full lives. And so while some forms of autism will have reproductive fitness constraints, others will not.

And so then the question to me starts to be, well, if we can now get information on genomic constraints, so which areas of the genome are
really, really important just in a particular cell types, like in neurons or in the lung cells or in something like that, is that now new information on what genes are really critical for biology and does that tell us something about disease? So that’s one area I’m really excited about.

You can also think about that the same way in terms of modulation, how do individual genetics within a cell either drive a phenotype or are still collected against the phenotype. So for example, let’s say a person with a neurodegenerative disorder where some of their neurons will die with age. Well, it’s not that these neurons die uniformly, some will die earlier, others will die later and there’s genetic variation between that.

So can we leverage that to somehow understand what is it genetically about those individual cells that are surviving longer? And I think in the past, the view is just, oh, it’s stochastic. Some are just gonna die sooner, some are die later. And yes, probably it is stochastic, but stochastic doesn’t necessarily mean random. Stochastic is a distribution that is related to some underlying biology. And so these are open questions as to the genetics that drives these stochastic processes. And so these are some of the areas I’m interested in and I think they have really strong translational potential and also the therapeutic potential. Yeah.

Grant Belgard: When did you first realize you wanted a career at the intersection of medicine and research?

Diane Shao:
This is a great question, Grant. I think it actually goes back to our college days. When we were at Rice University, I started working for a PI at Rice who’s now left that university, but he was my first significant research experience and I realized he was kind of a remarkable person in that he was a trained astrophysicist who then became an HHMI investigator, which is a very prestigious award funded investigator who studied slime molds, Dictyostelium.

And then when I was in the lab, was going into human immunology and had created a compound to treat fibrosis, which is I was working on in the laboratory as like he had one postdoc and, I guess, me the undergraduate working on this at the time. And then he turned that into a company that was sold for $1.4 billion, ultimately for trials in fibrosis. And so that mindset of the fundamentals of science can be leveraged across astrophysics to slime molds, to human immunology, to translational medicine, I think already came to me maybe through this experience by osmosis maybe as an undergraduate in this space.

And I think that mindset really resonates with me as in at the core, science is science and those principles apply no matter what realms you’re looking into. And also as scientists or people engaging with life sciences in any way that many people do, you also don’t have to be limited to the one dimension that you’re trained in, that all of these realms are possible and so, for me, that is what also made me think doing a MD/PhD career path would be one for me because it was one where I got to see both the research perspective, the translational perspective, the clinical perspective and then in my early 20s, in my past couple years have added this investment and market space perspective as well.

And while some people feel that they’re really disparate and indeed they’re really tackling very different problems at the core, if you go to core principles, there’s a commonality.

Grant Belgard:
What’s something you intentionally didn’t do or stopped doing that made your path a little easier?

Diane Shao: Oh, that’s interesting. Yeah, it does sound like I’m just accruing things, but to be honest, I drop things constantly. I actually think that’s a critical part to maybe going back to your question on what drives progress. Progress does mean constantly cutting out everything that is not leading to your vision of progress.

So even in, let’s say my work on single cell technology, I was developing some technology, applying it to a number of different settings and when I found one that seemed like it’s particularly interesting in terms of its understanding of the biology and that we could really gain traction with the tool and stuff, it meant I just dropped everything else and I don’t have any intention of picking them back up unless they further my vision in a given direction. And so I think that there is always this fear, like the sunk cost fear of like, oh, I’ve invested all this time, I gotta finish it, it’s gotta be a thing, but I don’t buy into that at all.

So I actually need to constantly drop things along the way and to me, that’s a huge driver of success because it means we focus on our energy, on the things that go toward a vision.

Grant Belgard:
If you could go back and give your earlier self one piece of advice, what would it be?

Diane Shao: Oh, probably don’t stress so much. You know, I think that especially as a trainee or a student, it was so easy to worry about how things would unfold and try to control them, but honestly, simply because we didn’t know, for example, writing a first paper, you don’t actually know what it takes to even write a paper or, you know, what are all the steps, what are all the pitfalls, what is everything you wouldn’t even need to think about?

And so I think I spent a lot of time stressing and let’s say strategizing and things like that, but the reality is you just gotta do it and then you’ll learn from it, as in nothing needs to be perfect that first time. And to allow for that, allow for the learning process, you’re gonna get more out of that than trying to make it go a particular way each time.

Grant Belgard:
I guess on that note, how do you avoid burnout?

Diane Shao:
Drop everything else that you don’t feel like doing. But in some ways, I really believe that burnout is a combination of what we’re holding, all the different aspects that we’re holding, and also how we feel about it. As in, if we’re aligned, like if I’m holding a lot of things that I’m doing and those are the things that get me up in the morning, I’m so excited about them, I can’t wait to discuss them with people and share them with the world, that’s not burnout.

That’s just me holding a lot of things that I like to do. But burnout is having a particular interest, but also feeling like I’m obligated to do all these things I don’t wanna do, I’m supposed to be finishing XYZ thing in this other realm that I’ve sunk all this time and effort in. And so to me, preventing burnout is pretty continual, like every few weeks renewal of what is my actual vision, what is actually driving me, and am I doing the things aligned with that? Because if it’s not, and you’re doing that and having that conflict internally long-term, that’s what burnout is. So yeah, so if you are aligned, then that will feel good. Everything will feel like fun and flow.

Grant Belgard: What’s an effective way to build competence across disciplines?

Diane Shao: Competence or confidence?

Grant Belgard: Competence.

Diane Shao: Competence. Oh, that’s a great question, Grant. The biggest thing is to not be afraid and to not be afraid to not know. There’s no reason you would know. And I find that what people really orient around is a strong vision. So for example, maybe with my own interest in mosaicism and thinking about how that can push our boundaries in precision medicine, I work in child neurology, I work in pediatrics, brain development, et cetera. I’m very interested in maternal influences on childhood brain development, but that’s not a space I know at all.

I don’t know a single OB, I don’t know nearly anything about obstetrics or what happened, all the actual biological principles of pregnancy, et cetera. And so, as I delve into that space, that is a totally new space for me. But what I do orient around is how important I think understanding this phenomenon is. And if I can share with people my vision, what I know in a very clear way, others are going to wanna help me and that will build my competence. As in, I don’t go in pretending I know anything about these other spaces where I don’t.

And that’s actually where true collaboration lives. It’s not, we both know everything about the other’s field, it’s knowing exactly what do I know that’s valuable between us and exactly what do you know that’s valuable between us and then putting those together. And competence is not always getting to know everything in a different space. Competence is sometimes being able to know where the gaps are and know how to ask questions and get help.

Grant Belgard: What mistakes do you see smart people make when they try to do interdisciplinary work?

Diane Shao: Oh, that’s a really good question, Grant. You’re full of good questions. So one thing I do think is really important to recognize is that there’s always a difference in culture, no matter what. Research culture, medical culture, even as I’m talking about neurology research versus obstetrical research, there’s a difference in culture. And if you are not recognizing that and respecting those cultures, it’s just not going to work out. So for example, in the biological space, samples are really critical. I work with post-mortem tissues, those are really important. And PhD scientists are also really interested in studying human tissues.

But why do PhD scientists have a lot of trouble integrating with MDs? It’s because they kind of speak different languages, right? It’s the way they’re talking about the samples is different. The PhDs are talking about the samples as a biological utility. The MDs are talking about them like the boy they took care of for 10 years and then passed away for some reason. And so to understand that culture is critical. If you go to the MD and say, hey, I’m looking for samples for X. They might say like, oh, okay, I have some. And then you’re going to say something like, okay, well, I want to study protein. Proteins A and B and how they interact and blah, blah, blah.

The MD is not going to connect with that, right? So thinking about, well, protein A could be a therapeutic if it interacts with protein B in this way as a much more viable start. And then also thinking about, it’s easy to start thinking about, okay, the doctor is just the one who’s going to just be retrieving the sample and et cetera. And the minute you start reducing some other person’s role to just a task oriented sample retrieval role, you’ve totally lost the collaborative interdisciplinary engagement there. And so I think about these things a lot and I encounter them constantly.

For example, even in my example of, what do I do as a neurologist who wants to think about obstetrical tissue? Well, when I started, I’m used to paying $0 for my tissue because I get them from biobanks. I get them from patient groups that are really trying to get people to utilize the tissue for studies, et cetera. But obstetrical tissues are different. They pay a lot of people healthy pregnant women money to collect samples to be part of studies, et cetera. And so even engaging on costs, what is value?

I was running the risk of devaluing all of their tissues simply because I’m used to paying $0 for my tissues. And so these are all cultural nuances between disciplines, the same way going to a different country, you really have to consider those cultural nuances. Understanding them is non-trivial. I do rely on saying things like, hey, I don’t know what the typical way things are done in your field is, this is what I’m used to. And having that humility upfront allows people to also share with you their culture and being open to that, whatever that culture is and not just judging it as unreasonable or too hard just because that’s not the culture you’re used to.

Grant Belgard: What’s a good habit you find most strongly compounds over time?

Diane Shao: Oh, good habits. I find that, I think this may be going to your burnout question, finding the things that are going to make you feel passionate and excited every day. And sometimes they’re not always scientific questions. Like for example, I find a good habit that I have is taking a break at 2:30 PM every day. Either that break could be taking my 2:30 meeting and asking the person if they’d rather take a walk around and have a discussion instead of sitting at a Zoom screen, or that break could be meditating for 10 minutes by myself in a quiet space.

And so I guess I mean that as in, not to say that everyone needs to take a break at 2:30, but just if that is something that you need and will make you feel good about your day, that’s something you need to do for yourself. Similarly, if there’s a particular question you need to answer to feel excited, engaged in science, you just need to go down that route regardless of if it’s exactly the right time or if you have 10 other things you need to finish first or whatever it is, because it’s doing those things for you that is really gonna make everything worthwhile.

Grant Belgard: And where can our listeners follow your various threads of work?

Diane Shao: Oh, that’s a wonderful question. I am in the middle of building my own lab website, but for now you can find me through the
Boston Children’s Hospital. I have a research page there. I’ll provide the link for your notes. And then also my venture capital firm is at LegacyVentureCapital.com.


Grant Belgard:
Well, Diane, thank you so much for joining us. It’s been
lovely.

Diane Shao: Thank you so much, Grant, so lovely to be here.

The Bioinformatics CRO Podcast

Episode 79 with Yang Li

Yang Li, an Associate Professor at the University of Chicago, discusses applying computational genomics to the intersection of genetics, gene regulation, and disease, as well as the impact of new AI tools.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Yang Li

Yang Li is an Associate Professor at the University of Chicago, where his lab investigates the genetics and genomics of RNA splicing.

Transcript of Episode 79: Yang Li

Disclaimer: Transcripts are automated and may contain errors.

Intro: We are conducting our first listener survey. If you enjoy the podcast, please follow the link in the description to a 60-second multiple choice survey. This helps us understand what kind of guests you’re most interested in and keep the podcast sustainable. The survey is anonymous, but you can choose to provide your email to receive a summary of the aggregate results after the survey period is over. Go take the survey at bioinformaticscro.com/survey.

Grant Belgard: Welcome back to the Bioinformatics CRO podcast. I’m your host, Grant Belgard. Today, we’re joined by Professor Yang Li from the University of Chicago, a computational genomics researcher working at the intersection of genetics, gene regulation and disease. Yang, welcome.

Yang Li: Hi, Grant. Nice to see you.

Grant Belgard: Good to see you again. So what’s been energizing you most recently in your work, scientifically or operationally?

Yang Li: Well, since the New Year’s, I’ve been playing a lot with Claude. I mean, everyone’s, I think, playing with Claude. And I think both in terms of the science that he can help me produce and also, you know, just managing my schedule, that has been a game changer. And I’m still exploring what he can do. But yeah, so I think that’s basically what’s been what I’ve been thinking about most of the time.

Grant Belgard: What have you put into practice so far? Like what’s kind of, quote unquote, in production?

Yang Li: Yeah, we’ve been writing the revisions for one of our papers. And I’ve been using that extensively both to help me write some of the response, making it a little bit friendlier, but also rewriting some of my old code and checking for bugs and things like that. And it’s amazing. The number of things that I can do in just an hour far exceeds what I can do within a day at this point. So things like producing a plot in a slightly different way. As you know, it’s very difficult to rerun your code, especially if it’s not the best practice in the sense of software engineering. I’ve been self trained in terms of programming, mostly, and so the comments are not necessarily the best. But with Claude, it helps me comment, it helps me name my variable, right?

Or at least improve the naming of my variables, and then produce plots very, very fast, right? And so as you know, a lot of the way we check that the code is doing its job is to visualizing the underlying data in many different ways. And so Claude helps me do that. You know, as soon as I have an idea, I can just ask it to do it. And then I would see the visualization and I would sometimes I would find error. But most often than not, it gives me exactly what I expect.

Grant Belgard: When someone asks you what you do, what’s your favorite way to describe it without using jargon?

Yang Li: Well, lately, I’ve been trying to steer away from that because I’ve been doing things that are pretty technical. But in just a few sentences, I think I would just describe it as I’m trying to understand how proteins are expressed. And there are many different ways by which we can control the expression of these proteins and focusing on this regulatory mechanism called RNA splicing. And this is highly regulated. And I want to understand what is the function in different system and how to modulate it using drugs.

Grant Belgard: What makes this the right time for that?

Yang Li: Well, I think the reason why I chose this and I stuck to this ever since I think I was in grad school, really, is because almost nobody talks about genes in terms of how many proteins each gene can be producing. And so and it was clear when I was researching, even the things I was researching in grad school, which is, as you might remember, the cichlids, it was clear to me that every single gene produces many proteins or many isoforms. And to me, it felt like this has had to do something. Right. And my perspective has changed slightly since then. But because of my earlier work and the fact that no one, almost no one was really researching that, I became really interested in that topic.

Grant Belgard: So what is your current perspective on splicing?

Yang Li: Well, when you read the textbook, it basically tells us that every single gene, every single human gene can produce many different proteins and many different protein isoforms. So these are isoforms that are essentially the same, but with slight differences. So it could be one protein domain that is included in an isoform and in another isoform, this same protein domain is excluded. And so often textbook or in literature, it would be described as something intentional, as in the two version of the proteins have very different function. So one would be performing function A and the other would be performing function B. And both are very important for the survival or the proper function of a cell or the organism.

But what I think now is that the vast majority, and by a vast majority, I mean really over 90% of these different isoforms is not really to have a different function, but really as a regulatory sort of switch. So again, to fine tune, very similar to gene expression level, right? So when you regulate gene expression levels through enhancers and promoters, you’re not changing the final output or the function of the gene. You’re just changing the activity by a little bit. And I think splicing most of the time is doing that, is doing exactly the same thing. The regulatory input is a little bit different, but the outcome is very similar.

So it’s able to change the protein and have a different function, but those are really the minority of the cases rather than the majority of the cases, as is taught by literature or the textbook.

Grant Belgard: How do you decide if a problem is method worthy or just something you’ll apply existing tools and move quickly on?

Yang Li: So do you mean in terms of developing a tool or just using a tool to solve a problem? Right. So I think it takes me a long time to convince myself that I need to develop a method for something. And so in general, I try to use methods that exist already or previous method that I or my lab has developed. In some very rare case, I think, hey, we need to develop a method because there’s really something that hasn’t been done. And we really need to do that and also that we can do it. So all of these checkboxes has to be checked in order for me to move on to method development.

And I should say that we’re not particularly, I don’t think my lab is particularly good at developing methods, but we’re pretty good at identifying, I think, problems that can be solved by an older method whose goal is not necessarily for, well, hasn’t been developed for the specific question.

Grant Belgard: What are the most common bottlenecks you run into today? Is it a matter of data, compute, annotations, study design, interpretation, something else?

Yang Li: Yeah, that’s a pretty good question. I would say for me, it’s my time and getting a sense of what to focus on when there’s just so many people that I think needs my attention, so many projects that needs my attention. I think one thing that I’ve heard a friend tell me was a good example that I often talk about. It’s this context switching time. I’ve heard that the grape vines that Terry Tao, the famous mathematician, is extremely good at context switching. So he basically could switch from one problem to the next within seconds. And for others like me, we need more time to context switch. And so our schedule, when I guess you become a faculty, is that it’s spreading blocks of one hour. And I find it pretty hard to switch context from one hour to the next.

So I try to block more time, but then there are fewer blocks of the longer period of time. And so I think that’s somewhat of a bottleneck for me, is to find a longer block of time so I can have the time to context switch and then do deep work instead of just trivial work in order to make progress. It feels a lot of time I’m trying to just keep afloat and that doesn’t give me enough time to do enough deep work, which is the thing that I think I’m good at and also the most happy in doing. Yeah.

Grant Belgard: Have you found using tools like Claude impacts that in any way?

Yang Li: Yeah, yeah. So I think previously it was very hard to, I had a lot of questions about a data set or some topic and it just never felt like I had the time to do it. And with Claude, all of a sudden you could do things that would take a few hours. It would just take you a few minutes because it had the context, it remembers the context in which and you would ask and then you would remember the context and then you would just do it. So for example, plotting a figure about a data set and then it remembers where the file, where the raw data was. It would have taken me maybe 10, 20 minutes if I came back to this specific project after a week. It would take me maybe 10, 20 minutes to even recall where was the file that I was using and what exactly I was doing essentially.

I can ask Claude or summarize what I was doing or just scroll up a little bit and then ask questions and then he would give me the answer within a few minutes and then that would get me back on track much more rapidly than he would me by looking at my own code and browsing and recalling. So that has been extremely useful. I think also Claude might be able to help me manage better. I haven’t implemented this, but I’ve sort of joked around that I would have my trainee talk to an agent or Claude and then Claude would tell me, you know, summarize all of their things. And then I would only have to read through the summarized version.

Grant Belgard: So you could be like the nurse at a doctor’s appointment before you see the doctor, right?

Yang Li: Yeah, exactly. And then five minutes before I meet them, I would review that and I would think a little bit to just get into context. And then it would be, I think, a lot more productive, right? So, yeah, I often tell my students to prepare some slides or some notes before I meet with them so that it helps me switch, get into context, because oftentimes I hear about their problem on the spot when they come to me during the half an hour or the hour period. And then I have to think about it. And oftentimes when I think about it, I find it’s not really awkward, but it’s still some pressure to answer, right? I can’t think if I thought in silence for five minutes, even two minutes, right? It feels a little bit long, right? Let alone 10, 15 minutes.

But oftentimes that’s the time that you need to bring you back into context, to recall all of these different information, right? To have a very effective conversation. But the reality is that it’s also hard on them to come up every time with a few bullet points. Or at least, you know, I don’t know if it’s hard for them, but they don’t do it essentially. And this would, I think, speed up tremendously our meeting or at least make it extremely productive because everyone’s on the same page.

Grant Belgard: What’s a recent result or direction that surprised you and how did you respond to the surprise?

Yang Li: I can’t say that there’s a recent direction that really surprised me. I think I plan my projects long in advance and I can see points of failures pretty early on. And oftentimes the project or the direction does indeed fail. But then I often have a backup plan. And so I don’t think there’s anything, any direction that surprised me, I would say. And unfortunately, there hasn’t been anything like a sudden discovery that changed everything, unfortunately. So I’m either a very good planner or just not super lucky in terms of unexpected findings.

Grant Belgard: How do you think about reproducibility in practice? What’s good enough versus gold standard?

Yang Li: Yeah, I think there’s a lot that can be improved in terms of reproducibility. Unfortunately, when I think there is some amount of pressure to understand the system, the biology. I mean, there’s a speed component, right? You want to dig into the biology more rapidly. And oftentimes the solution to that is to do what you know best to do. And we’re not trained as software engineers. We don’t do these kind of unit tests. And so reproducibility and there are bugs, right? So I’ve developed LeafCutter many years ago and I still find bugs there. So in that sense, these things can be improved drastically. On the flip side, I don’t think any of these bugs or these issues affect our results, our biological interpretation of things.

Very rarely there would be a very important result that are affected by these. It does happen that it affects a very minor result, right? Or the interpretation of a minor result. And to prevent these bugs or these lack of reproducible findings, we essentially try to poke holes at our major, what we call major discovery. So the things that would, for example, break a paper or the main finding that we think we made. We would look at many different data sets and we would design tests that would essentially break it in one way or the other. So we have very orthogonal ways of trying to confirm a result. That would include, for example, looking at a just completely different data set or deriving some corollary. So based on these, if this were true, then this other thing must be true.

And so we would do more tests on whether this downstream result should be true, will be true. So we do a lot of these type of analysis. And then at some point, everything makes sense. And if something doesn’t make sense, then we have to explain this. Right. So I think this is a scientific process. And I’m not going to claim that this is foolproof, as in I will never have anything that is later falsified. But I think from my track record, I think this has worked so far.

Grant Belgard: How do you decide what to delegate and what you personally stay close to?

Yang Li: Right. So I try to delegate as much as possible. I try to delegate anything that I think a trainee or someone else or a collaborator can do to them. But I obviously weigh by importance. So the things that are the most important, even though I also try to delegate those, depending on whether I think they can do it, I would pay attention to the outcome. Yeah, in my minor things, I would just trust them to do the correct thing. And sometimes, you know, we have to backtrack when later on we find a problem.

Grant Belgard: How do you help your trainees develop taste, knowing what to do and what’s not worth the effort?

Yang Li: Yeah, that’s a very good question. It’s a little bit like asking me, how do I teach creativity, someone to be creative? And yeah, I hate to have this fixed mindset view, but I think it’s something that’s very difficult to teach, right? I think we can encourage creativity, but it’s something that has to do a lot with personality. I think I’ve noticed some type of personality that are, I would say, not as creative or don’t have as much taste and more rely on other indicators. So, for example, sometimes I notice it as not just my training, but in general, right, that a paper that’s published in Nature, right, or in a high impact factor journal, they sort of rely on that to be as a measure of what’s exciting and what’s good.

And others, they don’t rely on this and they have an internal perception of what’s exciting and what’s not. I think the one way that you can help is to read a lot, right? I think it’s, I always tell my trainees to read a lot. I don’t know if you remember, but in grad school, I just tried to read at least one paper a day and I would go through my RSS feed with, you know, hundreds of abstract every day. I mean, now it’s getting even harder because, well, much harder because there’s just a lot more papers that’s been published. But at least back then I had all my journals that I generally read and an RSS feed and I would go through all of the abstracts, title and abstract. I would do this every morning and I would read at least one paper that interests me.

And so I think that helped a lot in terms of both creativity. I mean, creativity is not just, you know, whether you can come up with new things, right? You can come up with new things to you, but someone might have done it. So you also have to know about what’s out there, right? And taste, I think, is somewhat similar as well, right, to creativity. If it’s just, if you like a paper or if you like a project just because it sounds good to you and you don’t know that much, then maybe someone might not call that a good taste, right? So I think these are linked together.

So the more you know, the more you’re likely to have good taste and you have to have your own sense of what’s worthwhile and what’s valuable and not just use some kind of, you know, external, I mean, what someone tells you, right? Obviously, at some point you have to rely on someone, right? So someone that you respect, someone that you know have good taste, if they like it, then you can maybe up weight something a little bit. But then at the end of the day, you need to build your own, you know, scoring function.

Grant Belgard: So let’s talk about your own career track. In your own words, how did you get here?

Yang Li: That’s a very interesting question. I did mention that I like to plan things ahead in terms of my research project, but my trajectory, I think it’s been, yeah, I’m reminded of the quote by Bertrand Russell. I don’t remember the quote, but essentially it goes like, you know, my life has been like great waves, right? Or great winds, like it blows me here and there. And I really feel that way, that, you know, there are periods of my life where that changed me a lot, right? And that has depended a lot on luck or maybe we shouldn’t call it luck, just circumstances that I guess I viewed favourably and therefore I called these luck. But it could have also been misfortune, right, if it didn’t end up very well. And I think these periods are what, these few periods are what took me here.

And the first period was during the last few years after high school. So right before college, I grew up in Montreal, in Quebec, and there’s this period called CEGEP, which is two years before university, but after high school. And at that point, I met some very good friends who introduced me to coding, but also hacker culture, not in terms of, you know, black hat, but more just, you know, coding hacker. And I remember that I started to also become interested in philosophy a lot and asking, you know, bigger questions such as what is the meaning of life, obviously, what is consciousness and all that. But really asking the question of, you know, what am I doing here, right? And at that point, I started to code and to read essays.

I think many of us have been influenced by essays from Paul Graham about, you know, being a little bit more intentional about who your friends are, who you hang out with. And I think this really has started this trajectory, right, being very humble, always trying to look for people who are smarter, who are more, you know, who are more knowledgeable than me. And so without that perspective, I don’t think I would be where I am right now. I mean, grades don’t really matter, but my grades were very average. I wasn’t particularly interested in anything other than video games, obviously. But at that point, something switched in me, right? So being a lot more intentional about how I use my time, who I’m friends with, or who I hang out with. And I think that worked out, right?

Immediately during university, I just identified people who were really excited about their work, excited about their craft. It didn’t have to be anything particular. But at that point, I majored in mathematics and computer science. And I met very good friends, again, that were really excited about the work that they did. And they were really passionate about something, right? So you can be passionate about video games, which I was about, right? But it’s very easy to be passionate about video games. It’s a lot harder to be passionate about something that’s very difficult and that no one cares about. And so I was looking for these sort of phenotype of people who really cared about mathematics, right?
Something that I didn’t particularly care about, I enjoyed, but I didn’t particularly care about. But then, you know, I started to model things that I cared about and use their intensity on these, I guess, passion, right? And so just recognizing the fact that real usable things like mathematics or coding or these things that are hard and tedious can be something that you really enjoy, right? Was something that’s quite new to me. And then, as you know, I got really interested still through my philosophical angle about aging, the aging process. And so I followed these passions and essentially all through that experience, changing from mathematics and computer science to biology, my main goal was to follow what I was passionate about and to do things as rigorously as I could.

And so essentially that led me to where I am right now, which is not studying what I initially set myself up to do, which is aging. And there’s plenty of reasons for that. But essentially following a passion that not everyone cares about, right? But I care about and applying the same fundamental values to these problems.

Grant Belgard: What’s something you learned early on that still pays dividends today?

Yang Li: Early on, as in how early on? I think one thing that I mentioned a lot to people is doing a degree in mathematics. And I don’t think that you have to do a degree in mathematics to get that. But this is where I got it, is this sense of what you don’t understand is actually very important, right? I don’t know if it’s, it’s definitely not a muscle, right? But it does feel like a muscle that you can use to know gaps in logic. And I think even, I wouldn’t say extremely good scientists, but because I do think that very extremely good scientists, they have this muscle, but I would say maybe many, many trainees and many faculty, I would say, and leaders, they still struggle with some gaps in logic. It’s very easy to jump, to have this logical jump.

And that impacts a lot of things, impacts the science, but also the writing a lot. That’s what I observe. That I think is one aspect that I see the most apparent when I see someone’s writing and I observe that there’s a gap in logic. So you just assume that, well, first of all, you assume that everyone knows what you know, but you also assume that one sentence followed the next sentence, or rather the next sentence follows the previous sentence. And I often see that there’s a gap in reasoning that I think is pretty hard to fix, right? Essentially, you have to say, well, why does it follow? And then a student might say, well, it follows because it’s obvious, but it’s actually not obvious. But how do you know that something is not obvious?

How do you distinguish something that’s not obvious from something that’s obvious? And especially when you’re talking about biology, for something to be obvious, there’s often a stack of unstated assumptions. Exactly, exactly. And in mathematics, when you do a lot of proofs, you’re sort of trained to always question every single step. And so I think doing this has really taught me, or at least it made me extremely careful about these steps. And in biology especially, I thought it was extremely useful because, well, sometimes you just can’t overcome, right? You cannot prove every single thing, right? In fact, the first few years when I transitioned from mathematics to biology, it was extremely difficult, because I was just hung up with the simplest thing, right?

But then I found utility in this because you can stash it, right? You can stash this gap in logic. So you notice them, and then you have to convince yourself that, well, it’s true that I can prove it, but it’s probably right in this system, right? And then you can move forward. But also at the same time, you understand what is this gap, right? And by understanding this gap or this condition, right, that it works only in one system, I think you start to understand the system a little bit better, and you start to understand how can this information that is supposedly only applies to this specific system also apply to another system. And so it helps me transfer some of my understanding from one paper, for example, to another paper, right?

So how might these be similar across papers or across cell types, right, or across disease transferred to another cell type or across another cell, another disease? So I think, I mean, this part helped me a lot, I think, in my thinking and in how I transferred knowledge across disease cell types or anything really. And I guess this was a little bit unexpected. I use very little mathematics right now, very, very little of the things I actually learned during undergrad. But this obviously has stuck with me.

Grant Belgard: What are some things you had to unlearn when transitioning between stages, you know, student to postdoc to faculty?

Yang Li: Right. I mean, I wouldn’t say that it’s unlearn, but change definitely very much so. All right. So when you’re a student and a postdoc, you’re very self-centered. You drive your project forward, and there’s some sense of the truth is the only thing that matters. The results are the only thing that matters. There’s less, I would say there are some, but much less personal touch, right? There’s some collaboration, obviously, but you’re really focused on your own project. At least that was my experience. And whatever I did was a lot focused on just obtaining the truth, obtaining, you know, understanding the way it works. What I had to, I would say, unlearn is maybe be less obsessed with the truth and how things should be done versus, you know, how it will be done by someone.

So it’s hard to force your way of doing things, even though if you still believe that it’s correct onto someone else who might not do the same way as you. Right. And so, as you know, we’re not taught to manage as faculty. And this is something that you learn because you see either students struggle or you see other people struggle. And then you notice that, hey, this is not, I mean, this is not productive, right? You cannot tell someone to work the same way as you did, even though you think or you strongly believe that this is how you would do things. And even if you could prove that this is the more efficient or the better way. So I think this is something that I think about.

Everyone’s different and some personalities are more likely to accept some ways of doing things and some other personalities are unlikely to perform well if you tell them to do it in a certain way.

Grant Belgard: What’s something a great mentor did for you that you try to replicate for others?

Yang Li: Well, I think all my mentors have been extremely kind. I think that’s something that, and at no point I felt like that the mentor just was using me in some ways to get a paper out, for example. And I’m mentioning this because I have witnessed that some mentors, they essentially think trainees, even though maybe they think it’s justified, I mean, using the students as a means to an end. And so I always think of the trainee as a person that is here to grow in terms of their ability, in terms of their knowledge. And so I think that’s something that I’m very careful about. I never try to have a student do something that is not beneficial to them.

Grant Belgard: Given the rapid changes in the field driven by AI, what advice do you typically give to early career bioinformaticians in navigating that?

Yang Li: Yeah, I think that’s a great question. And it really depends on your own personality. I think one aspect is to understand yourself, like what kind of personality you are. And I truly believe that personality matters a lot, right? And it’s really, some might say, oh, well, you have to change your personality. But I find that extremely hard. There are some personality traits that I know that I should change or that if I change, I would be happier or even more productive. But it’s very difficult to change, right? And so the way that I try to guide my trainees, for example, is to get a very broad sense first of what type of personality that person is. There are some, I think it was Ray Dalio in his book, he used to be the CEO of Bridgewater, and he developed these tests.

I don’t exactly remember the specifics, but I think what he did was for every personnel, every member of the company, he has this test that will classify them into what they’re good at and what are their personality. And one thing that I keep on thinking about is doers and thinkers, right? So oftentimes you can characterize someone as a doer or a thinker. The thinkers are those who like to think and then they’d like less to do. And the doers, they’re more, you know, they have a higher affinity to just start to do things before even thinking, right? So one thing that’s helpful and it’s only, you know, potentially related to personality is to figure out if you’re more like a thinker or more like a doer. And maybe you’re both, right? And that’s great.

But then figuring out these sort of traits for you will help you determine what you should focus on. And one advice was, if you think that you’re a doer, maybe you should team up with a thinker, right? And vice versa. If you’re a thinker, maybe you should team up with a doer and be very, again, intentional about this, right? Don’t let chance decide. If you have two doers, odds are that you’re just going to build a lot of things and it might not be very useful or not very good, right? If you’re a thinker, if you’re two thinkers and nothing gets done. And personality as well, it’s the same, right? And sometimes I also think about this diversity. I mean, lots of people say, oh, diversity is good, is good, is good.

But when pressed about exactly how diversity is good, they would say, you know, that the blanket statement like, oh, well, diversity, you know, you have different ways of thinking about things. I agree with that. I think you need a little bit more to really build a good diverse team, right? And this, for example, this diversity in thinkers and doers, and there’s also other personality traits that I forget to mention, right? So there are also, you know, personality traits about being very pessimistic, right? So I would classify myself as being a very pessimistic person. Careful, I’m trying to improve that, obviously. And then there are some people who are extremely optimistic. Like, you have an idea and they’re on board and then they’re like, OK, yeah, it could work because X, Y, Z.

I’m more of the, but it won’t work because X, Y, Z, right? But you need both, I think, on the team. If everyone is optimistic again, then this is going to be maybe an echo chamber of, well, yeah, it’s going to work and they’re just going to be hyped up. And then it’s great, right? Everyone feels great, but then it doesn’t, it’s not, you’re not going to have a good product, right? Because then you don’t consider what the negatives are. And if you’re all pessimistic, like me, if you have two rooms of me, then nothing’s going to work. So I think you need to figure out who you are and then team up with some diverse people, right, in that sense. And again, there’s lots of different, these are just two axes of variation.

There’s a lot more axes of variation that I think that you can optimize to build a very strong team.

Grant Belgard: What separates great collaboration partners from frustrating ones in CompBio projects?

Yang Li: As in me, is it CompBio or like two CompBio teams, or like one biological and one computational?

Grant Belgard: Yeah, probably a computational and a wet lab.

Yang Li: And a wet lab, I see. I think there needs to be respect, right? Respect for each other’s craft. If one is, again, using the other and without any amount of respect, I mean, that seems obvious, but it’s actually not. And it can go in both ways and often does, right? Yeah, yeah, exactly, exactly. So we, as a computational people, we can treat the experimental as just, you know, a pipette. Like, oh, you’re going to be replaced by robots soon, right? In the same way that the wet lab experimentalist cannot treat us as, you know, Claude Code, right? And in fact, I see it happen, right? Not every day, but I know who these people are, right? And so it’s a lot more prevalent than you might expect, I think.

And also a little bit of effort in understanding the other is, I think, at bare minimum, right? But obviously, accepting the fact that you’re not going to be as good as your experimental or dry lab counterpart. Another thing that is extremely important is that you have to enjoy working with them. Sometimes it could be tempting to work with someone who’s just very good, right? And you just need the resource. But personally, I just don’t think it’s worth it if you really don’t enjoy working with someone. Yeah, personally, I don’t think it’s worth it. The other thing is energy level. I think it’s very important to have the same amount of energy. If one of you is just a lot more excited, you end up being really annoyed that the other one is slacking off, right? And vice versa.

They’re probably going to be annoyed at you, or you’re going to find them pushy if you don’t have the same energy level. So I think these things are the main thing. And I’ve had, I would say, very good collaborators and pretty bad ones. And I think always these three aspects separate these perfectly.

Grant Belgard: What frameworks do you use when helping trainees decide on career paths?

Yang Li: Yeah, I think it also has to do with personality. Anyone who’s very curious and very open minded and maybe like more, you know, just very idealistic, I would try to push them towards academia. Anyone who is very practical and, you know, I don’t mean to say that one is better than other. Anyone who is very practical and have a very good sense of what they want in life and they don’t want to deviate too much. I would say that I would, you know, steer them towards industry. And I don’t tell them that, right? Everyone who goes through my lab, I tell them that I think that they could become good academics. But the fact of the matter is academia right now is not super welcoming in the sense that it’s just very difficult. It’s very difficult to have a tenure track position.

That being said, there’s a lot of position that is not tenure track. And if you’re OK with that, and I think you should totally be OK with that, there’s a lot of possibilities. And I would also encourage that. But obviously, you know, if you’re very creative and very, you know, idealistic kind of person and you really want to change the world or research something that you’re deeply passionate about that not many people might care about, then I still think that academia as a tenure track, having your own lab, at least, is the right place or the right path.

Grant Belgard: Final question. If you could give just one piece of advice to your earlier self, what would it be and why?

Yang Li: Other than buy Bitcoin? Yeah, I think communication is very important to focus on and be more open minded to things to improve. I think when I was young, I was really into doing hard things and technical things. I think you can call it hard skills versus soft skills. And I didn’t think at all about improving soft skills or maybe personal skills. So interpersonal skills. And I think I would give that advice to my past self, even though I strongly suspect that I wouldn’t listen to myself. Yeah, so personal skills is, I think, more and more important, especially with AI, which I think can replace a lot of the hard skills, to be honest.

And so the one who I can see that the one who will succeed a lot more than I will are the one who has the soft skills and know how to get AI to help them with the sort of hard skills.

Grant Belgard: Well, Yang, this has been fantastic. Thank you so much for joining us.

Yang Li: Great. Thanks for having me, Grant.

The Bioinformatics CRO Podcast

Episode 78 with Sun-Gou Ji

Dr. Sun-Gou Ji, statistical geneticist and VP of Computational Genomics at BridgeBio, discusses his career in genetics and genomics and BridgeBio’s approach to target validation and novel target discovery.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Sun-Gou Ji

Sun-Gou Ji is VP of Computational Genomics at BridgeBio, supporting target validation and novel target discovery for drug development. 

Transcript of Episode 78: Sun-Gou Ji

Disclaimer: Transcript is automated and may contain errors.

Grant Belgard: Welcome to the Bioinformatic CRO podcast. I’m Grant Belgard, and joining us today is Sun-Gou Ji. Sun-Gou is a statistical geneticist at BridgeBio, where he drives scientific decision making based on human genetics. As VP of Computational Genomics, he leads a team of statistical geneticists and data engineers focused on target validation and novel target discovery. Previously, he was at Seven Bridges, where he collaborated with the Million Veteran Program to validate and uncover genetic factors influencing human traits in a highly diverse and admixed population. Welcome to the show.

Sun-Gou Ji: Thanks, Grant, for having me.

Grant Belgard: So how did you first become interested in genetics and drug development and what drew you into the field?

Sun-Gou Ji: Sure, sure. I’m sure everyone has this time where you think about what impact, you know, do you want to make in this world while living here? And the type of lasting impact that I was very struck with was that a drug that I could develop could help people even when I’m gone. It would stick and still help people for perpetuity. So, you know, so once I thought about those things, I was actually lucky enough to then do a Ph.D. at the Sanger Institute at a time when human genetics were showing a pretty meaningful impact to the success of their programs. And here I am now. I feel like I was just, you know, happened to be at the right place at the right time and things aligned and really happy to be contributing to something that will outlive me.

Grant Belgard: So the Sanger Institute, of course, is an epicenter of human genetics. How did your Ph.D. work there shape the way you think about it?

Sun-Gou Ji: I would say it just basically shaped who I am now. I feel like if I had to choose one time in the past, I could go back to it would be doing a Ph.D. at the Sanger, which I think is pretty rare for people that have done Ph.Ds. And, you know, its history started with sequencing the human genome and the density of world-class human geneticists. There’s just no other comparison out there. So especially the scientific rigor and the collaborativeness I learned at Sanger are still the basis of how I operate today. And I would really strongly recommend it to anyone, you know, considering this field.

Sun-Gou Ji: And many of my friends remember the time at Sanger being the best time of our lives, not only the scientific achievements, you know, people at Sanger do publish a lot and pretty high impact journals, but also the diverse culture and its inclusiveness being part of Cambridge culture is a very exceptional experience.

Grant Belgard: What did you take away from your time at Seven Bridges, especially working on the Million Veteran program and the Graph Genome Project?

Sun-Gou Ji: Yeah, sure, sure. I joined Seven Bridges, it was like 2015, 2016. That time it was the data science, big data was the hype, you know, before AI. And back then, it actually took ages to perform imputation on HPC clusters or run a GWAS using LMMMs. And I’m sure you remember that time too, Grant. And being able to run a large compute and know some stats qualified me as a data scientist. And Seven Bridges has kind of occupied this niche where it was almost impossible to orchestrate complicated genomic workflows on AWS directly. And although everyone knew things would move from HPCs to the cloud, I think there was a time where people were scared of having their precious data in the cloud. And I was in the R&D team working on the Graph Genome Project and I met the smartest people I’ve ever met there.

Sun-Gou Ji: It was very different from the crew from Sanger in terms of that it was a completely different group of folks with PhDs in quantum physics, like mathematics and engineers, software engineers with some with like 20 plus years of experience. And this focus team of like a dozen plus work on a single project to create this graph genome ecosystem. And if you know, the name Seven Bridges comes from the seven bridges of Konigsberg, which was solved by Euler and laid the foundation of the graph theory. And he could understand what Seven Bridges was trying to do. And they were trying to use graph genomes and actually revolutionize how we do genomic analysis. And my experience there really opened my eyes on the difference between academia and industry. And because of the, usually when you have this type of project, you have one PhD or postdoc working on it.

Sun-Gou Ji: Whereas here you had dozens of like people with vast experience working on a single project to get one thing done. And I mainly focused on the structural variant aspect of the project and which led to a nice paper back in 2018 or so. And I believe it’s still part of Velsera’s offering, which is, which absorbed Seven Bridges. And it’s really great to see that this graph genome and pan genome approaches are really picking up more recently. And actually I feel like this really shows how difficult it is to commercialize a completely novel bioinformatics tool, even though it could revolutionize the whole field. And as for the MVP work, I was also working with many others at the VA to QC the initial tranche of the genotyping data and imputation. And all these experiences at Seven Bridges is like, I really learned a lot, especially being the only human geneticist in the group.

Sun-Gou Ji: It took me some time to understand that Sanger was sort of a bubble, right? Where like everyone understands human genetics. But here I quickly had to get myself comfortable basically defending the whole field of human genetics in front of mathematicians and physicists and engineers who would listen to you about how, you know, variant calling is done, alignment is done, association testing is done. And they would say, oh, that’s, this is irrational. This is inefficient. This is like very old statistical tools. You could, there’s these novel things you could use. Why are you using this? And, but actually me kind of defend the field in front of these really smart people helped me explain concepts of human genetics from first principles.

Sun-Gou Ji: Why do we do this reasons though, that the human genetics field uses this type of kind of old statistical techniques rather than these very complicated non-linear models a lot of times. And this kind of explaining the reasons of how human genetics done from first principles turned out to be very useful at Bridge Bio.

Grant Belgard: What do you consider to be the most impactful outcomes of the million veteran program?

Sun-Gou Ji: Well, the data itself, it’s, you know, the Million Veteran Program, it’s, it’s like, it’s very amazing that, you know, the veterans are actually contributing the health information, the genomic information for research to advance, you know, veteran, veteran care. And this type of data actually is reaching for a million in a single hospital system is still, there’s no comparison. And actually the Million Veteran data is really special. And in the way of how the ancestry proportions are distributed within the data, it’s very higher proportion of African-Americans as well as Hispanic Americans compared to the other databases that have larger European ancestry. So the type of analysis and knowledge that’s coming out of the MVP data is very orthogonal to what we get from other databases or biobanks.

Grant Belgard: So what led you to then join BridgeBio?

Sun-Gou Ji: Yeah, so honestly there was, of course, a lot of serendipity. And once I was working on these bioinformatics tools and QCing the data for others to use, the only thing I was sure that I wanted to do is move closer to patient impact through developing drugs. Like, like I said, at the beginning, it’s like, I felt I was sort of ready to kind of move closer to actually making a drug. That I feel like I made and, or I contributed significantly to making. And being as choices back then were like big pharmas, you know, thanks to the, the [Nelsted?] et al paper from GSK or the King et al paper from AbbVie, many pharma companies were building huge genomics teams. And, you know, I think there were a lot of choices from a lot of these places, but looking back and trying to justify my choice to join BridgeBio instead was definitely the people I met during interview.

Sun-Gou Ji: I was really impressed by the team. There were super smart in very different ways. There were, I think a lot of people, I would say from Seven Bridges were like really scientific smart, like street, like very academic smart. Whereas the BridgeBio folks felt a bit more street smart and they would just get things done right somehow. And without dwelling too much into the detail, but just enough to actually get things done in a very efficient way. And of course, the other part was, you know, being the opportunity to be interviewed, like world experts like Richard Scheller and the people that are like that, as well as getting personal call-ups from the CEO, you know, you wouldn’t really get that if I was going to be joining like the big pharmas. And it felt like these people could really do something. And this hub-and-spoke model for rare disease really also resonated with me.

Grant Belgard: So speaking of the hub-and-spoke model, that’s pretty uncommon in biotech. Can you explain how it works and why it’s effective in rare disease drug development?

Sun-Gou Ji: Yeah, so I’ll start with the ‘effective’ because I don’t think a lot of people appreciate it. Like one metric I really like to highlight about BridgeBio is like, we’ve been around for 10 years now. And within that time, we’ve delivered 19 INDs and three NDAs. We had two positive phase three trials that just read out in the last year. And we’re waiting for one more that we’ll weed out within this quarter. This efficiency is really rare. And this starts with actually picking the right programs and having a balanced view of the portfolio. So how do we choose? And the majority of rare diseases happen to be genetic. And we know that targets for genetic support have a higher chance of success. And that’s why BridgeBio develops therapies that target the source of these genetic disorders or are very close to it. All of our targets technically have genetic support.

Sun-Gou Ji: But, you know, everyone knows like there’s twofold increased success rate if you have genetic support. But the chance of a single program succeeding is still very low if you think about a single program. But if you bundle enough of them together, you have low probability of success, but you have slightly increased because of genetic support. And then you kind of bundle them all together. And if you bundle enough of them together, it just becomes a mathematical problem of how many programs do you have to try to get a certain probability of the portfolio of making it? So this is a paper from Andrew Lo, one of our founders that actually came up with this concept and our CEO Neil Kumar kind of delivering, executing it on it. And that becomes a very mathematical problem that actually a lot of investors and bankers get.

Sun-Gou Ji: And it’s very hard to raise funding for a single rare disease program that has a low success rate and actually the outcome of that would not be that huge. So it’s actually very difficult to raise for a single program. But if you, because of the higher probability of success of a single rare disorders, bundle them together, then your risk becomes really low. So there are investors that have the appetite for low risk investment under this model. So we were actually, we were like, BridgeBio was able to raise from, you know, unconstitutional investors in biotech. And also not only that, how we raise funding, but also it allows funding towards the smaller indications with smaller upside, which would not be funded individually if this model was not there.

Grant Belgard: So for a company, aside from a successful launch, the best outcome is failing as early as possible, not going as far as possible. What does that mean in practice to fail early in rare disease development? And how do you operationalize that mindset within BridgeBio where you have multiple shots on goal, you know, kind of in principle uncorrelated risk basket of programs?

Sun-Gou Ji: Yeah, that’s actually a very important aspect of our portfolio. We’re not trying to make every program a success where we try to optimize for the portfolio. And usually this is not possible because if you have one company working on this program, if this program fails, you’re done. Whereas at BridgeBio, if this program fails, there’s always new programs that we are starting. So people that are working on a certain program, even if that fails, it’s not mean, does that mean that they’ll lose their job? They might, they actually can be transferred over to other programs that are being created newly or that need support for other things because, you know, everything moves and all these programs that are uncorrelated have different stages of development, different programs and different problems.

Sun-Gou Ji: And as long as you there, that’s how you can actually least incentivize people to make the right decision rather than the decision that makes the program live longer. And, you know, these type of kind of shutting programs happen in very different circumstances. Sometimes it’s kind of happens because of external factors, right? Where the market’s shrinking. Now there you have to kind of figure out which programs you want to, which is kind of similar to what all other biotechs and pharma companies go through. But we also do that very intentionally where we review our programs, especially the early stage programs and make sure when we start a program, we develop these decision points, like clear decision points. Like if we hit a milestone, then it’s a go. But then we also very clearly lay out what a no-go would be for each milestone and try to make harsh decisions.

Sun-Gou Ji: But these are definitely one of the hardest decisions that we always have to make, but we always try to push ourselves to make those decisions before the market makes us make those decisions.

Grant Belgard: And how do you approach risk-adjusted net present value modeling in rare diseases? And why do you think that’s a better framework than focusing on peak sales?

Sun-Gou Ji: Yes. So we actually released a white paper on this and last October called the feasibility of rare disease drug development. And this is all talking about risk-adjusted NPV is the net present value of program, meaning what is the present value of a certain drug development program at this time, considering all the potential path this program could take and aggregating across all the potential outcomes from failure to like failure risk and success risk and how much and all these things and aggregate and cost and taking time into account, which is risk adjusted. Then you have a single number on whether this program is actually positive, meaning it’s worth investing because you’ll get something out of it versus negative, which is just, it’s not like economically viable, financially viable to actually make investment into the program.

Sun-Gou Ji: And I’m sure people have heard of this herding in rare disease drug development, where everyone is working on a select few more common rare diseases. And most of the other rare disease just have no interest. And that’s, I think what happens if you focus on peak sales, there are just a few rare diseases that actually make sense if you just think about peak sales and the biology is understood about disorder. And if you focus on just the peak sales, there’s just, I feel there’s just not much way to avoid herding on select rare diseases. Big sales only considers the potential outcome and ignores potential costs to get there, no way. So in contrast to common diseases like IBD or more like, you know, autism, like our NPV is not relevant because the cost, whatever you spend on it would actually be negligible in the context of the large outcome, like a large fruit at the end.

Sun-Gou Ji: But for rare diseases, comparing the size of the fruit that will bear with some probability against the expected cost and whether that is positive or not is critical. And like a lot of our drugs would not have been like interesting for many other traditional way of just thinking about peak sales. But you know, some of our team are so lean and efficient and then has pulled off like one of the cheapest drug development programs that you could actually, that has been ever run to reach phase three. And all of that, if you only focus on peak sales, it doesn’t really matter. So if anyone’s interested, I would really encourage people to check out our white paper. And there is actually a toy you could play with.

Sun-Gou Ji: You could kind of change how much you think you’re going to, this is going to cost, how long your trial is going to last and what are the things and try to figure out how, what you need to optimize in order to turn your program NPV positive.

Grant Belgard: In broad strokes, how would you define computational genetics for the work that you lead?

Sun-Gou Ji: In broad strokes, any analysis that cannot be done on an Excel file, Excel spreadsheet that is not directly related to clinical trials.

Grant Belgard: I like that definition. Yeah. I haven’t heard that before. That’s a good one. Where in the life cycle from target ID to validation, candidate selection, trial design, post-marketing, is your involvement the heaviest and why?

Sun-Gou Ji: It will be in the earlier stages, especially like once the target is selected and the drug program gets going, there’s not much in terms of the computational genetics that can be done to actually make the full, it can help decision-making while generating different biological kind of support for the pathway, the target and all that, and that we all do. And actually we work across all of them, but the heaviest that we put our effort into is selecting the right target and actually validating it for that. That’s one of the things where you, this is the type of decision, once you make it, there’s no turning back. You can only know after phase three is spending a lot of money and a lot of time, a lot of resources that could have been spent on try to help other around disease patients. If you just pick the right target, there’s no way you could kind of change that.

Sun-Gou Ji: That’s where we put a lot of our efforts and that’s also where, you know, there is trial and tested proof that it does significantly improve your success when you incorporate a lot of genetics data in that stage.

Grant Belgard: What data sets are most actionable for your work right now and what makes them actionable?

Sun-Gou Ji: There are multiple databases that we, of course, like everyone is working with the UK Biobank, the All of Us, it’s very useful and somewhat actionable because of the kind of general population representation that you could actually learn from where you can think about, okay, if you go after certain rare disorders, what are the kind of more common expression of the rare disorder that could be observed in more common patients?

Sun-Gou Ji: And can we actually build like an analytic series around the target based on more common variants that are not directly causing the monogenic disorders, but also because these UK Biobanks and All of Us are usually devoid of a lot of severe rare monogenic disorders, but you do have to complement those with other databases that have a higher enrichment of these more severe rare monogenic disorders that would include databases like Genomics England that we work closely with and also a lot of these genetic testing providers like Invitae and GeneDX where you would get tested because you have a certain concern about a genetic disorder. So those are the databases that would be enriched in the type of patients that we are trying to treat. So in the end, there’s not a single database because they all have different ascertainment bias.

Sun-Gou Ji: And if you just keep sampling from the general population, you would basically have to sample the whole of the US to actually get enough sample size to do anything for any of these rare disorders. So that would take too long, we’ll get there, but it’ll take too long from the other end because you are biased towards people that actually have a reason to be tested. Then you’re missing a lot of these people in those kind of genetic testing vendors. You’re missing a lot of people that are kind of mixed, where they have slightly less severe forms of the disorders that would not get tested. So a lot of the insights you get from those databases will be biased towards more severe expression of the phenotype.

Sun-Gou Ji: So in the end, you have to merge those two together and make sure that what we get from one database can be replicated, or if it’s not replicated, we can explain why you don’t see that in these other databases. And then of course, it doesn’t end by just using the genomics data, especially now the UK Biobank, I think they’re one of the best things about the UK Biobank. Now they provide all these proteomics data and a lot of other multi-omics data sets are being more readily available and kind of layering on top of that from the genetics is becoming more and more important. But again, a lot of these monogenic disorders don’t have a large enough sample size for these multiomics. So how do you use a general population or a general database, the multiomics to incorporate that layer of information to help de-risk our targets or de-risk our program moving forward, it’s always case by case.

Grant Belgard: So the calcium sensing receptor has been described as a system level node for calcium homeostasis. Can you explain why it’s an interesting target?

Sun-Gou Ji: Yeah, so the CasR gene is, like you said, is the calcium sensing receptor. It senses calcium and calcium level in your blood and try to make sure that your calcium levels are kind of kept at check. And one of our programs that read out last year was an inhibitor of this calcium sensing receptor that’s trying to treat autosomal dominant hypocalcemia, where the calcium sensing receptor is overactive and where it’s a monogenic disorder that kind of causes this calcium sensing receptor to be too sensitive to calcium. And that’s why it thinks that our body has more calcium than needed and kind of keeps the calcium level lower. So the hypocalcemia is the symptom of this monogenic disorder. And why CasR as the gene is super important and interesting is actually it’s a genetic target with an allelic series.

Sun-Gou Ji: And what an allelic series is, is to simply put, it’s nature’s dose response curve, where the dosage of the gene correlates with disease outcome. That means if you have low dosage, meaning a loss of function, CasR, then you have hypercalcemia, where you have too much calcium, and then you have your wild type in the middle, where you’re kind of okay. And then you have your gain of function in CasR that actually causes the disorder that we’re trying to treat, which is autosomal dominant hypocalcemia. So you have this outcome, human and phenotypic outcome that correlates with the dose. And the dose response curve is what you want to see in a clinical trial. That kind of proves that you’re actually hitting the target correctly.

Sun-Gou Ji: And having this allelic series of like different types of mutation, where you have very severe loss of function or like a weak loss of function, a very strong gain of function and a weak gain of function that correlates with a human phenotype, that’s the perfect genetic support for a target. And usually when you talk about the allelic series, everyone talks about PCSK9 for lipid metabolism. PCSK9 has been a beautiful story where you have gain of function and loss of function individual, where you have loss of function individuals who are protected from high lipids and coronary artery disease. Because PCSK9 inhibitors are not only used for monogenic hyperlipidemia. It’s used for just the general population. And that’s the analogy that we could use for these CasR inhibitors is that it’s not just for this autosomal dominant hypocalcemia type one monogenic disorder.

Sun-Gou Ji: But if you have this imbalance in calcium, which also leads to an imbalance in the parasite hormone. And usually when that happens, what you try to do is what you get prescribed is like a calcium tablet or that you could get more of the calcium and kind of increase your blood calcium. But then it normalizes your blood calcium. So it kind of gets rid of a lot of these other brain fog or neurological effect or tingling or other tetany or even kind of seizures. But what it actually does then it increases the amount of calcium that has to go through your kidneys. And that would end up leading to kidney damage. So a lot of the ADH1 patients are actually struggling with controlling the level of serum calcium against by using calcium supplements against their kidneys kind of breaking. So that could actually happen to other people that may be using calcium supplements wrongly.

Sun-Gou Ji: And there’s this kind of allelic series that we see in CasR actually indicates that this CasR inhibition as a therapeutic could be used for an other expansion from not just the rare CasR and ADH1 disorders to more complex phenotypes associated with the calcium sensing receptor, especially the anything influenced by calcium balance.

Grant Belgard: Many companies cluster around the same common rare diseases while ultra-rare conditions are left to non-profits. How do you decide which diseases to pursue, especially when patient populations are unknown or trial feasibility?

Sun-Gou Ji: That’s always a moving target, as you can expect. But one of the things that we really focus on is really let the science speak. Meaning, can we really get into the science of understanding the patient beyond the need and the biology of the disorder? And we call that the connect the dots from the genetic perturbation to human phenotype. And where does the proposed treatment is intervening in that whole pathway? So as I alluded to for the CasR example, like for genetic support, the allelic series is the best. That’s the ultimate genetic support of those response curve, super rare. Interestingly, we either find things that obvious and everyone is working on or stumble upon ones that no one is working on. If the rare monogenic disorder is too hard to make a drug, it sometimes makes sense to go straight to the complex disorder. But usually that’s not for us.

Sun-Gou Ji: And we look for partners that are willing to take it on together for these more larger indications that requires a significantly longer and complicated trials.

Grant Belgard: So as we sequence more of the population, what are you seeing about prevalence, penetrance and variable expressivity of monogenic variants?

Sun-Gou Ji: Definitely a higher genetic prevalence, but lower penetrance and wider phenotypic spectrum of expressivity. And this is definitely not new, right? Because pathogenic variants were observed in an exact a long time ago and were called, you know, these people were called super humans at some point. And that kind of led to the search for modifiers of these pathogenic monogenic variant carriers. And that still goes on today. And proceeding our work on ADH1, you know, Hugh Markus’s work on monogenic stroke or Karen Wright’s work on neurodevelopmental disorders and many others consistently show that there’s very many people, a lot more than expected, that carry pathogenic variants, but the penetrance is much lower than we traditionally thought.

Grant Belgard: How do those findings complicate the way we define patients and measure unmet need in rare diseases?

Sun-Gou Ji: Yes, because of the much wider variants and expressivity that we’ve been talking about, it’s just it’s very important to capture all the phenotype, not just the classical ones. And because treatment starts from diagnosis, but diagnosis a lot of times is based on genetic testing. And there’s just too many rare diseases out there. And if the symptoms observed in a patient doesn’t align with the classical symptoms of the genetic disease, the genetic testing will not be recommended a lot of times and may be only considered when symptoms become too severe.

Sun-Gou Ji: So the unmet need of rare diseases today, that’s why it’s harder to, and we’re learning that it’s actually harder to quantify properly because there’s two things, again, that kind of comes back to our old ascertainment bias that we were talking about, the databases where a lot of these testing vendors would be severely biased towards more classical symptoms with severe phenotypes, whereas the general population will just not be picking up enough of these rare, severe monogenic disorders to actually make sense out of. So making sense out of those two is still going to be hard.

Sun-Gou Ji: And because of the variants and phenotypic expressivity, understanding the full spectrum of phenotypic expressivity, meaning like we should actually start from the genetics, get everyone that carries a pathogenic variant and actually try to even identify new phenotypes that are not classically associated with the traditional monogenic disorder and expanding the phenotypic spectrum and defining it through a genetics first approach would be important.

Grant Belgard: So how do you think this will change the definition of a monogenic patient and impact clinical trial inclusion exclusion criteria for deciding who should be part of the trial and later on who should be treated?

Sun-Gou Ji: Well, it’s all going to be part of the continuum, right? You’ll have variants and that’s a very difficult line to draw, right? Because it’s pretty clear when you think about, okay, do you carry a variant in a gene that has been pathogenic before? And there are a bunch of VUSs, so whether you have a pathogenic, likely pathogenic or VUS carrier may actually tell you that you have a mutation, but whether you have the disorder, that may be a very different thing. You may be a monogenic patient because you have the pathogenic variant, but do you have the monogenic disease? Maybe no, but then how do you say no? Like in case of CasR, you have a monogenic variant in CasR that’s pathogenic. You have hypocalcemia, then you are technically an ADH1, but then when do you start treatment? It’s a different question too, right?

Sun-Gou Ji: Because then like, when does it warrant treatment to actually do these things? It’ll be very different by the disorder and the safety profile of the drug. And that’s sort of the start of personalized medicine, right? That’s when you start understanding the genetics and then the phenotype that you’re seeing in that patients, and when do you actually start treatment?

Grant Belgard: So you’ve talked about the importance of genetic support and drug development. What makes it such a powerful tool compared to other methods of validation?

Sun-Gou Ji: Yes, I would say it’s, you know, genetic support is the only tool with predictive validity for clinical success. There is not anything that I know of that have shown this reproducibly, that there is two to four times increased success replicated across so many different groups. But I wouldn’t really say it’s more powerful than any other tools, but it does provide an orthogonal point of validation of the therapeutic hypothesis that’s just basically not possible through models. Even the best models are just models, right? And although we have to be careful, the effect of a lifelong perturbation, which is a variant that you carry or genetic support versus therapeutic intervention, which is a sudden change, it still provides a completely different validation for the target.

Sun-Gou Ji: So, but again, however, despite genetic support showing two times increased odds of success, whether genetic support alone provides any predicted validity is unclear. Because genetic support, given the target had been tested in the clinic, independent of any genetic support, gives you this increased odds of success. So you always have this conditional, where a lot of these drugs were tested not knowing there is any genetic support. But when then you look conditional on that test set of genes that have been tested, you know, without knowing genetic support, then you have this increased odds. But if you only have genetic support, does it actually give you any increase? We just don’t know because there hasn’t been a drug that’s been tested just based on genetic support.

Sun-Gou Ji: And so it’s very powerful, and we are actively working on it, but that should not be a replacement of a target prioritization, target validation.

Grant Belgard: And final question on the future of precision medicine. So in what way would routine newborn sequencing transform precision medicine?

Sun-Gou Ji: Yeah, this does come to quite a personal story too, because I have a one-year-old daughter who’s been recently diagnosed with a rare genetic disorder. And we were lucky enough to be living in Boston, you know, where our pediatrician knew to refer us to a specialist who then quickly sent us to Boston Children’s and then diagnosed us within a couple of days and starts treatment right away. You know, the nurses and doctors were so helpful, you know, they were super supportive, full of empathy, and so grateful for our care team. And now this is what the US and the medical care should be, right? It’s the best medical care. And of course, it’s, we were lucky in the sense, of course, it’s best to not have a rare disorder, but we were lucky as it had been. But one thing, that’s the one thing I regret, though, is that, you know, this is a genetic disorder.

Sun-Gou Ji: And I actually convinced myself that I didn’t want to get her sequence when she was born. I sort of used the exact same logic against newborn sequencing to convince myself that I’d be overwhelmed with this information. You know, you’ll find these pathogenic variants in different like VUSs, am I going to be worried about them without saying but looking back, I feel like it was quite laziness on my end. And if I actually looked at her genome, have the information of the handful of genes that was potentially bad variants, but I have reduced the search space for what I should prioritize. And is it possible that maybe I would have picked up her symptoms earlier before it’s this late? And with the benefit of hindsight, I do feel like it is possible to catch it, it would have been possible for me to catch this a bit earlier and get her treated before.

Sun-Gou Ji: Technically, this is as much as possible now, right? The technology is all there, like assays are as accurate as it can be. And the interpretation, although needs some improvement, but the only way to get better than interpretation is just by doing more. And those are various newborn sequencing efforts, of course, the UK leading and Guardian and Beacon studies along with others in the US.

Grant Belgard: Well, what are your thoughts on whole genome sequencing versus whole exome versus targeted sequencing for newborns?

Sun-Gou Ji: I feel we should future-proof ourselves. And even for the UK BioBank that released the whole genome set last year, they show an improvement in identifying these pathogenic, likely pathogenic variants even with encoding exons over whole exomes. And I just feel like there’s no reason to use these targeted approaches, especially for data generation. For interpretation, there could be a case to make, but we should just do whole genomes to future-proof ourselves and get the highest yield. And then the interpretation could help. And the data sets itself could be very useful. It’s the first step. It will really help cases like my daughter a bit early on and reducing or at least prioritizing the search space, because when you have a baby, you’re worried about everything. But if you know that she has something and you see signs of that, you would be a bit more careful.

Sun-Gou Ji: And I feel like just for that, it should be worth it. But going back to your question about whole genomes, whole exomes and targeted panels. But in addition, I think the more exciting piece that I was thinking about traditionally as a scientist was the data generated, because it will be huge, so valuable for genetic research and drug discovery or development, because this is the true unbiased information of the population.

Sun-Gou Ji: Where I was talking to you about the fascinating bias about the different biobanks and cohorts, but newborn sequencing will be ultimate unbiased sampling of the population, which will open up the first door for the precision medicine that would really help us understand the difference, not just monogenic prevalence or in a transient expressivity, but also even in common disorders and different or complex disorders and really expand how we think about human health with genetics and start of precision medicine. And you would carry that information throughout your life and whenever something happens, you have that background information to best rather than waiting until something goes wrong and figuring out.

Grant Belgard: Yeah, it’s interesting. You know, we’ve heard for years that this is coming and certainly at this point, it’s not a barrier of price, right? I mean, getting a whole genome sequence is a pretty negligible cost in the American healthcare system these days compared to everything else, but it’s still not routine. I wonder when that will finally flip.

Sun-Gou Ji: Yeah, it’s interesting. And also, I guess there’s questions about privacy and who owns the data and who actually gets to analyze the data and how do we make that equitable before and maximize patient benefit over anything else?

Grant Belgard: Well, I guess that’s another challenge, particularly in the US healthcare system, right, is although there’s a ton of money spent, it is very fragmented from a data perspective, many different systems, et cetera, right? So that will be a challenge.

Sun-Gou Ji: This is like an operational problem rather than a technical or scientific problem now. And yeah, there are a lot of sensitivities and issues about it, but there are these pioneers are trying to do these pilots across different institutes in different countries. And hopefully those will change the mind of governments.

Grant Belgard: Thank you so much for joining us. It’s been great.

Sun-Gou Ji: Thank you for having me.

The Bioinformatics CRO Podcast

Episode 77 with Ewelina Kurtys

Dr. Ewelina Kurtys, a neuroscientist at FinalSpark, discusses her experience bridging AI, neurotech, and business development in industry, and FinalSpark’s mission to build a remotely accessible platform using living neural networks as a biocomputing substrate.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Ewelina Kurtys

Ewelina Kurtys is a neuroscientist at the biocomputing startup FinalSpark, which is working to create a bioprocessor from human neural organoids.

Transcript of Episode 77: Ewelina Kurtys

Disclaimer: Transcripts may contain errors.

Coming Soon…

The Bioinformatics CRO Podcast

Episode 76 with Christopher Woelk

Christopher Woelk, an External Innovation Partner at Astellas, discusses his background in multi-omics and AI/ML and what he looks for in his current search & evaluation role embedded within therapeutic oncology research.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Christopher Woelk

Christopher Woelk is an External Innovation Partner at Astellas, which focuses on developing and supporting transformative disease therapies.

Transcript of Episode 76: Christopher Woelk

Disclaimer: Transcripts may contain errors.

Coming Soon…

The Bioinformatics CRO Podcast

Episode 75 with Chris Yohn

Chris Yohn, leader of CompBio Bridge, discusses his current experience with computational biology contracting and consulting, what companies are doing with computational biology right now, and how to most effectively bridge the gap between data science and the wet lab. 

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Chris Yohn

Dr. Chris Yohn is a computational biologist who currently leads CompBio Bridge, which provides a fractional strategy and management practice to help biotech teams bridge data science with the wet lab.

Transcript of Episode 75: Chris Yohn

Disclaimer: Transcripts may contain errors.

Coming Soon…

The Bioinformatics CRO Podcast

Episode 74 with Phillip Meade

Dr. Phillip Meade, a leadership and culture advisor at Gallaher Edge, discusses his experience evaluating organizational culture and how to diagnose culture problems and build lasting habits for high-performance organizations.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Phillip Meade

Phillip Meade is a leadership and cultural advisor at Gallaher Edge, which provides executive coaching, leadership development, strategic guidance and culture management services for businesses and organizations.

Transcript of Episode 74: Phillip Meade

Disclaimer: Transcripts may contain errors.

Coming Soon…

The Bioinformatics CRO Podcast

Episode 73 with Nataraj Pagadala

Nataraj Pagadala, founder, president, and CEO of LigronBio, discusses his company’s goal of using molecular glues to target traditionally undruggable proteins as a route to new therapies for neurodegenerative diseases.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Nataraj Pagadala

Dr. Nataraj Pagadala is the founder, president, and CEO of LigronBio, which develops molecular glues to target traditionally undruggable proteins.

Transcript of Episode 73: Nataraj Pagadala

Disclaimer: Transcripts may contain errors.

Coming Soon…

The Bioinformatics CRO Podcast

Episode 72 with Sophia George

Sophia George, professor in the Division of Gynecological Oncology at the University of Miami Miller School of Medicine, discusses her research at the Sylvester Comprehensive Cancer Center investigating the genetics and biology of hereditary breast and ovarian cancer and working at the intersection of genomics, health equity, and cancer.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Sophia George

Sophia George is a professor in the Division of Gynecological Oncology at the University of Miami Miller School of Medicine and the principal investigator of the George Lab at the university’s Sylvester Comprehensive Cancer Center.

Transcript of Episode 72: Sophia George

Disclaimer: Transcripts may contain errors.

Coming Soon…

The Bioinformatics CRO Podcast

Episode 71 with Christiaan Engstrom

Christiaan Engstrom, founder and CEO of BLPN, discusses his experience building a space for authentic, non-transactional business networking in the life sciences.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Christiaan Engstrom

Christiaan Engstrom is founder and CEO of BLPN, an invite-only community for life science investors and senior executives to connect.

Transcript of Episode 71: Christiaan Engstrom

Disclaimer: Transcripts may contain errors.

Coming Soon…