Bioinformatics Done Right, Now

The Quiet Hero’s Guide to Shipping Diagnostics Faster


Why leaders who need bioinformatics for diagnostics work with The Bioinformatics CRO when timelines, budgets, and reviews all matter.

TL;DR for busy leaders

  • Move faster without adding headcount: shorten turnaround, keep backlogs under control.
  • Hourly, transparent pricing: estimates up front, timesheets throughout; rates typically not higher than fully loaded SF/Boston headcount.
  • Work like one team: we build in your repos, with your conventions—more like an extension of your bioinformatics function than a vendor.
  • Review‑ready habits: reproducible, version‑locked workflows; clear provenance and documentation aligned to your SOPs and change‑order process.
  • You own the work: 100% of the IP remains yours—code, results, and documentation.

The Reality You’re Managing

Most bioinformatics teams in clinical‑stage diagnostics are not “understaffed.” They’re undersized for the variance of the work: product launches, trial spikes, an unexpected rerun, a new assay, a new requirement, an urgent question from clinical operations.

And because you’re operating in a regulated environment, you can’t solve the backlog by moving faster in a sloppy way. You have to move faster while still being able to explain what ran, when, on which data, and why the output is trustworthy.

That tension – speed with traceability – must be navigated for outside help to be useful.

Don’t think of it as outsourcing – think of it as capacity and craftsmanship that fits into the way you already operate.

What Tends to Go Wrong with Typical CRO Help

You’ve probably seen some version of this:

  • Work arrives as a code dump or a set of plots without enough context.
  • The analysis is technically correct, but you still have to rewrite, harden, or document it to make it usable internally.
  • Timelines slip because the vendor isn’t embedded in your tooling, your conventions, or your change control.
  • The “handoff” creates a second project: making the work maintainable.

That’s not a talent issue; it’s a working model issue.

We’re built for the model where the work lands inside your organization cleanly.

How The Bioinformatics CRO Fits Into Your Team


1) Capacity That Behaves Like an Internal Team

We’re most useful when you need to increase throughput without committing to permanent hires.

Practically, that means:

  • We work in your repositories and follow your conventions.
  • We use your source of truth for requirements (tickets, specs, acceptance criteria).
  • We expect iteration. We don’t treat “v1 delivered” as the end; we treat it as the start of something you can actually run again.

The goal is that your team can pick up the work without reverse‑engineering it.

2) Regulated‑Aware Development, Guided by Your Process

We’ve supported regulated pipeline development as part of client teams. In that setup, your RA/QA function provides the regulatory guidance; we translate that into engineering choices and documentation habits.

In practice:

  • Versioned workflows with clear run instructions.
  • Traceability artifacts that travel with the code: provenance logs, parameter records, and validation notebooks where appropriate.
  • Comfort working with change orders, release notes, and controlled rollouts—because in your world, change is normal, but it has to be legible.

We don’t pretend to be your regulatory authority. We do know how to build in a way that supports one.

3) Thoughtful Analysis That Doesn’t Stop at “Here Are the Plots”

Sometimes you need clearer interpretation to inform better decisions.

When it’s appropriate, we’ll flag things like:

  • QC thresholds that are quietly driving false calls
  • Feature drift or cohort effects that will matter later
  • Places where a pipeline can be simplified without losing rigor
  • Analysis choices that could improve assay performance

Not as a grand “strategy presentation,” but as practical notes tied to the data you’re already looking at.


How We Start (and what the first call is for)

The first conversation is intentionally simple. It’s not a “free consulting session.” It’s how we learn enough to be accurate.

You’ll meet a PhD scientist who will:

  • understand your assay context and what “done” means internally,
  • map the scope and constraints (timelines, inputs/outputs, tooling, documentation expectations), and
  • gather what’s needed to provide a clear estimate.

If a deeper technical dive is needed, a PhD bioinformatician typically joins the follow‑up call. The goal is to align on the scope.

Working Model and Ownership

  • You own 100% of the IP: code, workflows, documentation, results; contractually and operationally.
  • Nothing is a black box: we deliver in your repos with READMEs, runbooks, and release notes suitable for your internal users.
  • Collaboration over handoff: you should be able to treat us like part of your bioinformatics department.

Pricing, Plainly

We work hourly.

You get:

  • Estimates up front with assumptions stated clearly.
  • Timesheets and checkpoints so spend doesn’t drift silently.
  • Rates not higher than fully loaded SF/Boston headcount

Hourly pricing fits the reality that regulated work evolves: requirements sharpen, edge cases appear, change orders happen. We’d rather be transparent about that than force it into a fixed bid that breaks the moment the project becomes real.

Common Questions (without the sales framing)

“Will we lose control of the work?”

No. The work lives with you: your repos, your standards, your IP.

“How do you avoid becoming a bottleneck?”

By embedding into your existing workflow rather than creating a parallel one. The less translation required, the less time gets wasted.

“What happens when scope changes?”

We expect it. We work with change orders and clear checkpoints so you can decide early what’s worth doing now versus later.

“Will you understand our regulated context?”

We’ve supported regulated pipeline development and are comfortable operating under documented processes. You lead on regulatory guidance; we implement in a way that supports traceability and review.

Where This Tends to Work Well

  • clinical‑stage diagnostics with continuous sample flow.
  • teams with periodic spikes (launches, studies, new assays).
  • programs where reproducibility and documentation are not optional.
  • groups that want outside capacity without losing internal maintainability.

Next Step

If you’re considering outside support, the most efficient starting point is a short introductory call to map scope and fit, so we can give you an estimate you can trust.

The goal isn’t to replace your team or “outsource bioinformatics.” It’s to make sure the work moves through your org with less friction: faster turnaround, cleaner pipelines, fewer rewrites, clearer documentation, delivered in a way that feels like it came from inside your own department.

Call Now

Bioinformatics Done Right, Now

When Data Won’t Sleep: Why Founders Choose The Bioinformatics CRO


For the biotech founder who wears too many hats and still wants to sleep at night.

There is a moment, often late in the evening, when the lab goes quiet but the data does not. Genomes, transcriptomes, single-cell atlases — each one a tide that keeps rolling in. The promise of your company lives inside those files. So does your next board update.

For the scientist‑founder discovering the next blockbuster drug, this is where anxiety begins. Hiring a full in‑house bioinformatics team would increase burn and shorten the runway. Chasing freelancers risks uneven quality and missed context. Buying another platform means lock‑in, not insight.

You don’t need a new tool. You need expert judgment on demand.

That is the work of The Bioinformatics CRO: providing expert, fast, and cost-effective bioinformatics services for biotechs.

The Problem Under the Microscope

You already know the science. Your wet‑lab team is strong. But bottlenecks creep in where code meets biology:

  • Backlogs swell after each sequencing run.
  • Decisions wait on “one more figure.”
  • Confidence slips when analyses are opaque or irreproducible.
  • Investor narratives lag behind the data.

The risk is not merely delay, but direction itself—the danger of missing weak signals or over‑reading noise. In a funding environment this competitive, errors and slow cycles cost twice: money now, credibility later.

Your Options, At a Glance

  • Hire a Head of Bioinformatics. Strong, but slow to recruit and costly to get wrong. You still need a team beneath them.
  • Solo consultants. Useful for narrow tasks; brittle for programs that change scope. Hard to maintain continuity.
  • Platforms. Great for routine pipelines; limited when the biology is messy or novel. Vendor lock‑in is a real problem.
  • The Bioinformatics CRO. Strategic leadership + flexible execution + niche breadth + cost control—together. That combination is rare, and it is what young biotechs actually need.

TL;DR (we know you’re busy)

  • What you get: Senior bioinformatics leadership, flexible hands‑on execution, reproducible pipelines, and investor‑ready narratives.
  • Why it matters: Faster decisions, lower risk, and a leaner core team.
  • Why us: Vetted experts, transparent pricing, 7+ years focused on biotech, and recognition as an Inc. 5000 honoree—with testimonials and publications to back it up.

Book a call now with our Director of Operations.


The Bioinformatics CRO: A Strategic Partner

We are a scalable bioinformatics partner powered by a vetted expert network—industry thought leaders working alongside experienced specialists. You get senior guidance when choices are hard, and flexible execution when timelines are tight. No bloat. No lock‑in. Clear deliverables you always own.

Why founders choose us:

  • Leadership on demand: Access senior strategists who have guided biotech programs across discovery and development. They help you frame the biological question, choose the right analysis, and avoid false trails early.
  • Elastic execution: Spin up the right mix of skills—single‑cell, spatial, long‑read, CRISPR screens, proteomics, metagenomics, image‑based transcriptomics, machine learning—then scale down when the push ends. You pay for results, not idle seats.
  • Breadth without compromise: One partner across all data types means fewer handoffs, faster cycles, and a coherent story from raw data to investor‑ready figures.
  • Quality you can audit: Reproducible pipelines, documented methods, and plain‑language readouts. Every analysis comes with a paper trail you can defend to investors, partners, and peer reviewers.
  • Transparent pricing: Clear rates and scoping. No hidden platform fees. No surprise renewals.
  • Speed to milestones: We turn complex data into simple narratives and publication‑grade visuals that support your next raise, partnership, or program decision.
  • Security and discretion: Strict NDAs, least‑privilege access, and industry‑standard data handling. Your IP stays yours.

What Changes For You

  • Backlog to clarity. Visible progress on your top analysis bottlenecks.
  • Figures that speak. Create the investor‑grade plots and narratives that make the science obvious, even to non‑specialists.
  • Decisions with confidence. Senior review highlights caveats and alternative explanations. Results reviewed by experts who identify caveats and alternative interpretations; so your results are accurate, defensible, and ready for any audience.
  • A leaner core. Keep your internal team focused on the biology that makes your company unique. We have you covered.

How We Work Together

  • Scope the question. We align on decision‑drivers: What must be true to move the program forward or to satisfy the next funding gate?
  • Design the analysis. We select methods that fit your data and risk tolerance—choosing the simplest approach that answers the question.
  • Build for reuse. Pipelines are built or adapted for your environment (cloud or on‑prem) and delivered with documentation.
  • Run, review, refine. Results are reviewed by senior scientists and presented with transparent interpretation, empowering you to make confident, data-driven decisions.
  • Deliver and transfer. You own the code, the notebooks, the figures, and the knowledge. We can train your team or stay on-call.

Capabilities At a Glance

  • Next‑generation sequencing analysis: WGS/WES, RNA‑seq (bulk and single‑cell), isoform analysis, alternative splicing, eQTLs.
  • Spatial & imaging: Spatial transcriptomics, multiplexed imaging analysis, cell typing, neighborhood analyses.
  • Functional genomics analysis: CRISPR screens, combinatorial perturbations, hit calling, pathway mapping.
  • Multi‑omics integration: Genomic + transcriptomic + proteomic fusion, feature selection, patient stratification.
  • Machine learning: Predictive modeling, QC automation, biomarker discovery, model explainability.
  • Translational support: Cohort design, power analysis, figure preparation for manuscripts and decks.

Proof You Can Point To

  • Recognized execution: The Bioinformatics CRO is an Inc. 5000 honoree—providing external validation of sustained growth and delivery.
  • Focused experience: 7+ years contributing to biotech programs, from discovery to development.
  • Vetted network: A global bench of experts matched to your problem—senior leaders and strong mid‑level specialists.
  • Open record: Testimonials and publications available; ask for examples relevant to your modality and stage.

These are not just badges; they are risk reducers. Your board cares. So do we.

When to call us

  • A high‑stakes analysis is due for a board meeting or fundraising round.
  • New single‑cell or spatial data landed, and your team is at capacity.
  • You need to integrate datasets across modalities into one clear story.
  • You want reproducible pipelines and documentation your team can own.
  • You want an experienced voice at the table to challenge assumptions, not just take tickets.

Talk to The Bioinformatics CRO, and turn your backlog into forward motion.

Call Now

The Bioinformatics CRO Podcast

Episode 74 with Phillip Meade

Dr. Phillip Meade, a leadership and culture advisor at Gallaher Edge, discusses his experience evaluating organizational culture and how to diagnose culture problems and build lasting habits for high-performance organizations.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Phillip Meade

Phillip Meade is a leadership and cultural advisor at Gallaher Edge, which provides executive coaching, leadership development, strategic guidance and culture management services for businesses and organizations.

Transcript of Episode 74: Phillip Meade

Disclaimer: Transcripts are automated and may contain errors.

Grant Belgard: Welcome back to the Bioinformatics CRO podcast. Today I’m talking with Dr. Philip Meade, a leadership and culture advisor at Gallaher Edge, whose career has included extensive work inside NASA, particularly around organizational culture and return-to-flight moments after major setbacks. He’s collaborated across public and private sectors and co-authored a book on building high-performing cultures. Today we’ll translate those lessons for labs, universities, biotechs, and pharma, how to evaluate the strength of a culture, diagnose problems, and build habits that last, plus common pitfalls to avoid. Dr. Meade, thanks for joining us.

Phillip Meade: Good morning. Thank you for having me. I’m happy to be here.

Grant Belgard: So we’ll cover three arcs today, your current work and lens, how you got there, including time with NASA, and practical advice for leaders and teams in the life sciences. So to kick us off, in your current work at Gallaher Edge, what kinds of culture or leadership challenges are you most often being asked to help with right now?

Phillip Meade: The thing that we see most often is companies asking us to come in and help them because either they are in the process of growing and scaling or they want to grow and scale and they’ve hit a ceiling and they’re having trouble doing that. And so culture typically is one of those things that either is an enabler for scaling or it ends up being a roadblock that keeps them from being able to do the scaling that they’re wanting to do.

Grant Belgard: When you first meet an executive team, what signals, good or bad, do you look for the first hour?

Phillip Meade: There’s a few things that we typically see that demonstrates what we’re looking for in terms of a high-performing culture. Openness is one of them. Is every member of the executive team truly engaged and contributing or is there one or two key members that are really the ones that are doing everything and everybody else is sort of sitting there waiting and seeing what they do and hanging back? Another one is self-awareness. Are they really aware that when we’re talking about culture that they’re a part of it, that culture starts with them and so that this work is really about them and they’re a piece of it and they’re involved? Or are they talking about everybody else needs to change and this culture is about out there? And then another piece of it that’s very important is a willingness to be vulnerable.

Phillip Meade: Do they show that and demonstrate that willingness to actually let the guard down and take the armor off and be vulnerable as human beings? Or are they armored up and trying to present themselves that way?

Grant Belgard: How do you decide whether a client needs structural changes, leadership, behavioral changes, or both?

Phillip Meade: You know, it’s usually all of the above. It’s just a question of how much of each and how do we set those dials in there. When we talk about organizational culture and how is that created, people take cues for how they behave and what they believe about how they should behave. They take that from the leaders and what the leaders do and what the leaders pay attention to and what the leaders say and do and all of that, as well as from the structure. And so we really want to be intentional about all of that and be intentional about how do we design the behaviors that we want from the leaders and what are the leaders saying and doing, as well as how are we creating the structures and the experiences within the organization that people are seeing and responding to. And so it’s really a total design that we’re looking for from that perspective.

Grant Belgard: Many leaders feel they already talk about culture. What separates talk from traction?

Phillip Meade: I just touched on it a little bit in my previous answer, but first and foremost, it’s an intentional design. I think a lot of people think they’re doing culture just because they do things that are culture adjacent. Like they do things that are around, you know, employees being happy or feeling good in the workplace, but they haven’t done the work to intentionally design what is the culture that they want? How do they create that culture? What are the beliefs that they’re intentionally trying to create in their employees around that culture? And how are they creating those beliefs through the specific experiences that they’re creating? And what experiences are those? How are they doing those experiences? So if you haven’t intentionally designed that, then it is kind of just talk.

Phillip Meade: And so you want to have that level of intentionality to the design of what you’re doing so that you know, let’s just take the silly ping pong table in the break room. If you want to have a ping pong table in the break room, that’s great. Do you know why you have that ping pong table in the break room? You should know exactly why you have that ping pong table there, what that experience is designed to do. Is it what beliefs are you trying to create in your employees? And then what beliefs those are creating? What do those beliefs drive from a behavioral perspective from your employees? And how do those behaviors then help to create that culture and ultimately drive the strategy of your organization? So that’s the whole flow that you want to have from a design perspective. And if you don’t have that level of understanding, then you haven’t really designed your culture.

Phillip Meade: You’ve just bought a ping pong table and put it into your break room. And so it’s there’s nothing wrong with the ping pong table. It’s neither good nor bad, but you haven’t designed a culture around it.

Grant Belgard: What’s your go to way to align executive intent with middle management behaviors?

Phillip Meade: So you want to have first the senior leaders to demonstrate those behaviors, because if the senior leaders aren’t truly living it, it’s going to be very difficult to just look at the middle managers and say, you know, do what I say, not what I do. That never works. Secondly, you’re going to want to communicate those expectations clearly. It needs to be crystal clear so that they understand what is exactly expected of them. You’re going to want to align the systems and processes so that they have the ability to do what you’re asking them to do and that it fits into how they do their jobs and they’re rewarded for it. And then finally, you’re going to want to provide them with if it’s if it’s skills based, you’re going to provide them with training.

Phillip Meade: And if it’s really is behavioral, you’re going to provide them with some behavioral change workshops that will support the behavioral change that you want from them.

Grant Belgard: If a team has strong technical results, but shows strain, missed handoffs, creeping burnout, how do you frame the problem without pathologizing people?

Phillip Meade: This is one of the things that we typically focus on with all of the organizations that we work with, because blame is actually one of the greatest drivers of organizational dysfunction. I mean, you see it in a lot of a lot of organizations, and it’s a huge waste of time and energy. We like to focus on contributions. And so in any time that there’s an issue that happens, there are many things that contribute to it. If you think about blame, blame is typically a game that we play where we try to figure out who was mostly responsible, and then we assign blame to them so that we can say it was their fault. And from an organizational standpoint, if you’re trying to think about how do we become most effective, that doesn’t make us most effective. We really want to figure out how do we diagnose how this happened? How do we correct that?

Phillip Meade: And how do we move forward and prevent this from happening in the future? So the way that we do that is we try to identify all the contributors to the situation, and then we figure out how do we prevent those contributions or shift those contributions so that this doesn’t happen in the future. And so we want to approach it from that standpoint so that people aren’t afraid that if I admit that I contributed to this, either through my action or inaction in some way, I’m not going to be in danger of becoming the person who is blamed as a result. And so we come together and we look. Everybody contributed in multiple ways through action and inaction. The system contributed to it. There were environmental contributors. We really look at exactly all the things that contributed to it, and then we say, okay, how can we shift those contributions in the future and get a different result?

Phillip Meade: And so that’s the way we want to start approaching things differently from now on. How do you design for sustainability so the workout lives the initial consulting period? You really want to embed it within the fabric of the organization. And that’s where, when we talk about true culture change is not a short-term project, this is why. Because oftentimes it can take a little while to really go through the whole process of getting it really embedded. But you want to build it into everything you’re doing.

Phillip Meade: Once you really understand the culture that you’re trying to create and what that looks like and have it well-defined, and you understand the behaviors that you’re looking for, and you understand the core values that you want, and what that really looks and feels like, and how to create this culture that you’re after, then you can build it into how you recruit, how you perform your interviews, how you onboard and introduce people into your organization so that they’re trained into your culture from the beginning. You can build it into your leadership development programs. You can build it into your executive development. You can build it into your performance management systems. You can build it into your succession management. You can build it into the language that you use in your organization and how you talk and speak and interact with each other.

Phillip Meade: And then, as I was talking earlier, you can build it into the experiences that you intentionally design into your organization that are part of the way that you do things as a company. And so, you know, as you’re doing that throughout the course of the year and the course of the life of the organization, you know these are the different experiences we have and why we’re doing it. And you can change those out and tweak those over time. But as you’re doing that, you know what you’re doing and why you’re doing it. And then, as you update it, you know how you’re updating it and why you’re doing that.

Grant Belgard: So, shifting gears to talk about your own career trajectory, what early experiences pointed you towards organizational performance and culture as your focus?

Phillip Meade: Well, you touched on it in the introduction. It was an abrupt change for me. It wasn’t a subtle shift. In 2003, the space shuttle Columbia disintegrated on re-entry, killing all seven astronauts on board. And in the wake of that accident, the Columbia Accident Investigation Board found that NASA’s culture had as much to do with the accident as the piece of foam that hit the wing. And I was asked to lead all of the cultural and organizational changes for return to flight because they grounded the entire space shuttle fleet until we could fix the culture. And so, that really set me off on sort of a life-altering path where I began looking into organizational culture and really how that impacts organizations and how important that is to how they perform.

Grant Belgard: When did you realize engineering, as of course you originally came up as an engineer, right?

Phillip Meade: Yeah.

Grant Belgard: Systems thinking could be applied to human systems.

Phillip Meade: Well, I mean, I will say it was a lifeline to some extent. I was trying to grasp for something to make sense of how do I figure this out? How do I solve for this problem of organizational culture? And I realized that an organization is a system. But the thing that I realized is that it’s not just any kind of system. It’s a complex adaptive system. And so, that’s where systems thinking came in. Because if you try to treat an organization like, you know, a car engine, you’re not going to get the right results. You have to treat it like the complex adaptive system it is. And so, when you shift your thinking and begin, you know, analyzing it and diagnosing it and working with it in that way, you get different results. So, a couple of pivotal mentors that I had, I worked with a couple of consultants very early on, Paul Gustafson and Shane Cragun.

Phillip Meade: They were very instrumental in helping me to learn a lot about organizational behavior. And, of course, I read a ton of books that helped me come up to speed on all of this. And I’ll say that one of the moments that helped shape my approach was really the fact that, you know, I thought that NASA had a great culture. And that’s really part of what freaked me out when I was asked to lead this culture change. Because I would have felt better if there were tons and tons of problems for me to solve. And I didn’t think that there were any. So, one of the moments that shaped my approach was that the results of a study was released right after I was asked to lead this. And it named NASA as the best place in the federal government to work. And it was like, okay, this just confirmed what I thought.

Phillip Meade: And so, it really shaped my approach because it confirmed that the way that we’re looking at culture might not be perfectly correct here. If culture caused this accident, and yet we’re the best place in the federal government to work, then what does culture really mean? And, you know, that’s where I came up with the fact that, you know, culture means more than just people are happy at work, right? It has to mean something more. And so, that really influenced my philosophy on organizational culture.

Grant Belgard: So, this might feed into the next question. What’s a belief you held earlier in your career that you’ve since updated?

Phillip Meade: So, beliefs that I held earlier in my career that I would have updated, I think I’ll go in a different direction on that one. I mean, I was very much an engineer in my early career. I was an electrical engineer. You know, they say you can’t spell geek without double E. And I had, I think one of the ones that is my favorite one to reminisce on is, I used to say, I can explain it to you, but I can’t understand it for you. And, you know, I had philosophies on communications that, you know, if I explained it, and I was technically accurate, and you didn’t get it, then that was your problem. And, you know, I grew a lot, you know, over my early career, realizing that being effective was more important than being right. And being effective meant learning how to work well with other people. And organizational culture, oddly enough, really is a lot about that.

Phillip Meade: Organizational culture is about how do you help human beings to work together effectively as a group. A lot of the psychology underpinnings that we use in the work that we do actually comes from work that was done with the Navy, because they were having challenges, trying to figure out how to put the most effective teams together in the control center of their ships. And their theory was, if we take the smartest, you know, best performer at each position and put them together on these teams, we should get the best performance. And they weren’t getting that. And they were confused. And you would think that that’s what you would get. But in reality, the best performance on a team comes from the teams that work best together, not from putting the best performers together. And so that’s what culture is all about.

Phillip Meade: Culture is about how do you get people and put them together that actually work well together. And in an organization, that’s what you need. You need people who feel good about themselves and have the ability that when you put them together with other people in that environment with other people, they all feel good working together. They feel good about themselves. They have the ability to adapt and interact with each other in ways that it makes the whole team perform better. Not just about each one of them trying to maximize how they work best individually, but the team suffers as a result of it. That’s not what you want as an organization. And so, you know, it’s ironic, but I was a part of that personally when I think back to how I performed individually as a young engineer.

Grant Belgard: So, diving a bit more into your learnings from your time at NASA, when people hear culture, they often picture perks, right? The ping pong table in the break room, as you mentioned. In mission-critical contexts, what does culture actually do?

Phillip Meade: Yeah, so this takes me back to the previous question where I said that, you know, being named as the best place to work in the federal government showed me that it has to mean more than, culture has to mean more than that, right? And so, I define culture as, you know, being three things. I think it has to drive employee engagement because you get so many benefits from that. I mean, when a culture drives employee engagement, I mean, there was a 2020 Gallup poll that said that disengaged employees have 37% higher absenteeism, 15% lower profitability. I mean, that drops down to the bottom line and translates into a cost of 34% of their salary. I mean, you know, engagement is huge. You know, it’s a big deal. And so, having highly engaged employees is a big part of what culture does for you. And then, it also improves people’s lives.

Phillip Meade: And that’s a big part of what having an effective culture does. But the third thing that culture does is that it drives organizational performance and market success. And, you know, for a mission-critical organization like NASA, this means that it had to support mission success, which meant taking astronauts up to space and returning them back to Earth safely. I mean, safety was a huge part of that. And so, if it doesn’t do all three, it’s like, you know, three legs of a stool. If it doesn’t do all three, you don’t truly have an effective culture. I mean, I can think of examples of companies that have any two of those three, and I would argue it doesn’t have what I would call a truly effective culture. In some ways, it’s not doing good things. And so, when it has all three of those, and that’s what it takes to truly have an effective culture, and that’s what you want to be shooting for.

Grant Belgard: What did you learn about surfacing dissent in bad news in environments where schedule pressure and hero narratives play a big role?

Phillip Meade: Yeah. You know, I learned that human psychology is complex. You know, even though we’re an organization full of, NASA was an organization full of engineers, and, you know, we like to joke that they’re not really human beings. They are human beings. And when you talk about organizational culture and what happens there, it all starts inside of the human being, and it really is driven by that human psychology. And we don’t think about this. We don’t talk about it very often in our daily lives, but we’re all actively self-deceiving ourselves, you know, on a daily basis. It’s just, it’s part of what our human psychology does to protect us.

Phillip Meade: And so, you know, when we are afraid of something, when we’re afraid that something’s going to make us feel uncomfortable, when we’re afraid that we’re going to be unpopular, when we’re afraid that this isn’t going to align with the identity that I’ve created for myself, all kinds of funny things happen in our psyche, and we get behavior that you wouldn’t expect. And so, when you’ve got engineers that live in an environment where failure is not an option, and they don’t want to be the one that says that something’s impossible or something that can’t be done, and they’re tremendously committed to mission success, and they love their jobs, and they love doing what they do, and they’re working really, really hard and long hours to try and make something be successful.

Phillip Meade: They don’t want to be the one that holds their hand up and say, hey, I don’t think we can do this, or this isn’t possible, or we can’t get this done. There’s a lot of silent peer pressure to be successful, and to save the day, and to make things work, and to not do that. And it’s not overt, and nobody’s saying anything, and nobody would call them a bad name if they did that, but it’s all below the surface, and it’s all in the subconscious. And so, it makes it very, very hard to identify and see, which is why it’s so deadly. So, many organizations talk about psychological safety and practice what behaviors from senior leaders create or destroy it. It’s really about truly encouraging and rewarding the feedback and dissenting opinions, normalizing dissent and healthy conflict, and helping individuals to increase self-awareness.

Phillip Meade: You know, that self-deception that I was talking about that’s happening on a daily basis, educating people that that’s going on, helping people to know that that’s a piece of what’s happening, and helping us all to know and be aware of what we’re doing and what’s going on so that we can recognize it and combat it. Because noticing is the first step. Until we notice, there’s nothing we can do.

Grant Belgard: Could you share an example of aligning structure, for example, reporting lines or decision rights with the desired cultural behaviors?

Phillip Meade: Yeah. So, there’s two I’d like to talk about. One is sort of a large-scale one, and then there’s another one that I like to use, which is a sneakier one. And so, I like to use it as an example. The larger one was with the Columbia accident. One of the challenges that was identified after the accident was that the way we were structured, the engineering, the technical, as well as the budget and schedule and safety, they all rolled up to the program manager. And so, it was a single point of accountability was managing all of that. And so, there was a feeling like from the engineers that they didn’t have their own voice. And so, you had one human being who was having to try to juggle responsibility for budget pressure and schedule pressure, as well as technical decisions and safety.

Phillip Meade: And so, afterwards, we split that out into separate technical authority and safety authority so that we did have the, again, we called it the three legs of the stool, but we had the three legs there where we had a program manager that was responsible for budget and schedule. And then we had a safety organization that was responsible for safety and a technical organization that was responsible for the engineering. And so, engineering, if they had a technical concern, they felt like they had a route that they could advocate all the way up and didn’t feel like they were having to go up to their boss who was more concerned about budget impacts than the technical concerns. And then the sneaky one that I want to talk about is an organization where they had quality assurance technicians that were responsible for safety and speaking up about safety concerns.

Phillip Meade: And they had to punch a time clock on a daily basis coming in to work. And the engineers that were working in this area didn’t have to punch a time clock. Nobody else had to punch a time clock. And for whatever reason, the quality assurance technicians, the story in their head as a result of punching the time clock was that management didn’t trust them to keep their time, that they distrusted them. And so, that’s the reason they had to punch a time clock. And so, they felt like because they weren’t trusted by management, then they created a similar distrust towards management, because trust is a reciprocal entity. So, if you don’t trust me, I’m naturally not going to trust you. That’s just the way that it works. And so, speaking up and raising safety concerns becomes harder. If I don’t trust management, it’s going to be harder for me to raise a safety concern.

Phillip Meade: And so, it was creating a challenge with raising safety concerns because there was a trust issue. And one of the root causes of this trust issue was this silly time clock that they were having to punch in and out of work. So, it’s just weird structural stuff. It’s all about the beliefs that are created in people through the environment that they live in and through the things that happen. And so, we create those unintentionally many times in ways that we never intended to do.

Grant Belgard: That’s interesting. Yeah. Because in the clinical trial arena, you do have this structural separation of the safety monitoring for the patients, but there’s typically not something like that in the earlier stages of drug development before patients get involved. So, for leaders inheriting legacy systems in history, where do you begin?

Phillip Meade: I always like to begin by trying to learn as much as I can about why things are the way that they are. I don’t like to change things until I understand the reasoning behind why they are and how they got there. Usually, there’s people and there’s inertia around the existing systems and processes and everything. And so, providing honor to why it’s there and being able to respect that and take the good for what it is and then only change the things that need to be changed or build upon what it is. That usually helps at least minimize some of the resistance from the people who are involved in what’s there already. And you can save time and energy too because there’s probably are reasons why things are the way they are. And so, you’re not, you know, breaking things that don’t need to be broken or, you know, doing something that won’t work.

Grant Belgard: If you had a week inside a life sciences organization, how would you diagnose the culture quickly?

Phillip Meade: I would try to be as much of a fly on the wall as I could. I would just try to hang out, visit meetings and listen, see how the meetings go, you know, see how much actual discussion happens in meetings. Are people speaking up? Is there meaningful dialogue and is there healthy conflict happening in those meetings? You know, follow people out into the hallway. Are there, is there more conversation after the meeting than there was in the meeting? You know, listen to what’s happening, the conversations that are happening in the executive meetings and what they’re, they’re asking to have happen. And then, you know, see what the managers at the middle level, what are they telling their people? Are they telling their people the same things that the managers at the upper level are telling? Or is the, does the message get distorted by the time it reaches that level?

Phillip Meade: And do the employees, or do they understand the things that the leaders want them to know? Do they even know why they’re doing what they’re doing? Just that, that kind of a thing. You know, what is, what is the, what is the general vibe around the office feel like, you know, or do employees seem like they’re happy and enjoy being there? Or does it, does it feel like it’s a, it’s a drag hanging out at the office? You can learn a lot just by hanging around.

Grant Belgard: What questions would you ask at the bench level versus the executive level?

Phillip Meade: I probably would ask a lot of the same questions. Honestly, I’d want to know, like, if they understood what their, what their strategy was, it might come out in different language, but I’d want to know, you know, do you understand how you’re going to be successful as a company? What are the values here? Or what, how would you describe the culture? Do you know, do you know what that means to be an employee here? I’d probably ask them questions about how they liked working here.

Grant Belgard: How do you tease apart performance issues that stem from process, structure or relationships?

Phillip Meade: You really just have to dive in and start asking questions and, and, and figure it out. You know, a lot of it is, is trying to figure out, you know, if the person that’s doing it, is it, are they, if there’s a challenge, is it because they, they can’t do it? Or is it because they won’t do it? Do they not have the, the ability to do it because they don’t know how to do it, or they don’t have the ability to do it because there’s something that’s missing? You know, you just have to, there’s just so many different ways it can go. You have to, just have to dig in and, and start asking questions and, and figure things out.

Grant Belgard: For, for regulated environments, of course, drug development is fairly regulated. What cultural strengths and blind spots tend to show up?

Phillip Meade: Well, I mean, sometimes you’ll have a strength from a feeling of, of sameness. You know, there can be like a, a, a sense of community or camaraderie that can come with being a part of a committee or a particular community there. But similarly, a blind spot can come along with that, that maybe there’s an over-reliance on standards or regulations to protect you from things. And, you know, that can be dangerous because many times, well, in all cases, those are only as effective as, as the people who are following them. And so, you know, you, you really have to depend on people to do what those regulations say. So.

Grant Belgard: When, when publication pressure or go, no, go, gates, loom, how do you maintain integrity of decision-making?

Phillip Meade: So first and foremost, I want to be honest, I haven’t dealt with this too much personally, but if I’m reading into the question correctly, I would say that as an organization, you would want to make sure that you are structuring your incentives correctly. You don’t want to create situation where you’re, you’re putting your, your employees into a no-win situation and, you know, putting them under undue, undue pressure to, to do things in order to save their job or, you know, or whatever. So, uh, I think that’s what I would say there.

Grant Belgard: What are the telltale signs that a strong culture has drifted into groupthink?

Phillip Meade: Uh, I think similar to, to what I said about being a fly on the wall in a, in a meeting earlier, you know, groupthink is obvious when everybody basically agrees to everything all the time. So, you know, I, I look for healthy conflict, uh, as a sign of a strong culture in, in many cases. And so I would be looking for, you know, that type of healthy dissent, not arguing or fighting, but, you know, questioning and challenging and, and people with different ideas or different positions on things. And so that’s where you get the, the best decisions and the best ideas and the best innovation. And so, um, that’s what you want to see.

Grant Belgard: What’s your approach to decision rights clarity? Who decides who’s consulted, who’s informed?

Phillip Meade: I don’t think that there’s a single answer to this one because, you know, there’s lots of different types of decisions. The idealistic answer to this is that you want the people who are affected to be involved in the decision. That’s not realistic in a lot of cases. I would say that I would lean as far towards that as is practical because the more that you can involve the people that are impacted in the decision, the more buy-in you’re going to get. And so one of the things that people don’t think about oftentimes is they, they misinterpret what it means to make a decision quickly. And they think of the time to make a decision as the time it takes to actually like decide. And I would argue that the time that you want to look at is the total time from when you start to the time to finish implementation.

Phillip Meade: And so you may get from the beginning to making the decision quickly, but then your implementation may take three times as long if you don’t involve the right people. And sometimes it may take a little longer to get to the actual decision point, but then your implementation is, is a third of the time to actually implement it. So the total time is actually shorter when you involve more people. And, you know, you got to think through that. Obviously you can’t always involve all the people and you can’t, and sometimes it is too long. And the way I just described, it doesn’t work out. And that’s the reason I said, it depends and it’s not really super clear, but, you know, I would lean towards involving more people and trying to get, you know, implementation to go more smoothly and getting greater buy-in when, when you can’t, because it really does, it really does help.

Phillip Meade: And I think that right now, in many cases, people lean too far on trying to decrease the amount of individuals involved because it makes the deciding part go faster. But then I think they’re under, underweighting how much it increases the implementation portion of it.

Grant Belgard: That’s a good point. How do you cultivate leader self-awareness?

Phillip Meade: I mean, coaching is a great way to do that. We have some workshops that, uh, that help to increase leader self-awareness, you know, reading helps, you know, as if once a leader decides that they want to start improving their self-awareness and then there’s, then just starting to pay attention and notice things can, can begin to, to be that part of that process. But as with all self-improvement, it has to start with the desire from the individual themselves to, to improve.

Grant Belgard: So how do you adapt culture work as a company scales from 20 to 200 to 2000, uh, even 20,000, right? Life science organizations come in all shapes and sizes.

Phillip Meade: Yeah. I mean, you’re doing the same basic things. It’s just a matter of how do you roll it out in tiers? So, you know, we, we always like to start at the top and then roll it down. And so you want to start with the executive team and then you want to move down to the layer below that. And then the layer below that. And so you, you just, you have more tiers. It takes a little bit more time. You know, when you start to get up to like 2000 and above, now you’ve got more mature, more well-developed HR departments. So you’re, you begin to work with, you know, more well-developed HR systems and processes. So you’ve got LMSs that you’re, you’re now integrating with and you’re, you’ve got really well-developed performance management systems and tools that you’re integrating into. And you’ve got internal HR teams that you begin to integrate into and work with.

Phillip Meade: And so, you know, you’re, the work that we do begins to integrate with the people that they have and the work that they’re already doing. And so we begin to, to weave in, into that.

Grant Belgard: What’s the best small concrete habit a leader can start tomorrow?

Phillip Meade: You know, for me, it’s, it’s just, I would say it’s, it’s learned something new every day. You know, one of the commitments that I made a long time ago was that I was, I was going to read every day. And so I try to, I try to read something new every day, but I think more generically, I would say just to, to learn something new every day. I think that’s a great habit.

Grant Belgard: What are the top three mistakes leaders make that quietly erode culture over six to 18 months?

Phillip Meade: I think the top three are not communicating, not admitting mistakes and tolerating bad behavior.

Grant Belgard: Where have you seen well-intentioned values backfire?

Phillip Meade: I think there’s two ways that well-intentioned values backfire. The first one is anytime the company or the leaders of the company don’t actually live the values or, you know, do something counter to the values that kills it right there. People see that it’s basically a lie or that it’s not true, then it becomes immediately ignored or, or worthless to them. The other one is when the values as well intentioned as they may be are over general. And Patrick Lencioni refers to these as permission to play values. And I mean, I’m not opposed to them existing as permission to play values, but I would call them that and differentiate those from your true core values. But, and these are things that almost every organization could claim that they have like integrity and respect and safety.

Phillip Meade: You know, it’s, it just feels so vanilla that a lot of times employees will look at those and they’re like, yeah, yeah. Okay. I don’t get it. You know, like it just, it just feels like it’s a platitude or, or something that is just being hung on the wall just to, just to do it because it doesn’t seem like there’s anything particularly special to it. Like, yeah, of course, you know, we don’t want employees to steal from us and, you know, everybody should have some basic respect from each other and you should expect not to die when you come to work. So that, you know, those things make sense. And so people just sort of blow it off that, you know, and they don’t pay attention to it. And so I think that those things are, are very well intentioned and there’s nothing bad to them, but it’s also very difficult to really get a lot of traction with them because they are so in most cases, vanilla.

Phillip Meade: And you know what, what Patrick Lencioni says is that, and unless you can truly argue that you have more integrity than 99% of the other companies in your industry, like it’s not really your core value, like it’s not what defines you. And so it’s, it’s hard to like, say this sets us apart. This is something that we’re going to hang our hat on and your employees see that. And it’s like, okay, like, yeah, we have integrity, but you know, it doesn’t really, it doesn’t really mean, you know, mean something special. And so it sort of just becomes this thing that we hang on the wall.

Grant Belgard: When culture change fails, what was the root cause of that failure most of the time?

Phillip Meade: Most of the time it comes down to a failure of leadership. Usually the leaders, the most senior leaders haven’t really truly bought into it and committed to it.

Grant Belgard: How do you prevent hero culture from undermining redundancy and documentation?

Phillip Meade: This goes back to what we were talking about a little earlier. I mean, this is a self-awareness issue. When hero culture is about me not truly having the self-awareness to realize that I am trying to make myself feel better by becoming the hero. And, you know, it’s that lack of self-awareness. It’s that self, it’s where I, it’s a defensive mechanism where kicking in, where, where I’m just trying to, to prevent myself from, from feeling bad. And so it’s, it’s part of my identity and I’m trying to protect. And so we want to try and raise that and prevent that from happening and increase, increase that, uh, self-awareness so that it, it doesn’t happen.

Grant Belgard: What’s the smallest viable step an individual contributor can take to strengthen culture?

Phillip Meade: The smallest viable step I would say is to increase your courage by 1%. If you increase your courage by 1%, then you’re going to increase your openness by 1%, which means that you’re going to increase the feedback that you give to others by 1%. And you’re going to increase the self-accountability that you have by 1%. And you’re going to increase the initiative that you take by 1%. You’re going to increase the contributions that you make by 1%. You’re going to increase your performance by 1%. I think if, if everybody in the organization were to do that, I think that you’d start to see visible changes in the culture.

Grant Belgard: What book, practice, or question has stayed useful across contexts?

Phillip Meade: I think the thing that has stayed useful across contexts, the practice, I’m going to go with the practice is getting curious. And it’s, it’s something that it’s something that I’ve, I’ve had to learn. And, you know, it’s, I’m not necessarily proud of it, but, you know, one of the things that is my tendency is, you know, and probably a reason why I’m sitting here answering all these questions really quickly for you on a podcast is I like being an answer guy. And so, you know, people come to me and, and ask me a question and I’m, I’m really quick to have an answer. And a practice that I started developing as a leader was to not answer the question immediately and to get curious and to ask more questions and try and learn more and say, okay, well, what’s going on here?

Phillip Meade: Or when someone would say something and I thought that they were wrong or I didn’t, you know, I thought that I had the answer and they didn’t, they were, they didn’t understand, get curious and figure out, well, why do I think that they’re wrong and I’m right? That’s been very, very useful to me across a lot of contexts to just try to get more curious instead of assuming that I always know the answer, that I always had the, you know, the right answer and that everybody else is wrong is very, very useful.

Grant Belgard: So what, what options do our listeners have to get more engaged with you through your work at Gallaher Edge? And, uh, you know, I know you have a book, you offer courses, you have, uh, consulting and so on.

Phillip Meade: Yeah, absolutely. You pretty much summarized it. We have a, we have a book that they can get on Amazon. It’s, it’s called The Missing Links: launching a high-performing company culture. They can get that on Amazon. You can go to our website. It’s Gallaheredge.com and, uh, check us out. Uh, we offer individual workshops as well as, uh, consulting engagements. We have a on-demand leadership development course that we offer. That’s, uh, it’s a micro learning format and, uh, it’s, uh, it’s a great way to get introduced to us and, and see what we did. We’re all about. So a lot of different ways. We, we also do, uh, speaking. So if you’re looking for a speaker for, uh, for an event, it’s another way that we can come and help you all out. So.

Grant Belgard: Great. Dr. Meade, thank you so much for joining us.

Phillip Meade: Thank you, Grant. I really appreciate it.

The Bioinformatics CRO Podcast

Episode 73 with Nataraj Pagadala

Nataraj Pagadala, founder, president, and CEO of LigronBio, discusses his company’s goal of using molecular glues to target traditionally undruggable proteins as a route to new therapies for neurodegenerative diseases.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Nataraj Pagadala

Dr. Nataraj Pagadala is the founder, president, and CEO of LigronBio, which develops molecular glues to target traditionally undruggable proteins.

Transcript of Episode 73: Nataraj Pagadala

Disclaimer: Transcripts are automated and may contain errors.

Grant Belgard: Welcome to The Bioinformatics CRO Podcast, where we talk to scientists, founders, and leaders at the intersection of computation and biology. I’m your host, Grant Belgard. I’m joined today by Dr. Nataraj Pagadala, founder, president, and CEO of LigronBio. LigronBio is a biotech company focused on molecular glue therapeutics, small molecules that co-opt the cell’s own protein degradation machinery to go after proteins that have traditionally been considered undruggable. The company is applying computational chemistry, bioinformatics, and AI-driven platforms like its tri-matrix analyzer to design these glues and target neurodegenerative diseases and other serious conditions where new therapies are badly needed.

Grant Belgard: Nataraj has more than two decades of experience in computational drug discovery, spanning academia and industry from early work in biochemistry and bioinformatics through postdoctoral and research roles, modeling protein structures and aggregates, to senior positions in biotech and now founding his own company. Today, we’ll talk about what he’s working on now at LigronBio, how his career path led him into molecular glues and company building, and the advice he has for students, trainees, and scientists who are now thinking about careers in computational drug discovery, or even starting their own companies. Nataraj, thanks for joining us. Great to have you on the show.

Nataraj Pagadala: Thank you very much, Grant. Thanks a lot for, you know, giving me the great opportunity for the molecular glue audience and also for the targeted protein degradation companies. This is Nataraj Pagadala, founder and CEO of LigronBio, and LigronBio is incorporated in 2023, working on targeted protein degradation space, developing molecular glues for all undruggable targets in oncology side and also in neurodegenerative diseases, mainly focused on Alzheimer’s, and later on it will be extended to Parkinson’s and also ALS therapeutics. So, primarily, we are developing the platform called as the AI TriMatrix Analyzer Platform to rationalize and discover molecular glues for the specific undruggable targets in Alzheimer’s space, and also this is linked with the diagnostic kit, which is called as an L-tag assay.

Nataraj Pagadala: This particular L-tag assay will help in the functional studies of these molecular glues to take it further for preclinical studies and also for clinical trials. So, this is a powerful engine linked with generative AI that will help in discovery of these molecular glues within 36 months.

Grant Belgard: So, for members of the audience who have never heard of molecular glues, what are they?

Nataraj Pagadala: Molecular glues are the small molecules, which is very, all the medicinal chemistry properties are similar to traditional drug molecules, except that the difference between general traditional molecules and molecular glues are these molecules, they do the protein degradation compared to the traditional drug molecules where they inhibit the proteins in the biological system. So, for the undruggable targets, basically, there is no binding pockets where actually these undruggable targets help in the progression of the disease, even though there are the proteins which can be inhibited by the traditional drug molecules. So, that is the reason why these molecular glues are designed especially for the undruggable targets for protein degradation.

Grant Belgard: When you explain your company’s mission to someone with biology background, what do you emphasize first, the disease areas, the modality, or the technology platform?

Nataraj Pagadala: So, basically, our mission is basically to design the molecular glues for any of the disease-specific proteins, which is undruggable mainly. So, at the same time, our mission is to do the targeted protein degradation for the diseases and also help in reduction of the proteins in the biological system and also the disease progression. So, our vision is very broad to develop a molecular glues for all the undruggable targets, you know, and to save the future generations from Alzheimer’s is our very big mission.

Grant Belgard: Are there any currently approved molecular glues?

Nataraj Pagadala: Yes, yeah. So, there is a couple of approved molecular glues. The two are, one is palmolidamide and also one is lenidamide, which is in the market as a revlimid for multiple myeloma. So, and also, it is a very big market for this particular molecular glues for multiple myeloma disease.

Grant Belgard: So, what convinced you that there is space for a new company in this area?

Nataraj Pagadala: So, basically, if you see from the last 10 to 15 years, many companies are developing molecular glues in the targeted protein degradation, but unfortunately, all these companies, they are literally, were not completely successful in developing molecular glues for any disease-specific or also the target-specific because of a lack of a serendipity. So, this is the reason why LigronBio came into picture. We are developing because of, you know, serendipity reasons, you know, to rationalizing the molecular glues and discovery of molecular glues is a very difficult task. So, we are developing right from the scratch. This is the primary reason why we are developing a TriMatrix Analyzer platform where actually this particular platform rationalizes the molecular glues and, you know, for a specific target using a generative AI that will help in discovery one thing.

Nataraj Pagadala: And also, at the same time, this particular platform also, you know, finds out all the off-target interactions, you know, that way we can eliminate all the serendipity problems within the biological system to develop a molecular glue for a specific target without any off-target interactions. That is the reason why LigronBio is a novel compared to all the existing platforms worldwide in terms of, you know, data integration with the AI and also high selectivity and specificity.

Grant Belgard: Neurodegeneration is notoriously difficult. What aspects of those diseases make them feel particularly well-suited for a molecular glue approach?

Nataraj Pagadala: Basically, if you see in the biological system with the neurodegenerative diseases like Alzheimer’s, right? So, that’s what I’m saying that, you know, there are many undruggable targets in the biological system that will help in the progression of the disease, not only in oncology side, but in also the neurodegen, neurological space in the neurodegeneration. So, these, as long as these undruggable targets exist in the biological system, it is very difficult for, you know, to inhibit the progression of Alzheimer’s or Parkinson’s and ALS. So, this is where actually, unfortunately, the targeted protein degradation space is not introduced into this neurological space and people are not successful as of now. So, this is where actually we need to develop these molecular glues and, you know, eliminate these toxic proteins which are undruggable from the biological system.

Nataraj Pagadala: That way, we can slow down the disease progression and, you know, restore the memory function and then also reduce the cognitive decline. So, this is where importance of molecular glues comes into picture with respect to neurodegenerative diseases.

Grant Belgard: How do you balance going deep on a few carefully chosen targets versus exploring widely across many possible targets with your platform?

Nataraj Pagadala: So, basically, this particular platform designs the molecular glues for any specific target. So, even though there is no three-dimensional structures of the protein done by crystallography or by any other method. So, this particular platform designs the molecular glue just by the amino acid. So, basically, if you see the undruggable targets, then there is a motif called, let’s say, degron. So, this degron is a six to seven amino acids or maximum 10 amino acids. So, based on that, this particular platform designs the molecular glue based on the amino acid. So, it is even the layman who doesn’t know how to design the molecular glues, this particular platform gives an opportunity just by typing, inputting the amino acid, amino, just an amino acid or a peptide sequence, it will develop a molecular glue.

Nataraj Pagadala: That’s where this particular platform is completely different from all the existing platforms worldwide.

Grant Belgard: What kinds of collaborations or partnerships are most important for a company like yours at this stage?

Nataraj Pagadala: So, at this stage, particularly because, you know, the experiments of targeted protein degradation is different than the traditional way. So, that is the reason why we need partnerships, you know, who are well-versed with the targeted protein degradation space. So, this is where, actually, we need the partners like BMS who is working on targeted protein degradation or also C4 Therapeutics or Chimera Therapeutics. You know, these companies are developing or working on a protein degradation, but unfortunately, they are not working especially on molecular glues, but they are working on other modality called as a protag. But, you know, there are some companies who are working on especially on molecular glues, but, you know, they were not successful as of now.

Nataraj Pagadala: So, we can help those kind of companies, you know, we can help, we can also partner with those companies to design the molecular glues with this particular platform and also help them to, you know, for the targeted protein degradation with the molecular glues with our platform. That’s where, you know, we can partner with those companies and we also, we can help those companies for developing a molecular glues.

Grant Belgard: When you think a few years ahead, what would success look like for LigronBio?

Nataraj Pagadala: Earlier, a few years ahead, right? You know, that time, actually, to be honest, funding is much flexible compared to this particular time where, you know, funding is a very bit hard. So, because of not successful by many of the companies. So, otherwise, you know, by today, LigronBio might have developed the molecular glues for the Alzheimer’s therapy. And by today, we might have at least reached the patients, you know, clinical trials for Alzheimer’s therapy and also might reach the patients.

Grant Belgard: And is the vision to accomplish that through partnerships or are you planning on sponsoring trials as Ligron?

Nataraj Pagadala: Yeah, actually, we are also trying to do from our side, our own clinical trials. At the same time, we are also looking for the big partners. You know, once we complete the initial phase of studies, once we file the IND, then we are also looking for the big partners to step in and also do the clinical trials, you know, as a joint collaboration with LigronBio.

Grant Belgard: What do you see as the main advantages and disadvantages of molecular glues compared to more traditional small molecule approaches?

Nataraj Pagadala: The most important advantage of molecular glues is, you know, because this is an event-driven mechanism, the effectivity and also the degradation therapy is more effective for any disease compared to the inhibition. That is a major difference between the molecular glues and also the traditional inhibitors because the traditional inhibitors are an occupancy-driven mechanism. So, as long as you take the drug molecule, then the effect will be more on the disease state. But when in the molecular glues, even though the molecular, the drug will be eliminated from the biological system, then still the effect will be more. So, that is the reason why, if you see the efficacy is also very high when compared to traditional molecules, and the effect will be 100 times more than the traditional drug molecule.

Nataraj Pagadala: So, that is the reason why, and not only that, basically, the molecular glues are treat undruggable targets, which is notoriously undruggable in the biological system and helps the disease progression. As long as these, as I said, you know, earlier that these proteins are not eliminated from the biological system, the disease progression will still be there. That is the reason why we cannot stop oncology, we cannot, cancer progression, and also neurodegeneration. So, there actually, traditional methods cannot deal with those undruggable targets. Only molecular glues can help in that particular situation and, you know, help in the inhibition of disease progression.

Grant Belgard: What makes designing molecular glues hard, scientifically or computationally?

Nataraj Pagadala: Basically, I see, basically, as I said, you know, the molecular glues, they influence the target protein based on a simple motif, which is called as a degron. So, degron is always, you know, as I said, you know, maximum of 10 amino acids, right? So, this is not a catalytic site. This is a catalytic site for our traditional drug molecules is different than, you know, influencing the drug molecule based on this particular glue, which is a solvent exposed. So, you know, to formation of ternary complex is very, very difficult with respect to molecular glues. So, this is where the difficulty comes in, one thing, because as I said, you know, the degron is only 10 amino acids or maximum of 6 amino acids. So, there will be serendipity of the molecular glues because, you know, most of the kinases, you know, most of the kinases contains this kind of a degron where, you know, 6 to 7 amino acids.

Nataraj Pagadala: That is the reason why there is a high chances of off-target interactions with the molecular glues. That’s where we need to eliminate those molecular glues. And the AA TriMatrix Analyzer platform is the one that, you know, eliminates all these off-target interactions and gives them highly specific molecules for the time, you know, that shows a target protein degradation.

Grant Belgard: How do you think about modeling ternary complexes and cooperativity when you’re working with molecular glues?

Nataraj Pagadala: So, modeling, basically, as I said, you know, we are training a very big database of ternary complexes right from the literature and also from our own in-house experimental studies. And we are also, you know, mapping the proteome in the biological system for all the undruggable targets, you know. So, that will help us in, you know, to see that using a generate AI, artificial intelligence, you know, large language models, that will help us, you know, to see that, you know, how the molecular glues is especially, you know, seeing the off-target interactions. Once we eliminate that off-target interactions, it is easy for designing of molecular glues for a specific target. So, this is where actually that we are building the TriMatrix Analyzer platform.

Nataraj Pagadala: And also, because, you know, most of the targets doesn’t have a three-dimensional structure, this is where another advantage of this platform is that even though there is no three-dimensional structure, still we can develop a molecular glue for the particular target, you know, just based on amino acid as an input. So, this is where the advantage of this one, and also the difficulty that I said, you know, in most of the companies, they don’t have a three-dimensional structure, you know, for most of the targets, you know, unless there is no three-dimensional structure, there is no molecular glue. But a TriMatrix Analyzer platform can do this. And at the same time, most of the companies, to find out a ternary complex formation, they are using a diagnostic kits. Those diagnostic kits is based on the fluorescence.

Nataraj Pagadala: They only give indication about, you know, whether the ternary complex is formed or not. But when that is taken into experimental site, then it is not replicated. The diagnostic kit is not replicated. The results of the diagnostic kit is not replicated in the biological system in most of the cases. But we are developing a diagnostic kit in, which is called as an LTG assay, which gives information about, you know, how the ternary complex is formed, which is like an alternative to x-ray crystallography. That’s where we can clearly see that how the ternary complex is formed. So, this is where the difficulty from all the big companies are facing as of now. And that’s what we want to make it easier for all these companies, with our TriMatrix Analyzer platform, or also the diagnostic kit.

Grant Belgard: How do you decide which parts of the problem to treat with more traditional physics-based structural biology approaches versus more data-driven AI-ML approaches?

Nataraj Pagadala: So, basically, in the physics-based approaches, you know, most of these approaches are for traditional therapy for all the proteins which have a three-dimensional structure of the protein, right? You know, on the catalytic side, you know, there it is easy for the physics-based approaches, you know, for designing of the drug molecules. But data-driven approaches, this where actually, where we don’t have a proper [trim?] structures of the protein, this is where actually the data-driven approaches comes into picture. Now, just like, as I said, you know, for all the undruggable targets where we need lots of data, and lots of data to develop one molecular glue for a specific target.

Nataraj Pagadala: This is where AI and also machine learning and artificial intelligence comes into picture compared to, even though, basically, artificial intelligence and machine learning is also useful for traditional therapy, but especially because that even though artificial intelligence and machine learning is not needed, still we can develop a drug molecule for the proteins which have three-dimensional structures of the protein and also the catalytic pockets. But without the data-driven approaches and without AI and ML, it is very, very difficult to design molecular glue for undruggable targets.

Grant Belgard: How important is experimental feedback for your models and what does that loop look like in practice?

Nataraj Pagadala: Basically, the experimental studies is very important because, you know, the important thing is, you know, very, very rare that we see the targeted protein degradation effectively by molecular glue in the beginning. So, the experimental side is very, very important. I know because, you know, there are many factors that we need to find out in the area of targeted protein degradation, especially with the molecular glues, because, you know, the protag development is completely different. So, it is easy to find out the targeted protein degradation with the protags. But molecular glues is a small molecule and they influence the target protein through small motif. Sometimes, you know, we don’t know how the degradation is happening, you know, how the degradation is happening, whether the territory complex is formed. You know, this is a very complex system through molecular glues.

Nataraj Pagadala: That is the reason why the experimental data, not only that, you know, it’s like, you know, if you check, you know, thousands of, hundreds of molecular glues, sometimes, you know, we end up with no molecular glue showing a targeted protein degradation. So, that is where experimental data, one experimental data, and one targeted protein degradation will give a clue for many, many stages of a molecular glue development in the biological system.

Grant Belgard: Where do you see the biggest gaps right now in this space? If you could choose one particular type of data to just have a lot more of, or better data of, what would that look like?

Nataraj Pagadala: So, basically, I see the main gap here is, especially in the molecular glue is, you know, we don’t have a ternary complexes. So, that is where actually we cannot design a molecular glues, the ternary complexes, not only, and also from x-ray crystallography, especially from the x-ray crystallography, actually, how the ternary complexes formed, except, you know, five or six cases. Not only that, you know, because when these undruggable targets, you know, the ternary complexes formed, it’s a larger, you know, it’s a very big complex. It’s very difficult sometimes to create a three-dimensional structures of the proteins through the x-ray crystallography because of its complexity in nature. So, this is where actually the difficulty is coming from in the area of molecular glues.

Nataraj Pagadala: That’s where we need to do some computational studies in the beginning with enormous, generate enormous amount of data, what the ternary complexes, you know, mapping of all the ternary complexes. That’s where we get some clues to do the experimental studies. If it is replicated, then we can say that, you know, yeah, this is what is happening from my computational studies, and this is also replicated in experimental studies. Then from that, you know, generate more, you know, molecular glues for other targets, you know, more data-driven through AI and ML.

Grant Belgard: So, to talk about your career, looking back, what were the big inflection points that shaped your career in computational drug discovery?

Nataraj Pagadala: Basically, I did my PhD in computational chemistry in 2007. And after that, you know, I did four years of postdoc in the University of Alberta and one year of postdoc in Belgium in KU Leuven University. So, I have lots of my career, you know, 25 years of experience. But, you know, all my career, I worked on a traditional way, you know, developing a drug molecules for all the proteins, for all the proteins which has the binding pockets, you know, have a very great traction record of computational drug discovery from the last 25 years, you know, published for international publications. And also, I was also rated as one of the eminent scientists in computational chemistry by Carnegie Mellon University. So, you know, but unfortunately, I never worked on this targeted protein degradation earlier, before I started my career in [biotherics], you know.

Nataraj Pagadala: There, my journey of a targeted protein degradation has changed, actually. Yeah. So, from there, you know, after going in-depth analysis, you know, then I realized that, you know, this is a, it’s not a simple thing, you know. I need to, I need to show to the world that, you know, with all my experience that, you know, how can we design the molecular glues? How can we not only molecular glues, you know, how can, I know, targeted protein degradation can be done easily? That is the reason why I started this particular career. That’s where the, I know, the inflection point has come in my career to show to the world that, you know, how can we do this? Not only that, with the doing of this, now, how can we, you know, reduce the progression of the Alzheimer’s or Parkinson’s and also ALS and also major this, this devastating diseases, you know.

Nataraj Pagadala: With this technology, we can definitely protect the future generations because we know that COVID-19 has, you know, pandemic has created, you know, havoc in entire world, right? You know, half of the world was got wiped off. So, that is the reason why I changed my career that I want to do something to this, you know, in the disease therapy and I want to show something to this, you know, how can we, you know, stop the diseases or also we can, we can inhibit the disease progression and, you know, protect the future generations for, for these devastating diseases.

Grant Belgard: What gave you the confidence to start your own company doing this?

Nataraj Pagadala: So, basically, my experience, you know, from the last 25 years, as I said, you know, I have a great track record of, you know, computational drug discovery and also because, you know, as I said, you know, I, I did a full five years of postdoc in a PhD and publications, you know, my, as from Carnegie Mellon University, I was also rated as an eminent scientist. So, based on my career, my track record and my way of doing a drug discovery, so it’s completely, a little bit different, you know, compared to other people in terms of thinking, in terms of implementation. That gives me confidence that, you know, definitely my approach will help definitely for these diseases to, for the disease progression, inhibit the disease progression.

Nataraj Pagadala: So, that is the reason why with all my computational chemistry, because not only that, you know, my other confidence is because I’m a, I’m a biochemistry background. Mainly, my, my background is biochemistry with a genetics, you know, with a PhD genetics department. And also, I’m well-versed with molecular biology and all the biology aspects. So, that’s where actually, I can easily connect my biochemistry experience with a computational chemistry experience, with a drug discovery experience, and also experience in biophysics. So, with all these subjects, you know, great expertise, it is easy for me to design the molecular glues. Think about how the drug molecule works in the biological system. That’s where I can easily connect. That’s where my confidence has come that, you know, I can achieve, not only that, you know, I don’t need big laboratories to develop these drug molecules.

Nataraj Pagadala: You know, I can sit at home and design the molecular glues in on the computer with all my expertise. So, that’s where, you know, I started, I started this company because of all my expertise and also discovery of these drug molecules without having a laboratory spaces.

Grant Belgard: Have there been any particularly helpful pieces of advice from other founders or mentors that have changed the way you run the company?

Nataraj Pagadala: Actually, because, you know, there are very less people, you know, who are working on molecular glues. So, and as of now, apart from the very big companies, like [?], and also C4 Therapeutics, and also Chimera Therapeutics, and BMS, apart from this, I personally feel that, you know, I’m the only one who started as a startup with developing a molecular glues and developing a platform. Other than this, you know, till now, I did not see any kind of other founder developing a molecular glues till today.

Grant Belgard: What’s something about the founder-CEO role that you didn’t appreciate until you were actually doing it?

Nataraj Pagadala: Yeah, actually, as I’ve basically, you know, earlier, when I was doing, working in different companies, you know, at that time, I was, you know, my ideas was not taken into consideration. But as a CEO of the company, when I was developing this TriMatrix Analyzer platform, when I was developing this, you know, designing the molecular glues, you know, with a diagnostic kit, you know, that’s where actually people completely, you know, seeing me as a different person in terms of, because, you know, there are people who are well worth the experience from the last 10 to 15 years of experience. Even though they have so much of experience, they were unable to figure out how the ternary complexes, how the targeted protein degradation is happening in the biological system, you know.

Nataraj Pagadala: But as a CEO of the CEO of LigronBio, as within a short period of time, you know, when I was doing this, you know, then people, you know, are seeing me as a different exceptional person and then who can definitely deal these particular problems, you know, help the community and help the society for and also for future generations with Alzheimer’s and also other domestic diseases.

Grant Belgard: From your perspective, what are the most underrated skills for computational scientists who want to work closely with wet lab teams?

Nataraj Pagadala: With the wet lab teams, actually, we, this is basically a different complex, you know, biology. So I need, you know, I want to work with the people who are well-versed with, especially with the neuroscience one, especially with targeted protein degradation, who has experienced targeted protein degradation in terms of molecular glues, without that, it’s very difficult, you know, to understand, to understand and do the experiments in the, you know, in the laboratory without having a knowledge about the molecular glues are targeted protein degradation. So I prefer the people from this particular background, you know, if you want to work with, yeah.

Grant Belgard: Where do you think molecular glues will realistically be in 10 years? A niche modality or something more mainstream?

Nataraj Pagadala: Yeah, actually molecular glues, as of now, molecular glues are, are in the, in the high priority for different companies and also bigger companies like J&J. So because they are small molecules, as I said, you know, they are brain penetrant, gut penetrant, and also membrane permeable. So molecular glues are the first priority as of now, and also, till now, 24 billions of money was deployed in molecular glue development by different companies and also by different VCs. So molecular glues are the highest priority in, in under the next 10 years, molecular glues is going to occupy number one place compared to traditional drug molecules. Because, you know, as I said, you know, the effect of the molecular glue will be high, very high, 100 times more than a traditional drug molecule. So it is going to, it is the first number one priority in the next 10 years.

Nataraj Pagadala: And also, not only that, in the molecular glues are going to, you know, affect on the disease therapy, especially for the Alzheimer’s in the next 10 years, there is a high chances that a molecular glue therapy will come into existence for Alzheimer’s, for Alzheimer’s, and also help the progression of, you know, and also inhibit the progression of Alzheimer’s. That way, it is a stepping stone for, you know, reversing the Alzheimer’s. If that happens in the next 10 years, trust me that, you know, molecular glue therapy will also reverse the Parkinson’s and also will reverse the ALS and also all the devastating diseases, even the cancer progression. We definitely, we can reverse the cancer progression, and also we can inhibit the cancer progression, you know, 30 to 40 percent. That increases the lifespan of the patient and also the families who are affected with these devastating diseases.

Grant Belgard: Is there a misconception about molecular glues that you wish you could correct for everyone listening?

Nataraj Pagadala: Actually, yes. You know, basically, people think that, you know, molecular glues are very difficult to design. And also, molecular glues have a high serendipity and also off-target toxicity. This is what the people think about molecular glues. But, you know, if you design properly from right from the scratch, you know, and also, we can design a molecular glue with a high target. Because last 10 years, this is what is happening with the molecular glues. Whatever the target is, basically, they are designing, but ending up at the same targets repeatedly every time and showing a degradation. So, because there is some problem in designing the molecular glues. That is the reason why we can design the molecular glues without off-target toxicity, very easily, if you do right from the scratch in a proper way.

Nataraj Pagadala: So, this is the misconception that, you know, molecular glues cannot be designed so easily. That is, that is a misconception there for the different companies all over the world.

Grant Belgard: Finally, if listeners remember just one thing from this conversation, what would you want it to be?

Nataraj Pagadala: Yeah. LigronBio, we are unlocking the undruggable targets for Alzheimer’s and other neurodegenerative diseases with the molecular glues. So, this is where actually we are the pointers in the molecular glue discovery.

Grant Belgard: And how can listeners or potential investors connect with you to learn more?

Nataraj Pagadala: So, basically, through email and also with my website, you know, all the information is given in the website. And, you know, please contact me. If you want any kind of a collaboration, if you want any kind of a help in designing the molecular glues with our TriMatrix Analyzer platform, I’m here to help you in a very effective way. And also, we can reduce the time of research and the cost of your research. And we can design the molecular glue for sure within less than 36 months. So, all the details were given in the website. Please contact me. Or else, you know, my email is npagadala@ligronbio.com. And my cell number is 412-863-3812. Please contact with any of this, you know, medium. You know, I’ll be here to help you as much as I can. Thank you.

Grant Belgard: Nataraj, thank you for joining us.

The Bioinformatics CRO Podcast

Episode 72 with Sophia George

Sophia George, professor in the Division of Gynecological Oncology at the University of Miami Miller School of Medicine, discusses her research at the Sylvester Comprehensive Cancer Center investigating the genetics and biology of hereditary breast and ovarian cancer and working at the intersection of genomics, health equity, and cancer.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Sophia George

Sophia George is a professor in the Division of Gynecological Oncology at the University of Miami Miller School of Medicine and the principal investigator of the George Lab at the university’s Sylvester Comprehensive Cancer Center.

Transcript of Episode 72: Sophia George

Disclaimer: Transcripts are automated and may contain errors.

Grant Belgard: Welcome to the Bioinformatics CRO Podcast. I’m your host, Grant Belgard. Today we’re joined by Dr. Sophia George, a full professor in the Division of Gynecologic Oncology at the University of Miami’s Miller School of Medicine and a member of the Sylvester Comprehensive Cancer Center. Her lab investigates the genetics and biology of hereditary breast and ovarian cancer and works to close gaps in cancer outcomes across the Caribbean, Africa, and the wider African diaspora. We’ll talk about what her team is doing now, how she got here, and what advice she has for scientists and clinicians working at the intersection of genomics, health equity, and cancer. Dr. George, welcome.

Sophia George: Good morning, hi.

Grant Belgard: Morning. So if you were explaining your lab’s mission to a first-year undergrad, how would you describe the problem you’re trying to solve right now?

Sophia George: Yes, right now is a great question because it has changed a little bit. So what we are ultimately interested in is understanding drivers of cancer and those drivers that lead to more aggressive disease and poor outcomes. And then we take into context what’s surrounding those drivers. So as a molecular geneticist, it’s the only thing about the DNA and sometimes RNA. But now we know that the DNA is not in isolation. Also the RNA is not in isolation and it’s in people. I mean, within cells, within people that are also exposed to factors beyond the genome. And so that’s what we do.

Grant Belgard: What questions are at the top of your list this year and why those?

Sophia George: So questions like, how can we distill spatial and temporal influences on the genome? Meaning spatial, where people are, so geography. And then temporal, how long have they been there? And I’m not thinking thousands of years, but more like tens of years. And how those exposures kind of lead to the signatures that we see, transcriptional signatures that we see in the tissues we’re studying.

Grant Belgard: And what kinds of data are most central for you at the moment? Do you now make transcriptomic, do you now make imaging, clinical, something else?

Sophia George: Yes, everything, everything, which is like, makes us work, makes work very interesting and long, long, long days. So we are looking at epigenetic data using DNA methylation assays, or assays that can tell us about DNA methylation. We’re using epigenomic assays like cut and run and cut and tag. We’re using single cell sequencing assays, transcriptomics specifically, and then spatial assays like COSMX and the APOIA system and a CODEX. And at some point even, I mean, I’m calling names of companies, but that’s how we kind of situate the type of assay and the technology and of course, 10X. So that’s what we use day-to-day in the lab. And then outside the lab in the community, we are also capturing epidemiologic data, survey data, the metadata that’s linked to the individuals that we’re studying the tissues of.

Grant Belgard: What’s a recent result or a signal that genuinely surprised you?

Sophia George: So the more you do, so one of the limitations of the stuff that I do is that one, you have to access the tissue. And of course, clinical data. So part of the metadata is the clinical data. And you’re asking recent, but I would say a while ago, it’s recent in the context of it’s just been put in guidelines. But one of the things that we discovered a while ago is that different populations in the Caribbean have different prevalence of the germline genetic mutations in BRCA1 and BRCA2. And in particular, the Bahamian population have these founder mutations that are really common. So one in four women who have breast cancer or ovarian cancer will have this BRCA1 or BRCA2 mutation specific to that population. The other well-known group are the Ashkenazi Jewish populations or groups. And they have one in 40 people in general, but 10% to 12% who have breast cancer have a mutation in the gene.

Sophia George: So you can hear the differences in these populations. That’s a surprise. So going beyond DNA that you inherit, another thing that we notice is that, at least from the perspective of the work that we’re doing, black women in the Caribbean or people of Caribbean ancestry, and we’ve also noticed there’s, of course, people of West African specifically ancestry. I can’t speak for the entire continent, but I’m speaking from the spaces that I work, have really diagnosed these cancers at a younger age and other populations. Even people with the same BRCA1, not the same identical mutation, but a mutation in BRCA1 and BRCA2. So now it’s collecting samples from all over the world.

Sophia George: We’re seeing that these ancestries with the mutation are a little bit surprising, but it’s good to see it because then we can actually attribute some at least biology, transcriptional biology, tissue biology to the prevalence and the incidence of early age at onset in these populations. So we’re seeing differences in transcriptional profiles that we’ve not yet published, but we’re doing single cell sequencing on hundreds and thousands of tissues from these populations. And so we are starting to see these signals come up, and I’m excited about what the data is going to tell us about the biology.

Grant Belgard: So in ancestry diverse cohorts, what strategies help you separate biology from environment, care access, and other social determinants?

Sophia George: Data, data, data. Really, it’s knowing what you have in the tube and who the people are, where the people are. So it’s putting things in context and why we have to capture that epidemiologic data, the clinical data to discern are we just looking at. I mean, everybody. So for example, I’m studying hereditary breast and ovarian cancer. A lot of my work is focused on the fallopian tubes of people with these BRCA mutations. They have an increased risk of 40% from 27% to like 40% to develop ovarian cancer if you have a BRCA mutation and higher up to 80% you have and for breast cancer. Maybe I’m like skewing the percentages. I think it’s 27% to 60% for ovarian, depending on the gene. OK. So there are other factors that we know are linked to cancer beyond the BRCA. They have an [imputations?] by how many ovulatory cycles or how long women have been ovulating. And that’s the same for breast.

Sophia George: If you have breastfed, if you BMI, increased risk smoking increased risk alcohol consumption. The data keeps telling us how many glasses or no glasses. But nonetheless, alcohol consumption increases your risk. And then a bunch of other things. So when you look at tissue and you isolate the DNA, isolate the RNA, and you’re looking at that signal, then you’re asking, well, for women in West Africa, what age on average do they start having kids? How many kids do they have? The fertility rates are different in the US as even compared to the Caribbean, compared to Africa. So that’s really important to be able to actually see people who are multi-parous. How does a transcriptional profile look compared to people who have one child or no child and no pregnancy or one pregnancy each time that goes to term?

Sophia George: So that is giving us ideas about one just normal physiology of the tissue and then seeing like, well, how can now? So that’s just like normal biology, right? And then we now have the complexity of genomic ancestry, which we know of people in the continent of Africa are the most genetically diverse folks. So we’re not even going down to the single nucleotide polymorphism yet, because we will need tens of thousands. But what we are doing is looking at essentially breaking it down by ethnic groups, self-identified, and also in [?] through the 1000 Genomes Project and others to be able to say, OK, well, people of West Africa, and I’m doing quotation marks, have this signature versus those who are European, or those who are admixed, like in the Caribbean, where we have a little bit of everything.

Sophia George: And one of my PhD students had come up with this logistic regression algorithm and approach to be able to kind of quantify proportion on the amount of African and European ancestry and essentially like a sliding scale and the signature that we see. And so that’s given us an opportunity to be able to disentangle both normal healthy, normal biology of the tissues that we study in the organs and then overlaying that with genomic ancestry. And of course, in the background, I’m determining whether these people have a mutation or not, because that’s also a driver of transcriptional difference.

Grant Belgard: So above and beyond all the biological and social sources of variability, what about the technical sources of variability? Do you think there are issues of collection, fixation, transport, storage, things that you think are currently underappreciated by many people for the impact they have on the downstream analysis?

Sophia George: 1,000 and 20, or maybe 1,200%. That is such a driver. So I should describe a project that we’re doing actively now. We have funding from the Chan Zuckerberg Initiative, where we were funding initially in 2021 to establish the African-Caribbean Single Cell Network. As a proof of concept, can we collect tissues, of course, at the time, snap frozen tissues, single cell tissues that we digest and get single cell suspensions from, I think at the time I started, it was like five or six countries in Africa and the Caribbean and, of course, in Miami. Just the idea of doing that and the premise and collaborating with my peers in those countries and say, do not put things in formalin. And then learning about the process of when tissue gets collected from the OR and taken to pathology and how it gets transported. How long does it sit on the bench? Do we have dry ice? Do we have liquid nitrogen?

Sophia George: That in itself, creating SOPs and changing practice to adapt to collecting tissues that are to be fresh and not just stuck in formalin in writing the OR has been a process on its own that deserves its own one to two, maybe three hour conversation. And you have to do that in each country. And so there is a saturation of the number of samples, right? So instead of saying, well, initially, we’ll digest to 10 and 20. Now we are doing hundreds each country, 400, so that there will be some that fall, right? So you have the outliers. And this is the outlier due to somebody forgot [?] and picked it out. That happens. We can see those added marks. So it takes on training, continuous training of the teams and continuous conversation and monitoring both for tissues and also PBMCs, peripheral blood monocytes, where we started and then we were like, oh, everything is failing.

Sophia George: And it’s because of how long they get kept in the minus 80 or even on the bench, right? So we’ve had to do all of that. And those technical, you can imagine, then over time and in different spaces, you will see these batch effects. So to prevent that from happening and say, we’re sequencing all serially on their own, we have to kind of wait and include samples from different countries in a batch so that when it gets to the lab, whichever lab, they’re trying to decrease the scale of variability.

Grant Belgard: This all sounds very familiar. In my PhD postdoc, we did a lot of postmortem brain work. And yeah, very, very similar challenges. You often don’t have a lot of information on how things were really processed brain bank to brain bank. And in some cases, even within the same brain bank, it will have been processed in very different ways.

Sophia George: Exactly. At the University of Miami, we have several hospitals and clinics where people undergo have to have surgery. So even within our institution, we had to optimize a protocol of transporting samples from the OR to the pathology to the lab. So that would decrease variability within our own health system, because some of them you literally have to drive, like go in a car. Because it’s so far away from the lab, right? It’s not walking distance. So we’ve had to do a lot of optimization.

Grant Belgard: And so if you had unlimited compute, but limited biospecimens, how would you allocate resources across discovery, validation, and mechanistic follow-up?

Sophia George: You’re asking really hard questions. Things that we think about. Okay, so unlimited compute, but limited resources, the tissues. Which is true, which is true, which is a reality. We can’t collect forever. I mean, it would be great to have a saturation of samples and genetic variability. So we would have to do like a test and a validation, right? One of the things that when we decided to scale this project from 15, 15, 15, so 15 fallopian tubes, 15 breasts, 15 prostate samples initially, to now 400, 400, 400, this was to give us room for the technical error, but also hopefully to get to somewhat of a saturation point with the genetic variability. Okay, I know Africa is like completely huge and so much genetic variability.

Sophia George: To test whether if we see something happening in the Ghanaian population and we see differences or similarities in Sierra Leone and in Nigeria because of the geographic proximity. So it would be testing us up, validating another, and then to use, which is something I’m actively thinking about now, use some CRISPR in vitro approach to try to mimic what we’re seeing in the transcriptomics, at least from the single cell perspective. That we still have to go back to modeling. I mean, of course, and I know there’s not like a rambling, but there’s a lot of now in silico things that you can do to mimic like the perturb-seq and all this data, this rich data that’s being generated that we might not need to go into in vitro, but it is always going to be able to say like, these either genetic alterations with this condition is likely increasing risk to develop disease. Can we model this?

Sophia George: And then eventually intercept it somehow, right? Because we know what we think is causing the change. So I would use a lot of tools, artificial intelligence, and generating so much data. Yesterday we saw we had like 1.6 million fallopian tube cells from cells from fallopian tubes just, and that’s only like 85 sample, no, a hundred and something samples, right? So it’s not, and we’re planning on doing this for like 300 to 400 samples per tissue type. And so it’s, we’re going to have a lot of data to inform on what it is that’s happening.

Grant Belgard: Are there computational approaches that you’re excited to scale up or to apply on this really large data set, right? Because oftentimes there are things that in principle people would like to do, but when, you know, you’re looking at data sets that were typical five years ago they just didn’t have the sample size to do it. But with the sample sizes you’re now working or that you’ll be getting, it might open the doors.

Sophia George: Yeah, so I really am excited about working with informaticians who want to use or who are using, I mean, we can’t really avoid it now at different neural networks, LLMs to be able to give us more information and the information, like I already know that my ability to ask questions about the data to look in front of me is limited because I cannot infer the relationships by just looking at it of cells amongst themselves and how the genome is interacting with the transcriptome beyond like the exons, like beyond the exons, right? So how, like, I am excited and I want the data to talk to me and to tell me what is happening. And so I look forward every day. I’m like, okay, what new packages out there?

Sophia George: What new algorithm did somebody come up with to the data that already exists, like in Cell by Gene and Human Cell Atlas, for example, talk to us, like, what is it telling us that I have the limitations of not even being able to ask? So I’m excited about that.

Grant Belgard: When you look at the literature on aggressive breast and gynecologic cancers, where do you see the biggest gaps that bioinformatics could realistically fill in the next five years?

Sophia George: I want more integration of the data. I want more what is happening, which these samples are hard to find, right? But they’re not, they exist. And what is the least amongst, as you asked me before, what is the least amount of data we can put in to be able to infer causality or even a relationship to disease progression? And then of course, on the other side is, well, how do we learn about all the data that we have? What do we learn from it in response to treatments? Knowing that, okay, this genetic signature from this genomic background will likely not respond to like pharmacogenomics and with the transcriptomics will likely not respond to drug X because we have modeled this a thousand times. This we know for sure. These are the questions that I would like answered and with what we already have and all the data that’s been generated like exponentially every day.

Grant Belgard: So when thinking about prevention in hereditary cancers, what does precision prevention look like in practice?

Sophia George: It’s just the old fashioned identify people at risk and then intervene with screening. And of course then there are cooler ways where, so how do you identify? So you could ask how do you identify the person in the first place, right? So how do we identify people who don’t even know that they are at risk or not aware? Yes, mom had breast cancer or ovarian cancer or pancreatic cancer and you think, oh, you know, grandma had that cancer and then you just kind of like, yeah, all people as we age, we get cancer because this cancer is the disease of the aging. Oh, it used to be so. So what tools again, computational tools, can we use to identify these individuals based on the data that they’re putting out there who would benefit from screening, genetic screening? And so that’s the population side.

Sophia George: And then of course the molecular side is of all the data that I’m generating, what are the ways that we can use small molecules to prevent disease?

Grant Belgard: If you had to bet, what’s the most likely near-term translational payoff from your current line of work? You know, is it more risk stratification, earlier detection, therapy selection, something else?

Sophia George: Risk stratification, I’m excited about some things that I brought some folks together to think about in terms of how do we use the data that I’m generating in the real world because they’re real people with real data. And so risk stratification is, you said one, but that’s one on that side. And then there’s a clinical trial that I’m co-principal investigator of where we’re looking at targeted therapy in these populations in three countries, the United States, Nigeria, and the Bahamas to be able to better identify individuals who will respond to these already FDA-approved drugs versus those who would not. I’m excited about that. That’s like long-term because the clinical trial just begun this year, but that’s something that I’m excited about learning.

Grant Belgard: Do you know when that’ll be finished?

Sophia George: Well, it’s a five-year clinical trial. So it just started today, not today, this year, so in five years, but we will be obviously getting data as soon as we see recurrence or response. And of course, you can’t make a conclusion from one person, but it is the fact that we get to do this study and all the components of it, of course, multi-omics and all the fancy things, all the tools, we’re doing all the tools, using all the assays that are available to us now and samples that we banked that we can do things in the future to be able to really go deep in understanding what’s going on. So that’s like a ways away, but in the meantime, it’s a re-stratification and again, integration of all the things that’s what we’re doing.

Grant Belgard: Something to look forward to, yeah.

Sophia George: Yeah, I’m excited. It was like we’ve done a lot of building and now we get to, again, ask really interesting questions and then hopefully have tools to help us resolve things that we don’t even, are not aware of.

Grant Belgard: Yeah, it’s kind of the, you know, biology equivalent to some of these big particle physics experiments, right? It can take a very, very, very long time to get the infrastructure in place and then you run the experiments and get the answers.

Sophia George: Yes.

Grant Belgard: So pivoting now to your own career, what first pulled you towards gynecologic oncology and hereditary cancer research?

Sophia George: Quite honestly, it was, I did a job after my PhD. I did a PhD in molecular genetics, molecular medical genetics and it was on engineering embryonic stem cells and differentiating embryonic stem cells on the cardiovascular system and looking at embryonic development and vascular genesis, angiogenesis. I wanted to do something with humans and I had considered going to medical school. I had applied to go to medical school, I got in and I had just had my son at the end of my PhD and I wanted to take a breather between all those decisions, between making all these decisions and so I applied for a job and I applied for a job to work at a biobank and the person, director of the biobank at the time, she said, but you’re too qualified, you’re overqualified. What is wrong with you? And I was like, I just want a job for a minute just to like not do anything science-y.

Sophia George: And so she offered me equivalent of a postdoc position in her lab and she helped focus, she wanted me to establish cell lines from fallopian tube and [epithelial?] cells from women who were undergoing [risk-reducing?] surgeries because at the time it had just been published and not yet published that the fallopian tubes were a likely site of origin for high rates of ovarian cancer because she’s a pathologist and her scholarship was in hereditary ovarian cancer before it was even a thing like in the context of fallopian tubes. So that’s how I got started. And then the following year, I was always interested, I’m from the Caribbean, I should state, all those listening, wondering where is that accent from. I’m from a tiny island called Dominica in the Caribbean, not Dominican Republic.

Grant Belgard: Dominica is always advertising the citizenship by investment on the planes, right?

Sophia George: Oh my goodness.

Grant Belgard: Every time you fly British Airways or something.

Sophia George: Okay, fine. So I’m from that island. We only have 70,000 people so we can afford to have visitors come. Okay. And so I’ve always been interested in health of the population, mine, I guess, and looking back. And so I got a scholarship to go to school in Canada, did my undergrad, did my PhD at U of T and during my PhD, I got to go to Venezuela with the UN and at the time, the Centre for Bioethics at the University of Toronto. And so I got exposed to thinking about doing genomics in the Caribbean and Latin America. And I had the opportunity to meet people from the Caribbean at the time, got invited to go to the Bahamas and say, oh, by the way, let’s think about genomics in the Caribbean. And I’m working for hereditary- I’m working on a project on hereditary ovarian cancer. And they said, oh, we also have this in Bahamas. And I was like, what do you mean you have BRC in Bahamas?

Sophia George: Like, it’s not a Bahamian thing. It’s a Jewish thing because I was in Toronto and that’s who had the BRCs. And that is how I got really fascinated about our population many years ago.

Grant Belgard: Just geographically, it seems being based in Miami makes a lot of sense. You’re, you know, a short flight or ferry right away.

Sophia George: Exactly. And that is my mentor. So I was in Toronto at the time and my mentor, the person who became my mentor, who was leading the study. So I said to the people when I was in Canada and I’m in the Bahamas. They’re giving a talk. Who is leading this research? And they’re like, oh, someone at the University of Miami and someone in Toronto, Steven Narod and Judith Hurley was at the University of Miami as a medical oncologist. And I got introduced to them. And she is a phenomenal woman who allowed me to ask questions and introduced me to everyone. And now I lead this work, right? But that’s how I got in to studying hereditary ovarian cancer.

Grant Belgard: So speaking of mentors, how did you find mentors and what made those relationships work?

Sophia George: Oh, wow. So Judith was serendipitous, I guess, because as I said, I was in the Bahamas and they said who might I reached out, not necessarily for her to be my mentor, but to see if I could learn more about this study. And she was magnanimous and generous. And I learned so much from her about how to engage with, who do you need to engage with to have impact. There’s always more people, but for sure, the people treating, the doctors treating the patients, you cannot, they’re not, or not to be a bystander in the work, right? Because they’re the ones that are going to see the patients to implement the things that we will eventually find and discover. So she, her personality allowed her to develop into my mentor, to learn and navigate the space.

Sophia George: Pat Shaw, who was my postdoc mentor and lead of the biobank and a pathologist, she ended up being a mentor because she knew so much about the system that we were in and what I was trying to do, quite frankly, as a woman. And it happened to me that I’m a woman of color and not that she was a woman of color, but being a woman in the space, in academia and allowing me to meet her networks and be introduced to them. So I’ve since then identified people that helped me in specific needs, areas of growth. So I tell my folks all the time that I mentor that you can have multiple, and peer mentors are really important. How we can help each other, drive each other, but also again, identifying folks for me who fill a gap and also have some redundancy.

Sophia George: So cheerleaders, supporters, folks who can help me plan and navigate, those have been factors in how I identify folks who might wanna spend time with and learn from.

Grant Belgard: What skills have you found hardest to learn on the job that you wish training programs taught more explicitly? People management, we don’t train the trainees.

Grant Belgard: I think that that is the most common answer I get when asking academics this question, right? Cause you’re promoted cause you’re good at doing science, right? And I guess the assumption is just, you pick up people management on the way.

Sophia George: Yeah, like somehow, right? We know about the DNA, RNA, protein, whatever molecule that we’re studying or trends that if you’re a population scientist, but how do you manage people? I mean, I guess people who do business and other things, they get to learn that.

Grant Belgard: Oh yeah, there are explicit training programs, coaching programs, absolutely, yeah.

Sophia George: Really? No, you learn that, you get to learn that like when you have a lab with people in it already and you’re like, wait a minute, I think I need to learn how to do this. So that, and then budgeting, finance. Although we have people that help us with the finance, but it’s not the same way of conceptualizing how much this project is really going to cost. What are all the factors involved that would cost money? And how do we identify sources of flows beyond and actually being creative about whom you collaborate with and how you do the collaborations. Again, institutions have some of those things, but we don’t get to think about that pre you come into it and then you hope that you find mentors or honest brokers that can let you know that this is happening and that’s an option beyond like thorough funding and how you partner with industry, different types of industry, all those things.

Grant Belgard: Yeah, the budgeting and project management’s a good point. I recall my postdoc advisor had spent some time in management consulting before his MD PhD and he really would use that pretty regularly and it really gave him a leg up in thinking about exactly what you said, what’s the true all in cost of a project, right? Because it’s a lot more than just what you have in the grant and then the time and thinking about recruitment and all that.

Sophia George: I mean, the time, the time, the time, the time. We are on 40 hour week of 60 hour week, whatever the week is, it’s never enough. And especially when you’re doing projects at scale where you are enabling people to lead, you have course when you are at different sites, you have site PIs and they have expectations and so on. But if you’re driving some parts of the science, it takes a lot of time to get everybody on board and a continuous training, all those things are not budgeted for. You know, there’s no line. I’m really, is there a line? Some people are like, yes, I put a line, but that line is never the true line, right? But it’s well worth the efforts of all the things. But yeah, it’s the budgeting, the project management.

Grant Belgard: How have collaborations across institutions or across countries changed the way you do science?

Sophia George: It has changed it significantly. So how? One, different systems, different cultures and practices and how to engage and expectations. Expectations vary independent of the cost. So even if you have a budget, some people want you to be fully involved. Some people want you to be not fully involved. Expectations, not talking about publications, but relationships, like what, how are these relationships built and sustained? They vary by country and they vary by partner, collaborating partner. And so for me, I have projects in, where we, in one region, three different languages. So projects in, oh four actually, Dominican Republic, in Haiti, in Benin Burkina Faso and English. So that’s four languages. And so, and each system, each country is different. And even within country, the institutions are different, different infrastructure.

Sophia George: So, and different questions that they want to ask, different priorities and how they want to ask the questions. So one disease might be more important than another, even within the same organ. And so making sure that, I call them informed believers on board, you have to also acquiesce, which is why collaborations work like the give and the take, or the give and the give, right? What it is that are you fundamentally interested in? Because even if I’m interested in like ovarian cancer, a lot of my collaborators, ovarian cancer is relatively rare compared to other diseases in some parts of the world. So they want to focus on prostate. They want to focus on cervical cancer. They want to focus on some rare disease that only is impacting their population, where I’m interested in the other part of the tissue.

Sophia George: And so how do we ask a robust question scientifically and have everybody, according to COVID, win-win, right? Like always it’s a win-win. So it’s a lot of interplay. And so the science that you see and the science that I’m thinking about is not like linear.

Grant Belgard: What non-technical skills do you find most accelerate progress in community-engaged genomics and in navigating multinational consortia?

Sophia George: One non-technical skills, communication. Communication has been a big, has been an important factor. Humility, I guess, is a behavior. I don’t know if it’s a skill, but it’s necessary. So communicating, being transparent, which facilitates the communication and humility, those things have allowed me to be with my partners on the ground forever, have allowed me to be able to do what I’m doing.

Grant Belgard: If you were advising a PI on setting up a multi-site cohort from scratch, what would you emphasize in governance and quality control?

Sophia George: So governance, setting up a team of folks at individual sites who have been trained and understand the biology enough that the representatives of you versus just managing. And then having harmonized system to collect and track whatever is going on. Like if you’re collecting blood, whatever you’re doing. And of course, optimizing protocols locally. So what protocol you write here or wherever you are is not going to necessarily translate to the T in a different setting that you do not want people to fill in gaps without your knowledge. So it’s like shopping the protocol, workshopping the protocol in each site versus disseminating one protocol and assuming that everybody’s doing the same thing.

Grant Belgard: I feel like that’s pretty universally applicable advice when you try to do anything across different sites in science, outside of science. How do you personally protect time to do deep work?

Sophia George: I block my calendar. So this year I’m interim associate director for the Center for Black Studies at the University of Miami. An interesting year to take up that role, but this is the year. And the center is on another campus. And when I go there, I can be quiet because sometimes nobody knows that I’m there, which is like the best thing. I have to be away from my home often and or my lab office and the lab in a very quiet space. My best work is in the middle of the night, but it’s not sustainable because then I wake up late and I don’t get enough sleep, et cetera, et cetera. Or I wake up early and I don’t get enough sleep when it’s super quiet. So for me, it’s just blocking my calendar and finding peace, like somewhere quiet so that I can think. I can read a paper from beginning to end and think.

Grant Belgard: And what advice would you give your first year PI self?

Sophia George: Oh, Lord, don’t be afraid to pursue the thing that you think is hard. Don’t be afraid. And be bold. Don’t be afraid. Because at once I was considered very timid and shy and sit in the back of the room. And I know that affected my ability to do more sooner.

Grant Belgard: So speaking of being bold, if you could place one big bet in your field and you had to wait 10 years for the readout, what would you fund?

Sophia George: In my field. My field is like, I mean, I’ve developed a few fields. We still, surprisingly, we still don’t have enough people sequenced. Surprisingly, we still don’t know enough between the transcriptome and the DNA, the genome. So, you know, these projects that I’m doing, we need more. We need to get to saturation. So 3,500 single-cell samples from different bodies is not enough. Even if that leads to, I don’t know, 35 trillion cells, I don’t know how many, 10,000, let’s say 10,000 cells, times 3,500, whatever that math is, 35 million. It’s not enough. No, so it’s gonna be three billion cells. It’s not enough. It’s not enough. It’s not even reflective of the number of people in the world. Right. So it’s not enough. It’s not enough. So I would do that. I’ll do more of that. And I would do like deep work, deep.

Sophia George: So the whole human kind of work where you’re not just capturing the single-cell, the RNA, but you’re capturing the epidemiologic data. You’re capturing that metadata that puts context with that piece of tissue or RNA, protein, metabolome, like the molecule, you know, that you, you know, whatever your measure is, that there is significant metadata to make it make sense, to contextualize it. So I would be doing that.

Grant Belgard: I don’t think any of our bioinformatics-interested listeners would disagree with, you know, more data and better metadata, right? Two things people always want.

Sophia George: I mean, it opens the doors to so many, you know, new additional methods and so on that can be used. King and queen. I said king. Metadata is king, but it’s also queen. Like it’s, it’s non-gender. It’s important.

Grant Belgard: So where can our listeners follow your work and your lab’s updates?

Sophia George: Oh boy. So I’m supposed to be updating my website. I post sometimes on Instagram. Sophia HLG and publications. I, yeah, kind of, I know it does, it sounds anticlimactic, right? But yeah, when we travel, we post and of course publications here and seeing the work that we’re doing. Some of it, they all look now and be like, well, this is like all epi stuff, but while we’re building the, and Grant knows and sees the different types of assays that are coming through, it takes time to get these types of rich data and to make, I’m not a, I don’t want to make fast and dirty conclusions. So the metadata and the clinical data is really important to put context with these populations and samples that we’re studying.

Grant Belgard: Thank you so much for joining us today. Really appreciate it.

Sophia George: It’s been fun.

Grant Belgard: Thank you.

Sophia George: Thank you for having me. Thank you.

The Bioinformatics CRO Podcast

Episode 71 with Christiaan Engstrom

Christiaan Engstrom, founder and CEO of BLPN, discusses his experience building a space for authentic, non-transactional business networking in the life sciences.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Christiaan Engstrom

Christiaan Engstrom is founder and CEO of BLPN, an invite-only community for life science investors and senior executives to connect.

Transcript of Episode 71: Christiaan Engstrom

Disclaimer: Transcripts may contain errors.

Coming Soon…

The Bioinformatics CRO Podcast

Episode 70 with Joanne Hackett

Dr. Joanne Hackett, VP of Health Systems Services at IQVIA and Chair of the Board at eLife, discusses her hopes for the future of healthcare.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

Dr. Joanne Hackett

Dr. Joanne Hackett is VP of Health Systems Services at IQVIA and Chair of the Board at eLife.

Transcript of Episode 70: Joanne Hackett

Disclaimer: Transcripts may contain errors.

Coming Soon…

Bioinformatics Done Right, Now

Academics, Don’t Wait on the Queue: A Faster Path from Data to Publication


The email arrives: “Your sequencing data are ready.”

It’s the kind of sentence that makes a lab buzz. But after the first rush comes a familiar pause: Who will analyze this, and how long will it take? If you recognize yourself in that moment, The Bioinformatics CRO is for you.

The Shortest Distance Between Data and Figure

Our promise is simple: fast, publication-grade bioinformatics for academics at core‑competitive pricing—without the long waitlist or the learning curve.

  • Speed without shortcuts. In-house cores do important work, but they’re often backed up. We keep our queue short and our response times tight. Projects start quickly once scope is set.
  • Expert time, not training time. Our team is staffed by senior scientists who have shipped many analyses. That experience compresses timelines and reduces rework.
  • Pricing in the same neighborhood as cores. Hourly rates are similar, but our model is built to reduce idle time and cut the “waiting cost.”

In other words: you move faster, often with lower all‑in cost once you account for delays, rework, and the hours you spend shepherding a novice through their first pipeline.

Why Not Just Use a Trainee?

Postdocs and graduate students are talented. They are also busy. Courses, journal clubs, teaching, competing projects, and grant work carve away their hours. If your timeline is tight—or the analysis is non‑standard—asking a trainee to learn on the fly can turn weeks into months. By the time they’ve written code, defended choices, and redone figures for reviewers, the “cheap” path has quietly become expensive.

Working with us is different. We’ve already navigated the edge cases, the batch effects, the parameter cliffs, and the “looks great, but reviewers won’t accept it” traps. We deliver defensible results, clean methods text, and reproducible code.

A Diplomatic Word About Cores and Collaborators

Cores are steady partners, but queues are real, and revisions can be slow. Collaborator labs can be great, yet authorship and priorities get complicated. We’re designed to be your surge capacity and your clean handoff: fast starts, clear deliverables, and no unnecessary authorship entanglements.

How to Work With Us (and Save Money Doing It)

Two engagement styles both work well. Choose the one that matches your project and bandwidth.

1) Clear Scope → Accurate Estimate & Fast Delivery

If you know what you need—say, bulk RNA‑seq differential expression with pathway analysis and four figure-ready plots—tell us up front.

What you get: a tight statement of work, a realistic budget window, and a start date you can put on your lab calendar.
Best for: projects with defined questions, revision letters, or datasets similar to your previous work.

2) Engage With Us as You Go → Exploration & Iteration 

Not every dataset announces its secrets on day one. If the plan is exploratory, we’ll move in measured steps—share early readouts, discuss directions, and refine.

What we need from you: real engagement. Quick feedback keeps momentum high and scope aligned.
Best for: new modalities, mixed cohorts, or “we’ll know it when we see it” figure discovery.

A Cost‑Effective Division of Labor

To keep your budget focused on analysis (not cosmetics or copy), split the work like this:

  • Your lab handles:
    • Data and metadata hygiene (sample sheets, consistent IDs, clear conditions).
    • Figure polish for final submission (fonts, colors, journal‑specific formatting).
    • Manuscript prose (introduction, discussion, and related literature).
  • We handle:
    • QC and rigorous analysis (e.g., DE, clustering/annotation, integration, modeling).
    • Reviewer‑proof choices and statistics.
    • Figure‑ready plots and tables.
    • Methods text and code so everything is reproducible.

This division keeps costs lean and lets trainees contribute meaningfully without spending their semester learning an entire toolchain from scratch.

What to Expect From Us

  • Fast kickoff once scope is set. We schedule starts promptly and keep you posted.
  • PhD‑level analysis you don’t have to babysit. We make choices transparent and document them.
  • Figure‑ready outputs and clean methods. Drop them into your manuscript with minimal edits.
  • Reproducible artifacts. Notebooks, parameter files, and pipeline manifests live with your results.
  • Plain-language updates. Short check‑ins, clear next steps, and no jargon walls.

When We’re the Obvious Choice

  • Data in hand; publication clock ticking. You need figures in weeks, not semesters. 
  • Major revision lands. A reviewer asks for extra analyses or different thresholds. We execute fast and clean. 
  • Grant support. You want a credible analysis plan and methods you can defend. We can provide a letter of support too.

Common Questions

  • “Isn’t a student cheaper?”
    On paper, yes. In practice, hidden costs pile up: learning time, your guidance time, reruns after critiques, and the risk of delays. Our rate is core‑range, but our experienced team and shorter queue often make the real cost—and the stress—lower.
  • “Will you take authorship?”
    Only if you want us to and if our intellectual contribution merits it. Otherwise, we provide clean acknowledgments and thorough methods so credit remains where you intend.
  • “What about compliance and reproducibility?”
    We assume de‑identified data by default and return a reproducible package: QC summaries, parameter files, methods text, and code that stands up to reviewer scrutiny. 

A Short Field Guide to Faster Projects

Before kickoff

  • Write one paragraph that states your central claim or question.
  • List your cohorts/conditions, sample counts, and any known pitfalls.
  • Clean your metadata: consistent sample names, tidy spreadsheets, no mystery columns.

During analysis

  • Respond quickly to interim results; momentum matters.
  • If the path forks, choose one clearly—or approve a bounded exploration.

Before submission

  • Have your trainee apply journal style to figures (fonts, colors, panel letters).
  • Paste our methods text and citations; adjust voice as needed.
  • Use our code and QC notes to pre‑empt reviewer concerns.

The Quiet Luxury: Time

The most expensive thing in your lab is not the hourly rate of an analyst. It’s time—time before a scoop, before a grant deadline, before a trainee defends, before the field moves on. The Bioinformatics CRO trades in time saved without rigor lost. That is the value we offer: publishable certainty, delivered quickly, at a price you already recognize.

If the next dataset is knocking, let’s make the waiting the shortest part of your story.

Ready to move? Send us a brief description of your data and desired figures, or tell us you’d like to work iteratively. We’ll match the approach to the moment—and get you from data to done.

The Bioinformatics CRO Podcast

Episode 69 with David Scieszka

David Scieszka, founder and CEO of Vertical Longevity Pharmaceuticals, tells us about VeLo’s pioneering senolytic vaccine approach to clearing senescent cells and his quest for longer, healthier lives for everyone.

On The Bioinformatics CRO Podcast, we sit down with scientists to discuss interesting topics across biomedical research and to explore what made them who they are today.

You can listen on Spotify, Apple Podcasts, Amazon, YouTube, Pandora, and wherever you get your podcasts.

David Scieszka

David Scieszka is founder and CEO of Vertical Longevity Pharma, which is currently pioneering a senolytic vaccine approach to targeting atherosclerosis and aging.

VeLo Pharma has just opened up a community investment round with no investor accreditation required: https://netcapital.com/companies/vertical-longevity

Vertical Longevity Pharma investment QR code

Transcript of Episode 69: David Scieszka

Disclaimer: Transcripts may contain errors.

Coming Soon…

Phil Ewels

The Bioinformatics CRO Webinar Series

February 18, 2026: Phil Ewels – Reproducible Bioinformatics at Scale: nf-core + Nextflow

Phil Ewels

​Phil Ewels is Product Manager for Open Source at Seqera. He holds a PhD in Molecular Biology from the University of Cambridge, UK. Phil joined Seqera in 2022, previously working at the National Genomics Infrastructure (NGI) at SciLifeLab in Stockholm, Sweden, where he became involved in the Nextflow project and co-founded the nf-core community. Phil’s career has spanned many disciplines from lab work and bioinformatics research in epigenetics, through to software development and community engagement. He is passionate about open-source software and has a soft spot for tools with a focus on user-friendliness. He is the author and maintainer of tools like MultiQC and SRA-Explorer, and helps lead the nf-core and Nextflow development teams.

In this live webinar, he gives an overview of Nextflow and an introduction to some of its new and exciting features for bioinformaticians looking to scale up their pipelines.

Transcript of The Bioinformatics CRO Webinar Series – Reproducible Bioinformatics at Scale: nf-core + Nextflow

Disclaimer: Transcripts may contain errors.

 

Grant Belgard: Welcome to the final talk in The Bioinformatics CRO webinar mini-series. At The Bioinformatics CRO, we help life science teams turn complex data into clear decision ready insights, providing flexible expert bioinformatic support from study design through to analysis and reporting. As part of that mission, this webinar series features practitioner focused talks with concrete takeaways you can put to work right away. Today’s talk is by Phil Ewels. Phil is a senior product manager for open source software at Seqera where he helps lead the nf-core and Nextflow development teams. Today Phil will be presenting on reproducible bioinformatics at scale: nf-core and Nextflow. After the talk, we’ll host a live Q&A session. This is streaming both to YouTube and LinkedIn and on either platform, you can put your questions in the chat or the comments at any point during the talk and we’ll bring them into our discussion afterwards. Phil, over to you.

Phil Ewels: Thanks very much for the introduction and thanks Grant for the invite to come and speak today. It’s a pleasure to be as part of this webinar series and it’s always nice to have the opportunity to talk a little bit about Nextflow, a topic close to my heart. Um I don’t know if my slides are ready to come up but I yeah so basically my talk today is in two parts. I’m going to give a bit of an introduction to what Nextflow is and what nf-core is and why I think they’re good and useful for you and why I think you should care and then I’ll talk a little bit about some of the new features which have come out especially for Nextflow in the past kind of year or so, year or six months and this is particularly good for anyone in the audience who maybe has dabbled in Nextflow especially a little while ago because things are changing quite a lot and for the better. So I hope I convince you to really pick up Nextflow and see if it could help you in your work. So yeah, so my name is is Phil and I’ve been working originally in the lab and then kind of became a self-taught bioinformatician and went slowly moved from research into kind of core labs. So I worked at the National Genomics Infrastructure in Sweden developing new lab techniques and analysis and then started kind of accidentally getting into software design. Started writing pipelines, had my own pipeline tool. It was all the rage 10 years ago. And wrote software like MultiQC which I imagine many people will be familiar with. And got into Nextflow probably about eight years ago or so while I was in Sweden at the NGI. And we were running huge numbers of samples. It was a real step up from my previous work in Cambridge where now we were running hundreds of samples, hundreds of projects, sorry, thousands of samples. And we needed to the software I’d used previously wasn’t really up to the task. So I looked around and found Nextflow and we started building lots of different pipelines and because we’re a team of about eight people building pipelines and kind of we started to standardize and nf-core was born out of that standardization of our pipelines. I’ll talk a little bit about what made that possible.

Phil Ewels: So the background to the whole picture of why Nextflow exists is these classic statistics from this Nature paper quite old now 10 years ago. Where a simple study I think it’s the statistics resonate with many of us working in bioinformatics about this reproducibility crisis where it’s famously difficult to reproduce experiments that you find in the published materials and even reproducing your own experiments kind of what I refer to as your- one of your most important colleagues which is future you is notoriously difficult to do and reproducibility is the foundation of the scientific method. And so we were kind of in a bit of a bad place 10 years ago where data was really starting to scale. NGS was taking off. We had more data than we knew what to do with and we couldn’t really reproduce the analyses that we were doing and certainly we couldn’t transfer those analyses to other people. And it’s not surprising because it’s a really difficult problem. We’re running many different tools, each one of which might have a numerous different complex dependencies. Everyone’s running on a different system. And often, you know, even if it works on your machine, it might not work at a collaborator. Everyone is doing things in their own way and there was very little in the way of provenance, of knowing where data came from when your supervisor sent you an Excel spreadsheet with some results in. So, Nextflow set out to basically try and provide an answer for this. And it’s a workflow orchestration tool. So, it takes your analysis pipeline of multiple different steps and puts it together into a language. And it’s quite a unique syntax. It’s flow-based programming which kind of makes sense for what it’s doing. It’s got some real key features which make it very very popular. Something that’s really important is it’s got built-in support for software packaging. Docker was very new about when Nextflow was first launched and Nextflow supported it almost right away. And so you can package individual tools at the level of single processes within your pipeline. So the software effectively comes built in with the pipeline. So end users don’t have to worry about installing 20 or 50 different tools every time they run a new pipeline. And all those versions are pinned so you know you’re always running the same version of the software when you run that version of the pipeline. It’s multiplatform so Nextflow supports lots of different what it calls compute environments. It can submit jobs to all kinds basically anywhere you can run computing Nextflow will support. It has one of the most popular features is the ability to resume. So it’s got it’s quite clever with a cache of completed tasks. So if you’re, if you lose power halfway through your run and it’s been running for like three or four days, you haven’t lost everything. Nextflow was able to look back and understand which tasks already completed successfully and pick up where it left off. With this kind of dash regime, it’s massively scalable. Really, you know, I’ve put thousands here, but up to millions of jobs. We’ve seen truly enormous workloads passed through Nexftlow and it’s able to scale to really massive volumes of data and in the last 10 years it’s really grown an extremely active ecosystem and community which is one of the most attractive things to the system really is that there are lots of other people building with it. Okay, so for those unfamiliar with Nextflow how does it work? What does it do? There’s basically a few different steps to building a pipeline and running it. Firstly, you define kind of processes within your Nextflow code which are the building blocks. So, a single process usually corresponds to a single tool. You say what the data inputs are, what the expected outputs are, and then you have a script which could be a bash command. It could be a Python script, an R script, can be anything really, but that’s able to be resolved on the fly and that’s then submitted to your compute environment to be run as a single task. So you describe all the different processes in your pipeline and then you link them all together with what Nextflow calls channels, which is the data flow aspect of Nextflow. And you can have one but Nextflow handles all the data flow automatically when you run for the pipeline and it handles all the dependency and all the parallelization so that when you describe this flow then Nextflow automatically figures out basically how your pipeline should be run. Then once you have the pipeline logic and the code written you have a separate step which is to write configuration and then the configuration is importantly separate to the pipeline code and this is where you describe your specific setup. So your HPC, your cloud compute credentials, your laptop, whatever, and when once it’s configured you’re ready to go and you can execute it wherever you want to, basically.

Phil Ewels: The really key points if you remember nothing else is that Nextflow is not just one thing it’s several things. It’s a language. So it’s a language and a code syntax which is designed for describing workflows, the steps and also the data flow within workflows. It has separate configuration from that syntax so that you can separate the logic of the pipeline from how the pipeline should run and it’s also an orchestrator. So it’s the actual job that you run which actually passes that code and understands it and runs the pipeline for you. So it’s both a language and also an executor. The two things that Nextflow brings are reproducibility that you can run the same workflow and it can be years apart and as long as you run the same git versioned pipeline code which has pinned the exact same software for every step and you’re using the same version of Nextflow you’re almost guaranteed to get exactly the same results out which is really fantastic. And the other thing is this idea of it being portable. I can write one pipeline code and share it with different people in different places running on different systems and they can write their own config files but the pipeline code stays unmodified. And so for the first time really when Nextflow came out, it was possible to write one pipeline and run it anywhere, which now seems kind of obvious 10 years in, but at the time these two facets in Nextflow were really revolutionary and and groundbreaking.

Phil Ewels: And so what this means, this touches on this concept of scalability which was in my talk title. So you can write a single Nextflow pipeline and you can test it out on your laptop with one small test sample and and once you’re happy that it’s working properly, you can scale that same pipeline up without touching the pipeline code to tens or thousands or millions of samples. And you can also scale up the compute that it’s running on from just your laptop to maybe a slurm cluster somewhere or cloud computing basically any kind of cloud computing AWS, Azure, Google, um, Oracle. And because of the way that Nextflow is structured and architectured, it’s able to handle that scaling and basically grow grow with your needs.

Phil Ewels: Nextflow has become massively popular because of this. The figure on the left is from a recent paper that we did for nf-core community and shows just the number of citations for different workflow managers is a bit of a lagging metric but you can see that Nextflow has become more and more popular over recent years. And then on the right we just have the number of runs and you can see there’s there’s hundreds of thousands of runs of Nextflow pipelines every day. And this is probably undercounting it quite a lot as well. So Nextflow is arguably one of the most run workflow managers certainly in life sciences.

Phil Ewels: So that’s Nextflow. Quick introduction there for those who are unfamiliar. So that was how it works and why it was built the way it is. Because for the first time Nextflow was able to give us workflows which were portable between different systems. Back in 2017-18 we had this kind of light bulb moment where up until then everyone wrote their own RNA pipeline wherever you were in your core facilities, in your labs. You had to because other people’s pipelines didn’t work on your system. And they had hard-coded paths or maybe the environment module system with the software used different names. All these different things made it very difficult to collaborate. But Nextflow suddenly removed those blocks that we could now share code for running pipelines and we didn’t all need to write our own pipeline. And so back in around 2017-18 I started nf-core with some collaborators and friends and we started taking the standards that we we built in Stockholm and kind of opening it up to the wider world and based on those principles we founded this Nextflow community called nf-core. nf-core has exploded in popularity alongside Nextflow. The two have kind of formed a very symbiotic relationship and now we have over 140 different pipelines which is astonishing when you bear in mind that one of our key guidelines is we only have one pipeline per data type or analysis type. So we have only one RNA pipeline. So that’s 140 different types of data analysis that we have pipelines for. In the recent years we’ve also grown to be more than just pipelines. I’ll touch on this in a second, but we also now have shared modules, which are basically individual processes within the pipeline. And so these themselves are shared and can be reused across different pipelines and across pipelines outside of nf-core. And so every one of those is is a different tool and it comes bundled with its commands, its usage, and its software containers and everything. And there’s now over 1,700 of those, and that number is growing really, really fast. And then we have a community Slack where we have channels for every different pipeline for discussions. We have kind of a core team and a maintainers team and kind of some level of governance within that. And we have going on 14,000 community members in Slack now. So it’s an extremely active community and of course really kind of it’s built on this concept of best practice where we you can write Nextflow is a programming language and you can write your Nextflow pipeline in basically any way; there’s huge variability in how you do that and nf-core takes a very very opinionated stance and says if your pipeline is going to be part of nf-core it has to be written exactly this way we you have to use our template you have to do things our way. And the reason we do that is that then makes it possible for components to be interchangeable and for folks to be able to collaborate. So standardized tooling, best practices and a lot of documentation.

Phil Ewels: One of the things that’s quite unique about nf-core versus other pipeline registries and software registries is that one of the requirements of adding a pipeline to nf-core is acknowledgement that it’s not owned by you anymore. It’s community owned. This is another figure from that recent paper. But I really love these plots. This is for the small RNAseq pipeline which we actually started in Sweden before the origin of NF core. And you can see that top green bar is SciLife. And you can see that we were sole owners, maintainers, contributors to start with in 2017-18 and then more and more different organizations have joined in with maintaining and contributing to the pipeline and actually SciLifeLab stopped contributing really to it around 2022. But the pipeline lives on because the pipelines are community owned. They don’t suffer from this problem of a PhD student finishing a PhD and moving on to a different position and abandoned the software getting abandoned because it’s community owned. We can build updates in based on community consensus and bring in volunteer works from from groups across the world. And that’s a real kind of superpower for nf-core.

Phil Ewels: I also want to touch on the fact that nf-core is not just pipelines anymore. This modules library and the tooling that we build for nf-core is deliberately done in such a way you can use it for any pipeline, any Nextflow pipeline. And so this is the nf-core CLI I’m showing on the right and it has a TUI, a terminal interface which you can use to create new pipelines and that very rapid example there is creating a new pipeline which is not using nf-core template and you can choose which of the features from the template you want. So you can make it very very minimal or you can have everything that NF core comes with. And it’s up to you. And once you’ve got your pipeline, you can then go into that pipeline and use the tooling again to pull in these shared modules from a community repository. So here I’ve pulled in SAMtools sort and BWM and it fetches that those modules. It fetches that code with everything that comes with it and pulls it into my pipeline. And really then all that’s left is to connect those channels I mentioned. I’ve got the building blocks of my pipeline there provided for me from a community and I just need to put them together. So, nf-core tooling really provides a fantastic starting place for for anyone building their own Nextflow pipelines that you can just mix and match and you’re building on on community best practices. You’ve got all the the learnings of thousands of scientists using Nextflow over many years. And your, the modules you’re sure are well tested and being used by many other people. So you’re benefiting from a a huge pool of community knowledge.

Phil Ewels: Okay, that’s it for my introduction. So next I’m going to touch on some of the new developments in Nextflow.

Phil Ewels: Nextflow itself is is developed at Seqera and we have a team of engineers working on Nextflow and basically the last year or so we’ve had some pretty major projects based on the community survey that we do. We try and do one of these almost every year. And for as long as I can remember, people would always say that they love Nextflow, but they find it really difficult to work with. The error messages are unhelpful. The syntax is confusing. And there’s none of the kind of nice stuff that people are used to working with when they use other programming languages. I myself write a lot of Python. Um, multiqc’s written in Python, for example. So, I absolutely sympathize with these these requests. And so we really went back to the drawing board with Nextflow about a year and a half ago and said okay how do we solve these problems and basically we took on a really massive project which is we completely rewrote how Nextflow understands Nextflow code. In the past Nextflow was what’s called like a Groovy DSL like a domain specific language. So the way it worked was you wrote your Nextflow script and that was basically cross-compiled into Groovy code at runtime and then the Nextflow engine would run that Groovy code. That’s still kind of the case but now we have a new language parser which takes your language which takes your Nextflow code and is able to natively understand the syntax that you’ve written. This is really changes the game for us in terms of what we’re able to provide for developer tooling, for error messages and things like this and means we’re kind of moving away from the days of Nextflow being a Nextflow – sorry a Groovy DSL really Nextflow starts to become its own native language.

Phil Ewels: One of the first things that was possible with this was that we launched a language server, an LSP, which um, and we incorporated that into the VS Code extension, which is probably the best way to write Nextflow code. And so suddenly we were able to bring up this developer experience for writing Nextflow code to be in line with other languages that you might be used for. The simplest thing is error reporting. Just being able to see in real time as you write your code that something’s wrong rather than having to hit save, run the pipeline, and then try and figure out where the bug is when you’ve been writing code for half an hour. There’s things like quick navigation and auto formatting of your code. So, you don’t have to argue about whitespace and things like this. But, picture’s worth a thousand words. So, let’s have a quick couple of kind of examples of what I mean. This is one of the simplest things but probably the most impactful is just the little wiggly lines that you can now get when you’re writing Nextflow code. Here you can see that the red line is telling us that that variable is unknown and it’s not defined and the clue is just above it where we have defined a variable called locations with an s and we’re also getting a warning there that we’ve defined a variable and it’s not being used anywhere and these hints are being shown as you write your Nextflow code. So it’s a huge productivity boost to writing Nextflow. There’s features like this where we have tool tips over every Nextflow language item. So when you hover over in this case a channel factory, but it can be any part of the Nextflow syntax really, you get a short description about what that is and what it’s doing. And then there’s also a link underneath to read more and that takes you straight to the Nextflow docs. There’s things like this where there’s special little buttons that pop up in certain places in your workflow. So if you have valid syntax, your top level workflow now has this button saying preview DAG or D-A-G. And you click that and it will show you a mermaid diagram of your whole workflow in the sidebar right there in VS Code. And so this is a great way to get to grips with a new workflow which maybe you haven’t worked on before and you’re inheriting from someone else. I had about six of these slides in, but I thought they were a bit too much, so I pared it back down. But this is just a taste. There’s many different things like this now built into VS Code. So if you’re writing Nextflow Code now, it’s just vastly vastly better than it was a year ago or more. So if you’ve ever tried in the past, I recommend having another go now and seeing if the experience is better. Along with the language parser, other things that allows us to do is actually change and develop the syntax of Nextflow itself. Before we were kind of limited by what the Groovy Nextflow language parser could handle, but now because we have a totally separate step, we can develop the language however we want. And so we’re bringing out several improvements as a result of that. And something that’s been asked for for a long time is static types. So here we have some input parameters for a pipeline. On the left is the traditional way to do it. You save a name and you say a value, default value, then that’s that. But Nextflow just on the fly tries to typecast stuff based on the values it’s been given which sometimes leads to problems. For example, if you have a sample name as a string, but the sample name is given as something with leading zeros and it gets converted to a number. Now on the right hand side you can see we’re defining the types of each parameter whether it’s a path a boolean an integer a string so on and Nextflow will then strictly typecast those things on input and also validate that the values that’s been given are correct. So you’ll get immediate validation and errors if you try and launch a Nextflow pipeline with the wrong kind of input. So these kinds of things are small changes to a syntax but really make a huge difference to re-usability of Nextflow pipelines.

Phil Ewels: Another thing I mentioned was error messages. So here you can see one of the old style error messages where because it was compiled to Groovy before it ran, Groovy threw an error and it was really unhelpful. it was just like top level pointing at a squiggly bracket and then you had to go through hundreds of lines of code to try and work out where the error actually was. Whereas now we throw the error at the parsing, language parsing level. And here you can see it even indicates the exact character which is wrong and and you can go straight there. It’s just again way better. And we have a lint command you can run as part of your continuous integration test for example linting just to to find those validation errors before you even run the pipeline.

Phil Ewels: Okay, I need to speed up a bit. Other features that we’ve been working on, these are kind of low-level features which you might not notice right away but are really kind of foundational blocks for us being able to build a lot of cool stuff. Workflow outputs is a new way of defining where, how files are basically published at the end of of pipelines. And data lineage gives a way of saving and storing all the information that next flow has about the provenance of all of your data. And so when lineage is enabled, you can kind of find out from any given file the entire analysis path that it took through a pipeline where it came from. And and we can do start to do some really nice things such as passing inputs between pipelines and things like this.

Phil Ewels: Before I wrap up, I want to just touch on a few things that we do at Seqera, which is the company which was formed around Nextflow. So, Nextflow is all open source and of course, and that’s kind of been my focus professionally, but if you’re running Nextflow, then Seqera has a lot of extra tooling that you can build on top of Nextflow. Tthe key thing we have is something called Seqera Platform which is basically a way to manage running Nextflow. So Nextflow is a command line tool. But when you’re running a lot of Nextflow pipelines, it can be difficult to keep track of all those different runs and which ones through errors and where they are and where the data is. And so Seqera Platform kind of provides an interface to launch and to monitor different workflows. Importantly it works with your compute. So you connect it to your AWS account or your slurm cluster and all your pipelines are still running in the same places that they were before. It’s just that they’re being exposed through the Seqera Platform interface. This is kind of an example of the kinds of things you can do once Seqera Platform is aware of the great- basically the Nextflow pipeline and everything around it. So the encapsulation of configuration and execution environment and data. So you can use it then as a control plane which you basically can build on top of. And so one of my extra little projects in the past year is a plug-in for an open source tool called Node-RED. There’s a link here, but basically you can use this as a low-code platform for setting up automation. So here it might be that when a file is added to an S3 bucket, it automatically triggers a workflow and when that workflow finishes, it triggers a second workflow and when that one is finished, it triggers the creation of an analysis studio which you can then go in and do your downstream analysis in things like that. You can basically create any kind of automation and this is all done via the APIs of Seqera Platform. And so when you abstract away all the complexity of actually configuring and launching and maintaining all the infrastructure, you can start to build some really cool solutions.

Phil Ewels: We have a lot of tooling to make basically running your pipelines faster and cheaper and better. A big one is is fusion which handles all the file operations. Nextflow is traditionally very well targeted towards working with huge data files. You know, your BAM files and your fastq files and everything and fusion basically is optimized specifically for Nextflow. It knows how Nextflow works and it’s really you know it can really fine-tune it for that use case and one of the latest things that fusion can do is snapshots. So if you’re running on cloud with spot instances for example AWS might tell you that this you’ve got one minute before your instance is being reclaimed and snapshots will now freeze that, freeze that image, that running task and you can restart it. And don’t lose all the progress you’d made in that long running task. And then just this is like an everything else slide because there’s I could give another two hour long talk about all these features. There’s so much more. But if any of that sounds interesting, I’m happy to ask answer kind of questions or yeah, come back and talk about more.

Phil Ewels: So to wrap up if you’re interested in becoming more involved with Nextflow, writing your own pipelines or getting involved in the community, we’ve got kind of a smattering of links here. The top one with the Nextflow website of course has all the documentation. We have a website called training.nextflow.io which is all basically walk through tutorials and training which you can do yourself. We’ve just had a training week last week where we had over a thousand people registered for it just in that one week. And the there’s multiple different courses. The beginner one is called Hello Nextflow. I’ve done a set of video tutorials for each of those chapters. And so you can kind of follow through with me step by step as we work through all the worked examples. Which is basically the best way to to learn Nextflow. And that’s all up to date now with all the latest Nextflow syntax. We have a very active community forum. So if you ever need any help, you can drop in there and ask you a question and you can usually get a response very quickly. Another plug, I run a Nextflow podcast. So at the moment, I’m trying to do one every two weeks. We talk to all kinds of different people using Nextflow for different things or other kind of tangentially related technical topics. It tends to be very technical deep dives. So that’s kind of fun. We have a really good blog and I’ve written a community forum twice. Didn’t mean to do that. And then finally we have a bunch of events coming up. So in a few weeks time we’ve got the nf-core hackathon which is both online and then people self-organize different local sites all the way around the world. I think we’ve something like 20 or 30 local sites already from Argentina to the UK to the US to Germany all over the place. So very welcome to join. It’s a great way to get involved. And there’s all different projects so you can kind of dive in and help people with their pipelines. And then we’ve got the the two flagship summit events. One in Boston at the end of April and then we’ll have the main online one with some in person in Barcelona in October. And there’s loads of Seqera sessions and all kinds of other events if you click that link where it might well be something near you. I think there’s Seqera sessions coming up in London and a few other places soon.

Phil Ewels: With that hopefully I’m about on time and happy to sort of take any questions. I hope that was all clear and and made sense and was useful.

Grant Belgard: Thanks, Phil. Um, so what’s the easiest way for someone to get started with Nextflow?

Phil Ewels: So the training website I think is the best way to get started really it the the examples that we use with the Hello Nextflow training are kind of domain agnostic. We we use cowpy to print a little cow to the terminal saying different messages and things. So you don’t need to really know anything specifically about RNAseq or anything else. And you can do most of that course probably in an afternoon or a couple of afternoons. And that takes you from almost nothing all the way through to building your own pipeline complete with containers, Docker containers and everything. And it’s all set up to work on GitHub code spaces. So it doesn’t, yeah.

Grant Belgard: And for people who currently use Snakemake, how hard is it to migrate to Nextflow?

Phil Ewels: So yeah, so I mean I didn’t really talk about any of the other workflow managers, but Nextflow is not alone in this field. And what I generally say to anyone is that just using any workflow manager is better than not using any. And so Snakemake especially and Nextflow um and WDL and others they share many of the concepts about kind of splitting up different tools and running them sequentially and working out risk DAG. Because of that it’s not usually not too bad to convert from one to the other. Especially with AI these days like we have our own Seqera AI which is particularly good and well versed in the latest syntax of Nextflow. And so honestly with many pipelines these days you can just dump your Snakemake syntax in and say convert this to Nextflow for me and it will do a pretty good job almost from the first go. So I would that I’m kind of lazy and a bit of an AI advocate. So that’s what I would definitely do in that situation.

Grant Belgard: If someone has a pipeline that could be useful for nf-core, how do they go about adding it?

Phil Ewels: Yeah. So, nf-core is this kind of like I say, it’s kind of a unique community because we don’t just kind of list any pipeline. We, it’s specifically kind of community owned and and only one pipeline per data type. So, because of this, it’s not just a question of kind of clicking a couple of buttons. You have you have to come and forward and put in a proposal and basically then we say yes or no and then there’s a kind of a system for going through and building your pipeline and adding it to nf-core. The short answer is go to nf-core website and click on the docs and there’s a guide saying how to add your pipeline and then there’s an nf-core proposals website where you go and basically describe what it is you want to do and get a thumbs up.

Grant Belgard: Where do you see AI fitting into pipeline development in the next couple years?

Phil Ewels: A couple of years is difficult to say. I’m struggling to predict anything more than a month ahead at the time at the moment because things are changing so fast. But I mean nothing in tech is going to be the same and I’m sure that pipelines will be included in that. We’re starting to see it already like I say converting between languages. We have our Seqera AI tool and we’re trying to kind of take the rough edges off these tools and it certainly lowers the boundary. Nextflow is known for not being the easiest in terms of learning curve and AI makes it possible to get started so much easier. So right now I think the benefits are kind of a low hanging fruit is it’s just much easier to write to debug your Nextflow pipelines using AI. And as we go forward I’m expecting kind of more foundational changes with how how we just approach the whole concept of building up scientific analysis to be honest.

Grant Belgard: Is Nextflow overkill if someone’s just running a few samples on their laptop?

Phil Ewels: It depends a bit. So I mean it depends a bit on your background and how much Nextflow you’ve written. If you’ve never written Nextflow before then is it worth you learning the whole syntax and going through the whole process just so you can run a couple of samples? Maybe not. But once you have kind of got familiar with Nextflow I kind of think it’s a bit like wearing gloves when you’re pipetting in the lab. It’s difficult. You want to, you end up wanting to write Nextflow pipelines for everything because it is self-documenting. It’s automatically versioned. You can rerun it any time in the future. And you know when you try and remember what it was you did six months ago, you can just see the next pipeline and it’s there. So it ends up being quite a low lift. So then of course I’m a bit biased in this question, but I would say yes to everything in Nextflow pipelines. That’s what I find myself doing.

Grant Belgard: Mhm. Is Seqera containers free and how does it compare to biocontainers or Docker hub?

Phil Ewels: Yeah, so I didn’t touch on this so much but Seqera containers is something we do on the Seqera side. So what one of the tools we have containers are are key and fundamentals in Nextflow and the success of bioinformatics workflows that you can encapsulate the software in this kind of clean environment on a per process basis. So your versions of Python don’t conflict and this and that. And so you almost every Nextflow pipeline you will see now have these container declarations and you have you might have 50 different or 60 different steps in your pipeline and you need to come up with a Docker container for every single one and so the bioinformatics community has kind of responded to this usage of containers in a few different ways. The biocontainers project has been wildly successful and basically every conda package gets a Docker image for free and so we’ve been using biocontainers in nf-core for a long time. The limitations we found are when you want to have a process in your pipeline that has more than one tool then you have to – the whole process for generating one of those containers is quite convoluted. And so we have we built a tool at Seqera called wave which is also open source which basically builds Docker containers on the fly. So you build, you add this into your Nextflow pipeline and you say I want to run tool A and tool B in this process and it will go off and it will request it and if Wave has seen it before it will just give you the container straight away. And if not it will sit there and it will build it on the fly and then give it to you. Which is really cool because it means you basically don’t have to think about containers anymore. They just magically happen. So Seqera containers is based on this technology and it’s exactly the same thing but it’s just a public repository. So when you build your, you request your image you build it it then gets stored there for we say a minimum of 5 years and then anyone can just fetch it and download it. So we for example are now going to be using this in nf-core where every single one of those 1700 modules will have their own custom built, you have and docker and singularity you’ll have x86 you’ll have ARM CPU processing will all be built automatically on the fly and then pinned for a long time for perfect reproducibility and it’s just free and yeah it works really well.

Grant Belgard: When can one start using static types in Nextflow?

So the syntax example I showed with those params you can do that now. So that’s out as of, we do two Nextflow releases every year one major release in April and one in October. And so the 25/10 release came out with that syntax. So you can use it for parameters today. Basically, we are working on developing more syntax which will come out in the next major release, so 26/04, which will have basically strong typing through all of your pipeline code pretty much. And so that will really take that concept and kind of bring it through and then you’ll have a lot more validation because if you try and as you’re building as you’re connecting all your processes with all these different channels, excuse me, if you say that this you know this process has an output which goes into this it will tell you immediately like well you can’t do that because those are different types. So we’re going to have that very soon in a few months but already today you can do typing for just the input parameters for the pipeline.

Grant Belgard: And lastly how do you go about deciding if it’s worth updating an ancient DSL1 pipeline?

Phil Ewels: Yeah. So, so for for those who don’t know where DSL1, DSL2, this is like back when Nextflow started, it was this Groovy DSL and this term got bandied around a lot. And then around 2020, I think there was a major language update. We used to have these huge monolithic scripts of like thousands of lines of code and DSL2 changed a bunch of the syntax and one of the things it allowed us to do is break out different files, have these modules which we now, you know, like I say, rely on for this level of granularity and testing and and community. So, but that change from DSL1 to DSL 2 was was quite painful. It was quite hard work doing a lot of the rewrites which I should say we’re taking great pains to avoid with the new syntax updates we’re doing. We’re doing it much more gently and there’s also a lot of tooling to automatically update code. But so if you have an ancient pipeline in DSL1 and you want to sort of leap frog all this and bring it forward what like six years in terms of syntax it’s surprisingly common to have this question but like basically you have a couple of options probably the easiest is the same as converting from Snakemake you chuck it into an AI tool and say rewrite this pipeline for me or you start from scratch and you just kind of copy over the logic into the new syntax and you take the nf-core template or something. Or if you really want to and you’re a bit of a sadist, you can go through and try and update all the syntax line by line, which is doable. But, you’ll probably have to go DSL, you know, it’s like a software migration. You have to go DSL1 to DSL 2 and then DSL 2 to a new syntax. It’s doable.

Grant Belgard: Well, Phil, thank you so much for joining us. Thanks to all our listeners.

Phil Ewels: It’s a pleasure. Thanks very much for inviting me.