October 18, 2021

AI for the Life Sciences: A Consultant Weighs In on Best Practices  

Life scientists at the recent Bio-IT World Conference & Expo were  advised to lay the groundwork for AI with a good data infrastructure and a collaboration plan to support effective research. (Credit: Getty Images) 

By Allison Proffitt, Editorial Director, AI Trends   

Surely one of the most anticipated sessions at the Bio-IT World Conference & Expo each year is the Trends from the Trenches session—a discussion led by consultants from BioTeam, a biotechnology consulting firm based in Middleton, Mass., highlighting the trends they have seen over the past year working with clients in the life sciences.    

Historically led by BioTeam co-founder and technical director for infrastructure Chris Dagdigian, the Trends talk has historically been guarded about artificial intelligence and machine learning’s role in life sciences research. In 2018, he warned that AI and machine learning (ML) were “in the hype phase,” but predicted eventual real-world practicality and value. In 2019, Dagdigian highlighted how the technologies were driving innovation in storage—demanding very fast storage to keep up with the workflows. And in 2020, he again labeled ML and AI as one of the most overhyped technologies, but blamed sales teams not the technology itself. “Unlike blockchain,” he quipped, machine learning and AI are real, beneficial technologies driving transformation, but the terms are on marketing overdrive.   

This year, he left the AI predictions to one of his colleagues: Fernanda Foertter, a former BioTeam consultant who has also worked at NVIDIA and is now director of applications at NextSilicon, a semiconductor startup targeting high-performance computing, that so far has raised over $200 million in capital.     

Fernanda Foertter, formerly of BioTeam, now director of applications at NextSilicon

AI/ML “is not ready for most of us,” Foertter opened—limiting her remarks to life sciences applications. “If any of you are under any impression that somehow you’re going to say, ‘I have all this data, and I’m going to implement AI at my organization, and we’re going to cure cancer,’ you are absolutely dead wrong.”    

But Foertter also warned the audience that AI could not be ignored. “I know I just said that AI is not ready, but dabbling in AI right now is going to improve the way you organize your data for when it becomes ready,” she said. “I’m here to reframe where you can put AI in your organization and how you can use it.”     

The reality, Foertter said, is that AI is primarily a research tool for now, and that’s how it should be used in life sciences organizations. She summarized reality in the life sciences today: swimming in data, all of it disconnected, there is little continuity, the person who understood any of it left—but the marketing department has already published an announcement about how you’re using AI.     

AI and ML require more than just having the data or even finding it. “Just because you search for it doesn’t mean you will find it. Just because you find it doesn’t make it usable,” Foertter said, reiterating a point she made in a workshop earlier at the event. FAIR data—findable, accessible, interoperable, and reusable—is hard to achieve, she emphasized, contending that it gets harder as you advance through the acronym.    

Best AI Practices for Life Science Companies Today 

But life sciences companies are making progress, and Foertter highlighted the top best practices.   

  • Hire data curators, she advised. Pick “failed scientists” who are tired of doing science but love and understand the data and the technology.   
  • Hire ethicists. “Do not begin doing AI—especially if you’re in healthcare—without an ethicist by your side. Just don’t,” she warned. “It’s just asking for trouble in the future.”   
  • Foertter encouraged tagging data as it comes off instruments and finding other ways of tagging data within existing processes.   
  • Start applying AI/ML to a new instrument or a new project, she encouraged. It’s too hard to address the whole legacy data lake. Start with the new.   
  • Begin with internal processes. “Make the lives of your internal scientists easier,” she said. “Do not think AI is going to be something you’re going to push through FDA approval. That’s hard!” Instead, use some AI to help your researchers decide which targets to push through the drug discovery process.   
  • Find ways to buy or share data. “You do not have enough data to do your own work. Period.” Foertter said. You must find ways to buy or share data. Thankfully, there is a lot of work being done on federated and safe data sharing. “In the future, leveraging other people’s data, and the secret sauce being the processes—not your data—is going to be the way you’re going to leverage AI.”   
  • Reuse mature algorithms from Google and Facebook. Don’t reinvent the wheel, she implored. “This is not a research exercise,” she said to any company scientist seeking to create their neural network model. “That’s a Ph.D. for somebody else. You’re in production; you’re not here to do research projects.”   
  • Solve narrowly-focused, well-defined problems. Most of the AI/ML goals within life sciences companies are far, far too broad.   
  • And finally, Foertter recommended hiring consultants and having access to a variety of experiences, not just topical education.     

In all of these best practices, Foertter applauds companies laying the groundwork for AI success, even if a commercial-facing AI app is still in the future.   

Her list of company worst practices were generally the opposites of the right ones: ignoring ethics and bias in the data, starting with too-big historical datasets, targeting AI for customer-facing apps, building custom models, trying to solve broad, poorly-defined problems, and believing the company’s own data is sufficient.     

But she also flagged a few things she admitted were controversial. She cautioned against buying AI startups. Most AI startups are working with public data. “They are not going to do something extremely novel without better data, and there are not a lot of public datasets out there. Unless there are AI startups that are generating their own data in some way, and then doing the modeling themselves, chances are they’re not going to be any different than anybody else.” Great ideas can’t be well-tested on the same data everyone else has.     

Foertter also warned against outsourcing proprietary data lakes, where one vendor has promised to build the lake and tag your data for you. “If you can find a way to work within open source, you’ll be better off than staying within a vendor proprietary ecosystem. If there’s a vendor out there promising to tag your data for you, you’ll be stuck with that vendor for a very long time. Transferring out of that vendor will be a problem.”    

Foertter again warned against a “wait and see” approach. Now is the time, she said, to get a grasp on how hard it is to get the data you need and begin laying the groundwork for data collaborations for the future. “Make 2022 the year the year of building good infrastructure for your data.”   

Learn more at NextSilicon. 

Leave a Reply

Your email address will not be published. Required fields are marked *