Implementing AI to Streamline Chromatographic Peak Picking

by | Sep 12, 2024

Peak picking can see an efficiency boost with AI, but what is required to get up and running?

Traditionally, scientists have manually reviewed chromatograms to identify and quantify compounds in a sample. The process involves establishing a baseline to represent the signal in the absence of analytes, setting a threshold to differentiate true peaks from noise, and inspecting the chromatogram for peaks that meet these criteria. Identified peaks are validated by comparing their retention times, shapes, and spectral characteristics against known standards or reference materials.

As chromatographs shifted to record chromatograms digitally instead of printing them onto long rolls of paper, algorithms were introduced in the chromatography software to help identify peaks and minimize noise. The software performs a preliminary peak identification, and the user confirms its validity.

While these algorithms are helpful, their utility is limited—after all, algorithms are static and cannot be tailored to an organization’s unique data. This is where artificial intelligence (AI) can take the baton.

Comparing Manual vs AI Peak Picking

It’s important to acknowledge that every chromatography lab is different, so some may benefit greatly from AI while others may be hindered by it. Here are two breakdowns illustrating the pros and cons of manual and AI-powered peak picking, respectively:

Manual peak picking

ProsCons
Quality assurance: Scientists personally ensure their data is accurate and precise.Time-consuming: Peak picking is a meticulous task that demands time, focus, and experience.
Full context: Humans can examine data in full context, weighing factors that may not necessarily be quantified, in a way that AI cannot.Cognitive limits: While humans are excellent at recognizing patterns, they may not be as good as AI. Some chromatography data may contain patterns that only AI can perceive.

AI peak picking

ProsCons
Automated: AI automates the peak picking process, slicing turnaround time and saving scientists time and energy to focus on work that requires a human touch.No contextual understanding: AI models cannot know the inevitable nuances of a situation, which may result in output that seems appropriate on paper but does not necessarily reflect reality.
Accuracy: As mentioned, some data may contain patterns imperceptible to humans. In these cases, introducing AI can heighten accuracy as the AI will identify those patterns and account for them in the final output.Lack of transparency: You cannot step through an AI model’s “reasoning,” you can only infer how it arrived at a particular output. While this is sufficient for many applications, it requires scientists to exercise caution with AI.

Let’s say you’ve reviewed the pros and cons of both approaches, built a business case, and received approval to introduce AI in your peak picking process. What are the next steps?

What Do You Need to Implement AI?

To roll out an AI solution, you’ll need more than just chromatograms. There are two key ingredients to implementing AI: annotated, high-quality data and a platform to access the solution.

High-quality data

Data is the lifeblood of any AI solution. Commercial AI solutions generally come pre-trained on the vendor’s data set. While this is useful and sufficient for many use cases, labs should consider investing the time into collecting and annotating their own data, which will enable them to launch an AI solution tailored for them.

Annotation is the process of defining and differentiating between elements of interest in a data set, which will enable the AI model to learn those patterns and draw the same distinctions in future analyses.

After annotation, you can then further train the AI model on your lab’s data in a process known as fine-tuning. “Fine-tuning a model on a lab’s own data is highly beneficial as it increases accuracy by tailoring the model to the specific characteristics and nuances of the lab’s data,” says Edison Cerda, product manager, informatics at Agilent Technologies. “It also improves relevance,” Cerda continues, “ensuring the model’s outputs are more actionable for your specific use cases.” Annotating your lab’s data and fine-tuning an AI model on it is essential to your AI strategy.

Data platforms

Of course, all the data in the world means nothing if you can’t use it. You need a centralized platform that can (1) receive input to process data and (2) make the output readily available to users. This is not a straightforward process. “Integrating AI solutions with existing lab infrastructure and workflows can be complex and time-consuming,” Cerda says. Architecting the data pipelines that will enable the AI solution to receive data and then generate output requires technical expertise, deep collaboration with your organization’s IT staff, and a clear vision of how you will integrate the solution with your lab’s workflow.

Questions to consider for establishing data pipelines and access

How are you hosting your AI model, and how will your chromatography data make it to the model? Some chromatograph vendors offer AI peak picking solutions that are cloud-based and operate as a software-as-a-service, easily scaling with your lab’s usage. The downside is that subscribing to these services can be expensive, and they may only work with chromatographs made by that particular vendor, effectively locking you in to that vendor’s walled garden.

In contrast, there are also on-premises solutions available, such as the open-source project PeakBot. On-premises solutions do away with recurring expenses and have enhanced flexibility. They can analyze data from any chromatograph as long as that data is scrubbed and standardized. The trade-offs, however, lie in cost and convenience.

According to Cerda, high-performance servers are required to handle the data processing and model training of on-premises solutions. These servers represent a significant capital investment, though their recurring operating costs may be lower than the cost of subscribing to a cloud AI platform.

The other factor to consider is convenience. A cloud solution makes access easy —it’s either integrated with the chromatograph’s software or available via a web browser. Meanwhile, an on-premises server demands network bandwidth, security, and a workstation to access the platform. All these factors must be weighed against each other when deciding if you should opt for a cloud or on-premises solution.

After deciding on a solution, you must then set up the data pipeline to move data from the chromatograph to the AI model. Cloud-based solutions will handle this out-of-the-box, piping all data into one model via the internet. If hosting on-premises, the ideal setup will hinge on the number of chromatographs you have, your internal network infrastructure, physical proximity, and other factors. One simple, but manual and time-consuming, solution is to export data from each chromatograph to a USB drive, walk it to the workstation hosting the AI model, and carry on from there. Ideally, you will be able to automatically send data to the model via your lab’s internal network without exposing it to the internet. Consider hiring a lab informatics consultant to identify the best solution according to your needs and budget.

Author bio:
Holden Galusha is the associate editor for Lab Manager. You can reach him at hgalusha@labmanager.com.

Share Your Expertise! Would you like to share your knowledge and news with the broader analytical community? Sign up to our contributor list today!

Related Content

Advertisment

Advertisment

Advertisement