Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics

Stephen Goldrick; Haneen Alosert; Clare Lovelady; Nicholas J. Bond; Tarik Senussi; Diane Hatton; John Klein; Matthew Cheeks; Richard Turner; James Savery; Suzanne S. Farid

Journal ArticleOPEN ACCESS

Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics

Frontiers in Bioengineering and Biotechnology (2023) 11

DOI: 10.3389/fbioe.2023.1160223

0Citations

25Readers

Abstract

Cell line development is an essential stage in biopharmaceutical development that often lies on the critical path. Failure to fully characterise the lead clone during initial screening can lead to lengthy project delays during scale-up, which can potentially compromise commercial manufacturing success. In this study, we propose a novel cell line development methodology, referenced as CLD4, which involves four steps enabling autonomous data-driven selection of the lead clone. The first step involves the digitalisation of the process and storage of all available information within a structured data lake. The second step calculates a new metric referenced as the cell line manufacturability index (MICL) quantifying the performance of each clone by considering the selection criteria relevant to productivity, growth and product quality. The third step implements machine learning (ML) to identify any potential risks associated with process operation and relevant critical quality attributes (CQAs). The final step of CLD4 takes into account the available metadata and summaries all relevant statistics generated in steps 1–3 in an automated report utilising a natural language generation (NLG) algorithm. The CLD4 methodology was implemented to select the lead clone of a recombinant Chinese hamster ovary (CHO) cell line producing high levels of an antibody-peptide fusion with a known product quality issue related to end-point trisulfide bond (TSB) concentration. CLD4 identified sub-optimal process conditions leading to increased levels of trisulfide bond that would not be identified through conventional cell line development methodologies. CLD4 embodies the core principles of Industry 4.0 and demonstrates the benefits of increased digitalisation, data lake integration, predictive analytics and autonomous report generation to enable more informed decision making.

Author supplied keywords

Cite

CITATION STYLE

APA

Goldrick, S., Alosert, H., Lovelady, C., Bond, N. J., Senussi, T., Hatton, D., … Farid, S. S. (2023). Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics. Frontiers in Bioengineering and Biotechnology, 11. https://doi.org/10.3389/fbioe.2023.1160223

Next-generation cell line selection methodology leveraging data lakes, natural language generation and advanced data analytics

Abstract

Author supplied keywords

Cite

Register to see more suggestions