AI-Based Pathway Visualisation

PhD Project
Supervisors
Igor Goryanin, goryanin@ed.ac.uk
Shay Cohen, scohen@inf.ed.ac.uk

Project Description
The explosion of high-throughput data-acquiring techniques in biology aka -omics in combination with the rapid growth of sophisticated statistical and ML analysis algorithms make interpretability the key element of any biology-related data analysis pipeline. The pathway diagrams are a well-established way of data and modelling results visualization technique that is the key step in biological interpretation. The Kyoto Encyclopedia of Genes and Genomes (KEGG)[https://doi.org/10.1093/nar/gkac963] is the most widely used set of pathway diagrams to map data on. This collection was manually drawn at the beginning of the millennia and has almost not been updated since. The static nature of KEGG diagrams and their relatively large size makes them useful for data comparison and interpretation. However, since 2011 KEGG become a commercial and closed source. Other collections of pathways such as MetaCyc [https://doi.org/10.1093/nar/gkz862] and Subsystems [https://doi.org/10.1186%2F1471-2164-9-75] did not acquire so much attention partially due to too fine-grained structure of MetaCyc pathway, partially due to complete lack of visualisations in the Subsystems.

The long-standing approach to solving the lack of visualisation problem was the development of sophisticated graph layout algorithms [http://biorxiv.org/lookup/doi/10.1101/2023.12.23.573191], however, the result is still far from manually created diagrams. Another problem of automatically generated layouts is that they vary between invocations even on the same network making such visualisation difficult to interpret and useless for comparative analysis. Recent advances in large generative models in AI make it a good candidate for the generation of aesthetically pleasing yet easily interpretable pathway diagrams from standard biochemical network representation formats such as SBML and BioPAX. We propose to develop an AI MPW system, which uses a Subsystems hierarchy of pathways as the foundation for (semi-)automatically generated collection of static pathway diagrams linked to the original network representation and open source annotations of metabolites, reactions and enzymes. In the future AI PMW collection could be commercialised similarly to KEGG or/and as ChatGPT app. No manual drawing will be required