Graphing Biodiversity to Improve Drug Discovery

Basecamp Research is a company founded in 2019 by biologists Glen Gowers and Oliver Vince, aiming to improve drug discovery by integrating advanced graph and AI technologies. They recognized the insufficient cataloging of proteins and enzymes existing in nature and set out to create a comprehensive knowledge graph, the BaseGraph, which serves as a digital twin of the natural world. This graph operates on the Neo4j database and consists of 5.5 billion biological relationships—more than any other public database.

One key aspect of BaseGraph is its inclusion of environmental data, which is essential to understanding how proteins and enzymes behave in different conditions. Basecamp collaborates with scientists to collect data from remote locations, ensuring the integrity of their findings and committing to contribute a portion of their earnings to conservation efforts in those areas.

The graph contains three types of data: environmental and geological data, microecology, and protein characteristics. It is growing rapidly, adding approximately 500 million new biological relationships every four weeks. By employing a graph database, Basecamp enables users to discover hidden relationships within the data, which is particularly useful for exploring unexplored microorganisms—referred to as “microbial dark matter.”

The company is also leveraging AI technology, including large language models, to enhance drug discovery further. Early successes have included discovering new enzymes and optimizing chemical manufacturing processes, demonstrating the effectiveness of their approach in accelerating pharmaceutical research.