BigHat Biosciences publishes its first peer-reviewed article, Effective Surrogate Models for Protein Design with Bayesian Optimization at the prestigious International Conference for Machine Learning (ICML) Workshop on Computational Biology (WCB). The ICML WCB will highlight how machine learning approaches can be tailored to making both translational and basic scientific discoveries with biological data. This year’s WCB showcases pioneers at the intersection of computation, machine learning, and biology, working with large complex datasets and using new methods to interpret these collections of high-dimensional biological data to aid in drug discovery.
Specifically, BigHat researchers collaborated with advisor and Bayesian deep learning expert, Andrew Gordon Wilson of NYU [Scholar, ICML 2020 Tutorial], to develop a framework for protein design that requires only a small amount of labeled data. The real-world utility of these methods were demonstrated by optimizing the Stokes shift of green fluorescent protein (GFP). This paper serves as the first public illustration of how BigHat leverages its rapid design/build/test platform for data-driven antibody discovery and engineering.