A lack of effective prediction tools has limited development of high efficiency glycoside hydrolases (GH), which are in high demand for numerous industrial applications. This proof-of-concept study demonstrates the use of a deep neural network and molecular evolution (MECE) platform for predicting catalysis-enhancing mutations in GHs. The MECE platform integrates a deep learning model (DeepGH), trained with 119 GH family protein sequences from the CAZy database. MECE also includes a quantitative mutation design component that uses Gradient-weighted Class Activation Mapping (Grad-CAM) with homologous protein sequences to identify key features for mutation in the target GH, this component can be used in this page.
Note: The MECE strategy consists of two components: predicting the GH class and interpreting the model for each protein with the Grad-cam tool, as well as an evolutionary computation analysis which involves a weighted analysis of family and necessitates a lot of computational resources. The web server only offers the first component. All the source codes can be obtained from GitHub (https://github.com/BRITian/MECE) and run on a local machine. The web server employed the predicted model with a maximum of 735 amino acids. If the input sequence was longer than 735 amino acids, only the initial 735 residues were used to determine the GH class and analyze the model. Therefore, if the user wants to predict the protein with a length greater than 735 amino acids, we suggest they use either the catalytic domain sequence of the enzyme, which can be determined by the Conserved Domain Database (CDD), or a model with sequences of a maximum length of 2000 amino acids available on GitHub (https://github.com/BRITian/MECE).
|