ABSTRACT
The ability to predict a protein’s local structural features from the primary sequence is of paramount importance for unravelling its function if no solved structures of the protein or its homologs are available. Here we present NetSurfP-2.0 (http://services.bioinformatics.dtu.dk/service.php?NetSurfP-2.0), an updated and extended version of the tool that can predict the most important local structural features with unprecedented accuracy and run-time. NetSurfP-2.0 is sequence-based and uses an architecture composed of convolutional and long short-term memory neural networks trained on solved protein structures. Using a single integrated model, NetSurfP-2.0 predicts solvent accessibility, secondary structure, structural disorder, interface residues and backbone dihedral angles for each residue of the input sequences.
We assessed the accuracy of NetSurfP-2.0 on several independent validation datasets and found it to consistently produce state-of-the-art predictions for each of its output features. In addition to improved prediction accuracy the processing time has been optimized to allow predicting more than 1,000 proteins in less than 2 hours, and complete proteomes in less than 1 day.