Accurate de novo design of high-affinity protein-binding macrocycles using deep learning

Rettie, Stephen A.; Juergens, David; Adebomi, Victor; Bueso, Yensi Flores; Zhao, Qinqin; Leveille, Alexandria N.; Liu, Andi; Bera, Asim K.; Wilms, Joana A.; Üffing, Alina; Kang, Alex; Brackenbrough, Evans; Lamb, Mila; Gerben, Stacey R.; Murray, Analisa; Levine, Paul M.; Schneider, Maika; Vasireddy, Vibha; Ovchinnikov, Sergey; Weiergräber, Oliver H.; Willbold, Dieter; Kritzer, Joshua A.; Mougous, Joseph D.; Baker, David; DiMaio, Frank; Bhardwaj, Gaurav

doi:10.1038/s41589-025-01929-w

Download PDF

Article
Open access
Published: 20 June 2025

Accurate de novo design of high-affinity protein-binding macrocycles using deep learning

Nature Chemical Biology (2025)Cite this article

48k Accesses
16 Citations
48 Altmetric
Metrics details

Subjects

Abstract

Developing macrocyclic binders to therapeutic proteins typically relies on large-scale screening methods that are resource intensive and provide little control over binding mode. Despite progress in protein design, there are currently no robust approaches for de novo design of protein-binding macrocycles. Here we introduce RFpeptides, a denoising diffusion-based pipeline for designing macrocyclic binders against protein targets of interest. We tested 20 or fewer designed macrocycles against each of four diverse proteins and obtained binders with medium to high affinity against all targets. For one of the targets, Rhombotarget A (RbtA), we designed a high-affinity binder (K_dâ€‰<â€‰10â€‰nM) despite starting from the predicted target structure. X-ray structures for macrocycle-bound myeloid cell leukemia 1, Î³-aminobutyric acid type A receptor-associated protein and RbtA complexes match closely with the computational models, with a CÎ± root-mean-square deviationâ€‰<â€‰1.5â€‰Ã… to the design models. RFpeptides provides a framework for rapid and custom design of macrocyclic peptides for diagnostic and therapeutic applications.

De novo design of high-affinity binders of bioactive helical peptides

Article Open access 18 December 2023

Improving de novo protein binder design with deep learning

Article Open access 06 May 2023

Synthesis and direct assay of large macrocycle diversities by combinatorial late-stage modification at picomole scale

Article Open access 02 July 2022

Main

Macrocyclic peptides present a promising avenue for developing new therapeutics that bridge the gap between small-molecule drugs and large biologics^1,2. Biologics, while capable of binding diverse therapeutic targets with high affinity and selectivity, are usually unable to cross cell membranes because of their large size and high polarity, limiting them to extracellular targets. Conversely, small molecules can access intracellular targets but are not ideal for targeting proteins lacking deep hydrophobic pockets. In principle, macrocyclic peptides with sizes between small molecules and proteins can be developed to modulate molecular targets inaccessible to traditional therapeutic modalities³. The ability to develop custom protein-binding macrocycles for diverse protein targets would have many diagnostic and therapeutic applications. Traditionally, the development of peptide therapeutics has relied on natural product discovery or high-throughput screening of trillions of random peptides for target binding using display-based techniques^1,2. However, natural product discovery has several challenges, particularly synthetic difficulties, marginal stability and low mutational tolerance of identified hits⁴. While powerful, the high-throughput screening methods are time-intensive, cost-intensive and labor-intensive and only span a small fraction of the rich chemical and structural diversity accessible to macrocycles. Moreover, such approaches frequently fail to simultaneously optimize for multiple biophysical properties, such as target binding, selectivity and membrane permeability, because of the precise structural control required to achieve such functional properties⁵.

Structure-guided design methods offer a complementary approach to the library screening approaches, enabling rapid in silico exploration of a large chemical and structural diversity to design macrocycle binders for therapeutic targets. We previously developed physics-based methods for designing hyperstable constrained peptides, structured macrocycles and binders to protein targets by borrowing the motifs or interactions from previously described binding partners as anchors^6,7,8,9. However, despite the high accuracy observed in the design of monomeric macrocycles with these methods⁷, the design of protein-binding macrocycles has had limited success, achieving only modest binding affinities and, in many cases, with the experimentally determined structures not agreeing with the design models^7,8,10. The reliance on previously described binding partners for starting motifs also restricts such approaches to well-studied protein targets. In recent work, we described a pipeline for hallucinating and predicting the structures of macrocyclic peptide monomers by modifying AlphaFold2 (AF2) to include cyclic relative positional encoding (named â€˜AfCycDesignâ€™)¹¹. Other promising deep learning (DL) methods were described recently to predict the structures of macrocycles and macrocycleâ€“target complexes^12,13 and to design peptide binders to protein targets^14,15,16. However, these methods have not been extensively structurally validated to date or shown to robustly perform atomically accurate de novo design of macrocyclic peptide structures in complexes with diverse protein targets. Computational methods that can accurately design high-affinity macrocycle binders de novo, using just the information of target structure or sequence, are required for wider therapeutic applications.

We reasoned that recent breakthroughs in generative DL methods could be leveraged to develop a robust pipeline for the accurate and efficient design of macrocycle binders. Diffusion models for protein design, such as RFdiffusion¹⁷, are trained to generate diverse protein structures from randomly initialized residues as starting points and have demonstrated remarkable success in designing protein monomers, binders and symmetric oligomers of medium-sized to large-sized proteins. However, despite considerable recent progress in DL-based protein design methods, these methods are not readily applicable to designing macrocyclic peptides. Developing analogous methods for peptide design from scratch has been challenging because of the limited availability of experimental data for training such models. To address these challenges, we set out to extend the RoseTTAFold2 (RF2)¹⁸ structure prediction network and the RFdiffusion¹⁷ protein backbone generation framework to incorporate cyclic relative positional encoding and enable the generation of the macrocyclic peptide backbones.

Extending RF2 and RFdiffusion for macrocycles

We began by examining the ability of the RF2 (ref. ¹⁸) structure prediction network to model known macrocyclic peptide structures. We implemented a modified (Methods) cyclic relative position encoding for RF2 (Fig. 1a) and observed robust prediction of natural cyclic peptide structures (Supplementary Fig. 1). Given this success, we reasoned that the same relative positional encoding should enable RFdiffusion¹⁷ to generate macrocyclic peptide structures because of its similar network architecture. We added the cyclic positional encoding scheme to RFdiffusion and observed robust generation of diverse macrocyclic peptides (Fig. 1b,c and Supplementary Fig. 2). Similar to the previously described work on designing monomeric cyclic peptides with physics-based methods⁷ and AfCycDesign¹¹, we observed 9,045 and 8,913 structurally unique 10-residue and 12-residue backbones, respectively, when 48,000 macrocycle backbones were generated for each size (Supplementary Fig. 2). The distribution of phi and psi values in these generated backbones is similar to the standard Ramachandran plot for protein structures (Supplementary Fig. 2), suggesting that generated backbones do not require extensive d-amino acids to stabilize the generated structures⁷. While we did not attempt to comprehensively enumerate the structural space of cyclic peptide monomers, RFpeptides can readily be scaled up to comprehensively cover the structural space accessible to macrocyclic peptides. Encouraged by the transferability of the cyclic positional encoding, we set out to use RFdiffusion for the de novo design of protein-binding macrocycles. We chose RFdiffusion for several reasons. Firstly, we expected the high experimental success rate of RFdiffusion^17,19 for protein binder design to carry over to macrocycle binder design. Secondly, de novo binder design with AfCycDesign as is would be far more computationally expensive and has not been successfully implemented or experimentally validated. Thirdly, the method can still take advantage of the current built-in conditional generation functionalities of RFdiffusion, such as epitope-specific targeting and â€˜motifâ€™ scaffolding. Lastly, the method should be directly transferable to other current and future RoseTTAFold-based design networks, such as RFdiffusion All-Atom²⁰, for incorporating nonpeptidic molecules (nucleic acids, ions, etc.) during design calculations.

**Fig. 1: RFpeptides is a diffusion-based pipeline for the de novo design of protein-binding macrocycles.**

We modified the RFdiffusion protein binder design pipeline to use cyclic relative position encodings for the generated chain and standard positional encodings for the target and interbinder target indices (Fig. 1d). We then completed our design pipeline by using ProteinMPNN²¹ to design amino acid sequences compatible with the backbones generated by RFdiffusion (Fig. 1e). We chose ProteinMPNN for its improved performance in sequence design and ability to generate sequences with better solubility profiles over the sequences generated by traditional physics-based methods²². This pipeline readily generated macrocycles with diverse secondary structure content against target proteins (Fig. 1f) and the inclusion of standard RFdiffusion hotspot features clearly shifted the distribution of generated binders toward desired residues (Supplementary Fig. 3). We refer to this integrated pipeline as â€˜RFpeptidesâ€™ throughout the remainder of the text.

De novo design of macrocyclic binders to myeloid cell leukemia 1 and MDM2

We selected myeloid cell leukemia 1 (MCL1) as our first target protein, given the availability of multiple high-resolution X-ray crystal structures available to initiate the design calculations. MCL1 is also a promising target for anticancer therapeutics because of its roles in autophagy, cell survival, DNA repair and cellular proliferation²³. For targeting MCL1, we used RFpeptides to generate 9,965 diverse cyclic peptide backbones, followed by four iterative rounds of ProteinMPNN and Rosetta Relax to design four amino acid sequences for each generated backbone. We expected the local changes to the generated backbone during the Rosetta Relax steps to allow for improved amino acid sequence diversity from the ProteinMPNN steps. While there are other ways to achieve increased sequence diversity, including generating multiple sequences per backbone from ProteinMPNN or adding noise during ProteinMPNN sequence generation, we did not explicitly try or compare them in this study. For downselecting the design candidates for experimental testing, we used AfCycDesign to repredict the designed macrocycleâ€“target complexes from the macrocycle sequence and the target structure as a template. We selected the designs on the basis of the confidence metric (interface predicted aligned error (iPAE)) and the similarity between the original design model and the proteinâ€“macrocycle complex predicted by the AfCycDesign (Supplementary Fig. 4). For further stringency in the design selection process, we also used RF2 to repredict the complex structures, reasoning that the design models predicted identically by two orthogonal structure prediction networks (AfCycDesign and RF2) should have a higher likelihood of binding to the target as designed. However, the 1,984 selected designs at this stage were still more than the number of designs we could reasonably synthesize and test experimentally. Therefore, we next used Rosetta²⁴ to calculate the â€˜physics-basedâ€™ metrics of interface and macrocycle quality, such as calculated binding affinity (ddG), spatial aggregation propensity (SAP) of the designed macrocycle and the molecular surface area of the interface contacts (CMS) (Supplementary Fig. 4).

After strictly filtering the designed candidates on DL-based and physics-based metrics, we selected 27 designs for synthesis, biochemical and biophysical characterization. Despite specifying no hotspots to guide the generation process to a specific patch on the MCL1 structure, all selected designs bound to the functionally relevant MCL1â€“BH3 interaction site (Supplementary Fig. 5). While all selected designs include an Î±-helical segment, they feature different sequences, macrocycle placement and target interactions (Supplementary Fig. 5 and Supplementary Table 1a). In addition to the common helical motifs, the loop regions of the selected macrocycles also contribute extensive side-chain-mediated and backbone-mediated interactions to the binding interface. During the chemical synthesis using Fmoc-based solid-phase synthesis (Methods), the yields for the correctly cyclized product for 13 designs were low and insufficient for further characterization. We tested the remaining 14 macrocycles for binding to biotinylated MCL1 using surface plasmon resonance (SPR) single-cycle kinetics experiments (Supplementary Fig. 6). Three macrocycles showed binding to the MCL1, with the best binder, MCB_D2 (MCL1 binding design 2) (Fig. 2a), demonstrating a binding affinity of 2â€‰ÂµM (Fig. 2b). To confirm whether the designed macrocycle adopts the designed structure and engages MCL1 in the designed binding mode, we determined the X-ray crystal structure of MCB_D2 bound to MCL1 at 2.1 Ã… resolution. The crystal structure was nearly identical to the design model, with a root-mean-square deviation (r.m.s.d.) of 0.7â€‰Ã… over all of the CÎ± atoms of the macrocycle with target chains aligned (Fig. 2c) and CÎ± r.m.s.d. of 0.4â€‰Ã… within the macrocycles when aligned (Fig. 2d). The side-chain rotamers of the interacting residues in the crystal structure also closely matched the design model (Fig. 2d). The crystal structure also confirmed that the binding interactions are not restricted to the helix region of the designed macrocycle but are also contributed by the loop regions (Fig. 2e,f). While several hydrophobic interactions from the MCB_D2 helical segment are similar to those seen in natural MCL1 binders (for example, BH3 peptide), (Supplementary Fig. 7), the N-to-C orientation of the helix is flipped in the case of MCB_D2. The loop region of MCB_D2 makes additional hydrophobic contacts and a cationâ€“Ï€ interaction with MCL1 (Fig. 2e and Supplementary Fig. 7d) that we did not observe in previously reported natural MCL1 binders and their analogs. All three hits with an observable binding signal at 100â€‰ÂµM featured this cationâ€“Ï€ interaction.

**Fig. 2: De novo design and characterization of macrocyclic binders to MCL1 and MDM2.**

Encouraged by the experimental validation of the MCL1 binding macrocycles, we next sought to design binders to MDM2, an E3 ligase that interacts with tumor suppressor protein p53 and has multiple critical roles in tumor growth and survival²⁵. We generated 10,000 macrocycle backbones spanning diverse lengths amenable to chemical synthesis (16â€“18 residues) and designed four amino acid sequences for each generated backbone using iterative rounds of ProteinMPNN and Rosetta Relax protocols (Methods). Design models were filtered on the basis of the confidence metrics and similarity of the AfCycDesign predictions to the designed complexes and the interface quality metrics calculated using Rosetta (Supplementary Fig. 4). AfCycDesign predicted 7,495 of the 40,000 design models to bind MDM2 with high confidence (normalized iPAEâ€‰<â€‰0.3) (Supplementary Fig. 4). In contrast to our approach for MCL1, we chose not to do any additional filtering with RF2 as the results between AfCycDesign and RF2 were fairly consistent. We also adjusted the filter thresholds for in silico filters as their overall distribution differed substantially from the distribution observed for MCL1 (Supplementary Fig. 4). After filtering on interface metrics (Methods), we identified 17 designs with iPAEâ€‰<â€‰0.3, ddGâ€‰<â€‰âˆ’50â€‰kcalâ€‰mol^âˆ’1, CMSâ€‰>â€‰300â€‰Ã…² and SAPâ€‰<â€‰35. We selected 11 top-ranked designs by ddG for biochemical and biophysical characterization. The 11 selected designs had diverse sizes, shapes and sequences (Supplementary Fig. 8 and Supplementary Table 2); however, they were all predicted to bind the same site as the p53 transactivation domain (Supplementary Fig. 8). Three of the selected designs had poor yields during the cyclization step of the chemical synthesis, preventing further experimental characterization with them. We tested the remaining eight peptides for binding to the biotinylated MDM2 by SPR and identified three binders with observable binding signals at 100â€‰ÂµM (Supplementary Fig. 9). The best design, MDB_D8 (Fig. 2g), demonstrated a binding affinity of 1.9â€‰ÂµM in the SPR single-cycle kinetics experiment (Fig. 2h). The computational model for this design makes several key contacts at the interface that are similar to interactions observed in native MDM2â€“p53 complex structures (Fig. 2i and Supplementary Fig. 10)²⁵. Despite different overall structures, all three hits from the SPR screen had a similar binding motif composed of phenylalanine, tryptophan and either leucine or methionine from the helical segment of the macrocycle. Together, these data highlight the promising accuracy of the RFpeptides pipeline to design diverse macrocyclic binders for selected targets of interest.

De novo design of macrocyclic binders to Î³-aminobutyric acid type A receptor-associated protein

We next set out to design binders against a target with a binding site that is structurally different from MCL1 and MDM2, formed by a mix of Î±-helices and Î²-strands (in contrast to all Î±-helical pockets of MCL1 and MDM2). We selected Î³-aminobutyric acid type A receptor-associated (GABARAP) as the target, a protein responsible for mediating autophagy through its role in autophagosome biogenesis and recruitment of cargo, resulting in lysosomal degradation of damaged or surplus proteins and organelles²⁶. Peptide modulators against GABARAP could have therapeutic applications in the treatment of late-stage cancers²⁷ or as chimeric peptides for autophagy-mediated targeted protein degradation²⁸. Our target binding site for GABARAP, which is also the binding site for the native LC3-interacting region or Atg8-interacting motif²⁹, is formed by a mix of Î²-strand and Î±-helix secondary structures (Fig. 3 and Supplementary Fig. 13). For designing macrocyclic binders against the human GABARAP, we used a similar pipeline as described above for MCL1 and MDM2 (Methods) but we doubled the number of generated designs and defined six hotspot residues (Lys46, Lys48, Tyr49, Leu50, Phe60 and Leu63) to guide the macrocycle backbone generation to a specific site on the target (Fig. 3a,d). We generated 20,000 macrocycle backbones and designed the amino acid sequences using ProteinMPNN and Rosetta Relax protocols. Of the resulting 80,000 design models, we selected 335 macrocyclic designs on the basis of AfCycDesign (iPAEâ€‰<â€‰0.13) and Rosetta (ddGâ€‰<â€‰âˆ’30â€‰kcalâ€‰mol^âˆ’1, SAPâ€‰<â€‰35 and CMSâ€‰>â€‰300â€‰Ã…²) interface metrics (Supplementary Fig. 4). Instead of trying to synthesize and characterize all 335 cyclic peptides (which would have required substantial time and experimental resources), we clustered the 335 designs into 80 different clusters on the basis of their three-dimensional structures and selected representative designs from diverse clusters for further biochemical characterization. We selected 13 diverse macrocycles of 12â€“17 residues for synthesis and experimental validation (Supplementary Table 3 and Supplementary Fig. 11). Unlike the design candidates described above for MCL1 and MDM2, several of the selected macrocycles for GABARAP showed cyclic Î²-sheet structures with several edgeâ€“strand interactions with the target (Supplementary Fig. 11).

**Fig. 3: De novo design of high-affinity macrocycle binders to GABARAP.**

We successfully synthesized six designs with high purity (>90%) and tested them for binding to GABARAP using SPR (Supplementary Fig. 12). Two designs, GAB_D8 and GAB_D23, showed binding affinities of 6â€‰nM and 36â€‰nM, respectively (Fig. 3b,e). To further characterize the binding of GAB_D8 and GAB_D23, we tested the ability of these designs to disrupt the interaction of GABARAP with linear peptide K1 (a previously described binder to this site³⁰) in AlphaScreen assays. GAB_D8 and GAB_D23 demonstrated a half-maximal inhibitory concentration (IC₅₀) of 0.7â€‰nM and 2.5â€‰nM in the AlphaScreen assay, respectively (Fig. 3i). To our knowledge, GAB_D8 is the most potent macrocyclic GABARAP binder to date.

In crystallization trials, we did not obtain crystals of sufficiently high quality for GAB_D8 bound to GABARAP. We instead crystallized GAB_D8 bound to GABARAPL1, a homolog of GABARAP with 86% overall sequence identity and 100% sequence identity for residues within 5â€‰Ã… of GAB_D8 in the design model. The X-ray crystal structure for GAB_D8 bound to GABARAPL1 matched very closely with the design model, with a CÎ± r.m.s.d. of 1.2â€‰Ã… over the macrocycle when aligned by the target protein to the closest of the four copies in the asymmetric unit (Fig. 3c and Supplementary Fig. 13) and a CÎ± r.m.s.d. of 0.47â€‰Ã… when aligned by macrocycle alone (Fig. 3g). Notably, the X-ray structure of the GAB_D8â€“GABARAPL1 complex showed two different bound conformations of GAB_D8, one that closely matched the design model and a second one that partially deviated from the design model (Supplementary Fig. 14), with a register shift nucleated by Thr10 from the macrocycle forming main-chain-mediated and side-chain-mediated hydrogen bonds with Lys48 on the target. GAB_D23 crystallized readily with GABARAP and also closely matched the design model with a CÎ± r.m.s.d. of 1.7â€‰Ã… when aligned by the target (Fig. 3f) and CÎ± r.m.s.d. of 0.74â€‰Ã… across the macrocycle alone (Fig. 3g). The X-ray crystal structure confirmed the key designed interactions, such as Trp5 and Ile8, with the main difference between the design model and the X-ray structure being the switch from a type I Î²-turn from Leu1 to Gly4 in the design model to a less regular conformation in the crystal structure, with a tendency for a type Iâ€² Î²-turn from Glu2 to Trp5. While our original design models were predicted with single sequences as inputs to AF2, we retrospectively predicted the GAB_D8â€“GABARAPL1 and GAB_D23â€“GABARAP complex structures with multiple-sequence alignment (MSA) inputs. These MSA-based predictions of the designs matched even more closely with the X-ray crystal structures, with a CÎ± r.m.s.d. of 0.5â€‰Ã… and 0.9â€‰Ã… for the GAB_D8â€“GABARAPL1 and GAB_D23â€“GABARAP complexes, respectively, when aligned by the target structure (Supplementary Fig. 15). Overall, these data demonstrate the ability of our de novo design pipeline to identify high-affinity binders against targets with diverse pocket shapes and surfaces without requiring library-scale screening.

Design of macrocyclic binders to predicted structures

Given the high accuracy and binding affinity of macrocycles designed against selected targets, we next set out to design macrocyclic binders against targets without any experimentally determined structure. We reasoned that the high accuracy of RFpeptides could mitigate the inherent risk of designing against a predicted target structure. We designed macrocycles against Rhombotarget A (RbtA), a recently identified cell surface protein from the ESKAPE pathogen, Acinetobacter baumannii. There are no experimentally determined structures available for this protein and sequence-based searches against the Protein Data Bank (PDB) did not return notable matches to other protein structures. We predicted the structure of the 617-aa full-length protein using AF2 and RF2; both methods predicted similar overall structures (CÎ± r.m.s.d. of 0.4â€‰Ã… over 509 residues excluding the signal peptide and transmembrane domain) with high confidence (predicted local distance difference test (pLDDT)â€‰>â€‰90) (Supplementary Fig. 16). AF2 and RF2 both predicted two distinct extracellular domains: an N-terminal Î²-helix domain and a C-terminal Ig-like domain (Supplementary Fig. 16). While there were some differences in the predicted structures from AF2 and RF2, we decided to focus our binder design calculations on regions that were predicted nearly identically and with high confidence by AF2 and RF2. On the basis of our preliminary design runs without hotspots to guide the diffusion, we identified a patch in the N-terminal domain to pursue in our large-scale design calculations against this target and defined hotspots Leu144, Phe202, Phe204, Tyr206, Val208, Leu231 and Ala269 for peptide backbone generation (Fig. 4a). In contrast to the concave pockets targeted for MDM2 and MCL1, this selected patch for RbtA is considerably flatter and difficult to target with conventional computational and experimental approaches (Supplementary Fig. 17). We generated 20,000 backbones for macrocycle binders and designed four amino acid sequences for each backbone using iterative rounds of ProteinMPNN and Rosetta Relax. Designs were filtered using AfCycDesign confidence metrics and Rosetta interface metrics, as described in earlier sections (Supplementary Fig. 4). On the basis of these in silico metrics, we selected 26 designs for biochemical and structural characterization with AfCycDesign iPAEâ€‰<â€‰0.4, ddGâ€‰<â€‰â€“30â€‰kcalâ€‰mol^âˆ’1, r.m.s.d. between the design model and AfCycDesign predictionâ€‰<â€‰1.5 Ã… and CMSâ€‰>â€‰300â€‰Ã…² (Supplementary Fig. 18). The selected designs covered diverse sizes (13â€“18â€‰aa), sequences, shapes and secondary structures (Supplementary Fig. 18 and Supplementary Table 4). We expressed the Avi-tagged version of the RbtA N-terminal domain (residues 20â€“458) and used it for binding screens using SPR. Four of 11 designs that were synthesized in sufficient quantity and purity showed a binding signal at 100â€‰ÂµM in our screens (Supplementary Fig. 19). On the basis of further binding experiments with SPR, we determined the dissociation constant (K_d) of the best binder, RBB_D10, to be 9.4â€‰nM (Fig. 4b). The design model for RBB_D10 showed extensive contacts to the target with several side-chain-mediated polar contacts and hydrophobic interactions (Fig. 4fâ€“h).

Fig. 4: Accurate de novo design of a high-affinity cyclic peptide binder against the predicted structure of RbtA from A.â€‰*baumannii.*

To confirm the structures of RbtA and RBB_D10 and the binding mode between them, we determined the high-resolution X-ray crystal structure of apo and macrocycle-bound RbtA using X-ray crystallography at 2 Ã… and 2.6 Ã… resolution, respectively. The apo structure of the RbtA N-terminal domain, which is also the first experimentally determined structure from this class of bacterial proteins, matched our AF2 and RF2 predictions for this target very closely, with an overall CÉ‘ r.m.s.d. of 1.2â€‰Ã… and 1.1â€‰Ã… between the X-ray structure of the RbtA N-terminal domain and the AF2-predicted and RF2-predicted structures, respectively (Fig. 4c). The complex structure also confirmed the structure and binding mode of our designed macrocycle, RBB_D10, with the X-ray structure matching the design model with an r.m.s.d. of 1.4â€‰Ã… (Fig. 4d). Notably, the conformation adopted by the macrocycle in the X-ray structure, including the side-chain rotamers involved in interactions with the target, was almost identical to the design model with an r.m.s.d. of 0.4â€‰Ã… (Fig. 4eâ€“h). Together, these data highlight the high accuracy and success rates provided by RFpeptides even while designing macrocycles against targets without deep pockets or targets with no known structures.

Overall, these data show that RFpeptides can sample extensive structural and chemical diversity of macrocycles during the backbone and sequence generation steps against selected targets and, finally, select the shapes and sequences ideally suited for binding the target surface or pockets. The highest-affinity binders against each target are also predicted to fold into the bound conformations even in the absence of the target (Supplementary Fig. 20), suggesting that macrocycles are designed to fold into binding-competent conformations. For all four design campaigns described here, selected designs demonstrated good solubility in aqueous buffers despite not imposing any particular sequence constraints related to solubility during the sequence design step using ProteinMPNN²¹. Notably, combining DL-based and physics-based in silico filters helps to select medium to high-affinity binders. However, we note that the distribution of such metrics varies substantially across the four selected targets and adjustments to filtering thresholds were required on the basis of the shape and chemical composition of the target pocket. While in silico metrics enrich well for binders, the relative ranking within the selected designs does not perfectly match the experimental binding affinities. The highest-affinity binders for MDM2 and RbtA had the best or second-best iPAE values among the designs chosen for those targets (Supplementary Tables 2b and 4b); however, the hit peptides against MCL1 and GABARAP were not among the top three ranked designs (Supplementary Tables 1b and 3b). Integration with high-throughput methods in the future should enable testing of more designs and inform absolute threshold values and filtering schemes for the single-shot design of peptide binders to any arbitrary target.

Discussion

Here, we describe RFpeptides, a generative DL pipeline for precise de novo design of macrocycle binders against a wide range of protein targets. The power of the approach is highlighted by the high affinities (K_dâ€‰<â€‰10â€‰nM) of the designed macrocyclic binders to GABARAP and RbtA and the nearly identical X-ray crystal structures and design models of the macrocycle-bound MCL1, GABARAP and RbtA (CÎ± r.m.s.d. of 0.7â€‰Ã…, 1.2â€‰Ã… and 1.4â€‰Ã…, respectively). The RFpeptides approach offers several advantages over traditional methods. Firstly, the design approach should enable faster and more efficient discovery of macrocyclic binders. Despite testing fewer than 20 designed candidates per target (in contrast to trillions of peptides tested in traditional library-based approaches), we achieved high-affinity binders for two targets without requiring any further experimental optimization; to our knowledge, this is a considerably higher success rate than achieved with any previous method. Secondly, in contrast to the untargeted nature of the random library-based approaches, RFpeptides can be used for designing custom binders to specific patches and sites, as demonstrated for GABARAP and RbtA. Lastly, the atomically accurate nature of the design models enables structure-guided optimization for properties beyond target binding (as well as further increases in affinity), bypassing the bottleneck of complex structure determination, which has hindered the optimization of leads from library screening. Combined with the design principles for membrane traversal, RFpeptides could enable the design of peptides simultaneously optimized for target binding and cell permeability or oral bioavailability.

RFpeptides also has considerable advantages over previous computational peptide design methods. Information on known ligands and/or binding partners is not required to initiate design. RFpeptides can design macrocycles completely de novo from just the structure or sequence (as in the case of RbtA) of the target, enabling design against molecular targets intractable with previous methods. RFpeptides is not limited to generating macrocycles with particular motifs or topologies; the diffusion process generates macrocycles with diverse shapes and sizes and selects the topologies appropriate for the protein being targeted. Among the four targets tested here, binders for MCL1 and MDM2 have helical motifs, binders for GABARAP have a Î²-sheet topology and binders for RbtA sample looplike conformations that make extensive contacts with the flat surface of this target.

We anticipate that RFpeptides will enable the rapid design of custom macrocyclic binders against a wide range of molecular targets, accelerating efforts to develop peptides for diverse functional applications. With the rapid advances in DL methods and frameworks, including the recent development of all-atom diffusion models, we aim to extend the approach to generative design of macrocycles with noncanonical amino acids, crosslinkers and cyclization chemistries.

Methods

Computational methods for cyclic peptide binder design

Macrocyclic peptide monomers and binders were designed with RFpeptides using a three-stage pipeline: backbone generation using RFdiffusion with the cyclic offset applied to the peptide chains, followed by sequence design using ProteinMPNN and, finally, structure prediction of the designed peptideâ€“target complexes using either AfCycDesign and/or RoseTTAFold with the cyclic offset applied to the peptide. Designs were further filtered and downselected using Rosetta metrics and, in some cases, clustered on the basis of CÎ± r.m.s.d. Detailed computational methods, including example scripts, can be found in Supplementary Section 2.2.

Peptide synthesis

Macrocyclic peptides described here were either purchased from Wuxi AppTec at greater than 90% purity or synthesized in-house using Fmoc-based solid-phase peptide synthesis. Peptides were typically synthesized on preloaded CTC resin. The resin was swollen in DCM followed by iterative deprotection with 20% piperidine in DMF and coupling with either HBTU (Sigma) or PyAOP (Novabiochem) and DIEA (Sigma). The linear peptides were cleaved from the resin using either 2% TFA in DCM or 20% HFIP (Oakwood Chemical) in DCM. The solvent was removed by rotary evaporation and linear protected peptides were cyclized in either DCM, DMF or a mixture of both depending on the solubility of the peptide, using two equivalents of PyAOP and five equivalents of DIEA overnight. The protecting groups were removed using a cocktail of 95:2.5:2.5, TFA, water and TIPS for 2.5â€‰h. The crude peptides were precipitated using cold diethyl ether. The precipitate containing the crude cyclization reaction was dissolved in a mixture of water and acetonitrile for purification using reverse-phase high-performance liquid chromatography (LC). Peptide identities were confirmed by mass spectrometry (MS). Purities for all synthesized and tested macrocyclic peptides are also summarized in Supplementary Tables 7â€“10. The mass spectrograms and analytical LC chromatograms for all purified peptides are shown in Supplementary Section 4.

Protein expression and purification

MDM2 and MCL1

The amino acid sequences of MCL1 (PDB 2PQK)³⁷ and MDM2 (PDB 4HFZ)³⁸ were retrieved from the PDB. The optimized genes were then cloned into a Novogen pRSF-DUET plasmid (Sigma, 71341-3), incorporating a 6xHis-tag at the N terminus, followed by an Avi-tag and a tobacco etch virus (TEV) protease cleavage site. The resulting constructs were codon-optimized for Escherichia coli expression and synthesized by Genscript. For propagation, the plasmids were transformed into E.â€‰coli NEBÎ± cells (New England Biolabs, C2987); for protein expression, the plasmids were transformed into E.â€‰coli BL21(DE3) cells (New England Biolabs, C2527). A single sequence-verified colony was cultured in 50â€‰ml of kanamycin (50â€‰Âµgâ€‰ml^âˆ’1) selective Luria Broth (LB) medium. This culture was incubated at 37â€‰Â°C with shaking at 200â€‰rpm for 16â€‰h overnight. Subsequently, 50 units of optical density at 600â€‰nm (OD₆₀₀) of the overnight culture were transferred to 1â€‰L of fresh kanamycin (50â€‰Âµgâ€‰ml^âˆ’1) selective LB medium. The culture was grown at 37â€‰Â°C with shaking at 200â€‰rpm for 2â€‰h (until it reached an OD₆₀₀ of 0.4â€“0.5), at which point the temperature was decreased to 20â€‰Â°C. The culture was grown until an OD₆₀₀ of 0.7â€“0.8; protein expression was induced by adding 1â€‰mM IPTG and the culture was left to grow overnight for 14â€‰h.

Cells were harvested by centrifugation at 5,000g for 10â€‰min at 4â€‰Â°C, resulting in a cell pellet with a density of 5â€‰gâ€‰L^âˆ’1. The pellet was immediately flash-frozen and stored at âˆ’20â€‰Â°C for later use. For lysis, the pellet was thawed on ice and resuspended in 5â€‰ml of lysis buffer per gram of pellet. This lysis buffer contained 50â€‰mM Tris-HCl, 300â€‰mM NaCl and 10â€‰mM imidazole and was supplemented with 1Ã— BugBuster protein extraction reagent (Sigma-Aldrich, 70921), 200â€‰Âµgâ€‰ml^âˆ’1 lysozyme (Sigma-Aldrich, L6876), 25â€‰U per ml benzonase nuclease (Sigma-Aldrich, E8263) and 1Ã— cOmplete EDTA-free protease inhibitor cocktail (Sigma-Aldrich, 11836170001). The buffer was filter-sterilized using a 0.2 Âµm filter before the addition of benzonase, mixed by inversion and kept on ice until use. Cells were completely resuspended in the lysis buffer using a homogenizer at low speed and incubated for 30â€‰min at room temperature (22â€“25â€‰Â°C). Following incubation, the suspension was sonicated using a Q500 Sonicator equipped with a four-tip probe. Sonication was conducted for 2â€“3â€‰min using pulses of 10â€“15â€‰s on followed by 10â€“15â€‰s off at 70% amplitude. The lysate was clarified by centrifugation at 16,000g for 20â€‰min.

Ni-NTA agarose resin (Qiagen, 30210) was equilibrated with 20 column volumes (CV) of ultrapure water, followed by 20 CV of equilibration buffer (50â€‰mM Tris-HCl, 300â€‰mM NaCl and 10â€‰mM imidazole). Then, 4â€‰ml of 50% resin suspended in equilibration buffer was used to bind His-tagged proteins from 25â€‰ml of clarified lysate. All immobilized metal affinity chromatography (IMAC) steps were conducted at 4â€‰Â°C. The lysateâ€“resin mixture was incubated for 60â€‰min on a rotary shaker set to a slow speed. After incubation, the resin was transferred to a 20-ml gravity column and allowed to completely settle. The resin was first washed with 20 CV of wash buffer 1 (20â€‰mM Tris-HCl, 250â€‰mM NaCl, 10â€‰mM imidazole and 5â€‰mM Î²-mercaptoethanol), followed by another 20 CV of wash buffer 2 (20â€‰mM Tris-HCl, 500â€‰mM NaCl and 35â€‰mM imidazole). The bound proteins were then eluted with 8â€‰ml of elution buffer (20â€‰mM Tris-HCl, 250â€‰mM NaCl, 350â€‰mM imidazole and 2â€‰mM DTT). Aliquots of the eluate were collected and analyzed using SDSâ€“PAGE gels.

The eluate was loaded onto a pre-equilibrated Superdex 75 10/300 GL column (25â€‰mM Tris-HCl, 250â€‰mM NaCl and 2â€‰mM DTT) and run at a flow rate of 0.6â€‰mlâ€‰min^âˆ’1 using an Ã„KTA pure system for size-exclusion chromatography (SEC). Then, 1 ml fractions were collected from the elution volume of 8â€“16â€‰ml and those corresponding to peaks in the absorbance at 280â€‰nm between an elution volume of 10 and 13â€‰ml were assessed with SDSâ€“PAGE gels. Fractions confirming the expected molecular weight were pooled and concentrated by centrifugation at 4,000g for 30â€‰min at 4â€‰Â°C using Amicon Ultra-4 concentrators with a 3 kDa cutoff (Millipore Sigma, UFC800308) to a final volume of 500â€‰Âµl. The identity of the eluted proteins were confirmed by MS using an Agilent 6230 LCâ€“MS time-of-flight system.

Verified protein samples were processed for further applications: biotinylation for SPR analysis or tag removal by TEV protease cleavage for crystallography. Biotinylation was performed using the BirA biotinâ€“protein ligase standard reaction kit (Avidity, BirA-500) according to the manufacturerâ€™s recommended conditions. The reaction was carried out at 4â€‰Â°C overnight on a slowly shaking platform. For TEV protease cleavage, the proteins were treated with a 25:1 protein to TEVd enzyme ratio³⁹. Similarly, the mixture was incubated at 4â€‰Â°C overnight on a slowly shaking platform. Following these treatments, samples underwent a cleanup step using 1â€‰ml of Ni-NTA resin per 20â€‰mg of protein. The resin was pre-equilibrated with 10 CV of ultrapure water and 10 CV of a buffer containing 25â€‰mM Tris-HCl, 250â€‰mM NaCl and 10â€‰mM imidazole. The pre-equilibrated resin was added to the protein mixture and incubated for 30â€‰min on a rolling platform at 4â€‰Â°C. Subsequently, the mixtures were filtered through a 0.45 Âµm PVDF centrifugal filtering unit to remove the Ni-NTA-bound substrates. The eluate was collected and dialyzed in 2â€‰L of 25â€‰mM Tris-HCl, 250â€‰mM NaCl and 2â€‰mM DTT using a Slide-A-Lyzer G3 dialysis cassettes with a 3.5 kDa molecular weight cutoff (Thermo Scientific, A52966) overnight for 18â€‰h at 4â€‰Â°C stirring. The dialyzed protein was concentrated to 0.2â€“0.5â€‰ml (as required for downstream assays), using the Amicon ultra concentrators (as above), aliquoted and flash-frozen. Fractions were analyzed by mass spectroscopy for the efficacy of the biotinylation and TEV protease cleavage treatments, as previously described.

GABARAP for SPR

A synthetic complementary DNA was designed on the basis of the amino acid sequence of GABARAP (UniProt O95166) and optimized for expression in E.â€‰coli using Benchling software. The construct was devised to include an N-terminal Avi-tag and TEV protease cleavage site and was cloned into the Novogen pET-50b(+) plasmid. This plasmid configuration introduced a tandem arrangement of protein tags at the N terminus: a 6xHis-tag, followed by a NusA solubility tag, another 6xHis-tag and a human rhinovirus (HRV) 3C protease cleavage site. Therefore, the final construct sequence was as follows: 6xHisâ€“NusAâ€“6xHisâ€“HRV 3Câ€“Aviâ€“TEVâ€“GABARAP. NusA was specifically chosen as a solubility tag because of its known effectiveness in enhancing protein solubility in E.â€‰coli^40,41. The construct was synthesized and cloned by Genscript.

As described above for MCL1 and MDM2 protein expression, the plasmids were introduced into E.â€‰coli NEBÎ± cells and BL21(DE3) cells. A single sequence-verified colony was cultured in 50â€‰ml of kanamycin (50â€‰Âµgâ€‰ml^âˆ’1) selective LB medium for 16â€‰h at 37â€‰Â°C, shaking at 200â€‰rpm. Then, 50 OD₆₀₀ units of this culture were transferred to 1â€‰L of fresh kanamycin (100â€‰Âµgâ€‰ml^âˆ’1) selective autoinduction medium (TBM-5052: 1.2% (w/v) tryptone, 2.4% (w/v) yeast extract, 0.5% (v/v) glycerol, 0.05% (w/v) d-glucose, 0.2% (w/v) d-lactose, 25â€‰mM Na₂HPO₄, 25â€‰mM KH₂PO₄, 50â€‰mM NH₄Cl, 5â€‰mM Na₂SO₄, 2â€‰mM MgSO₄, 10â€‰Î¼M FeCl₃, 4â€‰Î¼M CaCl₂, 2â€‰Î¼M MnCl₂, 2â€‰Î¼M ZnSO₄, 400â€‰nM CoCl₂, 400â€‰nM NiCl₂, 400â€‰nM CuCl₂, 400â€‰nM Na₂MoO₄, 400â€‰nM Na₂SeO₃ and 400â€‰nM H₃BO₃). The culture was grown at 37â€‰Â°C with shaking at 200â€‰rpm for 2â€‰h, at which point the temperature was decreased to 22â€‰Â°C and the culture was left to grow for 16â€‰h.

Cells were harvested, lysed and purified following the protocol outlined earlier for MCL1 and MDM2, with some modifications. The cultures yielded a cell pellet amounting to 15â€‰gâ€‰L^âˆ’1. Lysis was completed using an IKA T18 microfluidizer at 450â€‰psi, followed by lysate clarification by centrifugation at 16,000g for 15â€‰min. All IMAC steps were conducted at 22â€‰Â°C, except for the incubation of the lysateâ€“resin mixture, which was performed at 4â€‰Â°C. Proteins bound to the resin were eluted with 5â€‰ml of elution buffer (50â€‰mM Tris-HCl pH 8, 250â€‰mM NaCl and 300â€‰mM imidazole). SEC was then performed using a Superdex 200 Increase 10/300 GL column (Cytiva) equilibrated with TBS (50â€‰mM Tris-HCl pH 8 and 250â€‰mM NaCl). Fractions confirmed by SDSâ€“PAGE were pooled and concentrated using Amicon Ultra-15 concentrators with a 30 kDa cutoff (Millipore Sigma, UFC9030) to a final volume of 1â€‰ml. Downstream processing for SPR analysis was performed as described previously, with one modification. For biotinylation, the protein was first cleaved using HRV 3C protease with the reagents and protocol provided by the Pierce HRV 3C protease solution kit (Thermo Scientific, 88946). The digested samples were subsequently purified and verified, as outlined in earlier sections.

GABARAP and GABARAPL1 for crystallography

GABARAP and GABARAPL1 were expressed as glutathione S-transferase (GST) fusion proteins after transforming E.â€‰coli BL21(DE3) T1 cells with pGEX4T2-GABARAP and pGEX4T2-GABARAPL1 plasmids, respectively. Bacteria were cultivated in LB medium containing 100â€‰Âµgâ€‰ml^âˆ’1 ampicillin; gene expression was induced with 1â€‰mM IPTG at an OD₆₀₀ of 0.6â€“0.8 and allowed to proceed for 20â€‰h at 25â€‰Â°C. Afterward, cells were harvested by centrifugation at 3,000g for 30â€‰min at 4â€‰Â°C. The bacterial pellet was washed with PBS (137â€‰mM NaCl, 2.7â€‰mM KCl, 1.8â€‰mM KH₂PO₄ and 10â€‰mM Na₂HPO₄) and resuspended in lysis buffer (PBS supplemented with 5% (v/v) glycerol, 0.01% (v/v) Î²-mercaptoethanol, 10â€‰Âµgâ€‰ml^âˆ’1 DNase (AppliChem, A3778) and cOmplete EDTA-free protease inhibitor cocktail (Roche, 11836170001)) before application to the cell disruptor (Constant Systems, model TS1.1) for three cycles with 1.9â€‰kbar at 4â€‰Â°C. Lysates were cleared by centrifugation at 4â€‰Â°C with 45,000g for 45â€‰min. The GST fusion proteins were purified from the supernatant by affinity chromatography using glutathione Sepharose 4B (Cytiva, 1705605). Cleavage with thrombin (Sigma-Aldrich, 1.12374) during dialysis against 10â€‰mM Tris-HCl and 150â€‰mM NaCl (pH 7.0) at 4â€‰Â°C overnight yielded 119 amino acid proteins carrying an N-terminal Gly-Ser extension in addition to the native residues of GABARAP and GABARAPL1. Subsequently, samples were applied to a Hiload 26/60 Superdex 75 preparatory-grade size-exclusion column (GE Healthcare) equilibrated with 10â€‰mM Tris-HCl and 150â€‰mM NaCl (pH 7.0). Protein purity was assessed by SDSâ€“PAGE and Coomassie staining. Fractions containing the eluted proteins were concentrated to 3â€“5â€‰mgâ€‰ml^âˆ’1 using Vivaspin 20 concentrators with a 3 kDa cutoff (Sartorius), flash-frozen in liquid N₂ and kept at âˆ’80â€‰Â°C for long-term storage.

RbtA Î²-helix domain

For heterologous expression of the Î²-helix domain of RbtA (residues A20â€“I459) in E.â€‰coli, the gene was amplified and fused with a SNAC tag (GSHHWGS) at the C terminus using the following primers: forward, GCTGCCCAGCCGGCGATGGCCATGGGCGCTGATATTGAAGTCACAACTAC; reverse, CAGTGGTGGTGGTGGTGGTGCTCGAGGCTGCCCCAATGATGGCTGCCGATATATTCAATTGCGCCTAAAT⁴². The fragment was inserted into NcoI-digested and XhoI-digested pET-22b(+) by Gibson assembly to generate a construct with a C-terminal 6xHis fusion. The construct was confirmed by sequencing and transformed into E.â€‰coli Rosetta (DE3) cells.

To purify the Î²-helix domain of RbtA, an overnight culture of Rosetta (DE3) cells carrying the construct was back-diluted 1:300 in 2Ã— YT broth and grown at 37â€‰Â°C with shaking at 200â€‰rpm until the OD₆₀₀ reached 0.4. The incubation temperature was reduced to 18â€‰Â°C, IPTG was added to a final concentration of 0.3â€‰mM and the culture was incubated for a total of 18â€‰h. Cells were then collected by centrifugation and resuspended in lysis buffer containing 200â€‰mM NaCl, 50â€‰mM Tris-HCl pH 7.5, 10% glycerol (v/v), 5â€‰mM imidazole, 0.5â€‰mgâ€‰ml^âˆ’1 lysozyme and 1â€‰mU of benzonase. Cells were then lysed by sonication and cellular debris was removed by centrifugation at 35,000g for 30â€‰min at 4â€‰Â°C. The protein was purified from lysates using a 1 ml HisTrap HP column on an Ã„KTA fast protein LC (FPLC) system. Column-bound protein was eluted using a linear imidazole gradient from 5 to 500â€‰mM. Protein purity was assessed by SDSâ€“PAGE and Coomassie staining. The fractions with high purity were concentrated using a 30 kDa cutoff Amicon filter and then further purified by FPLC using a HiLoad 16/600 Superdex 200 preparatory-grade column (GE Healthcare) equilibrated with sizing buffer (500â€‰mM NaCl, 50â€‰mM Tris-HCl pH 7.5 and 10% glycerol (v/v)). The fractions with high purity were concentrated and used for evaluation of macrocyclic binders or determination of X-ray structure.

For determination of the X-ray crystal structure of RbtA, the C-terminal 6xHis-tag was removed by chemical cleavage at the SNAC tag. In brief, the buffer of the concentrated protein was exchanged to cleavage buffer (0.1â€‰M CHES, 0.1â€‰M NaCl, 0.1â€‰M acetone oxime and 5â€‰mM Fos-choline-12, pH 8.6). The protein solution was diluted to 1â€‰mgâ€‰ml^âˆ’1, followed by the addition of 1â€‰mM TCEP and 1â€‰mM NiCl₂. The mixture was vortexed and incubated at room temperature for 16â€‰h. The precipitation was removed by centrifugation at 35,000g for 30â€‰min at 4â€‰Â°C. The supernatant was concentrated and exchanged to Tris buffer (50â€‰mM Tris-HCl pH 7.5 and 200â€‰mM NaCl). The protein solution was incubated with a 1 ml bed volume of Ni-NTA beads to extract the cleaved 6xHis-tag. The resulting fraction was concentrated and then further purified by FPLC using a HiLoad 16/600 Superdex 200 preparatory-grade column.

Crystallization of proteinâ€“cyclic peptide complexes

MCL1 with cyclic peptide

MCL1 (18.5â€‰mgâ€‰ml^âˆ’1) and macrocycle MCB_D2 were mixed in 1:2 molar ratio and incubated for 30â€‰min at room temperature. Upon addition of the MCB_D2 to the protein, we observed some precipitation. This precipitant was removed by centrifugation before crystallographic screening. Crystallization experiments for the MCL1â€“MCB_D2 complex were conducted using the sitting-drop vapor diffusion method. Initial crystallization trials were set up in 200 nl drops using 96-well crystallization plates. Crystal drops were imaged using the UVEX crystal plate hotel system by JANSi. Diffraction-quality crystals for the complex appeared in 0.2â€‰M sodium chloride, 0.1â€‰M Bisâ€“Tris pH 6.5 and 25% (w/v) polyethylene glycol 3350 (Hampton Research) in 2â€‰weeks.

GABARAP and GABARAPL1 with cyclic peptides

Cyclic peptides GAB_D8 and GAB_D23 were dissolved in 10â€‰mM Tris-HCl and 150â€‰mM NaCl (pH 7.0) and each mixed with both GABARAP and GABARAPL1, targeting a peptide-to-protein molar ratio of 3:2. After incubation for 10â€‰min at room temperature, any insoluble components were removed by centrifugation (10â€‰min at 20,000g and 4â€‰Â°C). The proteinâ€“peptide complexes were concentrated using Amicon Ultra-0.5 centrifugal filter units with a 3 kDa cutoff (Merck) until a final protein concentration of 6â€“8â€‰mgâ€‰ml^âˆ’1 (GABARAPL1â€“GAB_D8) or 13â€“15â€‰mgâ€‰ml^âˆ’1 (GABARAPâ€“GAB_D23) was reached. Samples were once again cleared of particles by centrifugation (30â€‰min at 20,000g and 4â€‰Â°C) before application in crystallization experiments. Search for crystallization conditions was performed by the sitting-drop vapor diffusion method using robotic systems Freedom Evo (Tecan) and Mosquito LCP (SPT Labtech) with commercially available screening sets. Experiments were set up by combining 200â€‰nl of proteinâ€“peptide complex with 100â€‰nl (for GABARAPL1â€“GAB_D8) or 200â€‰nl (for GABARAPâ€“GAB_D23) of reservoir solution and plates were incubated at 20â€‰Â°C. Crystals appeared for a number of conditions, which were subjected to optimization as appropriate. Diffraction-quality samples used for X-ray structure determination developed with reservoir solutions containing 0.17â€‰M ammonium sulfate, 25.5% (w/v) PEG 4000 and 15% (v/v) glycerol for GABARAPL1â€“GAB_D8 and 0.1â€‰M MES pH 5.0 and 30% (w/v) PEG 6000 in the case of GABARAPâ€“GAB_D23. Diffraction data (https://doi.esrf.fr/10.15151/ESRF-DC-1966164200 and https://doi.esrf.fr/10.15151/ESRF-DC-1979522808 for GABARAPL1â€“GAB_D8 and GABARAPâ€“GAB_D23, respectively) were collected at 100â€‰K on beamline BM07 of the European Synchrotron Radiation Facility (ESRF) tuned to an X-ray wavelength of 0.9795â€‰Ã…, using a Pilatus 6M detector (DECTRIS). Data processing was carried out with XDS and XSCALE⁴³ and included reflections up to a diffraction limit of 1.5â€‰Ã… for GABARAPâ€“GAB_D23 and 2.5â€‰Ã… for GABARAPL1â€“GAB_D8. The GABARAPâ€“GAB_D23 structure featuring space group C2 was determined by molecular replacement (MR) using MOLREP⁴⁴ with the structure of GABARAP from its K1 peptide complex (PDB 3D32)³⁰ as a template. For the GABARAPL1â€“GAB_D8 complex, initial evaluation suggested tetragonal symmetry but with strong indications of twinning. Data integration in maximal translationengleiche subgroups followed by MR search using MoRDa⁴⁵ revealed P2₁2₁2₁ as the true space group, with near-perfect pseudomerohedral twinning accounting for apparent Laue group 4/mmm. To avoid bias in cross-validation, this pseudosymmetry of the data was explicitly accounted for in flag assignment. The solution obtained for GABARAPL1â€“GAB_D8 was subjected to a round of automated rebuilding in phenix.autobuild⁴⁶. In either case, model refinement was performed with phenix.refine⁴⁷, alternating with interactive rebuilding in Coot⁴⁸, which included stepwise introduction of cyclic peptides GAB_D8 and GAB_D23. According to validation using MolProbity⁴⁹ and the wwPDB validation system (https://validate-rcsb-2.wwpdb.org/), both models featured good geometry. Detailed statistics of data collection and refinement can be found in Supplementary Table 6.

RbtA with cyclic peptide and apo RbtA

RbtA (10â€‰mgâ€‰ml^âˆ’1) and RBB_D10 were mixed in a 1:5 molar ratio and incubated for 30â€‰min at room temperature. Initial crystallization trials were set up in 200 nl drops using 96-well crystallization plates and the experiments were conducted by the sitting-drop vapor diffusion method. Crystal drops were imaged using the UVEX crystal plate hotel system by JANSi. Diffraction-quality crystals for the RbtAâ€“RBB_D10 complex appeared in 0.2â€‰M lithium sulfate, 0.1â€‰M Tris pH 8.5 and 40% (v/v) PEG 400 (JCSG Plus, Hampton Research). Additionally, we soaked the crystals in 22.32â€‰mgâ€‰ml^âˆ’1 RBB_D10 for 5â€‰min before flash-freezing. Crystals for RbtA alone (18.7â€‰mgâ€‰ml^âˆ’1) were grown in 0.1â€‰M Bisâ€“Tris pH 6.5 and 20% (v/v) PEG 5,000 MME (SG1, Molecular Dimensions). All crystals were flash-cooled in liquid nitrogen before shipping to the synchrotron for data collection.

Diffraction data were collected at the NSLS2 beamline AMX/FMX (17-ID-1/17-ID-2). X-ray intensities and data reduction were evaluated and integrated by XDS⁴³ and merged and scaled by Pointless and Aimless in the CCP4i2 program suite⁵⁰. The X-ray crystal structure was determined by MR using the designed model for phasing by Phaser⁵¹. Next, the structure obtained from the MR was improved and refined by Phenix⁴⁷. Model building was performed by Coot⁴⁸ in between the refinement cycles. The final model was evaluated by MolProbity⁴⁹. Data collection and refinement statistics are reported in Supplementary Table 5.

SPR

SPR experiments were performed using a Cytiva Biacore 8K in HBS-EP+ buffer from Cytiva. Measurements were obtained by immobilization of biotinylated target protein using the biotin capture kit from Cytiva. Binding screens were performed by single-cycle kinetics experiments using the standard protocol in the Biacore 8K control software at 30â€‰Âµlâ€‰min^âˆ’1 with serial injections of 10â€‰nM, 100â€‰nM, 1â€‰ÂµM, 10â€‰ÂµM and 100â€‰ÂµM, an association time of 60â€‰s and a dissociation time of 120â€‰s. For MCL1 designs, a dissociation time of 150â€‰s was used. To evaluate the affinity of successful designs, a nine-point single-cycle kinetics experiment was performed with an association time of 90â€‰s and dissociation time of 300â€‰s. The dilution series for MCB_D2 was twofold starting at 20â€‰ÂµM, that for MDB_D8 was fivefold starting at 50â€‰ÂµM, and those for GAB_D8, GAB_D23 and RBB_D10 were fivefold starting at 20â€‰ÂµM. Reported measurements were analyzed using Biacore Insight evaluation software; sensorgrams were double-referenced and fit with a 1:1 binding kinetics fit model.

AlphaScreen assay

We used the AlphaScreen assay as described by Leveille et al.⁵² to measure inhibition of the GABARAPâ€“K1 interaction by the computationally designed macrocycles. K1 is a previously described GABARAP binder with a K_d of 10â€‰nM (ref. ²⁷). Biotin-labeled peptide K1 was used at a final concentration of 10â€‰nM and incubated with 10 nM (final concentration) of 6xHisâ€“GABARAP in a final reaction volume of 50â€‰Î¼l. Computationally designed inhibitor peptides were serially diluted with 1:3 dilutions using the highest final concentration of 50â€‰Î¼M and added to the reaction mixture. The buffer used was 25â€‰mM HEPES pH 7.3, 150â€‰mM NaCl, 0.01% Tween, 1â€‰mgâ€‰ml^âˆ’1 BSA and 0.5% DMSO. The plate was covered in foil, centrifuged at 1,500â€‰rpm for 2â€‰min and incubated for 150â€‰min at room temperature with shaking. Then, 20â€‰Î¼gâ€‰ml^âˆ’1 (final concentration) of the streptavidin donor beads and nickel chelate acceptor beads were added in the dark before incubating for another 45â€‰min. Data were collected on a Tecan plate reader using excitation at 680â€‰nm and emission at 520â€“620â€‰nm. Data were normalized to 0% (buffer only) and 100% (protein and tracer peptide, no inhibitor) controls. IC₅₀ values were obtained from curve fits using GraphPad Prism 9 software, using the equation \(Y=\frac{100}{(1+{(\frac{X}{{{IC}}_{50}})}^{h})}\), where X is the concentration of inhibitor and h is the Hill coefficient. At least three independent replicates were used to calculate the average IC₅₀ and the s.e.m.

Statistics and reproducibility

No statistical method was used to predetermine sample size. One trial from the AlphaScreen that was used to determine the IC₅₀ of GAB_D8 was repeated and the repeated value is what was used. All data are included in the Source Data. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The design models and sequences are available in Supplementary Information. Crystal structures of MCB_D2 bound to MCL1, GAB_D8 bound to GABARAPL1, GAB_D23 bound to GABARAP, RBB_D10 bound to RbtA and apo RbtA were deposited to the PDB under accession codes 9CDT, 9HGC, 9HGD, 9CDU and 9CDV, respectively. Source data are provided with this paper.

Code availability

The code and scripts for running RFpeptides are available from Zenodo (https://doi.org/10.5281/zenodo.15264344)⁵³. The code and scripts for RFpeptides are also available from RFdiffusion GitHub repository (https://github.com/RosettaCommons/RFdiffusion).

References

Vinogradov, A. A., Yin, Y. & Suga, H. Macrocyclic peptides as drug candidates: recent progress and remaining challenges. J. Am. Chem. Soc. 141, 4167â€“4181 (2019).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Muttenthaler, M., King, G. F., Adams, D. J. & Alewood, P. F. Trends in peptide drug discovery. Nat. Rev. Drug Discov. 20, 309â€“325 (2021).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Tsomaia, N. Peptide therapeutics: targeting the undruggable space. Eur. J. Med. Chem. 94, 459â€“470 (2015).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Atanasov, A. G., Zotchev, S. B. & Dirsch, V. M. International Natural Product Sciences Taskforce & Supuran, C. T. Natural products in drug discovery: advances and opportunities. Nat. Rev. Drug Discov. 20, 200â€“216 (2021).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Bhardwaj, G. et al. Accurate de novo design of membrane-traversing macrocycles. Cell 185, 3520â€“3532 (2022).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Bhardwaj, G. et al. Accurate de novo design of hyperstable constrained peptides. Nature 538, 329â€“335 (2016).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Hosseinzadeh, P. et al. Comprehensive computational design of ordered peptide macrocycles. Science 358, 1461â€“1466 (2017).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Mulligan, V. K. et al. Computationally designed peptide macrocycle inhibitors of New Delhi metallo-Î²-lactamase 1. Proc. Natl Acad. Sci. USA 118, e2012800118 (2021).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Hosseinzadeh, P. et al. Anchor extension: a structure-guided approach to design cyclic peptides targeting enzyme active sites. Nat. Commun. 12, 3384 (2021).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
MuratspahiÄ‡, E. et al. Design and structural validation of peptideâ€“drug conjugate ligands of the Îº-opioid receptor. Nat. Commun. 14, 8064 (2023).
ArticleÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Rettie, S. A. et al. Cyclic peptide structure prediction and design using AlphaFold2. Nat. Commun. 16, 4730 (2025).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Grambow, C. A., Weir, H., Cunningham, C. N., Biancalani, T. & Chuang, K. V. CREMP: conformerâ€“rotamer ensembles of macrocyclic peptides for machine learning. Sci. Data 11, 859 (2024).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Zhang, C. et al. HighFold: accurately predicting structures of cyclic peptides and complexes with head-to-tail and disulfide bridge constraints. Brief. Bioinform 25, bbae215 (2024).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Brixi, G. et al. SaLT&PepPr is an interface-predicting language model for designing peptide-guided protein degraders. Commun. Biol. 6, 1081 (2023).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Xie, X., Valiente, P. A., Kim, J. & Kim, P. M. HelixDiff, a score-based diffusion model for generating all-atom Î±-helical structures. ACS Cent. Sci. 10, 1001â€“1011 (2024).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Li, Q., Vlachos, E. N. & Bryant, P. Design of linear and cyclic peptide binders of different lengths from protein sequence information. Preprint at bioRxiv https://doi.org/10.1101/2024.06.20.599739 (2024).
Watson, J. L. et al. De novo design of protein structure and function with RFdiffusion. Nature 620, 1089â€“1100 (2023).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871â€“876 (2021).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
VÃ¡zquez Torres, S. et al. De novo design of high-affinity binders of bioactive helical peptides. Nature 626, 435â€“442 (2024).
ArticleÂ PubMedÂ Google ScholarÂ
Krishna, R. et al. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science 384, eadl2528 (2024).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Dauparas, J. et al. Robust deep learning-based protein sequence design using ProteinMPNN. Science 378, 49â€“56 (2022).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Wicky, B. I. M. et al. Hallucinating symmetric protein assemblies. Science 378, 56â€“61 (2022).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Widden, H. & Placzek, W. J. The multiple mechanisms of MCL1 in the regulation of cell fate. Commun. Biol. 4, 1029 (2021).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Leman, J. K. et al. Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nat. Methods 17, 665â€“680 (2020).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Kussie, P. H. et al. Structure of the MDM2 oncoprotein bound to the p53 tumor suppressor transactivation domain. Science 274, 948â€“953 (1996).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Szalai, P. et al. Autophagic bulk sequestration of cytosolic cargo is independent of LC3, but requires GABARAPs. Exp. Cell Res. 333, 21â€“38 (2015).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Brown, H. et al. Structure-based design of stapled peptides that bind GABARAP and inhibit autophagy. J. Am. Chem. Soc. 144, 14687â€“14697 (2022).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Ji, C. H. et al. The AUTOTAC chemical biology platform for targeted protein degradation via the autophagyâ€“lysosome system. Nat. Commun. 13, 904 (2022).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Popelka, H. & Klionsky, D. J. Analysis of the native conformation of the LIR/AIM motif in the Atg8/LC3/GABARAP-binding proteins. Autophagy 11, 2153â€“2159 (2015).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
WeiergrÃ¤ber, O. H. et al. Ligand binding mode of GABA_A receptor-associated protein. J. Mol. Biol. 381, 1320â€“1331 (2008).
ArticleÂ PubMedÂ Google ScholarÂ
van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579â€“2605 (2008).
Google ScholarÂ
Pedregosa, F. et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825â€“2830 (2011).
Google ScholarÂ
Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33, 2302â€“2309 (2005).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Dauparas, J. et al. Atomic context-conditioned protein sequence design using LigandMPNN. Nat. Methods 22, 717â€“723 (2025).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Sternberg, A. MaxCluster: a tool for protein structure comparison and clustering. http://www.sbg.bio.ic.ac.uk/maxcluster/index.html (accessed 1 November 2024).
Siew, N., Elofsson, A., Rychlewski, L. & Fischer, D. MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 16, 776â€“785 (2000).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Fire, E., GullÃ¡, S. V., Grant, R. A. & Keating, A. E. Mclâ€1â€“Bim complexes accommodate surprising point mutations via minor structural changes. Protein Sci. 19, 507â€“519 (2010).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Anil, B., Riedinger, C., Endicott, J. A. & Noble, M. E. M. The structure of an MDM2â€“Nutlin-3a complex solved by the use of a validated MDM2 surface-entropy reduction mutant. Acta Crystallogr. D 69, 1358â€“1366 (2013).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Sumida, K. H. et al. Improving protein expression stability and function with ProteinMPNN. J. Am. Chem. Soc. 146, 2054â€“2061 (2024).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Bhandari, B. K., Gardner, P. P. & Lim, C. S. Solubility-Weighted Index: fast and accurate prediction of protein solubility. Bioinformatics 36, 4691â€“4698 (2020).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Davis, G. D., Elisee, C., Newham, D. M. & Harrison, R. G. New fusion protein systems designed to give soluble expression in Escherichia coli. Biotechnol. Bioeng 65, 382â€“388 (1999).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Dang, B. et al. SNAC-tag for sequence-specific chemical protein cleavage. Nat. Methods 16, 319â€“322 (2019).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Kabsch, W. XDS. Acta Crystallogr. D 66, 125â€“132 (2010).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Vagin, A. & Teplyakov, A. Molecular replacement with MOLREP. Acta Crystallogr. D 66, 22â€“25 (2010).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Vagin, A. & Lebedev, A. MoRDa, an automatic molecular replacement pipeline. Acta Crystallogr. A Found. Adv. 71, s19 (2015).
ArticleÂ Google ScholarÂ
Terwilliger, T. C. et al. Iterative model building structure refinement and density modification with the PHENIX AutoBuild wizard. Acta Crystallogr. 64, 61â€“69 (2008).
CASÂ Google ScholarÂ
Liebschner, D. et al. Macromolecular structure determination using X-rays neutrons and electrons: recent developments in Phenix. Acta Crystallogr. D Struct. Biol. 75, 861â€“877 (2019).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D 66, 486â€“501 (2010).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Williams, C. J. et al. MolProbity: more and better reference data for improved allâ€atom structure validation. Protein Sci. 27, 293â€“315 (2018).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Winn, M. D. et al. Overview of the CCP 4 suite and current developments. Acta Crystallogr. D 67, 235â€“242 (2011).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
McCoy, A. J. et al. Read Phaser crystallographic software. J. Appl. Crystallogr. 40, 658â€“674 (2007).
ArticleÂ PubMedÂ PubMed CentralÂ CASÂ Google ScholarÂ
Leveille, A. N. et al. Exploring arylideneâ€“indolinone ligands of autophagy proteins LC3B and GABARAP. ACS Med. Chem. Lett. 16, 271â€“277 (2025).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ
Rettie, S. et al. Accurate de novo design of high-affinity protein-binding macrocycles using deep learning. Zenodo https://doi.org/10.5281/zenodo.15264344 (2025).
Hunter, J. D. Matplotlib: a 2D graphics environment. Comput. Sci. Eng 9, 90â€“95 (2007).
ArticleÂ Google ScholarÂ
Waskom, M. seaborn: statistical data visualization. J. Open Source Softw 6, 3021 (2021).
ArticleÂ Google ScholarÂ
Pettersen, E. F. et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci 30, 70â€“82 (2021).
ArticleÂ PubMedÂ CASÂ Google ScholarÂ

Download references

Acknowledgements

We thank L. Stuart, L. Stewart, K. van Wormer, L. Goldschmidt, M. Kennedy, I. Haydon, J. Woodward, X. Li, Z. Taylor, H. Osterstock, G. Zhou, G. GÃ¶kÃ§e, J. Palmer, K. Lindenauer, K. Campbell, M. Gloegl, R. Ragotte and M. Sadilek for their helpful feedback, guidance and support. We also thank the IPD Peptide, Crystallography and Biologics, Vaccines and Process Development core labs and the University of Washington Chemistry MS Facility for providing instrumentation support and expertise. This work was supported by funds from the DARPA Harnessing Enzymatic Activity for Lifesaving Remedies program HR001120S0052 contract HR0011-21-2-0012 (to G.B., D.B., J.D.M. and M.L.), the Defense Threat Reduction Agency HDTRA1-19-1-0003 (to D.B., G.B. and S.A.R.), the National Institutes of Health (NIH) 5R21AI178088-02 (to G.B. and S.A.R.), the Howard Hughes Medical Institute (HHMI) Emerging Pathogens Initiative (to J.D.M., G.B. and V.A.), startup funds from the University of Washingtonâ€™s Department of Medicinal Chemistry and Institute for Protein Design (to G.B.), the Audacious Project (to G.B., A.K.B. and A.K.), the C19 HHMI Initiative grant (to A.K.B. and A.K.), NIH R35-GM148407 (to J.A.K. and A.N.L.), the European Unionâ€™s Horizon Europe research and innovation program under the Marie SkÅ‚odowska-Curie 101059124 (to Y.F.B.), NIH R01-R0AI160052 (to A.K.B. and A.K.), Deutsche Forschungsgemeinschaft project ID 267205415-SFB 1208 (to D.W.) and the Bill and Melinda Gates Foundation GR047983 (to D.J.). All plots in this paper were generated using matplotlib or seaborn^54,55. Peptide structures were rendered using ChimeraX 1.9 (ref. ⁵⁶) or PyMOL 2.5.4. Data were analyzed and plotted with Pandas versions 1.4.3 and plotted using matplotlib version 3.7.0 and seaborn version 0.12.2. All figures were created with BioRender.com. This research used resources (FMX/AMX) of the National Synchrotron Light Source II, a US Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Brookhaven National Laboratory under contract No. DEâ€“SC0012704. The Center for BioMolecular Structure is primarily supported by the NIH National Institute of General Medical Sciences through a Center Core P30 Grant (P30GM133893) and by the DOE Office of Biological and Environmental Research (KP1607011). This publication resulted from data collected using the beamtime obtained through Northeastern Collaborative Access Team BAG proposal no. 311950. Lastly, we would like to thank the staff of the ESRF and European Molecular Biology Laboratory for assistance and support in using beamline BM07 under proposal number MXâ€“2587.

Author information

Alina Ãœffing
Present address: Molecular Cell Biology of Autophagy Laboratory, The Francis Crick Institute, London, UK
These authors contributed equally: Stephen A. Rettie, David Juergens, Victor Adebomi.

Authors and Affiliations

Department of Medicinal Chemistry, University of Washington, Seattle, WA, USA
Stephen A. Rettie,Â Victor Adebomi,Â Yensi Flores Bueso,Â Maika Schneider,Â Vibha VasireddyÂ &Â Gaurav Bhardwaj
Institute for Protein Design, University of Washington, Seattle, WA, USA
Stephen A. Rettie,Â David Juergens,Â Victor Adebomi,Â Yensi Flores Bueso,Â Asim K. Bera,Â Alex Kang,Â Evans Brackenbrough,Â Mila Lamb,Â Stacey R. Gerben,Â Analisa Murray,Â Paul M. Levine,Â Maika Schneider,Â Vibha Vasireddy,Â David Baker,Â Frank DiMaioÂ &Â Gaurav Bhardwaj
Molecular and Cellular Biology Program, University of Washington, Seattle, WA, USA
Stephen A. Rettie
Graduate Program in Molecular Engineering, University of Washington, Seattle, WA, USA
David Juergens
Department of Biochemistry, University of Washington, Seattle, WA, USA
Yensi Flores Bueso,Â David BakerÂ &Â Frank DiMaio
Cancer Research @UCC, University College Cork, Cork, Ireland
Yensi Flores Bueso
Department of Microbiology, University of Washington, Seattle, WA, USA
Qinqin Zhao,Â Andi LiuÂ &Â Joseph D. Mougous
Department of Chemistry, Tufts University, 62 Talbot Avenue, Medford, MA, USA
Alexandria N. LeveilleÂ &Â Joshua A. Kritzer
Heinrichâ€“Heineâ€“UniversitÃ¤t DÃ¼sseldorf, Institut fÃ¼r Physikalische Biologie, DÃ¼sseldorf, Germany
Joana A. Wilms,Â Alina ÃœffingÂ &Â Dieter Willbold
Forschungszentrum JÃ¼lich, Institute of Biological Information Processing, Structural Biochemistry (IBIâ€“7), JÃ¼lich, Germany
Joana A. Wilms,Â Alina Ãœffing,Â Oliver H. WeiergrÃ¤berÂ &Â Dieter Willbold
Department of Chemistry, University of Washington, Seattle, WA, USA
Maika Schneider
Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
Sergey Ovchinnikov
Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
Joseph D. MougousÂ &Â David Baker

Authors

Stephen A. Rettie
View author publications
Search author on:PubMedÂ Google Scholar
David Juergens
View author publications
Search author on:PubMedÂ Google Scholar
Victor Adebomi
View author publications
Search author on:PubMedÂ Google Scholar
Yensi Flores Bueso
View author publications
Search author on:PubMedÂ Google Scholar
Qinqin Zhao
View author publications
Search author on:PubMedÂ Google Scholar
Alexandria N. Leveille
View author publications
Search author on:PubMedÂ Google Scholar
Andi Liu
View author publications
Search author on:PubMedÂ Google Scholar
Asim K. Bera
View author publications
Search author on:PubMedÂ Google Scholar
Joana A. Wilms
View author publications
Search author on:PubMedÂ Google Scholar
Alina Ãœffing
View author publications
Search author on:PubMedÂ Google Scholar
Alex Kang
View author publications
Search author on:PubMedÂ Google Scholar
Evans Brackenbrough
View author publications
Search author on:PubMedÂ Google Scholar
Mila Lamb
View author publications
Search author on:PubMedÂ Google Scholar
Stacey R. Gerben
View author publications
Search author on:PubMedÂ Google Scholar
Analisa Murray
View author publications
Search author on:PubMedÂ Google Scholar
Paul M. Levine
View author publications
Search author on:PubMedÂ Google Scholar
Maika Schneider
View author publications
Search author on:PubMedÂ Google Scholar
Vibha Vasireddy
View author publications
Search author on:PubMedÂ Google Scholar
Sergey Ovchinnikov
View author publications
Search author on:PubMedÂ Google Scholar
Oliver H. WeiergrÃ¤ber
View author publications
Search author on:PubMedÂ Google Scholar
Dieter Willbold
View author publications
Search author on:PubMedÂ Google Scholar
Joshua A. Kritzer
View author publications
Search author on:PubMedÂ Google Scholar
Joseph D. Mougous
View author publications
Search author on:PubMedÂ Google Scholar
David Baker
View author publications
Search author on:PubMedÂ Google Scholar
Frank DiMaio
View author publications
Search author on:PubMedÂ Google Scholar
Gaurav Bhardwaj
View author publications
Search author on:PubMedÂ Google Scholar

Contributions

S.A.R., D.J., V.A., F.D. and G.B. conceptualized the study. D.J. and F.D. implemented the cyclic relative positional offsets into RF2 and RFdiffusion. S.A.R., D.J., V.A. and G.B. developed the protocol for generating and filtering designs. A.L. and J.D.M. identified RbtA as a surface-exposed target in A.â€‰baumannii. S.A.R., V.A., M.L., P.M.L., M.S. and V.V. synthesized the designs. Y.F.B., Q.Z., M.L., S.R.G., A.L., J.A.W., A.U. and A.M. expressed and purified the target proteins. S.A.R., V.A. and A.N.L. biophysically characterized the designed macrocyclic peptides. A.K.B., J.A.W., A.U., A.K., E.B. and O.H.W., determined the X-ray crystal structures of the designed macrocycle peptides bound to their targets. S.O., O.H.W., D.W., J.A.K., J.D.M., D.B., F.D. and G.B. offered supervision throughout the project. S.A.R., D.J., V.A., D.B. and G.B. wrote the paper. All authors read and contributed to the paper. S.A.R., D.J. and V.A. agree that the order of their respective names may be changed for personal pursuits to best suit their interests.

Corresponding authors

Correspondence to David Baker, Frank DiMaio or Gaurav Bhardwaj.

Ethics declarations

Competing interests

D.W. is a cofounder of Priavoid and Attyloid. D.B. and G.B. are cofounders, advisors and shareholders of Vilya. The other authors declare no competing interests.

Peer review

Peer review information

Nature Chemical Biology thanks Hyun Ho Park, Francesca Peccati and the other, anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisherâ€™s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1â€“5, Figs. 1â€“59 and Tables 1â€“10.

Reporting Summary

Source data

Source Data Fig. 1

Source data used in plots.

Source Data Fig. 2

Source data used in plots.

Source Data Fig. 3

Source data used in plots.

Source Data Fig. 4

Source data used in plots.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the articleâ€™s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleâ€™s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Rettie, S.A., Juergens, D., Adebomi, V. et al. Accurate de novo design of high-affinity protein-binding macrocycles using deep learning. Nat Chem Biol (2025). https://doi.org/10.1038/s41589-025-01929-w

Download citation

Received: 31 December 2024
Accepted: 02 May 2025
Published: 20 June 2025
DOI: https://doi.org/10.1038/s41589-025-01929-w

This article is cited by

De novo design of macrocycles
- Sarah Crunkhorn
Nature Reviews Drug Discovery (2025)

Subjects

Abstract

Similar content being viewed by others

Main

Extending RF2 and RFdiffusion for macrocycles

De novo design of macrocyclic binders to myeloid cell leukemia 1 and MDM2

De novo design of macrocyclic binders to Î³-aminobutyric acid type A receptor-associated protein

Design of macrocyclic binders to predicted structures

Discussion

Methods

Computational methods for cyclic peptide binder design

Peptide synthesis

Protein expression and purification

MDM2 and MCL1

GABARAP for SPR

GABARAP and GABARAPL1 for crystallography

RbtA Î²-helix domain

Crystallization of proteinâ€“cyclic peptide complexes

MCL1 with cyclic peptide

GABARAP and GABARAPL1 with cyclic peptides

RbtA with cyclic peptide and apo RbtA

SPR

AlphaScreen assay

Statistics and reproducibility

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links