Abstract
Bacterial adhesion is a fundamental process which enables colonisation of niche environments and is key for infection. However, in Legionella pneumophila, the causative agent of Legionnairesâ disease, these processes are not well understood. The Legionella collagen-like protein (Lcl) is an extracellular peripheral membrane protein that recognises sulphated glycosaminoglycans on the surface of eukaryotic cells, but also stimulates bacterial aggregation in response to divalent cations. Here we report the crystal structure of the Lcl C-terminal domain (Lcl-CTD) and present a model for intact Lcl. Our data reveal that Lcl-CTD forms an unusual trimer arrangement with a positively charged external surface and negatively charged solvent exposed internal cavity. Through molecular dynamics simulations, we show how the glycosaminoglycan chondroitin-4-sulphate associates with the Lcl-CTD surface via distinct binding modes. Our findings show that Lcl homologs are present across both the Pseudomonadota and Fibrobacterota-Chlorobiota-Bacteroidota phyla and suggest that Lcl may represent a versatile carbohydrate-binding mechanism.
Similar content being viewed by others
Introduction
Legionella pneumophila is a Gram-negative bacterium that inhabits both natural and artificial freshwater systems. It thrives within a complex aquatic microbiome, which includes other biofilm associated bacterial species and cyanobacteria1,2. It infects and replicates within amoebae and ciliates3 but as an opportunistic pathogen it can also infect the lungs and causes Legionnairesâ disease, and the self-limiting and milder Pontiac fever4. Infection occurs via inhalation of water droplets from contaminated sources, where it invades macrophages in the lungs and replicates intracellularly, resulting in pneumonia5. During infection L. pneumophila first binds the eukaryotic cell-surface, then after cell entry, it evades degradation through the formation of a specialised membrane bound replicative compartment, the Legionella containing vacuole (LCV)6. L. pneumophila utilises a type IV secretion system (T4SS/Dot/Icm) to transport >300 effectors directly into the host cytoplasm, which are key factors that drive LCV biogenesis and bacterial replication7,8. In addition, L. pneumophila employs a type II secretion system (T2SS/Lsp) to export >25 substrates/effectors out of the bacterium, and these play major roles in supporting the early stages of infection and during extracellular survival9,10,11,12,13,14,15,16,17,18.
We initially identified the Legionella collagen-like protein (Lcl) in a proteomic study of type-II dependent secretion in L. pneumophila strain 130b11. Subsequently, the lcl gene was detected in >500 other L. pneumophila strains examined, indicating that Lcl expression is a conserved trait of the L. pneumophila species19,20,21. Although limited in its broader prevalence within the Legionella genus, relative to that of other T2SS substrates, the lcl gene occurs in five out of 57 other Legionella species examined (i.e., Legionella oakridgensis, Legionella nagasakiensis, Legionella hackeliae, Legionella quateirensis and Legionella tucsonensis) and the majority of these are associated with human infection9. In addition to being detected in culture supernatants on multiple occasions11,19,22, Lcl appears to also be a peripheral membrane bound protein and upon its secretion from L. pneumophila it is targeted to the bacterial surface19,23 and found in outer membrane vesicles22. Lcl is important for L. pneumophila auto-aggregation and biofilm formation20,23,24,25, although this precise mechanism remains unclear. However, Lcl can also facilitate adhesion and cell entry of L. pneumophila to human lung epithelial (A549), lung mucoepidermoid (NCI-H292), and macrophage (U937) cell lines, and this indicates that Lcl has a fundamental role during infection of lung tissue19,20.
Lcl contains both an N-terminal region composed of collagen-like repeat (CLR) sequences, which are variable in length between different strains, and a C-terminal region with no overall sequence homology outside of the Legionella genus11,19,26. The C-terminal region of Lcl binds a range of sulphated glycosaminoglycan (GAG) polysaccharides that are present within the lung20, including heparin and chondroitin-4-sulfate, while the collagen-like region has been shown to bind fucoidan21, a heavily sulphated GAG found in many species of brown seaweed. GAGs are diverse linear carbohydrate structures that are formed from repeating disaccharide units of an amino sugar (N-acetylglucosamine or N-acetylgalactosamine) and glucuronic acid or galactose27. Sulphated GAGs exist as protein conjugates in the plasma membrane of nucleated cells and secreted into the extracellular matrix, and many bacterial pathogens including L. pneumophila use host GAGs as a means of adhesion during infection28,29,30. However, this is not well understood in L. pneumophila and there is a general lack in our structural and mechanistic understanding of cellular adhesion across the Legionella genus, which is a key step during colonisation and host invasion.
In this study, we report a structural model for full-length Lcl based on X-ray crystallographic, in silico modelling and nuclear magnetic resonance (NMR) spectroscopic data. We show that Lcl is also targeted to the surface of L. pneumophila 130b strain after its secretion, and this is mediated by its N-terminus. We present the crystal structure of the Lcl C-terminal domain (Lcl-CTD) which reveals an unusual trimer arrangement, and our structural and biochemical studies demonstrate a distinct GAG binding mechanism. Our work provides a molecular understanding of how Lcl can recognise and interact with a broad range of GAG ligands and provides strong evidence for the role of Lcl in facilitating direct recognition of glycosaminoglycans in host tissue during L. pneumophila infection.
Results
Lcl is expressed on the surface of L. pneumophila strain 130b
Previously, immunoblot analysis identified Lcl in an outer membrane fraction of wild-type strain Philadelphia-119, and an immunofluorescence assay detected the protein on the surface of strain Lp02, which is a lab-generated derivative of Philadelphia-123. Since Lp02 contains multiple point mutations and a large (~45-kb) deletion in the bacterial chromosome31, we began this study by determining whether Lcl is also surface-exposed in wild-type strain 130b in addition to being present within its culture supernatants. To that end, Lcl from strain 130b (numbered 1 to 401 from the mature protein; ORF lpw28961; lpg2644 in strain Philadelphia-1, lpp2697 in strain Paris)9,11 was expressed recombinantly in Escherichia coli, purified, and then used to generate polyclonal anti-Lcl antibodies. In confirmation of our earlier work11, immunoblot analysis revealed Lcl in the culture supernatants of strain 130b but not in the supernatants of either a T2SS (lspF) mutant or two constructed lcl mutants (Fig. 1a). By utilising a whole-cell enzyme-linked immunosorbent assay (ELISA) method that had previously examined the location of another T2SS-dependent protein, ChiA18, we determined that Lcl is in fact present on the surface of wild-type strain 130b but not the lspF mutant or lcl mutants (Fig. 1b).
a Analysis of Lcl secretion from BYE culture supernatants of wild-type 130b, lcl mutants NU468 and NU469, and lspF mutant NU275 reacted with anti-Lcl antibodies. Results are representative of two independent experiments. b Detection of bacterial surface binding. Whole cell ELISA of wild-type 130b, lcl mutants NU468 and NU469, and lspF mutant NU275 detected with anti-Lcl antibodies. Comparison of NU468, NU469, and NU275 to 130b shows a strong significant difference by two-tailed Studentâs t test (***pâ<â0.0001). Data are presented as mean valuesâ±âstandard error of mean (SEM) derived from nâ=â3 biologically independent experiments. OD optical density. Source data are provided as an accompanying Source Data file.
Overall architecture of Lcl
We next turned our attention to the structural characterisation of Lcl. Using Multi-Angle Light Scattering (MALS), we determined a molecular mass of 123.4â±â0.2âkDa (theoretical mass 42.3âkDa) for recombinant Lcl (Fig. 2a), which supported Lcl being a trimer in solution. Inspection of the Lcl sequence from strain 130b indicated three defined regions: a collagen-like repeat (CLR) region (consensus repeat: GPQGLPGPKGD(K/R)GEA) and C-terminal region (CTD) which contains a domain of unknown function (DUF1566)32, but also a 30 residue N-terminal helical region (N)33 (Fig. 2b and Supplementary Table 1). Recombinant Lcl was analysed by rotary shadowing electron microscopy and inspection of the micrographs revealed a clear âlollipop-shapedâ structure with a globular head and a stalk, consistent with a trimer of C-terminal domains and a triple helical collagen-like region, respectively (Fig. 2c, d and Supplementary Fig. 1).
a Size-exclusion chromatography coupled to multi-angle light scattering (SEC-MALS) profile of recombinant Lcl and Lcl-CTD, using a Superose 6 Increase 10/300 column. Normalised refractive index (grey and green line) and average molecular weight calculated across the elution profile (orange and gold line) are shown for Lcl and Lcl-CTD, respectively. Void (Vo) and column (Vc) volumes are highlighted. RI refractive index. b Schematic of the Lcl domains with residue numbering based on mature sequence shown below. SS periplasmic signal sequence, N N-terminal helix, CLR collagen-like repeat region, CTD C-terminal domain, GFP green fluorescent protein, His6 6Ãhistine tag, Ac N-terminal peptide acylation, NH2 C-terminal peptide amidation. Lower: constructs used in this study with position of His6 affinity tags shown. Peptide modifications are annotated (Ac acylation, NH2 amidation) along with sequence and numbering. GFP green fluorescent protein. c Micrograph showing lollipop-shaped structures of Lcl trimers. The concentration of Lcl was 5âµg/ml. Scale barâ=ââ50ânm. The globular shapes correspond to trimeric C-terminal domains (green arrow), while the stalks contain trimeric collagen-like region (grey arrow). d Schematic of the Lcl trimer presented on the bacterial surface.
Structural features of Lcl-N
Examination of the N-terminal region of Lcl (Lcl-N) using neural network-based modelling and solution 1H NMR nuclear Overhauser effect spectroscopy (NOESY), suggested that it forms an amphipathic helix34 and can bind to sodium dodecyl sulfate (SDS) micelles (Fig. 3a, b and Supplementary Fig. 2). This indicated that it may act as an extracellular membrane anchor for Lcl after its secretion, and so we next assessed its ability to bind to the surface of L. pneumophila. We created an Lcl-N GFP fusion (N-GFP) (Fig. 2b) and using size exclusion chromatography (SEC)17 observed both a major monomer species and a minor trimer species, but with the trimer population increasing with increased N-GFP concentration (Fig. 3c). Monomeric N-GFP was isolated and incubated with wild-type strain 130b and showed significant binding when compared with GFP alone (Fig. 3d). Using AlphaFold235 we produced consistent models of an Lcl-N trimer, where three parallel helices pack together through burial of their conserved hydrophobic face (Val10, Val14, Leu17, Leu21, Ile25) (Fig. 3e and Supplementary Fig. 2), and with other conserved residues localised to the N-terminal interface (Ser4) or contributing to a charged surface (Lys13, Lys18, Lys24). Three replica molecular dynamics (MD) simulations of 1âµs each were then run on the top ranked model to assess the stability of the trimer (Supplementary Table 2). While Lcl-N maintained the overall structure over the time course (Fig. 3f), we observed that one of the helices acts as a bridge between the other two, which share less interactions with one another, and this provides a potential mechanism for the exchange between monomer and trimer (Supplementary Fig. 2).
a Structural model of Lcl-N and helical wheel diagram generated by HELIQUEST34, with terminal residues numbered. Yellow/grey: large/small hydrophobic, pink/purple: large/small polar, blue/red: positively/negatively charged. b 1H-1H NOESY spectra of Lcl-N peptide in the presence/absence of 80âmM perdeuterated SDS, highlighting the amide region. c SEC profile of recombinant N-GFP and GFP, using a S200 column. Normalised absorbance (280ânm) across the elution profile is shown for N-GFP loaded at 2.5âmg/ml (N-GFPâÃâ1; wheat), 12.5âmg/ml (N-GFPâÃâ5; orange), and GFP (2.5âmg/ml; green). Void (Vo) and column (Vc) volumes are highlighted, as are monomeric (m) and trimeric (t) species. d Binding of purified N-GFP and GFP to the L. pneumophila 130b surface, detected through GFP fluorescence. Comparison of N-GFP to GFP shows a strong significant difference by two-tailed Studentâs t test (**pâ<â0.01). Data are presented as mean valuesâ±âSEM derived from nâ=â3 biologically independent experiments. Source data are provided as an accompanying Source Data file. e AlphaFold2 model of trimeric Lcl-N, with conserved residues shown as spheres (teal: 100% identical; green: >50% identical; purple: inserted sequence). f Map of the mean number of contacts between residues (centre of mass) from different monomers, within a 10âà cut-off, during a 1âµs MD simulation of the Lcl-N trimer model. Source data are available at https://doi.org/10.5281/zenodo.10961237. g 1H-15N HSQC spectrum of 15N-glycine labelled Lcl-CLR peptide showing resonances for monomeric (m) and trimeric (t) states. Assignment of specific glycines residues in monomeric Lcl-CLR is shown with peak positions for trimeric Lcl-CLR glycine residues numbered from left to right in subscript.
Structural features of Lcl-CLR
We next probed the quaternary structure of the collagen-like region using standard multidimensional NMR. We designed a peptide that encompassed a consensus repeat sequence (Lcl-CLR) and observed a monomeric species that formed a polyproline II conformation in solution at 15â°C (Fig. 2b and Supplementary Fig. 3). When an Lcl-CLR peptide containing uniformly 15N labelled glycine residues was studied at 2â°C, a 1H-15N heteronuclear single quantum correlation (HSQC) spectrum showed six glycine cross-peaks from the monomeric peptide, but also a higher molecular weight species containing at least 16 distinct glycine residues (Fig. 2f). This suggested that the Lcl-CLR peptide is also in equilibrium between a monomeric and pseudo-symmetric trimer state under these conditions, containing six and 18 glycine residues, respectively. This was further supported by the comparison of cross-peaks in NOESY and rotating frame Overhauser effect spectroscopy (ROESY) spectra, where there were significant differences between NOE/ROE patterns for the higher molecular weight species at 2â°C, which disappeared at 37â°C (Supplementary Fig. 4). Analysis of intact Lcl using circular dichroism (CD) spectroscopy showed negative and positive peaks at 199ânm and 222ânm, respectively, which is in line with previous reports for Lcl from the Lp02 strain21, and is indicative of a collagen-like structure (Supplementary Fig. 5). Furthermore, while monitoring the peak at 199ânm over increasing temperatures, we observed a two-stage unfolding process with Tm values of 38 and 45â°C. Together this further supports the CLR region of Lcl forming a triple helical structure in solution.
Overall structure of Lcl-CTD
As anticipated, analysis of the Lcl C-terminal domain (Lcl-CTD, residues 252 to 401) with MALS again revealed a stable trimer in solution (55.0â±â0.1âkDa; theoretical mass 18.6âkDa) and crystallographic studies were initiated. The structure of Lcl-CTD was determined using selenomethionine single wavelength anomalous dispersion (Se-SAD) phasing, with electron density maps refined to 1.9âà (Supplementary Table 3). Lcl-CTD is composed of a trimer with disordered N-termini (Glu252 to Val270) that could not be modelled, with each domain having an identical conformation formed from two α-helices (H2, H3), two 310-helices (3101, 3102) and nine β-strands (S1âS9) (Fig. 3a, b and Supplementary Fig. 6). Using small angle X-ray scattering (SAXS) we confirmed that the crystal structure is consistent with solution measurements and that the N-terminus can form several conformations, with an Rg value of 2.7ânm and a Dmax of 9.7ânm (Supplementary Figs. 7, 8 and Supplementary Tables 4, 5).
The Lcl-CTD structure is stabilised through the burial of an unusually small surface area per subunit (~14,000âà 2) and this is due to a solvent accessible cavity permeating from the underside into the core of the trimer (Fig. 3c). Inter-subunit interactions are mediated by charge complementary (e.g. Asp316, Asp319, Arg342) and hydrophobic residues (e.g. Trp315, Ile321, Phe343) and while the internal surface contains negatively charged patches (e.g. Asp336, Glu368), the upper surface displays strong positive charge (e.g. Arg342, Lys369, Lys380, Lys385 and Lys391) (Fig. 3d, e). Using the DALI server36 we established that the Lcl-CTD monomer is similar to C-type lectin-like domains found in snake venom toxins and bacterial invasins/intimins37,38,39,40. However, Lcl-CTD lacks disulfide bonds and the expected motifs required for Ca2+/carbohydrate and integrin/Tir binding (Supplementary Fig. 9), and we could not identify any trimeric structures that share tertiary homology.
The DUF1566/pfam0760332 domain is found in diverse proteins from a wide range of prokaryotes. DUF1566 is also located between residues 314 to 399 of Lcl-CTD and is composed of the H2, 3101, H3 and 3102 helices, and S5-S9 strands (Supplementary Fig. 10). With truncation of the S1-S4 β-strands almost all inter-subunit interactions are still maintained, but with the depth of the internal cavity of Lcl-CTD greatly reduced. Highly conserved residues in the DUF1566 domain are largely located within the core of Lcl-CTD, with just three residues located on the surface: Trp315 at the inter-domain interface, and Arg338 and Glu344, which form an intra-domain salt bridge within the internal cavity. While a generic role for the DUF1566 domain is not clear, based on Lcl it could act in carbohydrate recognition and/or promote trimer formation. Further examination of DUF1566 containing proteins that also possess a collagen-like repeat region (gly_rich_SclB superfamily) shows Lcl actually belongs to a larger family, with homologues identified in Legionella bononiensis and Legionella longbeachae from the Legionella genus, but also in species across the Pseudomonadota phylum (Comamonas sp., Methylomonas paludism, Methylobacter sp., Thiocystis minor), and the Fibrobacterota-Chlorobiota-Bacteroidota (FCB) superphylum (Bacteroidetes bacterium, Bacteroidia bacterium, Candidatus Fluviicola riflensis, Chitinophagaceae bacterium, Flavobacteria bacterium, Formosa sp., Fluviicola sp., Nonlabens sp., Oceanihabitans sediminis, Psychroflexus planctonicus, Winogradskyella pacifica, and Winogradskyella wichelsiae) (Supplementary Data 1).
GAGs bind the charged surface of Lcl-CTD
Chondroitin is composed of repeating disaccharide units of [â4)GlcA(β1-3)GalNAc(β1-]n (GlcA: D-glucuronate; GalNAc: N-acetyl-D-galactosamine), with chondroitin-4-sulfate (C4S) sulphated at the C4 position of GalNAc27. Heparin is formed of repeating disaccharide units of [â4)IdoA(β1-4)GlcN(β1-]n (IdoA: L-iduronate; GlcN: D-glucosamine) and is highly sulphated, with sulphation at the 2O position of IdoA (IdoA(2S)) and the 6O and N positions of GalNAc (GlcNS(6S)) being the most common form27. Both C4S and heparin are abundant in the lung41 and have variable molecular weights that range between ~5â50âkDa, which equates to ~15â135 disaccharide repeats in each GAG chain. Intact Lcl was previously shown to recognise a range of variable length commercially prepared sulphated GAGs, including C4S and heparin (Fig. 4a), with the isolated C-terminal domain of Lcl also showing binding to heparin20. We therefore attempted to crystallise Lcl-CTD in the presence of defined C4S (GlcA/GalNAc(4S)) and heparin (IdoA(2S)/GlcNS(6S)) fragments with 4, 6 and 8 disaccharide repeats (degree of polymerisation; dp4, dp6, dp8, respectively) but were unsuccessful. Nonetheless, we did identify a crystal form of Lcl-CTD grown from high concentrations of ammonium sulfate and solved its structure using molecular replacement and refined electron density maps to 1.9âà (Supplementary Table 3). The two trimer structures are highly similar (Root Mean Square Deviation (RMSD) over all Cα atoms of 0.3âà ) (Supplementary Fig. 11) but in this form, two sulfate ions were also observed on the surface bound to residues Lys369 and Lys391 (Fig. 3e). As GAG binding sites are usually formed from clefts or relatively flat positively charged patches42, we speculated that Lys369 and Lys391, along with the adjacent Arg342, Lys380 and Lys385 residues, may recognise the negatively charged sulfate groups that decorate GAG polymers.
a Monomer of Lcl-CTD shown as cartoon and rotated by 180°. b Trimer of Lcl-CTD shown from the top as cartoon. c Trimer of Lcl-CTD shown from the side as a cut-away electrostatic surface highlighting the internal charged cavity. Position of Asp336 and Glu368 in two chains is shown. d Monomer of Lcl-CTD shown as electrostatic surface and rotated by 180°, with the inter-trimer interface highlighted with a yellow outline. Inter-trimer residues and charged surface residues are highlighted. e Crystal structure of trimeric Lcl-CTD/SO4 shown as electrostatic surface and highlighting the bound sulfate ions (yellow spheres) and charged surface residues.
To assess this GAG binding model, we created constructs carrying R342A, K369A, K380A, K385A or K391A mutations (Lcl-CTDR342A, Lcl-CTDK369A, Lcl-CTDK380A, Lcl-CTDK385A and Lcl-CTDK391A, respectively) which we anticipated would abrogate binding to GAGs. In addition, we also created constructs carrying a D386A mutation (Lcl-CTDD386A) located on the Lcl-CTD surface, and a E368A mutation (Lcl-CTDE368A) within the internal cavity, which we expected would not affect binding. Using SAXS, all constructs except for Lcl-CTDR342A produced scattering profiles like wild-type Lcl-CTD, confirming that they were still correctly folded (Supplementary Fig. 12 and Supplementary Table 4). However, Arg342 forms intermolecular hydrogen bonds between Asp316 and Asp319 on the adjacent chain, and the R342A mutation resulted in destabilisation into monomer/trimer (3:1 ratio) (Supplementary Fig. 12), and so this construct was not used for subsequent analysis. We then assessed the ability of the correctly folded mutants to bind immobilised commercially prepared C4S and heparin extracted from bovine trachea and porcine intestinal mucosa, respectively, using an ELISA method. As anticipated, constructs carrying the K369A, K380A, K385A or K391A mutations all displayed a significant reduction in their ability to bind these GAGs when compared with wild-type Lcl-CTD, while the Asp386 mutation showed no difference. However, the Glu368 mutation resulted in higher binding capacity (Fig. 4b).
C4S binds Lcl-CTD across multiple domains
Although the structures of C4S and dermatan sulfate differ in just the location of hydroxyl and carboxyl groups at the C2 and C5 positions of D-glucuronate and D-iduronate, respectively, Lcl does not bind dermatan sulfate20,43. In an attempt to understand this specificity, we started by using solution NMR spectroscopy to investigate interactions between Lcl-CTD and C4S. Using a partially deuterated sample and multidimensional transverse relaxation-optimised spectroscopy (TROSY) NMR we were able to assign 61% of the potential amide backbone resonances of Lcl-CTD (Supplementary Fig. 13). Most missing assignments were located at the N-terminus (Glu252 to Ser275), the H2 helix (Asp316 to Asn323; positioned at the inter-domain interface), and the adjacent S3-S4 loop and S4 strand (Val303 to Ser311) (Fig. 5a). Furthermore, many peaks displayed variable intensity and ~10% of residues were present in multiple conformational states (Supplementary Fig. 13). We then compared 1H-15N TROSY spectra of Lcl-CTD titrated against increasing concentrations of commercially prepared C4S and observed significant broadening that approached saturation at 0.5âmg/ml C4S (Supplementary Fig. 14). Although no reliable data could be measured for Lys369, Lys380, Lys385 and Lys391 due to spectral overlap, significant peak broadening was observed for the neighbouring residues Thr381 and Thr392 (Fig. 5bâd). Moreover, substantial broadening was also detected for residues adjacent to Lys369 (Tyr292, Thr313, Trp315, His326, Arg342, Met350).
a Chemical structure of chondroitin-4-sulfate (C4S) and heparin. GlcA: D-glucuronate; GalNAc(4S): N-acetyl-D-galactosamine-4-O-sulfate; IdoA(2S): α-L-iduronate-2-O-sulfate; GlcNS(6S): 6-O-sulpho-2-(sulphoamino)-D-glucosamine. b ELISA analysis of binding between immobilised mixed length C4S or heparin and wild-type (WT) and mutant (E368A, K369A, K380A, K385A, D386A, K391A) His-tagged Lcl-CTD. BSA-coated wells (â) were used as controls. Comparison of mutated Lcl-CTD to their respective WT shows a strong significant difference by two-tailed Studentâs t test (***pâ<â0.001: K368A (heparin), K369A (C4S/heparin), K380A (C4S), K385A (C4S/heparin), K391A (C4S/heparin); **pâ<â0.001: K368A (C4S)) except for K380A (heparin; pâ=â0.241) and D386A (C4S/heparin; pâ=â0.588/0.931). Data are presented as mean valuesâ±âSEM derived from nâ=â4 biologically independent experiments. OD optical density. Source data are provided as an accompanying Source Data file. c Trimer of Lcl-CTD shown as surface representation with residues whose amides could be assigned coloured green, and those that could not be assigned coloured purple. d NMR 1Hâ15N TROSY spectrum of Lcl-CTD in presence (right) or absence (left) of 0.5âmg/ml mixed length C4S. Chemical shifts that have disappeared after addition of C4S are highlighted in red, and those that display significant broadening (reduction of >85% peak intensity) are highlighted in orange. e Same information as (d) shown as a bar graph with orange bars highlighting significant peak broadening on addition of C4S. Missing assignments have a value of zero and those where peaks disappear on addition of C4S are highlighted with red circles. Source data are provided as an accompanying Source Data file. f As (d) and (e) but mapped onto the surface trimer of Lcl-CTD.
We next carried out molecular docking with HADDOCK44,45, using monomeric and trimeric Lcl-CTD and dp4, dp6, dp8 and dp10 C4S oligosaccharides as starting structures, and ambiguous interaction restraints (AIRs) derived from the GAG binding ELISA and NMR chemical shift perturbations (CSP). Docking between monomeric Lcl-CTD and C4S dp8 (cluster one of the three major clusters) produced models consistent with the experimental data (Supplementary Fig. 15), and a trimer bound with one molecule of C4S dp8 was then created (HM model) and further examined using MD (Fig. 6a and Supplementary Fig. 16). Docking between trimeric Lcl-CTD (HT1 and HT2 models) caused changes to the trimer interface and they were unstable during MD simulations (Supplementary Fig. 16), and not taken forward for further analysis. MD simulations were also run on Lcl-CTD (residue 271 to 401) alone (Supplementary Fig. 17). Analysis of the Root Mean Square Fluctuation (RMSF) profiles indicated a high flexibility for the first 7 residues of each monomer (Asp217 to Ile223), consistent with the disordered nature of the adjoining N-terminal part of the chain. However, the overall trimeric structure of the domain was stable throughout the simulations, with RMSD from the initial structure quickly reaching a plateau and staying below 2âà in all the replicas.
a Modified HADDOCK model (HM: C4S dp8 docked against a monomer of Lcl-CTD and reconstituted as a timer) used as a starting structure for MD simulations. Surface residues in close contact of C4S are annotated and coloured red (reduction >100% peak intensity by NMR), orange (reduction >85% peak intensity) or blue (lysine residues identified by ELISA). Chains are labelled A to C. b Spatial distribution function (sdf) of the sulfur atoms of C4S dp8 during the simulations. The purple isosurface connects the points with sdfâ=â20âÃâaverage value. The protein surface (initial MD structure) is represented in white with the positions of Lys369, Lys380, Lys385 and Lys391 coloured blue. c Frequency of occurrence (occupancy) of contacts between C4S dp8 and the Lcl-CTD during the simulations colour-mapped onto the protein surface (initial MD structure) from white (0â10%) to orange (40.9%). Residues with an occupancy >10% in chain A are annotated as (a) or black (identified from MD). Source data are available at https://doi.org/10.5281/zenodo.10974841.
While the overall structure of Lcl-CTD was found to be stable in all replicas of simulations run on the HM model, C4S displayed highly dynamic binding to the Lcl-CTD surface (Supplementary Fig. 16). Although during the simulations C4S remained in contact with Lcl-CTD for much of the time, the different components of the glycan frequently detached and then reattached to different regions of the Lcl-CTD surface. From its starting position, during the simulations the polysaccharide either remained in the same region or explored other parts of the top surface of the protein. As indicated by the spatial distribution function (sdf) of C4S sulfur atoms (Fig. 6b) and the frequency of contacts between C4S and the Lcl-CTD trimer, (Fig. 6c), C4S more often bound to the central region of the top surface.
Structures from all the replicas were clustered using an optimised cut-off of 17.5âà on the pairwise Cα RMSD values, the high value reflecting the variety of binding poses explored by C4S in the different replicas. Three major C4S binding modes were identified, and although they had a relatively high RMSD, they broadly reflected a preference for Lcl-CTD surface localisation of the bound glycan chain, although the clusters did not reflect a preference in orientation (Fig. 7a). After considering the 3-fold symmetry of the system, the first and third mode were found to be closely related, with C4S dp8 showing a similar position in the two modes, and these were therefore combined. The first and major binding mode is the most frequently observed (M-BM; 63% frequency) and represents C4S binding to the top, central region of Lcl-CTD, along the chain A/C interface. The second and minor binding mode (m-BM; 20% frequency) represents C4S binding primarily to Lcl-CTD chain A and resembles the initial input HM model (Fig. 6a).
a Representative C4S dp8 structures (cluster centre) of the first (MD cluster 1, populationâ=â43%), second (MD cluster 2, populationâ=â21%), and third (MD cluster 3, populationâ=â20%) most populated clusters are shown as sticks, together with the initial protein structure (green surface). The position and orientation of cluster 3 is like that of cluster 1 when 3-fold rotational symmetry is considered. These therefore represent two major binding modes: clusters 1 and 3 (M-BM, major binding mode, populationâ=â63%) and cluster 2 (m-BM, minor binding mode, populationâ=â21%). Chains are labelled A to C. b Representative MD structures of C4S dp8 bound to Lcl-CTD selected to illustrate binding across 1-chain (populationâ=â36%; derived from m-BM and M-BN), 2-chains (populationâ=â35%; primarily derived from M-BM) and 3-chains (populationâ=â28%; primarily derived from M-BM). Structures were selected from replica 7, 11 and 6, respectively. Hydrogen bonding interactions between C4S and Lcl-CTD detected by PLIP110 are shown as dashed red lines. The protein residues involved in the interactions are labelled. Cyan spheres indicate the location of C1 hydroxyl and C5 carboxyl groups within C4S D-glucuronate residues, which if switched would perturb binding. c Models of glycosaminoglycan (GAG) binding to the Lcl-CTD trimer. Schematics of the general major and minor binding mode of C4S are shown with bound glycan chain as an arrow, which can bind in either direction. The Lcl-CTD surface could support simultaneous binding to GAGs from one continuous chain (black connected arrows) and/or from multiple chains (olive and wheat arrows). Source data are available at https://doi.org/10.5281/zenodo.10974841.
A more detailed analysis of the distance and interactions between C4S and Lcl-CTD highlighted that C4S can bind across one (36% frequency), two (35% frequency) or all three (28% frequency) Lcl-CTD chains (Fig. 7b, Supplementary Fig. 18 and Supplementary Tables 6, 7). While binding of C4S to a single chain of Lcl-CTD is observed in both the major and minor binding modes, binding across multiple Lcl-CTD chains largely reflects the major binding mode alone. Scrutiny of these different complex formations indicates that the Lcl-CTD residues found to be most frequently involved in hydrogen bonding with C4S are Ser371, Ser390 and Lys391, which are located at the central region of the top surface (Fig. 6c and Supplementary Table 7). On the other hand, we observed that C4S dp8 binds to Lcl-CTD using 4 to 6 saccharide units (GalNAc(4S) and GlcA), either as a continuous stretch or with the glycan looped out in the middle of the chain, and forms hydrogen bonding interactions through its carboxylates, amides, sulfates, and hydroxyl groups. Moreover, examination of the representative structures of C4S dp8 binding across 1-, 2-, and 3-chains of Lcl-CTD suggests that the replacement of D-glucuronate with D-iduronate would result in the disruption of some hydrogen bond interactions (Fig. 7b). It would also require changes in the glycan conformation to avoid clashes within dermatan sulfate and between dermatan sulfate and Lcl-CTD, and together this provides at least some explanation for the selectivity of Lcl-CTD for different GAGs.
Discussion
Adhesion is a fundamental process in bacteria and adhesin proteins often work in synergy to enable colonisation of niche environments. Several adhesins have been identified in L. pneumophila that play important roles in the recognition of eukaryotic hosts, and these include the type IV pilus (T4P)46 and its associated PilY1 pilus tip adhesin47,48, Hsp6049, RtxA50, MOMP51, LaiA52, and Lcl19. We have determined that Lcl is a trimeric structure formed of three regions: an N-terminal helix/coiled-coil, an elongated collagen-like region, and a DUF1566 containing C-terminal region. We previously observed Lcl secreted in bacterial culture supernatants of L. pneumophila strain 130b11. Subsequently Lcl was detected on the bacterial surface in strains Philadelphia-119 and Lp0223, and we have now shown this in 130b. The L. pneumophila T2SS exports >25 proteins, and three of these associate with host organelles and/or the bacterial surface upon their secretion (i.e., ChiA, Lcl, ProA); we previously observed ChiA and ProA tethered to the LCV membrane53 and ChiA to the outer membrane surface18. Although the mode of ProA membrane binding is unclear, ChiA binds the L. pneumophila surface using its N3 domain, formed of a fibronectin III module domain-like fold18. In addition, we have shown that NttA binds phosphatidylinositol-3,5-bisphosphate (PtdIns(3,5)P2) and other phosphorylated phosphoinositides, which indicates that NttA may also be targeted to host organelles during infection17. In this study we have revealed that Lcl binds the bacterial surface through its N-terminal region. Using a GFP fusion, we observed that Lcl-N is predominantly a monomeric amphipathic helix, but it can also form a coil-coiled structure with increasing concentration. Together with our MD analysis this suggests that within intact Lcl, Lcl-N will primarily form a coil-coiled, but this is not symmetrical, and as binding was shown using the isolated monomeric species, the trimer may become displaced and bind as three independent amphipathic helices. Alternatively, purified Lcl-N will inevitably still contain trimers and a trimeric binding mechanism to the bacterial membrane or to another yet to be identified outer-membrane structure cannot be ruled out; however, the surface of the current trimer model is not hydrophobic and would not be able to bind within a lipid-membrane in its current conformation. Nonetheless, this represents a distinct mechanism that has not been observed for other T2SS substrates.
Bacterial collagen-like proteins have been identified in a wide range of Gram-positive and Gram-negative bacteria54. A defining feature of collagen is the presence of Gly-X-Y repeats, where in eukaryotes X and Y are often proline and hydroxyproline, respectively, with hydroxyproline mediating inter-chain hydrogen bonding to stabilise the triple helical structure. However, bacteria cannot make hydroxyproline, and their collagen-like structures do not have a requirement for proline. Instead, they contain a higher proportion of charged/polar residues, and these are predicted to interact across different chains55,56. As both bacterial and eukaryotic collagens display similar thermal stabilities (Tmâ~â35â39â°C and ~37â°C, respectively)55, it is unclear why eukaryotic systems do not produce bacterial-like collagen, although these structures may be unfavourable for the formation of higher-order fibrils, which are not observed in prokaryotes.
From our examination, Lcl from L. pneumophila strain 130b contains 12 repeats of a consensus 15 residue sequence (GPQGLPGPKGD(K/R)GEA) within its collagen-like region. Analysis of Lcl from other strains isolated from clinical samples, the environment, and hot springs, however, has demonstrated a high polymorphism within this region, with Lcl from Philadelphia-1 containing 19 repeats19. While hot spring isolates (â¥40â°C) displayed a preference for 13 repeats, clinical and environmental isolates (â¤37â°C) were bimodal with a preference for both 8 and 13/14 repeats. Using a 19-residue peptide encompassing a single Lcl CLR consensus sequence, we observed this peptide to be largely monomeric, but able to form triple helical structures with a reduction in temperature. Together this suggests that variability in Lcl repeats may reflect the minimal length of collagen required for Lcl to retain its folding under different environmental temperatures. Our CD spectroscopy analysis showed that Lcl has two melting temperatures of 38â°C and 45â°C (Supplementary Fig. 5), which are similar to other reported bacterial and eukaryotic collagens55. However, in eukaryotes, folding of collagens are initiated at their C-terminus and mediated by trimerisation domains, before being propagated through to the N-terminus57. This indicates that the Lcl C-terminal domain may also function to initiate folding of the collagen-like region.
Lcl has been shown to mediate adhesion/invasion of L. pneumophila to a range of host cell types. In one study, a lcl mutant displayed a ~30% reduction in binding to NCI-H292 lung mucoepidermoid cells, compared with the wild-type Lp0220. In another study, incubation of wild-type Philadelphia-1 with Lcl antibodies resulted in a ~50% drop in binding to A549 lung epithelial cells, and 0â30% drop in binding to U937 macrophage cells, although no difference in binding was observed with the amoeba Acanthamoeba castellanii19. Specifically, Lcl binds to sulphated GAGs on the surface of host cells and both the collagen-like and C-terminal regions have been implicated here19,20,21. When lcl containing 14 or 19 repeats was expressed in Philadelphia-1, there was an increase in adhesion/invasion of A549 cells with 14 compared with 19 repeats, but the opposite was observed with U937 cells19. Fucoidan has also been shown to bind the collagen-like repeat region of Lcl from Lp02 with a higher affinity than to the C-terminal domain, and increasing the number of CLRs has been correlated with tighter binding21. Together this suggests that the C-terminal domain plays a general role in the recruitment of ligands, but at least for some GAGs, synergistic binding along the collagen-like region can provide further increases in overall affinity.
The C-terminal domain of Lcl is highly conserved (>97%) across different strains of L. pneumophila and we have determined that it forms a distinct trimer structure with a deep negatively charged internal cavity and positively charged external surface (Fig. 3). Intact Lcl from Lp02 binds fucoidan with 10-fold higher affinity than C4S (KD 18ânM and 173ânM, respectively)43, and this likely reflects the increased level of sulphation in fucoidan. Using mixed chain length heparin and C4S, we have demonstrated that the strong positive charge on the Lcl-CTD surface is important for glycan recognition (Fig. 4b). Furthermore, using MD simulations, and focussing on binding to a dp8 structure of C4S, we have identified two predominant binding modes for this ligand (Fig. 7a). A major mode (M-BM) which runs across the middle of the Lcl-CTD trimer, and a minor mode (m-BM) which is largely localised to a single chain of the trimer. Only Lys385 and Lys391 form hydrogen bonds with C4S during the simulations, with relatively low frequency, while Lys369, Lys380, Lys385 and Lys391 are all within proximity (Fig. 6b and Supplementary Tables 6, 7). This indicates that the primary role of these lysine residues is to provide general electrostatic attraction for the GAGs, rather than specific recognition. Furthermore, during the simulations the majority of the Lcl-CTD upper surface is involved in sampling dp8 C4S, mainly through serine, threonine, and asparagine residues (Fig. 6c), with C4S repeatably dissociating and reassociating at different sites. This indicates that C4S does not bind at two distinct sites but is instead recognised through a range of conformational states, albeit preferentially within two regions. Such fuzzy binding has been observed with intrinsically disordered proteins, for example Cdc4 binding to multiply phosphorylated Sic158, and this may reflect a mechanism that enables recognition of a broad range of ligands. Residues at the inter-trimer interface (Ser371, Ala372, Lys391) appear to have a more targeted role in binding, although it is not clear whether these are specific for C4S or represent a more general GAG binding mode. As C4S and other GAGs can contain up to ~135 disaccharide repeats it is feasible that one or more glycan chains could bind simultaneously at multiple sites on the Lcl-CTD surface (Fig. 7c).
Using NMR with mixed chain length heparin and C4S, we also observed major CSPs for residues on the side of the Lcl-CTD trimer (i.e., Arg342, Met350) (Fig. 5d), although this binding was not observed during the MD simulations using C4S dp8. Arg342 and Met350 could represent a lower affinity site that is only occupied once GAGs have bound to the top surface of Lcl-CTD but may facilitate single GAG chain binding between the C-terminal domain and the collagen-like repeat region. However, we also showed that Arg342 hydrogen bonds with Asp316 and Asp319 in an adjacent chain and stabilises the trimer, and an R342A mutation produced a mixed population of monomeric (major species) and trimeric (minor species) Lcl-CTD (Supplementary Fig. 12). Furthermore, in 1H-15N TROSY spectra of Lcl-CTD, many interfacial residues were either broadened out and could not be assigned (e.g. Asp316, Asp319) or were present as multiple peaks (e.g. Arg338, Thr341, Glu368) (Fig. 5a and Supplementary Fig. 13). This demonstrates that Lcl-CTD experiences conformational exchange and may be present in more than one trimeric state, although as this was not observed during the 400âns MD simulations it must occur on the slow NMR time scale (µs to ms). Furthermore, the R342A SAXS scattering profile deviates from the wild-type at qâ>â0.15âà â1(Supplementary Fig. 12 and Supplementary Table 4) indicating an alternate confirmation of the Lcl-CTD trimer, but with the same overall global structure. Therefore, as Arg342 and Met350 are located at the domain interface, CSPs observed for these residues during NMR titrations with heparin and C4S may instead reflect indirect binding events due to stabilisation of one trimeric state upon association with GAGs.
We have now identified the lcl gene in eight Legionella species, however, except for L. quateirensis we see little conservation in the surface lysine residues that are present in L. pneumophila (Supplementary Fig. 19). Although this suggests that Lcl-CTD from these other species will not exhibit a large positively charged surface, we do see conservation of other key C4S binding residues mainly located at the inter-trimer interface (i.e. Ser370, Ser371, Ala372, Asn373, Asn378). This may indicate that the Lcl C-terminal domain has different glycan specificity outside of L. pneumophila. Lcl is also known to mediate auto-aggregation and biofilm formation of L. pneumophila in the presence of divalent cations20,25 and it has been suggested that trimeric Lcl from Lp02 can form higher-order structures which could function in clumping adjacent bacteria21. However, these observations were independent of divalent cations, and our biophysical characterisation of Lcl from 130b shows that it is extremely stable and homogenous (Fig. 2a, c and Supplementary Fig. 5). Glu368 is highly conserved across the Legionella genus and is located within the internal cavity, where three residues are in proximity perpendicular to the trimer 3-fold axis (Fig. 3c and Supplementary Fig. 19). When Glu368 was mutated to alanine, we observed a significant increase in binding of Lcl-CTD to both heparin and C4S (Fig. 3c), which again could be explained by this mutation stabilising one of the trimeric states and priming it for GAG recognition. We speculate that Glu368 may bind divalent cations and have a role in modulating the biofilm activity of Lcl, potentially through increasing the population of the alternate Lcl-CTD conformation, but further studies are needed.
Methods
Bacterial strains and media
All strains used in this study are listed in Supplementary Data 2. L. pneumophila strain 130b (American Type Culture Collection [ATCC] strain BAA-74; also known as strain AA100 or Wadsworth) served as wild type and parent for all mutants59. The L. pneumophila lspF mutant NU275 strain60 and all newly constructed mutants (NU468, NU469) were routinely grown at 37â°C on buffered charcoal yeast extract (BCYE) agar or in buffered yeast extract (BYE) broth61. Isotopically defined M9 minimal medium (pH 7.4) contained (per litre) 6.0âg Na2HPO4·7H2O, 3âg KH2PO4, 0.5âg NaCl, 0.12âg MgSO4·7H2O, 22âμg CaCl2, 40âμg thiamine, 8.3âmgâFeCl3·6H2O, 0.5âmg ZnCl2, 0.1âmg CuCl2, 0.1âmgâCoCl2·6H2O, 0.1âmg H3BO3 and 13.5âmg MnCl2·6H2O, supplemented with 2âg [U-13C6]glucose and/or 0.7âg 15NH4Cl (Cambridge Isotope Laboratories). M9 media was made up in deuterium oxide (Sigma) to produce perdeuterated protein samples and pH was adjusted using 1âM NaOH solution.
Mutant construction
All primers and plasmids used in this study are listed in Supplementary Data 3 and 4, respectively. To make the L. pneumophila NU468 and NU469 mutant strains that have a nonpolar, unmarked deletion within the lcl gene, we employed overlap extension PCR (OE-PCR) followed by allelic exchange, as before13,62. DNA fragments of the 5Ⲡand 3Ⲡregions flanking the lcl ORF were PCR-amplified from 130b DNA using the primer pairs lcl-UpF and lcl-UpR for 5Ⲡlcl, and lcl-DownF and lcl-DownR for 3Ⲡlcl. A kanamycin (Kn)-resistance gene flanked by Flp recombination target sites was PCR-amplified from the vector pKD463 using the primers lcl-P1 and lcl-P2. We then performed two-step OE-PCR to combine the 5Ⲡand 3Ⲡregions of lcl with the respective Kn-resistance cassette. A PCR product matching the correct target size was gel purified and ligated into pGEM-T Easy (Promega) to yield plcl::Kn. After transforming strain 130b with the newly made plasmid, bacteria containing an inactivated lcl gene was obtained by plating on BCYE agar containing Kn. Confirmation of the mutated lcl gene was done by PCR using the above-mentioned primers. Following transformation with pBSFLP, which encodes a Flp recombinase along with a gentamicin-resistance marker63, mutants harbouring the desired unmarked deletion and lacking pBSFLP were recovered by plating on BCYE agar containing 5% (w/v) sucrose and scored for loss of resistance to both Kn and gentamicin, as before64. The mutants were verified by sequencing of PCR amplicons.
Immunoblot analysis of bacterial culture supernatants
Wild-type and mutant L. pneumophila strains that had been grown for 3 days on BCYE agar were suspended into BYE broth to an OD660 of 0.3 and grown overnight at 37â°C to an OD660 of ~1.5. Supernatants were obtained by centrifugation, sterilised by passage through 0.2-μm filters (EMD Millipore), and then concentrated, as before53. Following dilution in SDSâloading buffer, the samples were subjected to PAGE and immunoblot analysis. To that end, purified recombinant Lcl protein (above) was submitted to Lampire Biological Laboratories (Pipersville, PA) at a concentration of 2âmg/ml for the production of rabbit polyclonal antisera, analogously to what we had been done before for other secreted proteins of L. pneumophila53. Following an overnight incubation at 4â°C in 5% BSA (w/v)âTris-buffered saline (TBST), the blot was incubated overnight at 4â°C with the primary anti-Lcl antiserum at 1:1000 in 5% BSA-TBST. After four, 10-min washes with the TBST buffer, the membrane was further incubated for 1âh at room temperature with secondary goat anti-rabbit horseradish peroxidase antibody (Cell Signaling Technology, Catalog #704) at 1:10,000 in 5% BSA-TBST. Finally, after another series of washes, the blot was developed using Amersham ECL Prime reagent and exposed to X-ray film, as before13,18. The full scan blot is provided as an accompanying Source Data file.
Bacterial whole-cell ELISA
The assay for detecting protein on the surface of L. pneumophila was done as previously described for the detection of ChiA, another substrate of the L. pneumophila T2SS18. The bacterial strains (130b, NU468, NU469, NU275) were grown on BCYE agar for 3 days at 37° and then using a sterile cotton swab, bacteria were resuspended in 1âml sterile PBS to an OD660 0.3. These were centrifuged at 10,000âÃâg for 3âmin, and then washed once with PBS. Bacteria were fixed in 4% (w/v) paraformaldehyde for 10âmin, followed by two 1âml washes in PBS, and then resuspension in coating buffer (100âmM bicarbonate/carbonate buffer, pH 9.6) to a final OD660 0.03. In total, 100âμl of this suspension were added into the wells of Nunc MaxiSorp immunoassay plates (Thermo Fisher Scientific) and incubated overnight at 4â°C. The wells were then washed three times with 200âμl of wash buffer (PBS, 0.05% Tween-20), followed by addition of 200âμl of blocking buffer (wash buffer with 5% dried milk) and incubation for 1âh at 25â°C. Blocking buffer was removed and then samples were incubated with 100âμl of primary anti-Lcl antibody (Lampire Biological Laboratories, Pipersville, PA) at 1:1000 dilution in blocking buffer for 1âh at 25â°C. Following three, 200-μl washes with wash buffer, samples were incubated with 100âμl of secondary goat anti-rabbit horseradish peroxidase antibody (Cell Signaling Technology, Catalog #704) diluted 1:1000 in blocking buffer for 1âh at 25â°C. Following five washes with 200âμl wash buffer, samples were incubated with 100âμl 3,3â,5,5â-Tetramethylbenzidine (TMB) substrate for 15âmin at 25â°C, and then, the reaction was stopped by addition of 50âμl of 2âN sulfuric acid. Absorbance values were measured at 450ânm with wavelength correction of 570ânm using a microplate reader (Synergy H1, BioTek).
Construction of recombinant expression plasmids
Intact lcl (residues 1â401) and its C-terminal fragment (residues 252â401) were amplified by PCR from L. pneumophila 130b gDNA using primer pairs RLC1/RLC2 and RLC3/RLC4, respectively. These were then cloned into the pET-46 Ek/LIC vector (Novagen) using ligation-independent cloning. Synthetic genes gRLCm1 to gRLCm7 (Synbio Technologies, https://synbio-tech.com), pNGFP and pGFP (GenScript, https://www.genscript.com), were cloned into pET28b vector using NcoI and XhoI restriction sites to create plasmids pRLCm1 to pRLCm7, pNGFP and pGFP, respectively. All plasmids and synthesised genes used in this study are listed in Supplementary Data 4 and 5, respectively.
Protein purification
Intact Lcl, N-GFP and GFP were expressed in E. coli strain BL21(DE3) (New England Biolabs) grown in LB media containing either 50âµg/ml ampicillin (Lcl) or 50âµg/ml kanamycin (N-GFP and GFP). Lcl-CTD was expressed in E. coli strain BL21(DE3) (New England Biolabs) grown with 50âµg/ml ampicillin in either LB media, minimal media supplemented with selenomethionine (Molecular Dimensions), minimal media containing 0.07% (w/v) 15NH4Cl2 (Cambridge Isotope Laboratories), 100% (v/v) D2O (Sigma) or minimal media containing 0.07% (w/v) 15NH4Cl2, 0.2% (w/v) [13C]glucose (Cambridge Isotope Laboratories), 100% (v/v) D2O. Expression was induced with 0.5âmM isopropyl-d-1-thiogalactopyranoside (IPTG) at an OD600nm of 0.6 and cells were harvested after growth overnight at 18â°C. Samples were purified using nickel-affinity chromatography followed by gel filtration using a Superdex-200 gel-filtration column (GE Healthcare), equilibrated in 20âmM Tris-HCl pH 8.0, 200âmM NaCl. To ensure efficient back exchange of amide protons, perdeuterated Lcl-CTD samples were initially purified in the presence of 8âM urea and then after nickel-affinity chromatography they were refolded by dialysis against 20âmM TrisâHCl pH 8, 200âmM NaCl, 1âM urea, 5âmM ethylenediaminetetraacetic acid (EDTA) and then 20âmM TrisâHCl pH 8, 200âmM NaCl. Engineered Lcl-CTD carrying R342A, E368A, K369A, K380A, K385A, D386A and K391A mutations in the lcl-CTD gene (Lcl-CTDH326A, Lcl-CTDR342A, Lcl-CTDR368A, Lcl-CTDK369A, Lcl-CTDK380A, Lcl-CTDK385A, Lcl-CTDD386A and Lcl-CTDK391A, respectively) were purified as wild-type Lcl-CTD.
SEC-MALS
Lcl or Lcl-CTD were injected onto a Superose 6 Increase 10/300 column (GE Healthcare) coupled to a Wyatt Technology system and run in 20âmM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) pH 7.5, 200âmM NaCl. BSA was run as a monodisperse reference protein. A dn/dc value of 0.185âml/g was used for molecular weight calculations and data analysis was performed with Astra V software.
Rotary shadowing electron microscopy
The overall structure of Lcl was analysed using transmission electron microscopy after rotary shadowing using an adapted mica sandwich technique65,66. Five µl of Lcl in 20âmM HEPES pH 7.5 (5âµg/ml) was sprayed on a freshly cleaved mica sheet, allowed to adsorb, and then washed with ultrapure water. The mica was mounted on the stage of a Polaron Freeze fracture instrument and then freeze dried at â100â°C. The temperature was lowered to â150â°C for shadowing with Pt/C on a low angle (5°) and a carbon backing layer was added for support. These were removed from the mica in distilled water and placed on 400 mesh copper grids. Micrographs were taken with a JEM 1230 transmission electron microscope operated at 80âkV.
Peptide modelling
Modelling of monomeric and trimeric Lcl-N was carried out using the sequence for Lcl residues 1â30 from L. pneumophila 130b strain with AlphaFold2 or AlphaFold2-multimer35, respectively. Sequence alignments and templates were generated through MMseqs267 and HHsearch68, and run through the ColabFold notebook69. No prior template information was provided, and sequences used during modelling were both paired from the same species and unpaired from multiple sequence alignment.
Peptide NMR
All peptides were synthesised by Thermo Scientific to >95% purity. Unlabelled Lcl-N peptide (KSNPASQAYVDGKVSELKNELTNKINSIPS-NH2) was resuspended to 1âmM in 25âmM NaPO4 pH 6.5, 100âmM NaCl, 10% (v/v) D2O with or without 80âmM perdeuterated d25-SDS, and 1H-1H NOESY spectra (200âms mixing time) were recorded at 298âK on a 700âMHz Bruker Avance III HD equipped with cryoprobe. Unlabelled Lcl-CLR peptide (Ac-EAGPQGLPGPKGDRGEAGP-NH2) and Lcl-CLR peptide containing uniformly 15N labelled glycine residues (Ac-EAGPQGLPGPKGDRGEAGP-NH2; labelled positions underlined) were resuspended to 3âmM in 20âmM HEPES pH 6.0, 50âmM NaCl, 10% (v/v) D2O. Peptides were then incubated at 90â°C for 15âmin and then 4â°C for 1 week. Full backbone and side chain assignments for the monomeric unlabelled peptide was achieved using standard double-resonance peptide assignment experiments (1H-15N HSQC, 1H-13C HSQC, 1H-13C total correlation spectroscopy (TOCSY), 1H-1H TOCSY, 1H-1H correlation spectroscopy (COSY), 1H-1H ROESY with 200âms mixing time) recorded at 288âK on a 700âMHz Bruker Avance III HD equipped with cryoprobe. In addition, 1H-1H ROESY (200âms mixing time), 1H-1H NOESY (240âms mixing time) and 1H-15N HSQC spectra were recorded at 275âK, and a 1H-1H NOESY spectrum (240âms mixing time) was recorded at 310âK, on an 800âMHz Bruker Avance III HD equipped with cryoprobe. All spectra were processed using NMRPipe70 and analysed using ANALYSIS71. Secondary structure propensity of the monomeric Lcl-CLR peptide at 288 K were calculated using the δ2D server, providing Cα, Cβ, Hα, N, NH backbone chemical shifts72. All data was acquired using TOPSPIN 3.5.6.
Molecular dynamics of the Lcl-N trimer
The top ranking AlphaFold2 model of trimeric Lcl-N was run through the CHARMM-GUI73 solution builder server and placed in 0.15âM NaCl solution with 6.62âÃâ6.62âÃâ6.62ânm3 box dimension, resulting in a total of 55 salt, 27969 water, and 1377 protein atoms. MD simulations were run using GROMACS 201974 and the CHARMM36 force field75. The system was thermalized, equilibrated, and simulated at 300âK and 1âbar pressure following the simulation protocol suggested by CHARMM-GUI73. From the final structure after pressure equilibration, three independent production trajectories of 1âµs, that resulted in statistically consistent configurational ensembles, were generated. Trajectories were analysed using the python package MDAnalysis76. The system setup for these simulations is summarised in Supplementary Table 2.
Bacterial surface binding assay
L. pneumophila strain 130b was grown on BCYE agar plates (Oxoid, UK) at 37â°C aerobically for 3 days. Colonies were emulsified in 5âml sterile PBS (Oxoid, UK) with a sterile cotton swab to OD600 nm 0.3, centrifuged at 5000âÃâg for 10âmin, and then washed once in sterile PBS to remove cell debris and unbound protein. N-GFP and GFP at 2.5âmg/ml (~80âµM) were purified by SEC and the monomeric species isolated. A 1.5âml aliquot of resuspended cells was then incubated with either 20âμM N-GFP, 20 μM GFP, or sterile PBS for 1.5 hrs at room temperature with gentle end-over-end mixing. Following incubation, cells were pelleted by centrifugation at 12,000âÃâg and washed four times in sterile PBS, before being resuspended in 300âμl sterile PBS. Three 100 μl aliquots of resuspended cells were added to a 96-well microtitre plate and fluorescence intensity was measured at excitation/emission 489/508ânm (CLARIOstar Plus Microplate Reader), with fluorescence intensity being normalised against PBS only.
Circular dichroism
Far-UV CD spectra were measured in a Chirascan (Applied Photophysics) spectropolarimeter thermostated at 10â°C. Spectra for Lcl (0.05âmg/ml) in 10âmM HEPES pH 8.0 was recorded from 260 to 195ânm, at 0.5ânm intervals, 1-nm bandwidth, and a scan speed of 10ânm/min. Three accumulations were averaged for each spectrum. For thermal denaturation experiments, Lcl (0.05âmg/ml) in 10âmM HEPES pH 8.0 was recorded at 199ânm between 10â°C and 75â°C in 1â°C increments. Each increment was recorded in triplicate and then averaged.
Crystal structure determination
Selenomethionine labelled Lcl-CTD (Lcl-CTD; 15âmg/ml) and native Lcl-CTD (Lcl-CTD/SO4; 20âmg/ml) in 20âmM Tris-HCl pH 8.0, 200âmM NaCl, 20âmM EDTA were crystallised using the sitting-drop vapour-diffusion method grown at 20â°C in either 2.0âM (NH4)2SO4, 0.1âM Bis-Tris pH 6.5 or 20% (v/v) glycerol, 20% (w/v) polyethylene glycol (PEG) 4000, 30âmM NaNO3, 30âmM Na2HPO4, 30âmM (NH4)2SO4, 100âmM Bicine, 100âmM Tris pH 8.5, respectively. Crystals were briefly soaked in well solution complemented with additional 30% or 10% (v/v) glycerol, respectively, before flash freezing in liquid nitrogen. Diffraction data were collected at 100âK at beamline I03 of the Diamond Light Source (DLS), United Kingdom, with wavelengths 0.97969âà (Lcl-CTD) and 0.97625âà (Lcl-CTD/SO4). Data were processed using XDS and scaled with AIMLESS, within the XIA2 pipeline77,78,79. For Lcl-CTD, two selenomethionine sites were located in each Lcl-CTD molecule using SHELXD80 and then phases were calculated using autoSHARP81 (figure of merit: acentric/centric 0.251/0.010; phasing power 0.865). After automated model building with ARPWARP82, the remaining structure was manually built within Coot83. Refinement was carried out with REFMAC84 using non-crystallographic symmetry (NCS) and translation-libration-screw (TLS) groups, and 5% of the reflections were omitted for cross-validation. For Lcl-CTD/SO4, molecular replacement was carried out in PHASER85 using a single chain of Lcl-CTD as the search model. Refinement was again carried out with REFMAC84 using non-crystallographic symmetry (NCS) and translation-libration-screw (TLS) groups, and 5% of the reflections were omitted for cross-validation. Both structures were run through PDBREDO86 as a final step of refinement. The quality of the Lcl-CTD and Lcl-CTD/SO4 models were assessed by MolProbity87. Ramachandran statistics showed 97.7% and 98.8% of residues in the most favoured region and 100% and 100% in the allowed regions, respectively. Processing and refinement statistics of the final model can be found in Supplementary Table 3.
SEC-SAXS
Data were collected at beamline B21 at the Diamond Light Source (DLS), UK88. 60 μl of WT Lcl-CTD, Lcl-CTDR342A, Lcl-CTDE368A, Lcl-CTDK369A, Lcl-CTDK380A, Lcl-CTDK385A, Lcl-CTDD386A and Lcl-CTDK391A (5âmg/ml) in 20âmM TrisâHCl pH 8, 200âmM NaCl, 5âmM EDTA were applied to a Shodex KW403-4F column at 0.16âml/min and SAXS data were measured over a momentum transfer range of 0.004â<âqâ<â0.44âà â1. Peak integration and buffer subtraction were performed in CHROMIXS89. The radius of gyration (Rg) and scattering at zero angle (I(0)) were calculated from the analysis of the Guinier region by AUTORG90. The distance distribution function (P(r)) was subsequently obtained using GNOM90, yielding the maximum particle dimension (Dmax). The disordered N-terminus of the Lcl-CTD crystal structure was built using MODELLER91 and refinement of the N-terminus in the intact model against the corresponding SAXS curve was carried out with EOM292 with fixing of the ordered domains in the trimer. Bead modelling was carried out using DAMMIF/DAMMIN90, and compared with atomic structures using SUPCOMB90. CRYSOL90 was used to compare models against solution SAXS curves. Processing and refinement statistics can be found in Supplementary Tables 4 and 5.
GAG binding ELISA
Immulon 2-HB 96-well plates (VWR) were coated overnight at 4â°C with either 50âμl of heparin from porcine intestinal mucosa (Sigma) or chondroitin-4-sulfate from bovine trachea (Sigma) at 100âμg/ml in 50âmM carbonate/bicarbonate pH 9.6. Wells were blocked for 1âh at 25â°C with 200âμl of 0.1% (w/v) bovine serum albumin (BSA) in PBSâ0.05% (v/v) Tween 20 and then washed once with 200âμl of incubation buffer (0.05% (w/v) BSA in PBSâ0.05% (v/v) Tween 20). Wells were then incubated for 3âh at 25â°C with 50âμl of WT Lcl-CTD, Lcl-CTDE368A, Lcl-CTDK369A, Lcl-CTDK380A, Lcl-CTDK385A, Lcl-CTDD386A or Lcl-CTDK391A at 10âμM in incubation buffer. This was followed by four washes with 200âμl of incubation buffer and incubation with 50âμl of anti-His-HRP antibody (1âmg/ml; ThermoFisher Scientific, Catalog # MA1-21315-HRP), diluted 1:2000 in incubation buffer for 1âh at room temperature. After four washes with 200âμl of incubation buffer, 150âμl of o-Phenylenediamine dihydrochloride (Sigma) was added for 30âmin and then data was recorded at 450ânm.
TROSY NMR
Measurements were performed at 37â°C on a 2H15N13C-labelled Lcl-CTD sample (0.5âmM) in 20âmM HEPES pH 7.0, 50âmM NaCl, 5âmM EDTA, 10% D2O on a cryoprobe-equipped Bruker Avance III HD spectrometer with 900âMHz Oxford Instruments magnet. Backbone assignments for 61% of Lcl-CTD (not including the N-terminal His-tag and proline residues) was achieved using standard double- and triple-assignment methods. NMR titration experiments were carried out on a Bruker Avance III HD 800âMHz spectrometer equipped cryoprobe. 2H15N-labelled Lcl-CTD (0.2âmM) in 20âmM HEPES pH 7.0, 50âmM NaCl, 10% D2O with the addition of 0, 10, 20, 50, 100 and 500âμg/ml chondroitin-4-sulfate from bovine trachea (Sigma) was used to measure 1H15N TROSY spectra at 37â°C. All spectra were processed using NMRPipe70 and analysed using the programme NMRVIEW93. Residues that displayed spectral overlap were not analysed for changes in peak intensity between different spectra. All data was acquired using TOPSPIN 3.2.
Experimental driven docking
Molecular docking of C4S oligosaccharides to Lcl-CTD monomer and trimer was carried out with HADDOCK44,45,94 modifying an approach previously used to dock heparin oligosaccharides94. Oligosaccharides dp4, dp6, dp8 and dp10 were generated by the GAG Builder server95. Active and passive residues were chosen based on the CSPs and ELISA-based mutational analysis. Topology and parameter files for the C4S oligosaccharides were generated using the PRODRG server96. Docking of dp2, dp4, dp8 and dp14 were performed for a 1:1 and 3:1 Lcl-CTD:C4S complex for the Lcl-CTD monomer and trimer, respectively. During initial rigid body docking a total of 1000 structures were generated, and then semi-flexible simulated annealing (SA) was performed on the best 200 structures followed by explicit solvent refinement. The final structures were clustered using a RMSD cut-off value of 7.5âà and the clusters were sorted using RMSD and the HADDOCK score.
Molecular dynamics of C4S binding
MD simulations were carried out starting from the crystallographic structure of Lcl-CTD and from three HADDOCK derived models of the complex between C4S and Lcl-CTD. Two models (HT1 and HT2) were obtained from HADDOCK where one molecule of C4S dp8 was docked against a trimer of Lcl-CTD (3:1 Lcl-CTD:C4S). The final model (HM) was obtained from HADDOCK with one molecule of C4S dp8 docked against a monomer of Lcl-CTD (1:1 Lcl-CTD:C4S) but then reconstituted as a trimer (3:1 Lcl-CTD:C4S) based on the crystal structure. Simulations were performed using GROMACS 202074, with the Amber99SB*-ILDN97 force field for the Lcl-CTD and GLYCAM-06j for C4S98. GLYCAM is one of the most commonly used force fields to simulate glycans and it is fully compatible with Amber force fields99. A truncated octahedral box of TIP3P100 water molecules was used to solvate the systems, setting a minimum distance of 12âà between the protein and the edges of the box. Residues with ionisable groups were set to their standard protonation states at pH 7. Counterions (Na+ and Clâ) were added to neutralise the system and reach an ionic strength of 100âmM, leading to a total of ~50,100 atoms (~14,600 water molecules). Periodic boundary conditions were applied. The equations of motion were integrated using the leap-frog method with a 2-fs time step. The LINCS101 algorithm was used to constrain all covalent bonds in the protein, while SETTLE102 was used for water molecules. Electrostatic interactions were evaluated with the Particle Mesh Ewald (PME) method103 using a 9-à distance cut-off for the direct space sums, a 1.2âà FFT grid spacing and a 4-order interpolation polynomial for the reciprocal space sums. A 9âà cut off was set for van der Waals interactions and long-range corrections to the dispersion energy were included.
Each system was minimised through 3 stages with 5000 (positional restraints on heavy atoms)â+â5000 steps of steepest descent, followed by 2000 steps of conjugate gradient. Positional restraints on heavy atoms were initially set to 4.8âkcal/mol/à 2 and they were gradually decreased to 0, while the temperature was increased from 200 to 300âK at constant volume. The system was then allowed to move freely and was subjected to equilibration in NVT conditions at Tâ=â300âK. This was followed by equilibration under NPT conditions with Tâ=â300âK and pâ=â1âbar. For these equilibration steps, the Berendsen104 algorithm was used for both temperature and pressure regulation with coupling constants of 0.2 and 1âps, respectively. At last, NPT equilibration was run after switching to the v-rescale thermostat105 with a coupling constant of 0.1âps and the Parrinello-Rahman barostat106 with a coupling constant of 2âps. Longer equilibrations were run for Lcl-CTD in the presence of C4S (45âns in total for all the steps) compared to Lcl-CTD alone (6.5âns), to allow for relaxation of the GAG binding pose. Lcl-CTD simulations were run in three replicas (400âns for each production, for a total production time of 1.2 μs). For the Lcl-CTD/C4S system, preliminary 50âns production runs were first carried out. The trimeric structure of Lcl-CTD was very stable with the HM model, while unbinding of monomers from the rest of the protein was observed for MT1 and HT2, so only the former model was retained for subsequent simulations. A total of 21 replicas were run for HM (150âns production for each replica). The glycan remained in contact with Lcl-CTD in all but one replica, which was not considered for subsequent analysis, so that the overall production simulation time for Lcl-CTD/C4S was 3 μs.
Contacts between C4S and the protein were analysed with the bio3D107 R-package. A residue was considered in contact with C4S if the minimum distance calculated over all pairs of non-hydrogen side chain atoms was lower than 4âà . Frames sampled every 100âps were analysed. The frequency of occurrence of a given contact was calculated as the percentage of frames in which that contact was observed. The highest occurring contact between any part of C4S and a given protein residue was calculated and averaged over all 20 replicas to give the final value. Cluster analyses were performed using the gromos108 method implemented in GROMACS on the pseudo-trajectories generated by concatenating all the replicas for a given system (production only; 3âÃâ400âns for Lcl-CTD and 20âÃâ150âns for Lcl-CTD/C4S), with frames sampled every 1000âps. For both calculations, the Lcl-CTD Cα atoms, not including the flexible residues 271 to 277, were first fitted to the coordinates of the initial minimised structure. The distance between structures was calculated as the RMSD of the Cα atoms (271 to 277 excluded) for the Lcl-CTD simulations and the RMSD of all C4S atoms for the Lcl-CTD/C4S simulations. Cut-off values were determined to optimise the clustering for each system and were set to 1.1âà for Lcl-CTD and 17.5âà for Lcl-CTD/C4S. The large value for C4S reflects the variety of binding poses explored by the GAG in the different replicas. The structure with the highest number of neighbours for each cluster (central structure) was selected as cluster representative. The population of each cluster was adjusted to consider the 3-fold symmetry of the system. For a given cluster, each frame in the cluster was first rotated by ~ 120o in both directions. Rotation was carried out by superimposing symmetrically equivalent monomers. If after rotation the C4S structure in the frame was found to be closer to a cluster representative different from the original cluster (as measured by the C4S RMSD), the frame was reassigned to that cluster. The spatial distribution function109 (sdf) of C4S sulfur atoms around the protein was calculated by running the GROMACS gmx spatial tool on the pseudo-trajectory of concatenated replicas (production only), with frames sampled every 1âps. Each frame was first fitted to the minimised starting structure using best-fit superposition of Cα atoms (271 to 277 excluded). A grid spacing of 0.5âà was used for the sdf calculation. The average of non-null sdf values was calculated and the isosurface connecting points with sdfâ=â20âÃâaverage was analysed. Each frame of the Lcl-CTD/C4S simulations was classified into one of three binding categories (1-chain, 2-chains, or 3-chains) by calculating the number of Lcl-CTD chains within 3âà of C4S (with the distance calculated as minimum distance between all possible pairs of non-hydrogen atoms from C4S and Lcl-CTD). The system setup for these simulations is summarised in Supplementary Table 2.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Data availability
Atomic coordinates and structure factors files generated in this study have been deposited in the Protein Data Bank database under accession codes 8Q4E (Lcl-CTD) and 8QK8 (Lcl-CTD/SO4). NMR assignments have been deposited in the Biological Magnetic Resonance Data Bank database under accession codes 52394 (Lcl-CTD) and 52395 (Lcl-CLR peptide). SAXS curves have been deposited in the Small Angle Scattering Data Bank database under accession codes SASDUG7 (Lcl-CTD WT), SASDUH7 (Lcl-CTD R477A trimer), SASDUJ7 (Lcl-CTD R477A monomer), SASDUK7 (Lcl-CTD E503A), SASDUL7 (Lcl-CTD K504A), SASDUM7 (Lcl-CTD K515A), SASDUN7 (Lcl-CTD K520A), SASDUP7 (Lcl-CTD D521A), SASDUQ7 (Lcl-CTD K526A). Initial and final structures from MD simulations are available at https://doi.org/10.5281/zenodo.10961237 and https://doi.org/10.5281/zenodo.10974841. The authors will provide raw data, additional information, and materials, including plasmids for protein expression, upon request. These should be addressed to J.G. Source data are provided with this paper.
Code availability
GROMACS tools and bespoke python scripts using the MDAnalysis library were used to analyse the molecular dynamics trajectories for Lcl-N. These are freely available at https://doi.org/10.5281/zenodo.10961237. The codes used to analyse the molecular dynamics trajectories of Lcl-CTD/C4S are available upon request, which should be addressed to A.F.
References
Tison, D. L., Pope, D. H., Cherry, W. B. & Fliermans, C. B. Growth of Legionella pneumophila in association with blue-green algae (cyanobacteria). Appl. Environ. Microbiol. 39, 456â459 (1980).
Stewart, C. R., Muthye, V. & Cianciotto, N. P. Legionella pneumophila persists within biofilms formed by Klebsiella pneumoniae, Flavobacterium sp., and Pseudomonas fluorescens under dynamic flow conditions. PLoS ONE 7, e50560 (2012).
Rowbotham, T. J. Preliminary report on the pathogenicity of Legionella pneumophila for freshwater and soil amoebae. J. Clin. Pathol. 33, 1179â1183 (1980).
Fields, B. S., Benson, R. F. & Besser, R. E. Legionella and Legionnairesâ disease: 25 years of investigation. Clin. Microbiol. Rev. 15, 506â526 (2002).
Steinert, M., Heuner, K., Buchrieser, C., Albert-Weissenberger, C. & Glockner, G. Legionella pathogenicity: genome structure, regulatory networks and the host cell response. Int. J. Med. Microbiol. 297, 577â587 (2007).
Isberg, R. R., OâConnor, T. J. & Heidtman, M. The Legionella pneumophila replication vacuole: making a cosy niche inside host cells. Nat. Rev. Microbiol. 7, 13â24 (2009).
Hubber, A. & Roy, C. R. Modulation of host cell function by Legionella pneumophila type IV effectors. Annu. Rev. Cell Dev. Biol. 26, 261â283 (2010).
Schroeder, G. N. The toolbox for uncovering the functions of Legionella Dot/Icm Type IVb secretion system effectors: current state and future directions. Front. Cell Infect. Microbiol. 7, 528 (2017).
White, R. C. & Cianciotto, N. P. Assessing the impact, genomics and evolution of type II secretion across a large, medically important genus: the Legionella type II secretion paradigm. Microb. Genom. 5, e000273 (2019).
Soderberg, M. A., Rossier, O. & Cianciotto, N. P. The type II protein secretion system of Legionella pneumophila promotes growth at low temperatures. J. Bacteriol. 186, 3712â3720 (2004).
DebRoy, S., Dao, J., Soderberg, M., Rossier, O. & Cianciotto, N. P. Legionella pneumophila type II secretome reveals unique exoproteins and a chitinase that promotes bacterial persistence in the lung. Proc. Natl Acad. Sci. USA 103, 19146â19151 (2006).
McCoy-Simandle, K. et al. Legionella pneumophila type II secretion dampens the cytokine response of infected macrophages and epithelia. Infect. Immun. 79, 1984â1997 (2011).
White, R. C. & Cianciotto, N. P. Type II secretion is necessary for optimal association of the Legionella-containing vacuole with macrophage Rab1B but enhances intracellular replication mainly by Rab1B-independent mechanisms. Infect. Immun. 84, 3313â3327 (2016).
White, R. C. et al. Type II secretion-dependent aminopeptidase LapA and acyltransferase PlaC are redundant for nutrient acquisition during Legionella pneumophila intracellular infection of amoebas. MBio 9, e00528-18 (2018).
White, R. C., Truchan, H. K., Zheng, H., Tyson, J. Y. & Cianciotto, N. P. Type II secretion promotes bacterial growth within the Legionella-containing vacuole in infected amoebae. Infect. Immun. 87, e00374-19 (2019).
Mallama, C. A., McCoy-Simandle, K. & Cianciotto, N. P. The type II secretion system of Legionella pneumophila dampens the MyD88 and toll-like receptor 2 signaling pathway in infected human macrophages. Infect. Immun. 85, e00897-16 (2017).
Portlock, T. J. et al. Structure, dynamics and cellular insight into novel substrates of the Legionella pneumophila type II secretion system. Front. Mol. Biosci. 7, 112 (2020).
Rehman, S. et al. Structure and functional analysis of the Legionella pneumophila chitinase ChiA reveals a novel mechanism of metal-dependent mucin degradation. PLoS Pathog. 16, e1008342 (2020).
Vandersmissen, L., De Buck, E., Saels, V., Coil, D. A. & Anne, J. A Legionella pneumophila collagen-like protein encoded by a gene with a variable number of tandem repeats is involved in the adherence and invasion of host cells. FEMS Microbiol. Lett. 306, 168â176 (2010).
Duncan, C. et al. Lcl of Legionella pneumophila is an immunogenic GAG binding adhesin that promotes interactions with lung epithelial cells and plays a crucial role in biofilm formation. Infect. Immun. 79, 2168â2181 (2011).
Abdel-Nour, M. et al. Polymorphisms of a collagen-like adhesin contributes to Legionella pneumophila adhesion, biofilm formation capacity and clinical prevalence. Front. Microbiol. 10, 604 (2019).
Galka, F. et al. Proteomic characterization of the whole secretome of Legionella pneumophila and functional analysis of outer membrane vesicles. Infect. Immun. 76, 1825â1836 (2008).
Mallegol, J. et al. Essential roles and regulation of the Legionella pneumophila collagen-like adhesin during biofilm formation. PLoS ONE 7, e46462 (2012).
Chatfield, C. H., Zaia, J. & Sauer, C. Legionella pneumophila attachment to biofilms of an acidovorax isolate from a drinking water-consortium requires the Lcl-adhesin protein. Int. Microbiol. 23, 597â605 (2020).
Abdel-Nour, M. et al. The Legionella pneumophila collagen-like protein mediates sedimentation, autoaggregation, and pathogen-phagocyte interactions. Appl. Environ. Microbiol. 80, 1441â1454 (2014).
Coil, D. A. et al. Intragenic tandem repeat variation between Legionella pneumophila strains. BMC Microbiol. 8, 218 (2008).
Gandhi, N. S. & Mancera, R. L. The structure of glycosaminoglycans and their interactions with proteins. Chem. Biol. Drug Des. 72, 455â482 (2008).
Jinno, A. & Park, P. W. Role of glycosaminoglycans in infectious disease. Methods Mol. Biol. 1229, 567â585 (2015).
Thomas, R. & Brooks, T. Common oligosaccharide moieties inhibit the adherence of typical and atypical respiratory pathogens. J. Med. Microbiol. 53, 833â840 (2004).
Yaradou, D. F. et al. Zinc-dependent cytoadherence of Legionella pneumophila to human alveolar epithelial cells in vitro. Microb. Pathog. 43, 234â242 (2007).
Rao, C., Benhabib, H. & Ensminger, A. W. Phylogenetic reconstruction of the Legionella pneumophila Philadelphia-1 laboratory strains through comparative genomics. PLoS ONE 8, e64129 (2013).
Lu, S. et al. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res. 48, D265âD268 (2020).
Buchan, D. W. A. & Jones, D. T. The PSIPRED protein analysis workbench: 20 years on. Nucleic Acids Res. 47, W402âW407 (2019).
Gautier, R., Douguet, D., Antonny, B. & Drin, G. HELIQUEST: a web server to screen sequences with specific alpha-helical properties. Bioinformatics 24, 2101â2102 (2008).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583â589 (2021).
Holm, L. DALI and the persistence of protein shape. Protein Sci. 29, 128â140 (2020).
Huang, K. F. et al. Crystal structure of a platelet-agglutinating factor isolated from the venom of Taiwan habu (Trimeresurus mucrosquamatus). Biochem. J. 378, 399â407 (2004).
Horii, K., Okuda, D., Morita, T. & Mizuno, H. Crystal structure of EMS16 in complex with the integrin alpha2-I domain. J. Mol. Biol. 341, 519â527 (2004).
Luo, Y. et al. Crystal structure of enteropathogenic Escherichia coli intimin-receptor complex. Nature 405, 1073â1077 (2000).
Hamburger, Z. A., Brown, M. S., Isberg, R. R. & Bjorkman, P. J. Crystal structure of invasin: a bacterial integrin-binding protein. Science 286, 291â295 (1999).
Papakonstantinou, E. & Karakiulakis, G. The âsweetâ and âbitterâ involvement of glycosaminoglycans in lung diseases: pharmacotherapeutic relevance. Br. J. Pharmacol. 157, 1111â1127 (2009).
Xu, D., Prestegard J. H., Linhardt, R. J. & Esko J. D. Proteins that bind sulfated glycosaminoglycans. In Essentials of Glycobiology (eds Varki, A. et al.) Ch. 38 (Cold Spring Harbor Laboratory Press, 2022).
Su, H., Li, S., Terebiznik, M., Guyard, C. & Kerman, K. Biosensors for the detection of interaction between Legionella pneumophila collagen-like protein and glycosaminoglycans. Sensors 18, 2668 (2018).
van Zundert, G. C. P. et al. The HADDOCK2.2 Web Server: user-friendly integrative modeling of biomolecular complexes. J. Mol. Biol. 428, 720â725 (2016).
Dominguez, C., Boelens, R. & Bonvin, A. M. HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J. Am. Chem. Soc. 125, 1731â1737 (2003).
Stone, B. J. & Abu Kwaik, Y. Expression of multiple pili by Legionella pneumophila: identification and characterization of a type IV pilin gene and its role in adherence to mammalian and protozoan cells. Infect. Immun. 66, 1768â1775 (1998).
Hoppe, J. et al. PilY1 promotes Legionella pneumophila infection of human lung tissue explants and contributes to bacterial adhesion, host cell invasion, and twitching motility. Front. Cell Infect. Microbiol. 7, 63 (2017).
Treuner-Lange, A. et al. PilY1 and minor pilins form a complex priming the type IVa pilus in Myxococcus xanthus. Nat. Commun. 11, 5054 (2020).
Garduno, R. A., Garduno, E. & Hoffman, P. S. Surface-associated hsp60 chaperonin of Legionella pneumophila mediates invasion in a HeLa cell model. Infect. Immun. 66, 4602â4610 (1998).
Cirillo, S. L., Bermudez, L. E., El-Etr, S. H., Duhamel, G. E. & Cirillo, J. D. Legionella pneumophila entry gene rtxA is involved in virulence. Infect. Immun. 69, 508â517 (2001).
Bellinger-Kawahara, C. & Horwitz, M. A. Complement component C3 fixes selectively to the major outer membrane protein (MOMP) of Legionella pneumophila and mediates phagocytosis of liposome-MOMP complexes by human monocytes. J. Exp. Med. 172, 1201â1210 (1990).
Chang, B., Kura, F., Amemura-Maekawa, J., Koizumi, N. & Watanabe, H. Identification of a novel adhesion molecule involved in the virulence of Legionella pneumophila. Infect. Immun. 73, 4272â4280 (2005).
Truchan, H. K., Christman, H. D., White, R. C., Rutledge, N. S. & Cianciotto, N. P. Type II secretion substrates of Legionella pneumophila translocate out of the pathogen-occupied vauole via a semi-permeable membrane. mBio 8, e00870-17 (2017).
Qiu, Y., Zhai, C., Chen, L., Liu, X. & Yeo, J. Current insights on the diverse structures and functions in bacterial collagen-like proteins. ACS Biomater. Sci. Eng. 9, 3778â3795 (2023).
Zhuoxin, Y., An, B., Ramshaw, J. A. M. & Brodsky, B. Bacterial collagen-like proteins that form triple-helical structures. J. Struct. Biol. 186, 451â461 (2014).
Han, R. et al. Assessment of prokaryotic collagen-like sequences derived from streptococcal Scl1 and Scl2 proteins as a source of recombinant GXY polymers. Appl. Microbiol. Biotechnol. 72, 109â115 (2006).
Boudko, S. P., Engel, J. & Bachinger, H. P. The crucial role of trimerization domains in collagen folding. Int. J. Biochem. Cell Biol. 44, 21â32 (2012).
Mittag, T. et al. Dynamic equilibrium engagement of a polyvalent ligand with a single-site receptor. Proc. Natl Acad. Sci. USA 105, 17772â17777 (2008).
Stewart, C. R., Burnside, D. M. & Cianciotto, N. P. The surfactant of Legionella pneumophila Is secreted in a TolC-dependent manner and is antagonistic toward other Legionella species. J. Bacteriol. 193, 5971â5984 (2011).
Rossier, O., Starkenburg, S. R. & Cianciotto, N. P. Legionella pneumophila type II protein secretion promotes virulence in the A/J mouse model of Legionnairesâ disease pneumonia. Infect. Immun. 72, 310â321 (2004).
Chatfield, C. H. & Cianciotto, N. P. Culturing, media, and handling of legionella. Methods Mol. Biol. 954, 151â162 (2013).
Campbell, J. A. & Cianciotto, N. P. Legionella pneumophila Cas2 promotes the expression of small heat shock protein C2 that is required for thermal tolerance and optimal intracellular infection. Infect. Immun. 90, e0036922 (2022).
Bryan, A., Harada, K. & Swanson, M. S. Efficient generation of unmarked deletions in Legionella pneumophila. Appl. Environ. Microbiol. 77, 2545â2548 (2011).
Bryan, A., Abbott, Z. D. & Swanson, M. S. Constructing unmarked gene deletions in Legionella pneumophila. Methods Mol. Biol. 954, 197â212 (2013).
Mould, A. P., Holmes, D. F., Kadler, K. E. & Chapman, J. A. Mica sandwich technique for preparing macromolecules for rotary shadowing. J. Ultrastruct. Res. 91, 66â76 (1985).
Ghosh, N. et al. Collagen-like proteins in pathogenic E. coli strains. PLoS ONE 7, e37872 (2012).
Steinegger, M. & Soding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026â1028 (2017).
Steinegger, M. et al. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinform. 20, 473 (2019).
Mirdita, M. et al. ColabFoldâmaking protein folding accessible to all. bioRxiv https://doi.org/10.1101/2021.08.15.456425 (2021).
Delaglio, F. et al. NMRPipe: a multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277â293 (1995).
Vranken, W. F. et al. The CCPN data model for NMR spectroscopy: development of a software pipeline. Proteins 59, 687â696 (2005).
Sormanni, P., Camilloni, C., Fariselli, P. & Vendruscolo, M. The s2D method: simultaneous sequence-based prediction of the statistical populations of ordered and disordered regions in proteins. J. Mol. Biol. 427, 982â996 (2015).
Jo, S., Kim, T., Iyer, V. G. & Im, W. CHARMM-GUI: a web-based graphical user interface for CHARMM. J. Comput. Chem. 29, 1859â1865 (2008).
Abraham, M. J. et al. GROMACS: high performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1-2, 19â25 (2015).
Huang, J. et al. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat. Methods 14, 71â73 (2017).
Michaud-Agrawal, N., Denning, E. J., Woolf, T. B. & Beckstein, O. MDAnalysis: a toolkit for the analysis of molecular dynamics simulations. J. Comput. Chem. 32, 2319â2327 (2011).
Kabsch, W. XDS. Acta Crystallogr. D Biol. Crystallogr. 66, 125â132 (2010).
Evans, P. R. & Murshudov, G. N. How good are my data and what is the resolution? Acta Crystallogr. D Biol. Crystallogr. 69, 1204â1214 (2013).
Winter, G., Lobley, C. M. & Prince, S. M. Decision making in xia2. Acta Crystallogr. D Biol. Crystallogr. 69, 1260â1273 (2013).
Sheldrick, G. M. A short history of SHELX. Acta Crystallogr. A 64, 112â122 (2008).
Vonrhein, C., Blanc, E., Roversi, P. & Bricogne, G. Automated structure solution with autoSHARP. Methods Mol. Biol. 364, 215â230 (2007).
Langer, G., Cohen, S. X., Lamzin, V. S. & Perrakis, A. Automated macromolecular model building for X-ray crystallography using ARP/wARP version 7. Nat. Protoc. 3, 1171â1179 (2008).
Casanal, A., Lohkamp, B. & Emsley, P. Current developments in coot for macromolecular model building of electron cryo-microscopy and crystallographic data. Protein Sci. 29, 1069â1078 (2020).
Winn, M. D., Murshudov, G. N. & Papiz, M. Z. Macromolecular TLS refinement in REFMAC at moderate resolutions. Methods Enzymol. 374, 300â321 (2003).
McCoy, A. J. et al. Phaser crystallographic software. J. Appl. Crystallogr. 40, 658â674 (2007).
de Vries, I. et al. New restraints and validation approaches for nucleic acid structures in PDB-REDO. Acta Crystallogr. D Struct. Biol. 77, 1127â1141 (2021).
Williams, C. J. et al. MolProbity: more and better reference data for improved all-atom structure validation. Protein Sci. 27, 293â315 (2018).
Cowieson, N. P. et al. Beamline B21: high-throughput small-angle X-ray scattering at diamond light source. J. Synchrotron Radiat. 27, 1438â1446 (2020).
Panjkovich, A. & Svergun, D. I. CHROMIXS: automatic and interactive analysis of chromatography-coupled small-angle X-ray scattering data. Bioinformatics 34, 1944â1946 (2018).
Franke, D. et al. ATSAS 2.8: a comprehensive data analysis suite for small-angle scattering from macromolecular solutions. J. Appl. Crystallogr. 50, 1212â1225 (2017).
Sali, A. & Blundell, T. L. Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779â815 (1993).
Tria, G., Mertens, H. D., Kachala, M. & Svergun, D. I. Advanced ensemble modelling of flexible macromolecules using X-ray solution scattering. IUCrJ 2, 207â217 (2015).
Johnson, B. A. & Blevins, R. A. NMR view: a computer program for the visualization and analysis of NMR data. J. Biomol. NMR 4, 603â614 (1994).
Sepuru, K. M., Nagarajan, B., Desai, U. R. & Rajarathnam, K. Molecular basis of chemokine CXCL5-glycosaminoglycan interactions. J. Biol. Chem. 291, 20539â20550 (2016).
Singh, A., Montgomery, D., Xue, X., Foley, B. L. & Woods, R. J. GAG Builder: a web-tool for modeling 3D structures of glycosaminoglycans. Glycobiology 29, 515â518 (2019).
van Aalten, D. M. et al. PRODRG, a program for generating molecular topologies and unique molecular descriptors from coordinates of small molecules. J. Comput. Aided Mol. Des. 10, 255â262 (1996).
Lindorff-Larsen, K. et al. Systematic validation of protein force fields against experimental data. PLoS ONE 7, e32131 (2012).
Kirschner, K. N. et al. GLYCAM06: a generalizable biomolecular force field. Carbohydrates. J. Comput. Chem. 29, 622â655 (2008).
Foley, B. L., Tessier, M. B. & Woods, R. J. Carbohydrate force fields. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2, 652â697 (2012).
Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W. & Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79, 926â935 (1983).
Hess, B., Bekker, H., Berendsen, H. J. C. & Fraaije, J. G. E. M. LINCS: a linear constraint solver for molecular simulations. J. Comput. Chem. 18, 1463â1472 (1997).
Miyamoto, S. & Kollman, P. A. Settle: an analytical version of the SHAKE and RATTLE algorithm for rigid water models. J. Comput. Chem. 13, 952â962 (1992).
Essmann, U. et al. A smooth particle mesh Ewald method. J. Chem. Phys. 103, 8577â8593 (1995).
Berendsen, H. J. C., Postma, J. P. M., van Gunsteren, W. F., DiNola, A. & Haak, J. R. Molecular dynamics with coupling to an external bath. J. Chem. Phys. 81, 3684â3690 (1984).
Bussi, G., Donadio, D. & Parrinello, M. Canonical sampling through velocity rescaling. J. Chem. Phys. 126, 014101 (2007).
Parrinello, M. & Rahman, A. Polymorphic transitions in single crystals: a new molecular dynamics method. J. Appl. Phys. 52, 7182â7190 (1981).
Skjærven, L., Yao, X.-Q., Scarabelli, G. & Grant, B. J. Integrating protein structural dynamics and evolutionary analysis with Bio3D. BMC Bioinform. 15, 399 (2014).
Daura, X. et al. Peptide folding: when simulation meets experiment. Angew. Chem. Int. Ed. 38, 236â240 (1999).
Fornili, A., Autore, F., Chakroun, N., Martinez, P. & Fraternali, F. Proteinâwater interactions in MD simulations: POPS/POPSCOMP solvent accessibility analysis, solvation forces and hydration sites. In Computational Drug Discovery and Design (ed. Baron, R.) (Springer, 2012).
Adasme, M. F. et al. PLIP 2021: expanding the scope of the proteinâligand interaction profiler to DNA and RNA. Nucleic Acids Res. 49, W530âW534 (2021).
Acknowledgements
This work was supported by the MRC (MR/M009920/1, MR/R017662/1, MR/W000814/1) and EPSRC (1806169) to J.G., and NIH (AI043987, AI175460) to N.C. Work was also supported by the Wellcome Trust (099185/Z/12/Z), and we thank HWB-NMR staff at the University of Birmingham for providing open access to their Wellcome Trust-funded 900âMHz spectrometer. In addition, this work was supported by the Francis Crick Institute through provision of access to the MRC Biomedical NMR Centre. The Francis Crick Institute receives its core funding from Cancer Research UK (FC001029), the MRC (CC1078), and the Wellcome Trust (CC1078). We also thank the Centre for Biomolecular Spectroscopy at Kingâs College London for additional NMR access, funded by the Wellcome Trust (202767/Z/16/Z) and British Heart Foundation (IG/16/2/32273). This work made use of time on HPC granted via the UK High-End Computing Consortium for Biomolecular Simulation (HECBioSim), supported by EPSRC (EP/R029407/1, EP/X035603/1). We thank the beamline scientists at I03 and BL21 of the Diamond Light Source, United Kingdom. We would also like to thank Prof. Krishna Rajarathnam and Dr. K. Mohan Sepuru (UTMB) for their guidance in creating C4S input files for use in HADDOCK.
Author information
Authors and Affiliations
Contributions
Conceived and designed the experiments: S.R., A.K.A., I.M., H.Z., L.C., M.B., C.A., A.O., G.M., S.W., G.K., C.D., A.F., N.C. and J.G. Performed the experiments: S.R., A.K.A., I.M., H.Z., L.C., M.B., C.A., T.P., K.R., R.S., A.O., G.M., S.W., G.K., A.F. and J.G. Analyzed the data: S.R., A.K.A., I.M., H.Z., L.C., M.B., C.A., T.P., K.R., R.S., C.D., A.F., N.C. and J.G. Contributed reagents/materials/analysis tools: A.K.A., C.D., A.F., N.C. and J.G. Wrote the paper: S.R., A.K.A., L.C., M.B., C.A., C.D., A.F., N.C. and J.G.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Communications thanks Peter Davies, Michal Hammel, Michael Williamson, and the other, anonymous, reviewer for their contribution to the peer review of this work. A peer review file is available.
Additional information
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Source data
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the articleâs Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleâs Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Rehman, S., Antonovic, A.K., McIntire, I.E. et al. The Legionella collagen-like protein employs a distinct binding mechanism for the recognition of host glycosaminoglycans. Nat Commun 15, 4912 (2024). https://doi.org/10.1038/s41467-024-49255-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-024-49255-4









