LSECtin is one of a group of glycan-binding receptors encoded near the end of human chromosome 19 (Liu et al. 2004). Other receptors in the cluster include the dendritic and endothelial cell receptors dendritic cell-intercellular adhesion molecule 1 grabbing nonintegrin (DC-SIGN) and DC-SIGN related receptor (DC-SIGNR) (L-SIGN) and the structurally related Fc-binding receptor CD23 (Soilleux et al. 2000). All of these receptors contain C-type carbohydrate-recognition domains (CRDs). Although a related cluster is found in the syntenic region of chromosome 8 in mouse, there are major differences in the organization of several of the genes, with eight mouse DC-SIGN-related genes replacing the two human genes (Powlesland et al. 2006). These differences make it difficult to identify clear mouse-human orthologs. In contrast, the human and mouse LSECtin genes appear to be much more closely related, although the two sequences differ at key residues that have been proposed to be involved in selective binding of ligands (Powlesland et al. 2008; Tang et al. 2009).
The mRNA for LSECtin is found in a restricted pattern, with the primary sites of expression being sinusoidal endothelial cells in liver and lymph nodes (Liu et al. 2004), although expression has also been reported in suitably stimulated dendritic cells and macrophages in vitro and in the Kupffer cells (Domínguez-Soto et al. 2007; Domínguez-Soto et al. 2009). Unlike many glycan-binding receptors, LSECtin does not appear to function in clathrin-mediated endocytosis (Gramberg et al. 2008; Powlesland et al. 2008), although it can undergo antibody-induced internalization in myeloid cells (Domínguez-Soto et al. 2007). Several biological functions for LSECtin mediated by its ability to interact with specific glycans have been investigated. In vitro studies demonstrate that human LSECtin can provide a route for Ebola virus and SARS coronavirus infection of cells initiated by binding of LSECtin to glycans of the viral envelope glycoproteins (Gramberg et al. 2005; Domínguez-Soto et al. 2007; Powlesland et al. 2008). Studies in mice indicate that mouse LSECtin is able to bind T cells through glycans on the CD44 T-cell surface marker (Tang et al. 2010). This interaction leads to suppression of T-cell responses and knockout mice lacking LSECtin show increased sensitivity to liver damage mediated by T cells following acute injury, suggesting that LSECtin plays a role in protection of the liver from immune attack (Tang et al. 2009). Finally, it has been suggested that receptor expressed on myeloid cells could be involved in antigen capture and presentation (Domínguez-Soto et al. 2007).
Use of the knockout mice (Tang et al. 2009) and other mouse-specific tools to model interactions of the human receptor is based on the assumption that the human and mouse proteins function similarly. From the work with DC-SIGN and its homologues, which provides precedent for rapid evolution of function in this region of the genome, as well as the presence of some potentially important differences in the sequences of human and mouse LSECtin, there is need for empirical evidence comparing the biochemical properties of the two receptors in order to provide a foundation for extrapolating results from mice to humans.
In the studies described in this manuscript, biochemical characterization of mouse LSECtin is presented and close similarity in the sugar-binding properties of the two receptors, and their ability to interact with Ebola virus glycoprotein, is documented. In addition, it is demonstrated that the simple disaccharide GlcNAcβ1-2Man acts as an effective inhibitor of binding to the viral glycoprotein at micromolar concentrations.
Expression and characterization of the CRD from mouse LSECtin
LSECtin is a type 2 transmembrane protein in which the C-type CRD resides at the C-terminal end of the protein (Figure 1A). An initial survey of the distribution of LSECtin expression was undertaken using the polymerase chain reaction to amplify the LSECtin-coding region from a panel of cDNAs with a pair of primers that lie just outside the protein-coding region at each end (Figure 1B). Expression was detected exclusively in the liver, as was previously observed for human LSECtin in a survey of a similar set of human tissues (Liu et al. 2004). This distribution is consistent with the idea that mouse LSECtin performs a function similar to human LSECtin in sinusoidal endothelial cells, although it was not possible to document expression in lymph node as well because of the difficulty of obtaining a suitable cDNA library.
The CRD from mouse LSECtin was initially expressed in a bacterial system previously used for production of the human protein (Powlesland et al. 2008), in which the portion of the cDNA encoding the CRD is inserted immediately after the bacterial ompA signal sequence. The signal sequence allows folding of the CRD in the periplasm, so the protein could be extracted and purified directly by affinity chromatography on immobilized sugars (data not shown). However, for the mouse CRD, more efficient expression was obtained by production of the CRD as inclusion bodies that were dissolved in guanidine hydrochloride and renatured by dilution and dialysis (Figure 1C). Similar yields were obtained by purification on either fucose- or mannose-Sepharose. However, due to the relatively weak affinity of individual CRDs for the monosaccharide ligands, a 10-mL column was required to achieve effective purification. The CRD was used directly in the solid-phase binding assays.
For characterization of the binding specificity of mouse LSECtin by glycan array analysis, a tetrameric form of the CRD was created by attaching a C-terminal biotinylation sequence to the CRD and co-expressing the CRD and biotin ligase (Figure 2A). The resulting protein was purified by affinity chromatography as for the untagged protein. Incubation of an excess of biotin-tagged CRD with fluorescently labeled streptavidin resulted in formation of tetramers that could be purified away from uncomplexed CRDs by re-chromatography on a fucose-Sepharose. The tetrameric protein is efficiently retained on a 1-mL affinity column, whereas uncomplexed CRD washes through. The bound CRD–streptavidin complex was then eluted from the affinity column with ethylenediaminetetraacetic acid (EDTA) (Figure 2B).
Probing of version 4.0 of the glycan array created by the Consortium for Functional Glycomics (Blixt et al. 2004) revealed strong signals for a very selected subset of glycans (Figure 2C). The common feature of all of these glycan is the presence of terminal GlcNAcβ1-2Man disaccharides. Every oligosaccharide that gives a positive signal on the array contains one or more such terminal structures. Only two glycans that contain terminal GlcNAcβ1-2Man yield poor signals on the array, and in each case, these are bi-antennary oligosaccharides in which one branch is extended, which may lead to steric hindrance of binding of the remaining exposed GlcNAcβ1-2Man group.
The specificity of mouse LSECtin for the GlcNAcβ1-2Man disaccharide was further demonstrated in competition binding assays (Figure 3A and Table I). In these experiments, the CRD was immobilized on polystyrene wells and probed with 125I-labelled mannose-bovine serum albumin (BSA), a neoglycoprotein ligand with a high density of terminal mannose residues. Based on the measured inhibition constants, the affinity for the disaccharide was ∼1000-fold higher than for any of the monosaccharides tested.
Comparison of ligand-binding properties of human and mouse LSECtin
The initial characterization of mouse LSECtin binding to glycan ligands suggested that its ligand-binding specificity is closely analogous to that of the human protein (Powlesland et al. 2008). Testing of the pH sensitivity of ligand binding indicated that, as with human LSECtin, the binding activity of mouse LSECtin is enhanced at low pH (Figure 3B). The combined results thus suggest close similarity in the mechanism of ligand binding to human and mouse LSECtin.
The observed high degree of similarity between the human and mouse proteins was not necessarily predicted based on sequence comparisons. Previous modeling studies led to the hypothesis that an important contribution to high affinity binding of terminal GlcNAcβ1-2Man residues might be interaction of the N-acetyl group with the side chain of residue Trp259 (Powlesland et al. 2008). Surprisingly, this residue was found to be substituted by arginine in the mouse protein (Figure 4A). The similarity of the ligand-binding activities of the human and mouse proteins does not support the hypothesis that interactions of the aromatic side chain are important. The role of this residue was tested directly by creating mutant versions of the human CRD in which Trp259 is substituted with arginine, as found in the mouse protein, or with alanine. Comparison of binding of the wild-type and mutant proteins to the exposed terminal GlcNAcβ1-2Man disaccharides of Ebola virus glycoprotein confirmed that the absence of the tryptophan side chain does not significantly affect the interaction (Figure 4B).
Human and mouse LSECtin binding to Ebola virus glycoprotein and characterization of simple sugars as inhibitors
The similarity in the binding specificities of human and mouse LSECtin suggested that, like the human protein, mouse LSECtin might be a target for binding to the surface glycoprotein of Ebola virus. The Ebola glycoprotein has been shown to bear an unusually large number of under-processed glycans with terminal GlcNAcβ1-2Man that are not further modified by addition of galactose (Powlesland et al. 2008). A direct binding assay confirmed that human and mouse LSECtin interact with the Ebola glycoprotein-Fc fusion protein with very similar affinities of 0.52 µg mL−1 for the mouse protein and 0.48 µg mL−1 for the human protein (Figure 5). This result provides important evidence that the mouse protein is a potentially useful model for human Ebola virus interaction with human sinusoidal endothelial cells.
Based on the results in Figure 3, showing that the simple GlcNAcβ1-2Man disaccharide competes at micromolar concentrations for binding of neoglycoprotein ligands to mouse LSECtin, as well as published work showing similar affinities of the human protein (Powlesland et al. 2008), the disaccharide was tested for its ability to compete for binding of both receptors to the Ebola virus glycoprotein. The results demonstrate that the KI of the disaccharide is in the low micromolar range for both human and mouse receptors (Figure 6 and Table II). This result provides an important lead toward the development of high-affinity inhibitors of Ebola virus binding to sinusoidal endothelial cells as well as confirming that studies in the mouse will be relevant to humans as well.
The close similarity in the biochemical properties of human and mouse LSECtin is in stark contrast to the widely divergent properties of the mouse orthologs of DC-SIGN. The difference in the evolutionary history of these two protein families, in spite of their proximity in mammalian genomes, indicates that these proteins are subject to different evolutionary pressures. One interpretation of the conserved binding properties of LSECtin is that this receptor may bind a conserved endogenous glycan ligand, whereas DC-SIGN is evolving to interact with different pathogen glycans. This interpretation is consistent with the fact that DC-SIGN functions as an endocytic receptor able to a direct pathogen uptake and presentation to the immune system (Engering et al. 2002; Guo et al. 2004), whereas LSECtin does not have constitutive endocytic activity (Gramberg et al. 2008; Powlesland et al. 2008). In this scenario, the binding of endogenous ligands, such as CD44 on T cells, would be the primary function of LSECtin, which has been hijacked by certain pathogens such as Ebola virus.
Few other glycan-binding receptors show such conserved properties between species as is documented here for LSECtin. DC-SIGN is one of many receptors that show extreme divergence between human and mouse (Powlesland et al. 2006). The CD33 family of siglecs represent another example of independent radiation of a group of glycan-binding receptors (Crocker et al. 2007), and some receptors, such as the Kupffer cell receptor and prolectin, are present in rodents but not humans or vice versa (Fadden et al. 2003; Graham et al. 2009). Aside from the evolutionary and functional implications of the conservation of LSECtin function, the close similarities in their ligand-binding properties, as well as the similar restricted distribution of tissues in which they are expressed, provide a firm foundation for use of mouse LSECtin as a model for the human receptor, which is not always possible in the case of more divergent receptors. The development of low-molecular-weight inhibitors that work equally well on the human and mouse proteins, as well as the description of mice in which LSECtin has been knocked out (Tang et al. 2009), thus provide opportunities for exploring the physiological functions of the human receptor using mouse models.
The GlcNAcβ1-2Man disaccharide was purchased from Dextra Laboratories (Reading, UK). Soluble Ebola glycoprotein-Fc fusion protein was produced by transient transfection of 293T cells as described previously (Powlesland et al. 2008). Purified protein was iodinated by the chloramine T method (Greenwood et al. 1963).
Analysis of LSECtin expression
The polymerase chain reaction was carried out using a mouse cDNA panel and Advantage 2 DNA polymerase from ClonTech-Takara Bio Europe (Saint-Germain-en-Laye, France) under the standard reaction conditions described by the manufacturer. Forward and reverse oligonucleotide primers, obtained from Invitrogen (Paisley, UK), were 5-ccagggctggacgccaccaccacc-3′ and 5′-ctggggtcactaaagcatgcactggtcagg-3′. Following an initial denaturation at 95°C for 1 min, 40 cycles of 95°C for 30 s and 68°C for 1 min were executed, and the products were resolved on a 2% agarose gel containing ethidium bromide.
Expression system for CRD from mouse LSECtin
A cDNA for mouse LSECtin was isolated by polymerase chain reaction amplification from a mouse liver cDNA library from BD Biosciences (Oxford, UK). Initially, an expression system using the ompA signal sequence was constructed by direct analogy with the method previously used for the human protein (Powlesland et al. 2008). For more efficient expression, synthetic oligonucleotides were combined with restriction fragments from the cDNA to insert the region encoding the CRD into vector T5T (Eisenberg et al. 1990), with codons for an initiator methionine followed by an alanine codon and then the cDNA sequence beginning at residue 161 of mouse LSECtin, which encodes the sequence SerCysGlu at the N-terminal end of the CRD. For expression of biotin-tagged protein, the C-terminal end of the cDNA was similarly modified by insertion of synthetic oligonucleotides so that the expression vector codes for a protein in which the C-terminal cysteine residue of LSECtin is followed by the sequence GlyLeuAsnAspIlePheGluAlaGlnLysIleGluTrpHisGlu (Schatz 1993). All sequences were confirmed by sequencing on an Applied Biosystems 310 genetic analyzer.
The CRD of LSECtin was expressed in Escherichia coli strain BL21(DE3) grown in the Luria–Bertani broth containing 50 µg mL−1 ampicillin. For addition of biotin, the cells were also transformed with plasmid birA, which encodes the gene for biotin ligase (Chapman-Smith and Cronan 1999), and chloramphenicol at 20 µg mL−1 was included in the medium. For protein production, a 200 mL starter culture was diluted to 6 L and grown at 37°C to OD550 of 0.7, at which point additions were made to achieve final concentrations of 100 µg mL−1 isopropyl-β-d-thiogalactoside and 12.5 µg mL−1 biotin. After further growth for 2.5 h at 37°C, bacteria were harvested by centrifugation for 15 min at 3000 × g, washed in 400 mL of 10 mM Tris–Cl, pH 7.8, and collected by centrifugation for 10 min at 6000 × g. The final pellet was resuspended in 200 mL of 10 mM Tris–Cl, pH 7.8, and sonicated at full power in a Branson Model 250 sonicator for 6 × 30 s, with cooling on ice between sonication steps. The insoluble inclusion bodies were collected by centrifugation for 15 min at 10,000 × g and dissolved in 100 mL of 6 M guanidine hydrochloride containing 100 mM Tris–Cl, pH 7.0, by brief sonication. Following addition of 10 µL of 2-mercaptoethanol, the mixture was stirred for 30 min at 4°C and centrifuged for 30 min at 10,000 × g. The supernatant was dialyzed against three changes of 2 L of loading buffer (150 mM NaCl, 25 mM CaCl2, 25 mM Tris–Cl, pH 7.8), centrifuged for 5 min at 8000 × g and 30 min at 1,00,000 × g, the supernatant was filtered through glass wool and applied to a 10 mL column of mannose- or fucose-Sepharose (Fornstedt and Porath 1975). The column was washed with 16 mL of loading buffer and eluted in 2 mL fractions with elution buffer (150 mM NaCl, 2.5 mM EDTA, 25 mM Tris–Cl, pH 7.8).
Site-directed mutagenesis of human LSECtin
Mutations were introduced into the expression plasmid for the CRD from human LSECtin by the overlap extension method with the polymerase chain reaction (Ho et al. 1989), using flanking forward primer gtcgactctagataacgaggcgc and flanking reverse primer ggtcagcagttgtgccttttctcac and mutagenic primers gggagagcccaatgacgctagggggcgcgagaactgtg and cacagttctcgcgccccctagcgtcattgggctctccc for the W259R mutation and gggagagcccaatgacgctgcggggcgcgagaactgtg and cacagttctcgcgccccgcagcgtcattgggctctccc for the W259A mutation. The wild-type and mutant human proteins were expressed in the folded form in the periplasm of E. coli and were purified as described previously.
Screening of glycan array
Approximately 250 µg of biotin-tagged CRD in 6 mL of eluting buffer was adjusted to 25 mM CaCl2 and incubated overnight at 4°C with 150 µg of Alexa 488-labeled streptavidin (Invitrogen). The complex was re-applied to a 1-mL column of mannose-Sepharose, which was rinsed with 5 mL of loading buffer and eluted with 0.5-mL aliquots of eluting buffer. Version 4.0 of the glycan array of the Consortium for Functional Glycomics was screened in buffer containing 150 mM NaCl, 2 mM CaCl2, 2 mM MgCl2, 20 mM Tris–Cl pH 7.4, 0.05% Tween 20 and 1% BSA following their standard protocol.
For competition binding assays and pH-dependence assays, polystyrene wells coated with CRD from LSECtin at a concentration of 50 µg mL−1 were probed with 125I-labeled mannose-BSA or Ebola glycoprotein-Fc fusion protein (Mitchell et al. 2001). For direct binding assays, biotin-tagged CRD was applied to streptavidin-coated wells that were incubated with unlabeled Ebola glycoprotein-Fc fusion protein at various dilutions followed alkaline phosphatase-protein A conjugate and p-nitrophenylphosphate substrate (Powlesland et al. 2008). Data were fitted to binding equations using the nonlinear least-squares fitting function of SigmaPlot (Systat Software, Hounslow, UK) to determine the half-maximal concentration for binding (KD) or for inhibition of binding (KI).
Other analytical procedures
Sodium dodecylsulfate (SDS)-polyacrylamide gels were performed by the method of Laemmli (1970). Protein concentrations were determined by the method of Bradford (1976).
The Wellcome Trust (075565 to M.E.T. and K.D), National Institute of General Medical Sciences (GM62116 to the Consortium for Functional Glycomics) and the Center for Infection Biology (to I.S.).
Conflict of interest
CRD, carbohydrate-recognition domain; BSA, bovine serum albumin; EDTA, ethylenediaminetetraacetic acid; SDS, sodium dodecylsulfate.