Intramolecular quality control: HIV-1 Envelope gp160 signal-peptide cleavage as a functional folding checkpoint

The membrane-tethering signal peptides that target secretory proteins to the endoplasmic reticulum generally are assumed to be removed during translation, which is a prerequisite for proper folding. Cleavage of the HIV-1 gp120 signal peptide is late however and regulated by gp120 folding. While attached, a conserved cysteine in the signal peptide that is important for viral fitness, sustains disulfide isomerization during gp120 folding. Assembly of the N-terminal β-sandwich with a single C-terminal β-strand sets off signal-peptide cleavage, which releases gp120 from the membrane and from this disulfide-attacking cysteine and stabilizes the native gp120 fold. This is the first example of post-translational regulation of signal-peptide cleavage for intramolecular quality control of folding, with the signal peptide acting as a membrane-embedded propeptide with redox-active cysteine. Considering the ∼15% secretory proteins in our genome, and the frequency of N-C contacts in protein structures, this regulatory role of the signal peptide is bound to be a more common mechanism in secretory protein biosynthesis.


Introduction
The endoplasmic reticulum (ER) is home to a wealth of resident chaperones and folding enzymes that cater to approximately one third of all mammalian proteins during their biosynthesis (Ellgaard et al., 2016;Kanapin et al., 2003). It is the site of N-linked glycan addition and disulfide-bond formation, both of which contribute to protein folding, solubility, stability, and function. Targeting to the mammalian ER in general is mediated by N-terminal signal peptides, which direct the ribosomenascent chain complex to the membrane and initiate co-translational translocation (Blobel and Dobberstein, 1975;Gorlich et al., 1992;Görlich et al., 1992;Jackson and Blobel, 1977;Walter, 1981). For soluble and type-I transmembrane proteins, the Nterminal signal peptide is 15-50 amino acids long and contains a cleavage site recognized by the signal peptidase complex (von Heijne, 1985). While a great deal of sequence variation can occur between signal sequences, conserved features do exist. These include a positively charged, N-terminal n-region, a hydrophobic hregion and an ER-lumenal c-region (von Heijne, 1983(von Heijne, , 1984(von Heijne, , 1985. Classic paradigm-establishing studies showed that cleavable signal peptides are removed co-translationally, immediately upon exposure of the cleavage site in the ER lumen (Blobel and Dobberstein, 1975;Jackson and Blobel, 1977). This would imply that signal peptides function only as cellular postal codes and that signalpeptide cleavage and folding are independent events. For several proteins however, evidence has emerged that increased nascent-chain lengths are required for cleavage (Daniels et al., 2003;Rutkowski et al., 2003), indicating that the signal peptidase does not cleave each consensus site immediately upon its translocation into the ER lumen. For example, signal-peptide cleavage of the influenza-virus hemagglutinin occurs on the longer nascent chain, after its glycosylation (Daniels et al., 2003). Other proteins have been shown to undergo cleavage only after translation termination. Examples of this post-translational cleavage include, EDEM1 (Tamura et al., 2011), human cytomegalovirus (HCMV) protein US11 (Rehm et al., 2001) and HIV-1 envelope glycoprotein gp160 (Li et al., 1994), suggesting that signal peptides function as more than mere postal codes.
Gp160 is the sole antigenic protein on the surface of the HIV-1 virion and mediates HIV-1 entry into target cells . It folds and trimerizes in the ER, leaves upon release by chaperones and packaging into COPII-coated vesicles, and is cleaved by Golgi furin proteases into two non-covalently associated subunits: the soluble subunit gp120 ( Figure 1A, in colors), which binds host-cell receptors, and the transmembrane subunit gp41 ( Figure 1A, uncolored), which contains the fusion peptide (Decroly et al., 1994;Earl et al., 1990;Earl et al., 1991;Hallenberger et al., 1992;Wyatt and Sodroski, 1998). The so-called outer domain residues [according to (Pancera et al., 2010)] are colored in pink ( Figure 1A), the inner domain, which folds from more peripheral parts of the gp120 sequence, in grey, and the variable loops in green. Correct function of gp160 requires proper folding including oxidation of the correct cysteine pairs into disulfide bonds (Bontjer et al., 2009;Land and Braakman, 2001;Land et al., 2003;Sanders et al., 2008;Snapp et al., 2017). Disulfide-bond formation in gp160 begins co-translationally, on the ribosome-attached nascent chain, and continues long after translation, until the correct set of ten conserved disulfide bonds have been formed (Land and Braakman, 2001;Land et al., 2003).
The soluble subunit gp120 can be expressed independently of gp41 and folds with highly similar kinetics as gp160 (Land et al., 2003). Signal-peptide cleavage only occurs once gp120 attains a near-native conformation and requires both Nglycosylation and disulfide-bond formation, but is gp41 independent (Land et al., 2003;Li et al., 1996). Mutation-induced co-translational signal-peptide cleavage changes the folding pathway of gp120 and is disadvantageous for viral function (Pfeiffer et al., 2006;Snapp et al., 2017). Given the interplay between signal-peptide cleavage and gp120 folding, we set out to investigate the mechanism that drives post-translational cleavage and its relevance for gp120 folding and viral fitness.
We used C-terminally truncated and mutant gp120 constructs in various kinetic oxidative-folding assays combined with functional studies on recombinant HIV strains encoding gp160 mutants. Signal-peptide cleavage was regulated by folding of gp120, and until cleavage, the conserved cysteine in the signal peptide drove disulfide isomerization in the gp120 ectodomain. We uncovered this functional, mutual regulation as an internal quality control during folding of a complex glycoprotein.

Signal-peptide cleavage requires the gp120 C-terminus
Of the nine disulfide bonds in gp120, five are critical for proper folding and signalpeptide cleavage, three in the constant regions of gp120 in the (grey) inner domain, and two in the outer (pink) domain at the base of variable loops (green) V3 and V4 [ Figure 1A, ]. Gp120 undergoes extensive disulfide isomerization during its folding process as seen from the smear of gp120 folding intermediates (IT) in the non-reducing gel upon 35 S-radiolabeling, from reduced gp120 down to beyond native gp120 [(Land et al., 2003); Figure 1B, NR, 0' chase].
This smear gradually disappears into a native band with discrete mobility, at around the time the signal peptide is removed ( Figure 1B R, from 15' chase). Yet, it is far from obvious which aspect of gp120 folding triggers signal-peptide cleavage. We therefore embarked on the linear approach and prepared C-terminal truncations of gp120 from 110-aa length (111X, a gp120 molecule truncated after position 110) to full-length and analyzed in which the signal peptide was cleaved ( Figure S1).
Radioactive pulse-chase experiments showed that only full-length gp120 (511 residues long) and 494X lost their signal peptides, but that in all shorter forms, including 485X, the signal peptide remained uncleaved and hence attached to the protein ( Figure S1). We continued with a time course for 485X and 494X to examine and compare their folding pathways ( Figure 1B). Both truncations encompass the entire gp120 sequence except for the last 26 and 17 amino acids, respectively ( Figure 1A).
Like wild-type gp120, immediately after pulse labeling (synthesis) the 2 C-terminally truncated mutants ran close to the position of reduced gp120 in non-reducing SDS-PAGE. The 485X truncation formed disulfide bonds towards a compact structure, as the folding intermediates IT ran lower in the gel than reduced protein. It failed to form native gp120 however (NT, Figure 1B) or another stable intermediate. Instead it acquired compactness far beyond the mobility of NT and remained highly heterogeneous, suggesting the formation of long-distance disulfide bonds that increased compactness and hence electrophoretic mobility (Figure 1B,Cells NR,. Only a fraction, if any, of the 485X mutant lost its signal peptide or acquired competence to leave the ER and be secreted ( Figure 1B, Medium), even though all cysteines were present in the 485X gp120 protein. In contrast, addition of only 9 residues made 494X behave like wild-type gp120: cleavage rate and secretion were indistinguishable ( Figure 1B). Oxidative folding progressed similarly as well, except for a transient non-native disulfide-linked population that ran more compact than NT ( Figure 1B, Cells NR, 0-1 h) and disappeared over time (Cells NR,(2)(3)(4). We concluded that signal-peptide cleavage required synthesis and folding of more than 484 out of the 511 amino acids of gp120. The downstream amino acids in the C5 region in the inner (grey) domain ( Figure 1A, teal) triggered the switch from noncleavable to cleavable.
A pseudo salt bridge in the inner-domain b-sandwich controls signal-peptide cleavage and gp160 function Amino acids 485-494 form a b-strand (Figures 2A and 2B, teal, b31), which is part of the b-sandwich in the inner (grey) domain of gp120 ( Figures 1A, 2A, and 2B) [coding of strands and helices from (Garces et al., 2015)]. This b-sandwich is formed by interactions of seven b-strands in constant domains C1,C2,and C5 (Figures 1A,2A,and 2B). Six of the strands are close to the N-terminus and the 6 th strand, b31 (teal), is contributed by the C-terminal C5 region. As the addition of b31 triggered signalpeptide removal we hypothesized that the complete and properly folded b-sandwich was the minimal requirement for cleavage.
To address this, we designed charge mutants aimed to prevent assembly of b31 with the N-terminal part of the b-sandwich ( Figure 2B). In the high-resolution crystal structure (Garces et al., 2015) K487 (in b31) forms hydrogen bonds with E47 (in b2), E91 (in b5), and the main-chain oxygen of N92 (in b5) ( Figure 2B). We created charge-reversed mutants of the N-terminal glutamates (E47K and E91K), the Cterminal lysine (K487E) and combinations thereof ( Figures 2C and S2). We did not include N92 in our mutagenesis study since its interaction involves the main-chain oxygen, which cannot be removed; we considered this inappropriate for our question.
As gp120 is the dominant subunit in gp160 folding and signal-peptide cleavage, and allows more detailed analysis because it is smaller than gp160, we subjected wildtype gp120 and all mutants to pulse-chase analysis of their oxidative folding, signalpeptide cleavage, and secretion . Like the C-terminal truncations (485X, Figure 1B), all charge mutants in the b-sandwich formed gp120 molecules with higher electrophoretic mobility than native gp120 (NT), implying appreciable non-native long-range disulfide bonding, persisting at all time points or disappearing into aggregates ( Figure S2B). K487E showed the strongest phenotype: minimal formation of NT and a much-delayed signal-peptide cleavage (Figures 2C,E and S2B,Cells NR and R,band Rc). This folding step was crucial for function as K487E mutant virus was non-infectious ( Figure 2F). A striking rescue of the strong folding defect of K487E was effected by the charge reversal at the N-terminus: the double mutant E47K K487E displayed improved gp120 oxidation (Cells NR), signalpeptide cleavage (Cells R), and secretion (Medium) (Figures 2C-E and S2C), and rescued infectivity to ~30% of wild type ( Figure 2F). All N-terminal E-to-K mutants (E47K, E91K) oxidized and accumulated in a native-like position in gel but failed to form much of the sharp NT band seen in wild type ( Figure 2C and S2A). The start of signal-peptide cleavage of both single mutants was delayed to 30-60 min after synthesis and total cleavage was 25-50% lower than wild type by 4 h (Cells R and Figure 2E and S2C). As a result, the secretion of all N-terminal mutants was decreased by ≥60% in 4 h compared to wild type (Medium,Figures 2C,D and S2A,C). The redundancy of two negative charges likely contributes to the intermediate folding phenotype of the N-terminal mutants and the ~10% residual viral infectivity of the E47K mutant ( Figure 2F).
We concluded that the C-terminal b-strand was essential for proper folding of the bsandwich in the inner domain, which completes folding of gp120 and triggers signalpeptide cleavage. Timing of cleavage hence represents a checkpoint for proper folding of gp120.

Retention of the signal peptide causes hypercompacting of gp160
During folding, gp120 undergoes extensive disulfide formation and isomerization before reaching its native state. These intermediates appear as "waves" on SDS-PAGE representing varying degrees of compactness of folding intermediates (Land et al., 2003). Because mutants of gp120 that exhibited delayed or absent cleavage all formed hypercompact forms that ran below NT ( Figures 1B and 2C), we asked whether this heterogeneous electrophoretic mobility represented continued isomerization of gp120. We substituted the alanine in the -1-position relative to the cleavage site for a valine, to prevent cleavage by the signal-peptidase complex ( Figure 2G, Cells R). Initial oxidative folding of A30V was similar to that of wild type ( Figure 2G, NR). The A30V mutant, however, did not form a native band but populated all forms, from reduced to hypercompact oxidized, probably isomerizing continuously. The monomeric forms gradually disappeared into disulfide-linked, SDS-insoluble aggregates that increased in size and eventually became too large to enter the gel ( Figure 2G, Agg).
In both wild-type and A30V gp120, an endoglycosidase H-resistant band appeared over time ( Figure 2G, EHr). For wild-type gp120 this represents molecules that have transited through the Golgi complex and acquired an N-acetylglucosamine residue on their N-glycans but have yet to be secreted. For A30V gp120 this population may be due to the inaccessibility of some sugars for removal due to formation of SDSinsoluble aggregates.
We concluded that retention of the signal peptide either promotes formation of these hypercompact forms or prevents recovery from them. Because all signal-peptideretaining mutants showed a high propensity of aggregation, it is likely that these SDS-insoluble aggregates are comprised of hypercompact forms of gp120.

The cysteine in the signal peptide interacts with cysteines in gp160
As the signal peptide stays attached to gp160 for at least 15 min after chain termination it influences both co-and post-translational folding, through the tethering of the N-terminus to the membrane, as well as through interactions with the mature gp120 sequence (Snapp et al., 2017). The hypercompacting in non-cleaved mutants by continued disulfide isomerization ( Figure 2G) implies that an unpaired cysteine must be available to keep attacking formed disulfide bonds. Opening once-formed disulfides may improve folding yield as the folding protein regains conformational freedom, and at the same time has a chance to recover from non-native disulfide bonding. Existing disulfides may be attacked by a cysteine from an oxidoreductase in the ER, or by an intramolecular cysteine in gp120 [as shown for BPTI Kim, 1992, 1993)]. The unpaired cysteine in position 28 within the signal peptide is a likely interaction candidate, because it is part of the consensus sequence for the signal peptidase and as such (partially) exposed to the ER lumen. During translocation, C28 may interact with gp160 cysteines while they pass through the translocon. Folding analysis as in Figures 1 and 2 however showed that mutating C28 had no detectable effect on oxidative folding ( Figure 3A, C28A): folding intermediates disappeared, folded NT appeared, and the signal peptide was cleaved similarly and at similar times as wild-type gp120. Either C28A was identical to wild type or differences are missed due to asynchrony of the folding gp120 population. To amplify mobility differences, we alkylated with iodoacetic acid, which adds a charge to each free cysteine it binds to. To better synchronize the folding cohort, we modified the pulse-chase protocol with a preincubation with puromycin to release unlabeled nascent chains before labeling and added cycloheximide in the chase media to block elongation of radiolabeled nascent chains. At each chase time, gp120 C28A ran higher on a reducing gel ( Figure S3), indicating that it has more free cysteines as it bound more iodoacetic acid than wild-type gp120. This could be due to either slower disulfide-bond formation, faster disulfide-bond reduction, or a combination of both, which suggests a role for C28 in the net gp120 disulfide formation or isomerization during folding.
The importance of C28 became clear when we removed disulfide bond 54-74 in C1.
Deletion of disulfide 54-74 prevents signal-peptide cleavage, but allows gp160 to reach a compact position just above natively folded protein NT [ Figure 3B, ]. When C28A was introduced into the 54-74 deletion, folding intermediates were blocked at a much earlier phase and remained significantly less compact ( Figure 3B). The phenotype was the same when we combined C28A with the individual deletions of C54 or C74 ( Figure S4). C28A not only prevented formation of compact folding intermediates, it also increased their heterogeneity. As C28 deletion aggravated folding defects of 54-74 disulfide bond mutants, C28 must have partially compensated for the 54-74 folding defect by participating in oxidative folding. We concluded that C28A was important for oxidative folding of incompletely folded gp120, most likely for isomerization of non-native disulfide bonds and is redundant with the 54-74 cysteines for this process.
To analyze whether C28, in addition to redundancy, interacted directly with the 54-74 disulfide bond we used a 110 amino-acid truncation (111X, Figure 3C) for simplicity as it retained the signal peptide and contains a single native cysteine pair. Because formation of this disulfide bond was not detectable by comparing reduced and nonreduced samples, we made use of an alkylation-switch assay [ Figure 3D, (Appenzeller-Herzog and Ellgaard, 2008)]. In short, we radiolabeled cells expressing 111X and blocked free cysteines with NEM. Cells then were homogenized and denatured in 2% SDS and incubated again with NEM to block any free cysteines previously shielded by structure. After immunoprecipitation and reduction of disulfide bonds with TCEP, we alkylated resulting free cysteines with mPEG-malemide 5,000, which adds ~5 kDa of mass for each cysteine alkylated. Samples were immunoprecipitated again to remove mPEG and were analyzed by non-reducing 4-15% SDS-PAGE ( Figure 3E).
The 111X construct only showed weak disulfide-bond formation with only ~13% of molecules forming a disulfide bond ( Figure 3E, Wt). Upon removal of the signalpeptide cysteine C28 however, the population that contained a disulfide bond increased significantly to ~22% ( Figure 3F). The presence of C28 thus further destabilized the already unstable 54-74 disulfide bond. The non-native disulfide bond 28-74 in the C54A mutant barely formed, whereas the 28-54 disulfide bond in the C74A mutant was highly variable ( Figure 3F). This suggests that disulfide bonds involving the signal-peptide cysteine 28 are unstable and may only occur transiently.
The N-terminal cysteines form long-range disulfides during early gp160 folding Gp120 undergoes constant disulfide isomerization during folding (Land et al., 2003) and prolonged association of the signal peptide appears to intrinsically sustain isomerization and may further destabilize the already unstable 54-74 disulfide bond.
Moreover, we have shown redundancy of C28 with disulfide bond 54-74, which is why we asked whether the three N-terminal cysteines in gp120 were taking part in long-range disulfide bonds during folding. We removed the V1V2 variable loops, which are not essential for folding and function (Bontjer et al., 2009), and inserted a cleavage site for the protease thrombin through mutagenesis (L125R). This removed all 3 disulfide bonds in V1V2, 126-196 and 131-157 by the loop deletion and 119-205 by mutation (C119-205A). We named the resulting construct gp120Th. Reduction after cleavage produces an N-terminal fragment of ~15 kDa containing the signal peptide and the 3 N-terminal cysteines C28, C54, and C74, plus a ~75 kDa fragment containing the rest of gp120 ( Figure 4A). If long-range disulfides between the N and C-terminal fragments indeed exist, the cleaved, non-reduced molecule should run in the same position as uncleaved in non-reducing conditions, and should dissociate into the 2 fragments under reducing conditions.
Radioactive pulse-chase experiments as described above were modified: instead of deglycosylation with EndoH we denatured the protein with 0.2% SDS and cleaved gp120 with 0.75 U Thrombin. The 2 fragments were separated by 15-20% discontinuous SDS-PAGE. As expected, gp120 that lacked all N-terminal cysteines did not form any long-distance disulfide bonds ( Figure 4B and C, C28A C54-74A).
We confirmed that wild-type gp120 contained long-distance disulfide bonds between N-terminal cysteines and the rest of the molecule during early folding ( Figure 4B and C, Wt). Removal of C28 significantly reduced the number of molecules with a longrange disulfide bond ( Figure 4D), likely due to increased stability of the 54-74 disulfide bond in the absence of C28. Strikingly, all cysteine mutants that retained a single cysteine yielded some long-distance disulfides, suggesting that all three Nterminal cysteines could form a non-native pair with downstream cysteines in gp120 ( Figure 4B-D).

Removal of V1/V2 disulfides causes more rapid gp120 folding
Perhaps counterintuitively, the thrombin-cleavage construct (gp120Th) folded faster than full-length gp120 ( Figure 5A). Directly after the pulse, gp120Th ran as a more diffuse band whereas full-length gp120 (gp120 Wt) remained close to the reduced-gp120 mobility (Figure 5A NR). This increased compactness shows that gp120Th had already formed more or larger-loop-forming disulfide bonds (Snapp et al., 2017).
As a result, signal-peptide cleavage of gp120Th was faster: almost complete for gp120Th after a 1-hour chase, compared to ~50% cleaved of gp120 Wt ( Figure 5A, The 54-74 disulfide-bond mutants in gp120-Wt background fold to a stable intermediate just above the native position [Figures 3B and S4, (van Anken et al., 2008)] whereas the same mutants lacking V1V2 (in gp120Th) failed to accumulate in a single band ( Figure 5B), reminiscent of the folding of gp120 C28A C54-74A ( Figure   3B). This indicates that V1V2 deletion phenocopied C28 deletion in the 54-74 mutants. Redundancy of V1V2 with C28 was confirmed by the lack of additional effect of C28 removal in the gp120Th 54-74 mutants ( Figure 5B).

C28A results in decreased HIV-1 production and pseudovirus infectivity
As the biochemical data suggested a role for C28 in gp120 folding, we examined the effect of C28A mutation on viral production and infectivity. For this we tranfected cells with a molecular clone containing the full HIV genome (LAI strain), containing either wild-type gp160 or the C28A mutant. As the reading frames of Env and HIV-1 Vpu overlap and mutations in the signal peptide of gp120 can cause changes in the C-terminus of Vpu, we produced the viruses in HEK 293T cells, which are deficient in CD4 and tetherin and therefore do not require Vpu to enhance virus production (Van Damme et al., 2008). We consistently detected significantly less C28A HIV virus than wild-type HIV produced ( Figure 6A). Strikingly, the C28A virus was significantly more infectious than wild-type HIV but displayed strong heterogeneity in infectivities, indicative of heterogeneity in C28A gp160 incorporated into the virions ( Figure 6B).
Due to the severe deficit in virus production, despite increased infectivity, C28A-gp160-containing HIV is not likely to be competitive in nature. Indeed, alignments of >4,300 gp160 sequences from across all subtypes show that C28 is ~87% conserved (www.hiv.lanl.gov).
To uncouple differences in virus production from infectivity we moved to a pseudovirus system, which would allow analysis of the effect of C28A gp160 on infectivity alone ( Figure 6C and D). As expected, we found very little difference in virus production between wild-type and C28A gp160 ( Figure 6C). Infectivity of the C28A gp160 pseudovirus was roughly 60% less than the infectivity of wild type ( Figure 6D). We concluded therefore that gp160, and as a result HIV, suffered from the removal of the signal-peptide cysteine.

Discussion
During and early after synthesis, the N-terminus of HIV-1 gp160 remains tethered to the ER membrane by its transient signal anchor. We here show that the cysteine in the signal peptide drives disulfide isomerization, in part via the 54-74 disulfide bond, until the gp120 C-terminus has assembled with the N-terminus, completing gp120 folding. This triggers signal-peptide cleavage, removing C28 from the protein, halting further isomerization and stabilizing the native form. This intramolecular qualitycontrol process is essential for viral fitness of HIV and can be impaired and restored by single charge reversals in the gp120 inner domain.

Hierarchy of gp120 folding
The inner domain, which contains a β-sandwich, and the outer domain, which contains a stacked double β-barrel, constitute the minimal folding-capable "core" of gp120 [ (Figure 2A), (Garces et al., 2015;Kwong et al., 1998)]. Gp120 is completed with the surrounding variable loops V1V2, V3, and V4 (green in Figures 1A, 2A and B). The core contains six of the nine disulfide bonds in gp120, including the five that are essential for correct folding and signal-peptide cleavage .
The inner-domain β-sandwich consists of seven strands, six of which are N-terminal.
Proper folding of the sandwich requires assembly with the C-terminal β31 strand and formation of the five essential disulfide bonds, which then leads to signal-peptide cleavage [ Figure 2A, ]. Some folding of the gp120 'hairpin' begins during translation and translocation into the ER (Land et al., 2003); however, the bulk occurs post-translationally ( Figure 7).
The (pink) outer domain is the first complete domain to emerge from the translocon and has low contact order, meaning that folding does not require integration of distal residues (Figure 7). It is the first domain to fold, which requires formation of the two native disulfide bonds (296-331 and 385-418) in the b-barrel: gp120 lacking either disulfide bond barely folds past the reduced position in SDS-PAGE van Anken et al., 2008). Folding of the inner domain leads to cleavage of the signal peptide. Until that time, the signal peptide acts as signal anchor because it adopts an α-helical conformation that extends past the cleavage site and prevents proteolytic cleavage (Snapp et al., 2017). Folding and integration of the inner domain must break the helix and allow cleavage to occur, as this early helical region is a β-strand in crystal structures of gp120 (Garces et al., 2015). As the disulfide bonds of V1/V2 and gp41 are dispensable for signal-peptide cleavage and their deletion shows no aberration in the folding pathway , those domains likely fold after or independent of the outer and inner domains (Figure 7). This is underscored by gp120 folding and signal-peptide cleavage being largely independent of gp41, and by N-and C-terminal sequences in the gp120 inner domain forming the binding site of gp41 (Garces et al., 2015;Julien et al., 2013;Lyumkis et al., 2013), which probably requires the properly folded, native gp120 conformation.Gp41 binding may explain apparent inconsistencies between folding and function of some inner-domain mutants (Garces et al., 2015;Yang et al., 2003).
The regulation of signal-peptide cleavage by folding of gp120 implies that the formation of the b-sandwich generates sufficient force to break the attached α-helix.
Alpha-helical proteins have lower mechanical strength than β-sheet proteins, which often need to resist dissociation and unfolding; lower mechanical strength facilitates conformational changes to expose transient binding sites or allow signaling (Chen et al., 2015). Only ~5 pN indeed suffices for exposure of a protease-cleavage site in an α-helix: for proteolytic activation of Notch, cleavage of the NRR domain by ADAM17 (Gordon et al., 2015), and of the talin R3 domain, a 4-helix bundle (del Rio et al., 2009;Yao et al., 2014); the von Willebrand factor A2 domain requires ~8 pN (Zhang et al., 2009). Pulling apart a b-sheet protein such as Ig domains, OspA, or ubiquitin by shearing needs >160 pN (Brockwell et al., 2003;Carrion-Vazquez et al., 2003;Hertadi et al., 2003), with less force required for peeling (Brockwell et al., 2003).
Physiological forces measured so far do not exceed ~40 pN, a force level at which proteins in general may be destabilized already (Chen et al., 2015). The α-helical region around the cleavage site in gp120 thus would lose the stability competition from the b-sheet in the inner domain, if their structures are incompatible; indeed, in the gp160 structure this α-helical region is a b-strand (Figures 2A and 7). First-time folding, i.e. the completion of the inner-domain b-sandwich by assembly of β31, is likely to generate sufficient force as well, as 7-12 pN allows constant binding of a filamin b-strand to a b-sheet (Rognoni et al., 2012). Formation of the inner-domain disulfide bonds may further raise the stabilizing force (Eckels et al., 2019). The completion of gp120 therefore likely generates the ~5-pN force needed to break the α-helix and allow the signal peptidase to cleave off the gp120 signal peptide.

Effects of the attached signal peptide
The postponed cleavage of the signal peptide makes it a transient signal anchor, which acts as membrane tether. This limits conformational freedom and may benefit integration of the C-terminal b31-strand into the folded inner-domain b-sandwich, as in knotted proteins (Soler and Faisca, 2012).
The prolonged proximity of the free signal-peptide cysteine (28) supports disulfide isomerization during gp120 folding. Gp160 requires a native set of disulfide bonds to attain its functional 3D structure, but already during synthesis non-native disulfides are formed, which reshuffle over time (Land et al., 2003). The isomerization is detected as "waves" in SDS-PAGE, where the heterogeneous population of folding intermediates oscillates between more and less compactness over time [( Figure   3A), (Land et al., 2003)]. Despite extensive isomerization during folding, wild-type gp120 only transiently occupied forms more compact than native. In contrast, the various β-sandwich and uncleavable mutants extensively populated hypercompact states with non-native long-range disulfide bonds, indicating that in the absence of stable integration of the N-and C-termini and removal of the signal peptide, isomerization continues unabated and drives the formation of these hypercompact structures.
The constant disulfide isomerization is sustained by cysteine 28 in the signal peptide, as its sulfhydryl group is free to attack existing disulfide bonds. Once gp120 folding has reached a state where isomerization is no longer preferred, the signal peptide is cleaved, removing an important, conserved driving force behind isomerization.
Signal-peptide cleavage then acts as a sink because it removes the disulfideattacking cysteine and pulls the folding equilibrium to the native structure.

Mode of action of C28
In a short construct, C28 seemed to favor a disulfide bond with C54 ( Figure 3F).
Despite a limited ability to form disulfide bonds with cysteines downstream of the 54-74 bond ( Figure 4D), deletion of C28 did not aggravate the C54-74A defective phenotype in the absence of V1V2. C28 hence likely sustains isomerization primarily by constant attack and destabilization of the 54-74 disulfide bond. This propagates a free sulfhydryl group downstream through the folding protein, as the result of an intramolecular electron-transport chain from the more C-terminal cysteines via C54 and C74 to C28 (Figure 7). In the 110-residue gp120 chain (111X), essentially a mimic of a released gp120 nascent chain, in maximally 20% of molecules the 54-74 bond had formed, demonstrating its inherent instability as well as the likelihood that C28 already acts on 54 and 74 during translation.
Only in the presence of V1V2, C28 showed redundancy with the 54-74 disulfide, implying that C28 can fulfill roles otherwise played by C54 and C74 (and vice versa).
This suggests that C28 is involved in folding (and isomerization of V1V2) and may play this role by direct interaction with V1V2 cysteines, in absence of C54 and C74, distinct from its 54-74-mediated role in downstream disulfide bonds formation. We cannot exclude that the attack of C28 on the V1V2 disulfides leads to an alternative electron transport chain from C-terminal cysteines via V1V2 to C28. Either way, C28 needs to be removed at the end of gp120 folding to ensure stability of the gp120 conformation.

Intramolecular oxidoreductase and quality control for proper folding
Built-in isomerase activity may seem redundant considering that gp120 folds in the ER, a compartment that contains >20 protein-disulfide-isomerase family members (Jansen et al., 2012). These oxidoreductases are large, bulky proteins however, which cannot catalyze disulfide-bond formation once areas have attained significant tertiary structure. Isomerase activity built into the folding protein allows free-thiol propagation (in essence electron transport) in areas otherwise unreachable by folding enzymes. This perhaps should not be too surprising given that in folded proteins the majority of cysteines are solvent inaccessible (Srinivasan et al., 1990) and during folding disulfide bonds become resistant to reduction with DTT (Tatu et al., 1993;Tatu et al., 1995).
An example of intramolecular disulfide isomerization is the cysteine in the propeptide of bovine pancreatic trypsin inhibitor (BPTI), which increased both the rate and yield of BPTI folding (Weissman and Kim, 1992). The majority of disulfide formation during in-vitro folding of BPTI results from intramolecular disulfide rearrangements (Creighton et al., 1993;Darby et al., 1995;Weissman and Kim, 1995). Transfer of free thiols between lumenal and transmembrane domains in the ER has been demonstrated for vitamin-K-epoxide reductase (Liu et al., 2014;Schulman et al., 2010), indicating that such exchanges are possible. While C28 is located in the transmembrane α-helix (Snapp et al., 2017), suggesting immersion in the membrane, sliding of transmembrane domains up and down in the membrane is possible (Borochov and Shinitzky, 1976;Danielson et al., 1994;Mowbray and Koshland, 1987). As, C28 is part of the consensus sequence for signal-peptide cleavage, it likely is exposed to the ER lumen.

Conserved and multiple roles for signal peptides
While pro-peptides such as in BPTI may display similar functionality to the gp120 signal-peptide, the key difference is in their topologies. Still, the membraneembedded signal peptide of gp120 may well share functions with its soluble counterparts, acting as oxidoreducase and/or chaperone during biosynthetic folding (Iacovache et al., 2011;Ohta et al., 1991;Silen and Agard, 1989). Post-translational signal-peptide cleavage of gp120 is conserved across different subtypes of HIV-1 as biochemical properties, even if sequences are not strictly conserved (Snapp et al., 2017). This appears to be more general, as in other organisms signal-peptides mutate at a lower rate than the surrounding mature peptide (Morrison et al., 2003;Williams et al., 2000), or they mutate at the same rate, but with an increased proportion of null (Veitia and Caburet, 2009) or function-preserving mutants (Garcia-Maroto et al., 1991).
Detailed kinetic analysis of signal-peptide cleavage has not been reported for a great number of proteins, but gp160 is not alone in its biosynthesis-dependent and biosynthesis-regulated signal-peptide cleavage (Anjos et al., 2002;Daniels et al., 2003;Matczuk et al., 2013;Rutkowski et al., 2003;Zschenker et al., 2001). Whereas structure regulates cleavage in HCMV US11 (Rehm et al., 2001;Tamura et al., 2011), function regulates cleavage in ERAD-associated protein EDEM1 (Rehm et al., 2001;Tamura et al., 2011). These studies demonstrate that a variety of conditions including nascent-chain length and N-glycan addition can play a role in signal peptide cleavage. Signal peptides are more than address labels and folding and signal-peptide cleavage are more interdependent than originally thought (Li et al., 1994;Rehm et al., 2001;Tamura et al., 2011).
We argue that late signal-peptide cleavage may be much more common than biochemical experiments have uncovered. Cleavage may occur at any time from cotranslationally until late post-translationally. Considering the low rate of protein synthesis, ~3 to 6 amino acids per second (Braakman et al., 1991;Horwitz et al., 1969;Ingolia et al., 2011;Knopf and Lamfrom, 1965), it can lead to long average synthesis times (~1.5-3 min for gp120 and ~2-5 min for gp160). In fact, translation rates are much more heterogeneous (Ingolia et al., 2019): nascent chains of influenza virus HA may take >15 min to complete, corresponding to a rate of less than a residue per second [ (Braakman et al., 1992); unpublished observations]. For large proteins, the difference between early and late co-translational cleavage leaves a window of several minutes, during which the signal peptide functions as an anchor tethering the protein to the ER membrane. Sequence features in the signal peptide, such as an exposed cysteine, may well play roles similar to soluble propeptides. The membrane tether as a constraint may also be beneficial for the formation of N-and C-terminal contacts in proteins, which are present in ~50% of soluble PDB structures (Krishna and Englander, 2005) and can be found in multiple multimeric viral glycoproteins (Chen et al., 1998;Garces et al., 2015;Gogala et al., 2014;Sauter et al., 1992;Sun et al., 2014).
Here we have presented evidence for the direct functional contribution of the signal peptide of HIV-1 gp160 to its folding. The signal peptide assists directly in the disulfide isomerization of gp120 during early folding and functions as a quality control mechanism, only being removed when a critical late folding step has been reached.
As evidence grows, it becomes clear that signal peptides demonstrate functions far beyond their originally assigned roles as cellular postal codes.

Acknowledgements:
We would like to thank members of the Braakman-Van der Sluijs and Sanders labs for their fruitful discussions and insights. In particular Peter van der Sluijs and Joseline Houwman for critical reading of the manuscript and assistance with preparing figures. This work was supported by grants from the Netherlands

Declaration of Interests:
The authors declare no competing interests.  Oligomannose and complex glycans are represented as three-or two-pronged forked symbols respectively [adapted from (Leonard et al., 1990)]. Amino-acid stretch 485-494 is marked in teal. B) HeLa cells transiently expressing gp120 Wt and C-terminal truncations were radiolabeled for 5 minutes and chased for the indicated times. After detergent lysis, samples were immunoprecipitated with polyclonal antibody 40336. After immunoprecipitation, samples were analyzed by non-reducing (Cells NR) and reducing (Cells R) 7.5% SDS-PAGE. Gp120 was immunoprecipitated from medium samples with antibody 40336 and directly analyzed by reducing 7.5% SDS-PAGE (Medium). Gels were dried and exposed to Kodak-MR films or Fujifilm phosphor screens for quantification. Ru: reduced, signal peptide cleaved gp120, Rc: reduced, signal-peptide-uncleaved gp120, IT: intermediates, NT: native.     Figure 1B except that HeLa cells were transfected with wild-type full-length gp120 (gp120 Wt) or thrombin-cleavable gp120 (gp120Th) and pulsed for 10 minutes. B) Pulse-chase experiments conducted as in Figure 1B except that HeLa cells were transfected with cysteine mutants of gp120Th.
Complete statistical values are listed in Table 1.

Plasmids, antibodies, reagents and viruses
The full-length molecular clone of HIV-1LAI (pLAI) was the source of wild-type and mutant viruses (Peden et al., 1991). The QuikChange Site-Directed Mutagenesis kit (Stratagene) was used to introduce mutations into Env in plasmid pRS1 as described before; the entire Env gene was verified by DNA sequencing (Sanders et al., 2004).
Mutant Env genes from pRS1 were cloned back into pLAI as SalI-BamHI fragments.
For transient transfection of gp120/160 we used the previously described pMQ plasmid (Snapp et al., 2017). C-terminal truncations were generated by PCR of wt gp120 and Gibson assembled back into XbaI/XhoI digested pMQ. The thrombincleavable construct was designed based on stable V1V2 loop deletion number 2 (Bontjer et al., 2009) and generated from gp120 C119-205A using Gibson assembly (Gibson et al., 2009). All point mutations were introduced using QuikChange Site-Directed mutagenesis as above.
For immunoprecipitation: we used the previously described polyclonal rabbit anti- Although we studied gp160 of the LAI isolate, we followed the canonical HXB2 residue numbering (GenBank: K03455.1), which relates to the LAI numbering as follows: because of an insertion of five residues in the V1 loop of LAI gp160, all cysteine residues beyond this loop have a number that is 5 residues lower in HXB2 than in LAI: until Cys131, numbering is identical, but Cys162 in LAI becomes 157 in HXB2, etc.

Virus production
Virus stocks were produced by transfecting HEK293T cells with wild-type or mutant pLAI constructs using the Lipofectamine 2000 Transfection Reagent (Invitrogen) per manufacturer's protocol. Production of virus stocks on C33A cells were done by calcium-phosphate precipitation. The virus-containing culture supernatants were harvested 2 days post-transfection, stored at -80°C, and the virus concentrations were quantitated by CA-p24 ELISA as described before (Moore and Jarrett, 1988).
These values were used to normalize the amount of virus used in subsequent infection experiments.

Single Cycle Infection
The TZM-bl reporter cell line stably expresses high levels of CD4 and HIV-1 coreceptors CCR5 and CXCR4 and contains the luciferase and β-galactosidase genes under the control of the HIV-1 long-terminal-repeat (LTR) promoter (Wei et al., 2002). Single-cycle infectivity assays were performed as described before (Bontjer et al., 2009;Bontjer et al., 2010). In brief, one day prior to infection, 17 All infections were performed in quadruplicate.

Folding assay
HeLa cells transfected with wild-type or mutant gp160/gp120 constructs were subjected to pulse-chase analysis as described before (McCaul et al., 2019;Snapp et al., 2017). In short, cells were starved for cysteine and methionine for 15-30 min and pulse labeled for 5 min with 55 µCi/ 35-mm dish of Easytag express 35 S protein labeling mix (Perkin Elmer). Where indicated (+DTT), cells were incubated with 5 mM DTT for 5 min before and during the pulse. The pulse was stopped, and chase started by the first of 2 washes with chase medium containing an excess of unlabeled cysteine and methionine. At the end of each chase, medium was collected, and cells were cooled on ice and further disulfide bond formation and isomerization was blocked with 20 mM iodoacetamide. Cells were lysed and detergent lysates and medium samples were subjected to overnight immunoprecipitation at 4°C with polyclonal antibody 40336 against gp160.

Deglycosylation, SDS-PAGE, and autoradiography
Where appropriate, to identify gp160 folding intermediates, glycans were removed from lysate-derived gp120 or gp160 with Endoglycosidase H (Roche) treatment of the immunoprecipitates as described before (Land et al., 2003).
Samples were subjected to non-reducing and reducing (25 mM DTT) SDS-PAGE.
Gels were dried and exposed to super-resolution phosphor screens (FujiFilm) or Samples were immunoprecipitated again via the HA-tag and analyzed by 4-15% non-reducing gradient SDS-PAGE (BioRad) and processed as before.

Thrombin cleavage
After HeLa cells transfected with various thrombin-cleavable constructs were pulselabeled as described above, detergent lysates were immunoprecipitated with  Figure 3F and 4D differences were assessed using paired t-tests between wild-type and mutants. For experiments in Figure 6A-D differences were assessed using unpaired t-tests. A complete list of all pairs examined, statistical methods and resulting p values can be found in Table   1.  A) Pulse-chase experiments were performed as in Figure 2C except that HeLa cells were transfected with the indicated mutants. B) Uncropped gels from Figure 2C. IT: folding intermediates, NT: native gp120, Ru: reduced signal-peptide-uncleaved gp120, Rc: reduced signal-peptide-cleaved gp120. C) Quantifications performed as in Figure 2D. D) Quantifications performed as in Figure 2E. E) Infection assays were performed as in Figure 2F. Error bars: SD. Figure S3. Synchronized folding of gp120 Wt and C28A.