Some Example Applications for Protein Characterization Studies:
1. Sedimentation velocity , light scattering , and/or native gels can be used to cross-validate SEC protocols for detection of protein aggregates. Some form of confirmation that the SEC protocol detects all relevant species is now nearly always required by the regulatory agencies. APL has now done this for dozens of products.
2. Comparability protocols are commonly required either after changes in a manufacturing process, cell line, formulation, or manufacturing site once a product has entered clinical trials or has reached the market. They are also needed to show comparability of biosimilars to the Reference Listed Drug. It is important to demonstrate comparability of higher order structure (secondary, tertiary, and quaternary structure) as well as comparability of primary structure. (See our presentation at the FDA's State of the Art Analytical Methods meeting, June 2003, "Measuring Comparability of Conformation, Heterogeneity, and Aggregation with Circular Dichroism and Analytical Ultracentrifugation")
APL offers a number of tools useful for comparability protocols, and has completed literally hundreds of such protocols:
CD (equivalent secondary and tertiary structure; equivalent conformational stability)
DSC (equivalent tertiary structure and thermal stability)
sedimentation equilibrium (equivalent quaternary structure and/or binding of ligands)
3. Pre-formulation studies of protein conformation and stability can reduce the range of conditions that need to be examined during formulation studies, saving considerable time and expense.
4. Recombinant vaccines are often quite heterogeneous in size/MW distribution, and may use virus-like particles as carriers and adjuvants. Sedimentation velocity can be very useful for characterizing vaccines and monitoring lot-to-lot uniformity of size distributions, and circular dichroism can determine whether the recombinant antigen has a regular ordered structure.
5. For proteins that form visible or sub-visible particulates (often leading to eventual precipitation) dynamic light scattering (DLS) is a valuable tool for detecting the early precursors (nuclei) that will eventually trigger formation of particulates. A DLS assay can often trace the damage that leads to particulate formation to a specific step in manufacturing, or give confidence that a change in formulation will truly solve the particulates issue.
6. For characterizing PEGylated proteins and nucleic acids (and the quality of the polymers used for conjugation) we recommend
dynamic light scattering directly measures the increase in hydrodynamic size caused by PEGylation, a property which often correlates strongly with serum lifetime
on-line classical light scattering is very useful for measuring the extent of conjugation, and whether PEGylation has altered the state of association/aggregation
on-line classical light scattering is also very useful for verifying the true molecular weight and homogeneity of the PEG polymer
7. Assessing proper folding of novel proteins discovered through genomics or proteomics, for which no native protein from natural sources is available as a control, can avoid wasting time and money trying to assay biological activity or running screens using proteins that aren't properly folded. Three ways to assess proper folding are:
near-UV CD (is there any regular tertiary structure?)
thermal unfolding by far-UV CD (is there a cooperative unfolding transition?)
8. Characterizing protein binding events can provide functional characterization of hormones, monoclonal antibodies, and small molecule drugs. Such data can be useful for:
selection among drug candidates
demonstration of comparability (e.g. after a process change)
quality control and process monitoring
development of protein mimetics
The graph shows the ratio of the apparent weight-average molar mass at each concentration for a moderately-sized (20-70 kDa) therapeutic protein relative to that observed at infinite dilution. We see that the molar mass decreases strongly with concentration rather than increasing due to reversible self-association. Indeed near 120 mg/mL Mapp is nearly 4-fold lower than at low concentration.
Qualitatively the graph above tells us these protein molecules exhibit strong repulsive interactions, but how do we quantitate that? The simplest model is the first-order virial expansion:
1/Mapp = 1/M0 + 2B22c 
where the second virial coefficient B22 measures the strength of the net repulsion (positive values) or attraction (negative values). The graph below re-plots the data in a form where equation 1 predicts a straight line with a slope proportional to B22. While the data below ~20 mg/mL can be reasonably approximated by a straight line (the dashed green line), clearly at higher concentrations the data curve upward strongly (the repulsion between molecules grows stronger). This means that extrapolating the low-concentration data to high concentrations would grossly underestimate the strength of the repulsive interactions at higher concentrations.
The full range of data were fitted to a more complex, third-order virial expansion:
1/Mapp = 1/M0 + 2B22c + 3B33c2 + 4B44c3 
That fit is shown as the solid black curve in the graph above. For this sample, the third virial coefficient B33 turns out to be zero (within measurement error), but a large positive fourth virial coefficient B44 is necessary to fit the full range of data.
This approach and these data are discussed in more detail in this poster presented at WCBP 2015 .
Example 2: Is a Sequence Homolog a True Structure Homolog?
Tumor necrosis factor alpha (TNF) was the first known member of a family of signaling molecules involved in inflammation, apoptosis, and many other important functions. A hallmark of this family is that these proteins normally occur as trimers in solution.
A potential new member of this family was identified on the basis of sequence homology. However, when it was expressed in E. coli and refolded from inclusion bodies, it appeared to be a monomer based on its elution relative to standards on size-exclusion chromatography (SEC). Did this mean it was not truly a member of this family, or simply that it was not correctly refolded, or was the mass estimate from SEC wrong?
The graph below shows some sedimentation equilibrium data for this molecule, showing the concentration as a function of position within the cell as monitored by absorbance at 230 nm. Note that the total amount of protein for this experiment was <10 micrograms.
The next graph (below) shows that data re-plotted as the natural log of absorbance vs. radius2/2. In this type of plot a single species gives a straight line whose slope is proportional to mass. The light blue line indicates the theoretical slope calculated for the monomer mass (~17 kDa). The dark blue line (mostly hidden behind the data points) has the theoretical slope for the trimer mass. This plot, therefore, makes it obvious that this protein is indeed a trimer, and therefore it is indeed a homolog of TNF (and presumably is correctly folded).
Although the results for only a single sample and rotor speed are shown here, in general to quantitatively characterize a protein and whether it self-associates we run 3-9 samples over a broad range of loading concentrations and at two or more rotor speeds, and these data are then simultaneously ("globally") analyzed.
Example 3: Functional Characterization of a Monoclonal Antibody
The function of many proteins is to bind to other proteins, and sedimentation equilibrium is a very powerful tool for studying such binding interactions.
The graph below summarizes the data (points) and fitted curves for 8 experiments on mixtures of a monoclonal antibody and its ~25 kDa protein antigen. The data sets cover experiments at different mixing ratios of antibody to antigen, and by using scans at either 280 or 230 nm they also cover a wide range of concentrations. (Note that this entire set of experiments used only ~80 micrograms of antibody.)
To analyze these data an appropriate binding model is needed. The model shown below is the simplest one possible for an antibody with two binding sites, and simply assumes that both sites have the same binding affinity and bind independently of one another (no cooperativity and no steric blocking of one site by antigen bound to the other).
In fitting these data one is essentially asking: Is there a single value of the dissociation constant, K1, that can explain all 8 experiments? The solid lines in the graph above represent the best fit of this model, with K1 = 48 nanomolar, and the fact that the lines follow the data points quite well shows that this is a good fit. Importantly, this good fit also implies that both binding sites on the antibody are active, and active simultaneously. This data analysis was done using custom software available only at Alliance Protein Laboratories.
The value of K1 is actually quite well determined, with statistical analysis indicating we can be 95% confident the true value is between 43 and 52 nM (a 5% standard error, or only 60 cal/mol in terms of binding energy!) While this statistical analysis probably overestimates the true precision at least several-fold, nonetheless it is clear this approach can give very precise binding affinities.
Importantly, this approach could be used to quantitatively compare different antibodies, different lots of the same antibody, loss of activity of aged samples, etc.