Success Rates in Protein Crystallography
    Click for Graph
  1. Obtain milligrams of pure, soluble protein.
    1. About 60% of sequences express well.
    2. Only one half (prokaryotic) to one quarter (eukaryotic) of these are soluble.

  2. Obtain high quality crystals.
    1. Typically hundreds of crystallization conditions are tested.
    2. Crystals must be singular, sufficiently large, and preferably neither needles nor plates.
    3. Crystals with large numbers of copies in the asymmetric unit are problematic.
    4. About half of highly expressed soluble proteins crystallize, but only about one third of these crystals are suitable.
Overall success rates (Thornton):

By early 2004, of 24,000 targets cloned, 600 had been solved (3%).
Of these, ~500 nonredundant solved targets represents 3% of the 16,000-target goal of Structural Genomics.

by Eric Martz, University of Massachusetts, July 2003 (revised February 2004)
Thanks to Byron Rubin for some insights here.


Further reading:
  • TargetDB maintained by the RCSB can be queried by Status to find the current numbers of targets that have reached each stage. About 25% of completed results are being solved by NMR.

  • Structural genomics takes off, Thornton, Trends Biochem. Sci., 26:88, 2001.
  • Structural genomics. Tapping DNA for structures produces a trickle, Service, Science 298:948, 2002.
  • Structural genomics: current progress, Gerstein et al., Science 299:1663, 2003.
      The authors make the point that overall progress is much larger than the number of solved structures, since each solution serves as a template for homology modeling a large family of sequences. Also, work is often stopped on a target when a sequence-related target is solved; hence not all uncompleted targets are "failures".