Protein Data Bank
mmCIF, PDB | |
Website |
---|
The Protein Data Bank (PDB)[1] is a database for the three-dimensional structural data of large biological molecules such as proteins and nucleic acids, which is overseen by the Worldwide Protein Data Bank (wwPDB). This structural data is obtained and deposited by biologists and biochemists worldwide through the use of experimental methodologies such as X-ray crystallography, NMR spectroscopy, and, increasingly, cryogenic-sample electron microscopy. All submitted data are reviewed by expert biocurators and, once approved, are made freely available on the Internet under the CC0 Public Domain Dedication.[2] Global access to the data is provided by the websites of the wwPDB member organizations (PDBe,[3] PDBj,[4] RCSB PDB,[5] BMRB[6] and the EMDB[7]).
The PDB is a key in areas of
History
Two forces converged to initiate the PDB: a small but growing collection of sets of protein structure data determined by X-ray diffraction; and the newly available (1968) molecular graphics display, the Brookhaven RAster Display (BRAD), to visualize these protein structures in 3-D. In 1969, with the sponsorship of Walter Hamilton at the Brookhaven National Laboratory, Edgar Meyer (Texas A&M University) began to write software to store atomic coordinate files in a common format to make them available for geometric and graphical evaluation. By 1971, one of Meyer's programs, SEARCH, enabled researchers to remotely access information from the database to study protein structures offline.[10] SEARCH was instrumental in enabling networking, thus marking the functional beginning of the PDB.
The Protein Data Bank was announced in October 1971 in
Upon Hamilton's death in 1973, Tom Koetzle took over direction of the PDB for the subsequent 20 years. In January 1994, Joel Sussman of Israel's Weizmann Institute of Science was appointed head of the PDB. In October 1998,[12]
the PDB was transferred to the Research Collaboratory for Structural Bioinformatics (RCSB);
Contents
The PDB database is updated weekly (UTC+0 Wednesday), along with its holdings list.[18] As of 10 January 2023[update], the PDB comprised:
Experimental Method |
Proteins only | Proteins with oligosaccharides | Protein/Nucleic Acid complexes |
Nucleic Acids only | Other | Oligosaccharides only | Total |
---|---|---|---|---|---|---|---|
X-ray diffraction | 152277 | 8969 | 8027 | 2566 | 163 | 11 | 172013 |
NMR | 12104 | 32 | 281 | 1433 | 31 | 6 | 13887 |
Electron microscopy |
9226 | 1633 | 2898 | 77 | 8 | 0 | 13842 |
Hybrid | 189 | 7 | 6 | 12 | 0 | 1 | 215 |
Neutron | 72 | 1 | 0 | 2 | 0 | 0 | 75 |
Other | 32 | 0 | 0 | 1 | 0 | 4 | 309 |
Total: | 173900 | 10642 | 11212 | 4091 | 202 | 22 | 200069 |
- 162,041 structures in the PDB have a structure factor file.
- 11,242 structures have an NMR restraint file.
- 5,774 structures in the PDB have a chemical shifts file.
- 13,388 structures in the PDB have a 3DEM map file deposited in EM Data Bank

Most structures are determined by X-ray diffraction, but about 7% of structures are determined by
For PDB structures determined by X-ray diffraction that have a structure factor file, their electron density map may be viewed. The data of such structures may be viewed on the three PDB websites.
Historically, the number of structures in the PDB has grown at an approximately exponential rate, with 100 registered structures in 1982, 1,000 structures in 1993, 10,000 in 1999, 100,000 in 2014, and 200,000 in January 2023.[20][21]
File format
The file format initially used by the PDB was called the PDB file format. The original format was restricted by the width of
An XML version of PDB, called PDBML, was described in 2005.[24] The structure files can be downloaded in any of these three formats, though an increasing number of structures do not fit the legacy PDB format. Individual files are easily downloaded into graphics packages from Internet URLs:
- For PDB format files, use, e.g.,
http://www.pdb.org/pdb/files/4hhb.pdb.gz
orhttp://pdbe.org/download/4hhb
- For PDBML (XML) files, use, e.g.,
http://www.pdb.org/pdb/files/4hhb.xml.gz
orhttp://pdbe.org/pdbml/4hhb
The "4hhb
" is the PDB identifier. Each structure published in PDB receives a four-character alphanumeric identifier, its PDB ID. (This is not a unique identifier for biomolecules, because several structures for the same molecule—in different environments or conformations—may be contained in PDB with different PDB IDs.)
Viewing the data
The structure files may be viewed using one of
See also
- Crystallographic database
- Protein structure
- Protein structure prediction
- Protein structure database
- PDBREPORT lists all anomalies (also errors) in PDB structures
- PDBsum—extracts data from other databases about PDB structures
- Proteopedia—a collaborative 3D encyclopedia of proteins and other molecules
References
- PMID 30357364.
- ^ wwPDB.org. "wwPDB: Usage Policies". www.wwpdb.org. Retrieved 2024-04-16.
- ^ a b "PDBe home < Node < EMBL-EBI". pdbe.org.
- ^ a b "Protein Data Bank Japan – PDB Japan – PDBj". pdbj.org.
- ^ Bank, RCSB Protein Data. "RCSB PDB: Homepage". rcsb.org.
- ^ a b "Biological Magnetic Resonance Bank". bmrb.wisc.edu.
- ^ EMDB, EMBL-EBI. "EMDB: Homepage". www.emdatabank.org.
- PMID 18156675.
- PMID 9433130.
- PMID 9232661.
- .
- PMID 10592235.
- ^ "Research Collaboratory for Structural Bioinformatics". RCSB.org. Research Collaboratory for Structural Bioinformatics. Archived from the original on 2007-02-05.
- ^ "RCSB PDB Newsletter Archive". RCSB Protein Data Bank.
- ^ EMDB, EMBL-EBI. "EMDB: Homepage". www.emdatabank.org.
- ISBN 978-1-441-97664-2.
- ^ "PDB Validation Suite". sw-tools.pdb.org.
- ^ "PDB Current Holdings Breakdown". RCSB. Archived from the original on 2007-07-04. Retrieved 2007-07-02.
- PMID 30357364.
- PMID 24834514.
- ^ Protein Data Bank. "PDB Statistics: Overall Growth of Released Structures Per Year". www.rcsb.org. Retrieved 12 January 2023.
- ^ "wwPDB: File Formats and the PDB". wwpdb.org. Retrieved April 1, 2020.
- ^ wwPDB.org. "wwPDB: 2019 News". wwpdb.org.
- PMID 15509603.
- ^ "ICM-Browser". Molsoft L.L.C. Retrieved 2013-04-06.
- ^ "Swiss PDB Viewer". Swiss Institute of Bioinformatics. Retrieved 2013-04-06.
- ^ "STAR: Biochem - Home". web.mit.edu.
- ^ "VisProt3DS". Molecular Systems Ltd. Retrieved 2013-04-06.
External links
PDB structure ID (P638) (see uses)
- The Worldwide Protein Data Bank (wwPDB)—parent site to regional hosts (below)
- RCSB Protein Data Bank (US)
- PDBe (Europe)
- PDBj (Japan)
- BMRB, Biological Magnetic Resonance Data Bank (US)
- wwPDB Documentation—documentation on both the PDB and PDBML file formats
- Looking at Structures Archived 2011-03-24 at the Wayback Machine—The RCSB's introduction to crystallography
- PDBsum Home Page—Extracts data from other databases about PDB structures.
- Nucleic Acid Database, NDB—a PDB mirror especially for searching for nucleic acids
- Introductory PDB tutorial sponsored by PDB
- PDBe: Quick Tour on EBI Train OnLine