DBpedia

DBpedia
Developer(s)	Leipzig University; University of Mannheim;
Initial release	10 January 2007 (17 years ago)
Stable release	DBpedia 2016-10 / 4 July 2017
Repository	github.com/dbpedia/ ;
Written in	Scala; Java;
Type	Semantic Web; Linked Data;
License	GNU General Public License
Website	dbpedia.org

DBpedia (from "DB" for "

OpenLink Virtuoso.^[1]^[2] DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.^[3]

The project was heralded as "one of the more famous pieces" of the decentralized

Linked Data effort by Tim Berners-Lee, one of the Internet's pioneers.^[4]

As of June 2021, DBPedia contained over 850 million triples.

Background

The project was started by people at the

sui generis database rights

.

Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles, such as "infobox" tables (the pull-out panels that appear in the top right of the default view of many Wikipedia articles, or at the start of the mobile versions), categorization information, images, geo-coordinates and links to external Web pages. This structured information is extracted and put in a uniform dataset which can be queried.

Dataset

The 2016-04 release of the DBpedia data set describes 6.0 million entities, out of which 5.2 million are classified in a consistent ontology, including 1.5 million persons, 810,000 places, 135,000 music albums, 106,000 films, 20,000 video games, 275,000 organizations, 301,000 species and 5,000 diseases.^[8] DBpedia uses the Resource Description Framework (RDF) to represent extracted information and consists of 9.5 billion RDF triples, of which 1.3 billion were extracted from the English edition of Wikipedia and 5.0 billion from other language editions.^[8]

From this data set, information spread across multiple pages can be extracted. For example, book authorship can be put together from pages about the work, or the author.^{[further explanation needed]}

One of the challenges in extracting information from Wikipedia is that the same concepts can be expressed using different parameters in infobox and other templates, such as |birthplace= and |placeofbirth=. Because of this, queries about where people were born would have to search for both of these properties in order to get more complete results. As a result, the DBpedia Mapping Language has been developed to help in mapping these properties to an ontology while reducing the number of synonyms. Due to the large diversity of infoboxes and properties in use on Wikipedia, the process of developing and improving these mappings has been opened to public contributions.^[9]

Version 2014 was released in September 2014.^[10] A main change since previous versions was the way abstract texts were extracted. Specifically, running a local mirror of Wikipedia and retrieving rendered abstracts from it made extracted texts considerably cleaner. Also, a new data set extracted from Wikimedia Commons was introduced.

As of June 2021, DBPedia contains over 850 million triples.^[11]

Examples

DBpedia extracts factual information from Wikipedia pages, allowing users to find answers to questions where the information is spread across multiple Wikipedia articles. Data is accessed using an SQL-like query language for RDF called SPARQL.

For example, if one were interested in the Japanese shōjo manga series Tokyo Mew Mew, and wanted to find the genres of other works written by its illustrator Mia Ikumi. DBpedia combines information from Wikipedia's entries on Tokyo Mew Mew, Mia Ikumi and on this author's works such as Super Doll Licca-chan and Koi Cupid. Since DBpedia normalises information into a single database, the following query can be asked without needing to know exactly which entry carries each fragment of information, and will list related genres:

PREFIX dbprop: <http://dbpedia.org/ontology/>
PREFIX db: <http://dbpedia.org/resource/>
SELECT ?who, ?WORK, ?genre WHERE {
 db:Tokyo_Mew_Mew dbprop:author ?who .
 ?WORK  dbprop:author ?who .
 OPTIONAL { ?WORK dbprop:genre ?genre } .
}

Use cases

DBpedia has a broad scope of entities covering different areas of

DBpedia Spotlight also include links to DBpedia.^[16]^[17]^[18] The BBC uses DBpedia to help organize its content.^[19]^[20] Faviki uses DBpedia for semantic tagging.^[21] Samsung also includes DBpedia in its "Knowledge Sharing Platform"

.

Such a rich source of structured cross-domain knowledge is fertile ground for

artificial intelligence systems. DBpedia was used as one of the knowledge sources in IBM Watson's Jeopardy! winning system^[22]

Amazon provides a DBpedia Public Data Set that can be integrated into Amazon Web Services applications.^[23]

Data about creators from DBpedia can be used for enriching artworks' sales observations.[24]

The crowdsourcing software company, Ushahidi, built a prototype of its software that leveraged DBpedia to perform semantic annotations on citizen-generated reports. The prototype incorporated the "YODIE" (Yet another Open Data Information Extraction system) service^[25] developed by the University of Sheffield, which uses DBpedia to perform the annotations. The goal for Ushahidi was to improve the speed and facility with which incoming reports could be validated managed.^[26]

DBpedia Spotlight

DBpedia Spotlight is a tool for annotating mentions of DBpedia resources in text. This allows linking unstructured information sources to the

named entity recognition, and other information extraction

tasks. DBpedia Spotlight aims to be customizable for many use cases. Instead of focusing on a few entity types, the project strives to support the annotation of all 3.5 million entities and concepts from more than 320 classes in DBpedia. The project started in June 2010 at the Web Based Systems Group at the Free University of Berlin.

DBpedia Spotlight is publicly available as a

API licensed via the Apache License. The DBpedia Spotlight distribution includes a jQuery plugin that allows developers to annotate pages anywhere on the Web by adding one line to their page.^[27] Clients are also available in Java or PHP.^[28] The tool handles various languages through its demo page^[29] and web services. Internationalization is supported for any language that has a Wikipedia edition.^[30]

Archivo ontology database

From 2020, the DBpedia project provides a regularly updated database of web‑accessible ontologies written in the OWL ontology language.^[31] Archivo also provides a four star rating scheme for the ontologies it scrapes, based on accessibility, quality, and related fitness‑for‑use criteria. For instance, SHACL compliance for graph‑based data is evaluated when appropriate. Ontologies should also contain metadata about their characteristics and specify a public license describing their terms‑of‑use.^[32]^[33] As of June 2021^[update] the Archivo database contains 1368 entries.

History

DBpedia was initiated in 2007 by Sören Auer, Christian Bizer, Georgi Kobilarov,

Jens Lehmann, Richard Cyganiak and Zachary Ives.^[5]

References

ISSN 1570-8268. Archived from the original
(PDF) on 10 August 2017. Retrieved 11 December 2015.

^ "About DBpedia". DBpedia. Retrieved 14 January 2024.

^ "Komplett verlinkt — Linked Data" (in German). 3sat. 19 June 2009. Archived from the original on 6 January 2013. Retrieved 10 November 2009.

^ "Sir Tim Berners-Lee Talks with Talis about the Semantic Web". Talis. 7 February 2008. Archived from the original on 10 May 2013.

^ ^a ^b ^c DBpedia: A Nucleus for a Web of Open Data, available at [1], [2], or [3]

^ "Credits". DBpedia. Archived from the original on 21 September 2014. Retrieved 9 September 2014.

^ "Home". March 2024.

^ ^a ^b "YEAH! We did it again ;) – New 2016-04 DBpedia release". DBpedia. 19 October 2016. Retrieved 9 January 2019.

^ "DBpedia Mappings". mappings.dbpedia.org. Retrieved 3 April 2010.

^ "Changelog". DBpedia. September 2014. Retrieved 9 September 2014.

^ Holze, Julia (23 July 2021). "Announcement: DBpedia Snapshot 2021-06 Release". DBpedia Association. Retrieved 28 July 2021.

^ E. Curry, A. Freitas, and S. O'Riáin, "The Role of Community-Driven Data Curation for Enterprises", Archived 23 January 2012 at the Wayback Machine in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25-47.

^ "Statistics on links between Data sets", SWEO Community Project: Linking Open Data on the Semantic Web, W3C, retrieved 24 November 2009

^ "Statistics on Data sets", SWEO Community Project: Linking Open Data on the Semantic Web, W3C, retrieved 24 November 2009

^ "Zemanta API". dev.zemanta.com. Retrieved 26 July 2021.

^ Sandhaus, Evan; Larson, Rob (29 October 2009). "First 5,000 Tags Released to the Linked Data Cloud". The New York Times Blogs. Retrieved 10 November 2009.

^ "Life in the Linked Data Cloud". opencalais.com. Archived from the original on 24 November 2009. Retrieved 10 November 2009. Wikipedia has a Linked Data twin called DBpedia. DBpedia has the same structured information as Wikipedia – but translated into a machine-readable format.

^ "Zemanta talks Linked Data with SDK and commercial API". ZDNet. Archived from the original on 28 February 2010. Retrieved 10 November 2009. Zemanta fully supports the Linking Open Data initiative. It is the first API that returns disambiguated entities linked to dbPedia, Freebase, MusicBrainz, and Semantic Crunchbase.

^ "European Semantic Web Conference 2009 - Georgi Kobilarov, Tom Scott, Yves Raimond, Silver Oliver, Chris Sizemore, Michael Smethurst, Christian Bizer and Robert Lee. Media meets Semantic Web - How the BBC uses DBpedia and Linked Data to make Connections". eswc2009.org. Archived from the original on 8 June 2009. Retrieved 10 November 2009.

^ "BBC Learning - Open Lab - Reference". BBC. Archived from the original on 25 August 2009. Retrieved 10 November 2009. Dbpedia is a database version of Wikipedia. It is used in a lot of projects for a wide range of different reasons. At the BBC we are using it for tagging content.

^ "Semantic Tagging with Faviki". readwriteweb.com. Archived from the original on 29 January 2010.

^ David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, and Chris Welty "Building Watson: An Overview of the DeepQA Project." Archived 6 November 2020 at the Wayback Machine In AI Magazine Fall, 2010. Association for the Advancement of Artificial Intelligence (AAAI).

^ "Amazon Web Services Developer Community : DBpedia". developer.amazonwebservices.com. Archived from the original on 13 February 2010. Retrieved 10 November 2009.

ISBN 978-3-319-26761-6
.

^ "GATE.ac.uk - applications/yodie.html". gate.ac.uk. Retrieved 11 May 2020.

^ "ushahidi/platform-comrades". GitHub. 30 June 2019. Retrieved 9 March 2020.

^ Mendes, Pablo. "DBpedia Spotlight jQuery Plugin". jQuery Plugins. Archived from the original on 3 April 2011. Retrieved 15 September 2011.

^ DiCiuccio, Rob (25 September 2016). "PHP Client for DBpedia Spotlight". GitHub.

^ "Demo of DBpedia Spotlight". Retrieved 8 September 2013.

^ "Internationalization of DBpedia Spotlight". GitHub. Retrieved 8 September 2013.

^ "DBpedia Archivo". Retrieved 8 July 2021.

^ Frey, Johannes; Streitmatter, Denis; Götz, Fabian; Hellmann, Sebastian; Arndt, Natanael (27 October 2020). "DBpedia Archivo: a web-scale interface for ontology archiving under consumer-oriented aspects". In Sure-Vetter, York; Sack, Harald; Cudré-Mauroux, Philippe; Maleshkova, Maria; Pellegrini, Tassilo; Acosta, Maribel (eds.). Semantic systems: the power of AI and knowledge graphs. Cham, Switzerland: Springer.
S2CID 219939266. Download as PDF or ePUB.

^ Frey, Johannes; Streitmatter, Denis; Götz, Fabian; Hellmann, Sebastian; Arndt, Natanael (10 September 2020). DBpedia Archivo: a web-scale interface for ontology archiving under consumer-oriented aspects. Leipzig, Germany: Institut für Angewandte Informatik (InfAI). Retrieved 8 July 2021. YouTube video 00:10:38.

External links

Wikimedia Commons has media related to DBpedia.

Official website

v
t
e
Semantic Web
Background

Databases

Hypertext

Internet

Ontologies

Semantics

Semantic networks

World Wide Web

Sub-topics

Dataspaces

Hyperdata

Linked data

Rule-based systems

Applications

Semantic analytics

Semantic broker

Semantic computing

Semantic mapper

Semantic matching

Semantic publishing

Semantic reasoner

Semantic search

Semantic service-oriented architecture

Semantic wiki

Solid

Related topics

Collective intelligence

Description logic

Folksonomy

Geotagging

Information architecture

Knowledge extraction

Knowledge management

Knowledge representation and reasoning

Library 2.0

Digital library

Digital humanities

Metadata

References

Topic map

Web 2.0

Web engineering

Web Science Trust

Standards
Syntax and supporting technologies

HTTP

IRI
URI

RDF
triples

RDF/XML

JSON-LD

Turtle

TriG

Notation3

N-Triples

TriX (no W3C standard)

RRID

SPARQL

XML

Semantic HTML

Schemas, ontologies and rules

Common Logic

OWL

RDFS

Rule Interchange Format

Semantic Web Rule Language

ALPS

SHACL

Semantic annotation

eRDF

GRDDL

Microdata

Microformats

RDFa

SAWSDL

Facebook Platform

Common vocabularies

DOAP

Dublin Core

FOAF

Schema.org

SIOC

SKOS

Microformat vocabularies

hAtom

hCalendar

hCard

hProduct

hRecipe

hReview

v
t
e
Natural language processing
General terms

AI-complete

Bag-of-words

n-gram
Bigram

Trigram

Computational linguistics

Natural-language understanding

Stop words

Text processing

Text analysis

Argument mining

Collocation extraction

Concept mining

Coreference resolution

Deep linguistic processing

Distant reading

Information extraction

Named-entity recognition

Ontology learning

Parsing
Semantic parsing

Syntactic parsing

Part-of-speech tagging

Semantic analysis

Semantic role labeling

Semantic decomposition

Semantic similarity

Sentiment analysis

Terminology extraction

Text mining

Textual entailment

Truecasing

Word-sense disambiguation

Word-sense induction

Text segmentation

Compound-term processing

Lemmatisation

Lexical analysis

Text chunking

Stemming

Sentence segmentation

Word segmentation

Automatic summarization

Multi-document summarization

Sentence extraction

Text simplification

Machine translation

Computer-assisted

Example-based

Rule-based

Statistical

Transfer-based

Neural

Distributional semantics models

BERT

Document-term matrix

Explicit semantic analysis

fastText

GloVe

Language model (large)

Latent semantic analysis

Seq2seq

Word embedding

Word2vec

Language resources,
datasets and corpora
Types and
standards

Corpus linguistics

Lexical resource

Linguistic Linked Open Data

Machine-readable dictionary

Parallel text

PropBank

Semantic network

Simple Knowledge Organization System

Speech corpus

Text corpus

Thesaurus (information retrieval)

Treebank

Universal Dependencies

Data

BabelNet

Bank of English

DBpedia

FrameNet

Google Ngram Viewer

UBY

WordNet

Automatic identification
and data capture

Speech recognition

Speech segmentation

Speech synthesis

Natural language generation

Optical character recognition

Topic model

Document classification

Latent Dirichlet allocation

Pachinko allocation

Computer-assisted
reviewing

Automated essay scoring

Concordancer

Grammar checker

Predictive text

Pronunciation assessment

Spell checker

Syntax guessing

Natural language
user interface

Chatbot

Interactive fiction

Question answering

Virtual assistant

Voice user interface

Related

Formal semantics

Hallucination

Natural Language Toolkit

spaCy

v
t
e
Wikipedia
Overview
(outline)

Censorship

Conflict-of-interest editing
political editing incidents

Criticism

Biases
gender

geographical

ideological

racial

Deletion of articles
deletionism and inclusionism

notability

"Ignore all rules"

MediaWiki

Plagiarism

Predictions of the project's end

Reliability
Fact-checking

Citation needed

Vandalism

Community
(Wikipedians)

Administrators

AfroCrowd

Arbitration Committee

Art+Feminism

Bots
Lsjbot

Edit count

List of Wikipedias

The Signpost

Wikimedian of the Year

Wikipedian in residence

WikiProject

Women in Red

Events

Edit-a-thon

WikiConference India

Wiki Indaba

WikiConference North America

Wikimania

Wiki Loves

Earth

Folklore

Monuments

Pride

Science

People
(list)

Esra'a Al Shafei

Florence Devouard

Sue Gardner

James Heilman

Maryana Iskander

Dariusz Jemielniak

Rebecca MacKinnon

Katherine Maher

Magnus Manske

Ira Brad Matetsky

Erik Möller

Jason Moore

Raju Narisetti

Steven Pruitt

Annie Rauwerda

Larry Sanger

María Sefidari

Lisa Seitz-Gruwell

Rosie Stephenson-Goodknight

Lila Tretikov

Jimmy Wales

History

Bomis
Nupedia

First edit

Logo

Internet Watch Foundation

Scientology

Hillsborough disaster Wikipedia posts

VisualEditor

#1Lib1Ref

2021 Wikimedia Foundation actions on the Chinese Wikipedia

Controversies

Essjay controversy

Henryk Batuta hoax

Jar'Edo Wens hoax

Seigenthaler biography incident

Star Trek Into Darkness debate

United States congressional staff edits

Zhemao hoaxes

Coverage

American politics
Donald Trump

COVID-19 pandemic

Death

Israeli–Palestinian conflict

Russian invasion of Ukraine

Honors

274301 Wikipedia

Viola angustifolia

Wikipedia Monument

References
and analysis

Academic studies

Bibliography

Cultural

Films

Listen to Wikipedia

Wikipediocracy

Mobile

Apps

QRpedia

Wapedia

Wikipedia Zero

WikiReader

Wikiwand

Content use

DBpedia

Depths of Wikipedia

Google and Wikipedia

Health information

Kiwix

Science information

Wikimedia Enterprise

Related

The Iraq War: A Historiography of Wikipedia Changelogs

LGBT and Wikipedia

Magna Carta (An Embroidery)

People imprisoned for editing Wikipedia

Print Wikipedia

Wiki rabbit hole

Wikimedia Foundation

Wikimedia movement

Wikipedia for World Heritage

Wikipedia in India

Wikiracing

List of online encyclopedias

List of wikis

List

Category

Retrieved from "https://en.wikipedia.org/w/index.php?title=DBpedia&oldid=1218998168"

[crystallization-1] ISSN 1570-8268. Archived from the original
(PDF) on 10 August 2017. Retrieved 11 December 2015.

[DBpediaAbout-2] "About DBpedia". DBpedia. Retrieved 14 January 2024.

[LinkedData3Sat-3] "Komplett verlinkt — Linked Data" (in German). 3sat. 19 June 2009. Archived from the original on 6 January 2013. Retrieved 10 November 2009.

[Berners-LeeTalis-4] "Sir Tim Berners-Lee Talks with Talis about the Semantic Web". Talis. 7 February 2008. Archived from the original on 10 May 2013.

[nucleus-5] DBpedia: A Nucleus for a Web of Open Data, available at [1], [2], or [3]

[DBpediaTeam-6] "Credits". DBpedia. Archived from the original on 21 September 2014. Retrieved 9 September 2014.

[7] "Home". March 2024.

[v2016-8] "YEAH! We did it again ;) – New 2016-04 DBpedia release". DBpedia. 19 October 2016. Retrieved 9 January 2019.

[DBpediaMappings-9] "DBpedia Mappings". mappings.dbpedia.org. Retrieved 3 April 2010.

[Changelog-10] "Changelog". DBpedia. September 2014. Retrieved 9 September 2014.

[11] Holze, Julia (23 July 2021). "Announcement: DBpedia Snapshot 2021-06 Release". DBpedia Association. Retrieved 28 July 2021.

[12] E. Curry, A. Freitas, and S. O'Riáin, "The Role of Community-Driven Data Curation for Enterprises", Archived 23 January 2012 at the Wayback Machine in Linking Enterprise Data, D. Wood, Ed. Boston, MA: Springer US, 2010, pp. 25-47.

[W3LinkStatistics-13] "Statistics on links between Data sets", SWEO Community Project: Linking Open Data on the Semantic Web, W3C, retrieved 24 November 2009

[W3Statistics-14] "Statistics on Data sets", SWEO Community Project: Linking Open Data on the Semantic Web, W3C, retrieved 24 November 2009

[15] "Zemanta API". dev.zemanta.com. Retrieved 26 July 2021.

[NYTimesLinkedData-16] Sandhaus, Evan; Larson, Rob (29 October 2009). "First 5,000 Tags Released to the Linked Data Cloud". The New York Times Blogs. Retrieved 10 November 2009.

[OpenCalaisLinkedData-17] "Life in the Linked Data Cloud". opencalais.com. Archived from the original on 24 November 2009. Retrieved 10 November 2009. Wikipedia has a Linked Data twin called DBpedia. DBpedia has the same structured information as Wikipedia – but translated into a machine-readable format.

[ZDNetLinkedData-18] "Zemanta talks Linked Data with SDK and commercial API". ZDNet. Archived from the original on 28 February 2010. Retrieved 10 November 2009. Zemanta fully supports the Linking Open Data initiative. It is the first API that returns disambiguated entities linked to dbPedia, Freebase, MusicBrainz, and Semantic Crunchbase.

[ESWC2009-19] "European Semantic Web Conference 2009 - Georgi Kobilarov, Tom Scott, Yves Raimond, Silver Oliver, Chris Sizemore, Michael Smethurst, Christian Bizer and Robert Lee. Media meets Semantic Web - How the BBC uses DBpedia and Linked Data to make Connections". eswc2009.org. Archived from the original on 8 June 2009. Retrieved 10 November 2009.

[BackstageBBC-20] "BBC Learning - Open Lab - Reference". BBC. Archived from the original on 25 August 2009. Retrieved 10 November 2009. Dbpedia is a database version of Wikipedia. It is used in a lot of projects for a wide range of different reasons. At the BBC we are using it for tagging content.

[RWWFaviki-21] "Semantic Tagging with Faviki". readwriteweb.com. Archived from the original on 29 January 2010.

[22] David Ferrucci, Eric Brown, Jennifer Chu-Carroll, James Fan, David Gondek, Aditya A. Kalyanpur, Adam Lally, J. William Murdock, Eric Nyberg, John Prager, Nico Schlaefer, and Chris Welty "Building Watson: An Overview of the DeepQA Project." Archived 6 November 2020 at the Wayback Machine In AI Magazine Fall, 2010. Association for the Advancement of Artificial Intelligence (AAAI).

[AmazonDBpedia-23] "Amazon Web Services Developer Community : DBpedia". developer.amazonwebservices.com. Archived from the original on 13 February 2010. Retrieved 10 November 2009.

[24] ISBN 978-3-319-26761-6
.

[25] "GATE.ac.uk - applications/yodie.html". gate.ac.uk. Retrieved 11 May 2020.

[26] "ushahidi/platform-comrades". GitHub. 30 June 2019. Retrieved 9 March 2020.

[jquery-plugin-27] Mendes, Pablo. "DBpedia Spotlight jQuery Plugin". jQuery Plugins. Archived from the original on 3 April 2011. Retrieved 15 September 2011.

[php-client-28] DiCiuccio, Rob (25 September 2016). "PHP Client for DBpedia Spotlight". GitHub.

[29] "Demo of DBpedia Spotlight". Retrieved 8 September 2013.

[30] "Internationalization of DBpedia Spotlight". GitHub. Retrieved 8 September 2013.

[archivo-website-31] "DBpedia Archivo". Retrieved 8 July 2021.

[frey-etal-2020-32] Frey, Johannes; Streitmatter, Denis; Götz, Fabian; Hellmann, Sebastian; Arndt, Natanael (27 October 2020). "DBpedia Archivo: a web-scale interface for ontology archiving under consumer-oriented aspects". In Sure-Vetter, York; Sack, Harald; Cudré-Mauroux, Philippe; Maleshkova, Maria; Pellegrini, Tassilo; Acosta, Maribel (eds.). Semantic systems: the power of AI and knowledge graphs. Cham, Switzerland: Springer.
S2CID 219939266. Download as PDF or ePUB.

[frey-etal-2020-youtube-33] Frey, Johannes; Streitmatter, Denis; Götz, Fabian; Hellmann, Sebastian; Arndt, Natanael (10 September 2020). DBpedia Archivo: a web-scale interface for ontology archiving under consumer-oriented aspects. Leipzig, Germany: Institut für Angewandte Informatik (InfAI). Retrieved 8 July 2021. YouTube video 00:10:38.

[1]

[2]

[3]

[4]

[8]

[9]

[10]

[11]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[5]