Semantic spectrum
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)
|
The
At the low end of the spectrum is a simple binding of a single word or phrase and its definition. At the high end is a full
With increased
Some steps in the semantic spectrum include the following:
- glossary: A simple list of terms and their definitions. A glossary focuses on creating a complete list of the terminology of domain-specific terms and acronyms. It is useful for creating clear and unambiguous definitions for terms, and because it can be created with simple word processing tools, few technical tools are necessary.
- controlled vocabulary: A simple list of terms, definitions and naming conventions. A controlled vocabulary frequently has some type of oversight process associated with adding or removing data element definitions to ensure consistency. Terms are often defined in relationship to each other.
- data dictionary: Terms, definitions, naming conventions and one or more representations of the data elements in a computer system. Data dictionaries often define data types, validation checks such as enumerated values and the formal definitions of each of the enumerated values.
- data model: Terms, definitions, naming conventions, representations and one or more representations of the data elements as well as the beginning of specification of the relationships between data elements including abstractions and containers.
- taxonomy: A complete data model in an inheritance hierarchy where all data elements inherit their behaviors from a single "super data element". The difference between a data model and a formal taxonomy is the arrangement of data elements into a formal tree structure where each element in the tree is a formally defined concept with associated properties.
- W3C standard language for representing ontologies is the Web Ontology Language(OWL). Ontologies frequently contain formal business rules formed in discrete logic statements that relate data elements to each another.
Typical questions for determining semantic precision
The following is a list of questions that may arise in determining semantic precision.
- correctness
- How can correct syntax and semantics be enforced? Are tools (such as XML Schema) readily available to validate syntax of data exchanges?
- adequacy/expressivity/scope
- Does the system represent everything that is of practical use for the purpose? Is an emphasis being placed on data that is externalized (exposed or transferred between systems)?
- efficiency
- How efficiently can the representation be searched / queried, and possibly reasonedon?
- complexity
- How steep is the ontology editor)
- translatability
- Can the representation easily be transformed (e.g. by semantic equivalenceis ensured?
Determining location on the semantic spectrum
Many organizations today are building a metadata registry to store their data definitions and to perform metadata publishing. The question of where they are on the semantic spectrum frequently arises. To determine where your systems are, some of the following questions are frequently useful.
- Is there a centralized glossary of terms for the subject matter?
- Does the glossary of terms include precise definitions for each terms?
- Is there a central repository to store data elements that includes data types information?
- Is there an approval process associated with the creation and changes to data elements?
- Are coded data elements fully enumerated? Does each enumeration have a full definition?
- Is there a process in place to remove duplicate or redundant data elements from the metadata registry?
- Is there one or more classification schemes used to classify data elements?
- Are document exchanges and web servicescreated using the data elements?
- Can the central metadata registry be used as part of a Model-driven architecture?
- Are there staff members trained to extract data elements that can be reused in metadata structures?
Strategic nature of semantics
Today, much of the World Wide Web is stored as
In the past, many organizations that created custom database application used isolated teams of developers that did not formally publish their data definitions. These teams frequently used internal data definitions that were incompatible with other computer systems. This made
The job title of an individual that is responsible for coordinating an organization's data is a Data architect.
History
The first reference to this term was at the 1999 AAAI Ontologies Panel. The panel was organized by Chris Welty, who at the prodding of Fritz Lehmann and in collaboration with the panelists (Fritz,
The concept of the Semantic precision in
This discussion centered around a 10 level partition that included the following levels (listed in the order of increasing semantic precision):
- Simple Catalog of Data Elements
- Glossary of Terms and Definitions
- Thesauri, Narrow Terms, Relationships
- Informal "Is-a" relationships
- Formal "Is-a" relationships
- Formal instances
- Frames(properties)
- Value Restrictions
- Part-of
- General Logical Constraints
Note that there was formerly a special emphasis on the adding of formal is-a relationships to the spectrum which has been dropped.
The company Cerebra has also popularized this concept by describing the data formats that exist within an enterprise in their ability to store semantically precise metadata. Their list includes:
- HTML
- Word Processingdocuments
- Microsoft Excel
- Relational databases
- XML
- XML Schema
- Taxonomies
- Ontologies
What these concepts share in common is the ability to store information with increasing precision to facilitate intelligent agents.
See also
References
- Semantics in Business Systems: The Savvy Managers Guide, Dave McComb, 2003
- Ontologies Come of Age by Deborah L. McGuinness
- Figure 2 that includes Ontological Spectrum