COS4840 - Ontology Engineering
Ontology Basics¶
In computer science, an ontology is a formal representation of knowledge that describes the concepts and relationships between them in a particular domain. It provides a shared vocabulary and a set of rules for describing and classifying entities in a domain, such as objects, events, and concepts.
Gruber definition: "explicit specification of a conceptualization". Borst definition: "formal specification of a shared conceptualization". Studer definition: "formal, explicit specification of a shared conceptualization".
Philosophy - the study of the nature and structure of reality. AI - what exists is that which can be represented.
Definitions¶
Universals are entities that can have instances. Particulars are entities that do not have instances. Entities are organized into concepts and relations. Taxonomy is a generalization/specification hierachy of concepts. For example, in a taxonomy of animals, we might define a set of classes such as "Mammal", "Bird", and "Fish". We could then specify that "Mammal" is a subtype of "Animal", and that "Bird" and "Fish" are also subtypes of "Animal". We could further specify that "Primate" is a subtype of "Mammal", and that "Chimpanzee" is a subtype of "Primate".
Core ideas in an Ontology:¶
- Concepts - unary universal. eg. Person, Manager, Researcher
- Relations - binary universal. e.g. cooperates-with
- Properties
- Instances
- Axioms
Conceptualization¶
Genesereth and Nillson definition: “A body of formally represented knowledge is based on a conceptualization: the objects, concepts, and other entities that are assumed to exist in some area of interest and the relationships that hold among them. A conceptualization is an abstract, simplified view of the world that we wish to represent for some purpose. Every knowledge base, knowledge-based system, or knowledge-level agent is committed to some conceptualization, explicitly or implicitly".
Committing to a conceptualisation¶
Formally represented knowledge has instances of concepts and relations (e.g. Bob is married to Mary). This is committed to a conceptualization that contains people as a concept, with name as a property and married-to as a relation.
Parts of a conceptualisation¶
- World - an ordered set of world states, corresponding to the systems evolution in time.
- World state - with respect to a specific system S we want to model, a world state for S is a unique assignment of values to all the observable variables that characterize the system.
- System - a piece of reality we want to model which is perceived by an observing agent (at a certain level of granularity) by means of an array of "observed variables".
- Extensional relation - a function which explicitly enumerates (list) all the members of the set.
- Extensional relational structure - in a given a system S, an extensional relation is the tuple (D, R) where:
- S is the specific system which we want to model.
- D is a set of distinguishable elements within S (it is not the world, but unary particulars (instances of universals) within the system)
- R is a set of extensional relations on D
- Intensional relation (conceptual relation) - a function which defines a set of objects based on a condition, rather than explicitly listing the members of the set.
- It is a function (relation) which associates world states with a specific extensional relation. The extensional relation is a set of tuples and its membership is determined by some condition in the intensional relation. i.e. maps a world to a set of extensional relations.
- Intensional relational structure (conceptualization) - in a given system S, an intensional relation structure is a triple (D, W, ℝ) where:
- S is the specific system which we want to model.
- D is a set of distinguishable elements within S
- ℝ is a set of intensional relations on D
Extensional Relational Structure Example:
S = a company of 50 000 people: D = {I000001, ..., I050000} //id's of people in a company R = {Person, Manager, Researcher, co-operates-with, reports-to, firstname} //unary or binary relations on D
World state W1: Person = D Manager = {..., I034820, ...} //unary extensional relation Researcher = {..., I044443, ..., I046758, ...} //unary extensional relation reports-to = {..., (I044443, I034820), (I046758, I034820), ...} //binary extensional relation co-operates-with = {..., (I044443, I046758), ...} //binary extensional relation firstname = {..., (I034820, "John"), ...} //binary extensional relation
Intensional Relational Structure Example
S = a company of 50 000 people: D = {I000001, ..., I050000} //id's of people in a company ℝ = {Person, Manager, Researcher, co-operates-with, reports-to, firstname} //unary or binary relations on D
One of the assumptions we can make is that the conceptual relations of Person, Manager and Researcher are rigid, i.e. map to the same extensions for every possible world. Whereas the binary relations reports-to and cooperates-with we do not make that assumption.
World state W1: Person = D Manager = {..., I034820, I034822, ...} Researcher = {..., I044443, ..., I046758, ...} reports-to = {..., (I044443, I034820), (I046758, I034820), ...}
co-operates-with = {..., (I044443, I046758), ...}
firstname = {..., (I034820, "John"), ...}World state W2: Person = D Manager = {..., I034820, I034822, ...} Researcher = {..., I044443, ..., I046758, ...} reports-to = {..., (I044443, I034822), (I046758, I034822), ...} //report to other manager co-operates-with = {..., (I044443, I046758), ...}
firstname = {..., (I034820, "John"), ...}
Ontology as Formal, explicit specification¶
Practically, we need to use language to refer to the elements of a conceptualisation. Symbol or predicate symbol - representations of conceptual relations. (e.g. cooperate-with). Signature/vocabulary - the collection of non-logical symbols in a language. Interpretation - the meaning of a symbol to an observer. Committing to a conceptualization - when a language accepts a symbol into it's vocabulary, it is committing to a conceptualization. (When an agent accepts a language, they are committing to a conceptualization.) Problems of interpretation - even if we limit the interpretation domain to a subset of our cognitive domain, there are still many possible interpretation functions that can map predicate symbols into proper subsets of the domain of discourse. In other words, there are many different ways to interpret a logical signature, and we need to make sure that we only admit those models which are intended according to our conceptualization. Axiom or meaning postulate - constraints on predicates (symbols) in a logical signature (vocabulary).
To specify what these intended models are, we need to explicitly specify our conceptualization, which is typically in the mind of people and therefore implicit.
Ontology - the formal (machine readable) explicit (written down) specification (listed out) of a conceptualization (the intensional relation structure).
Extensional specification - listing the extensions of every relation for all possible worlds. (e.g. listing all possible co-operates with relations in the 50 000 employee company). Intensional specification - fix a language to the ontology to talk about it and constrain the interpretations of such a language in an intensional way through axioms or meaning postulates. (e.g. cooperates-with is symmetric, irreflexive and intransitive while reports-to is asymmetric and intransitive).
Properties that relations have:¶
Symmetric - if A relates to B, then B relates to A. Asymmetric - if A relates to B, then B does not necessarily relate to A. A also doesn't relate to itself. Reflexive - A can relate to itself. Irreflexive - A cannot relate to itself. Transitive - if A relates to B and B relates to C, the A relates to C. Intransitive - not transitive.
An ontology is a set of meaning postulates or axioms. A logical theory designed in order to capture the intended models corresponding to a certain conceptualization and to exclude the unintended ones. The result will be an approximate specification of a conceptualization: the better intended models will be captured and non-intended models will be excluded.
Creating a good ontology depends on: 1. The richness of the domain of discourse. 2. The richness of the vocabulary chosen. 3. The axiomatization.
Ontology vs Conceptual data models (UML) UML is application-specific, implementation-independent representation of data. Ontologies provide application-independent representation of a subject domain.
Interoperability¶
Ontologies can help with interoperability by providing a common vocabulary and a shared understanding of concepts and relationships within a particular domain. This shared understanding can enable different systems, applications, and users to exchange and interpret data in a consistent and meaningful way.
Ontologies can help in: - Data integration - Semantic interoperability - Decision support - Knowledge management
Open and closed world assumptions¶
- Closed world assumption (CWA): The closed world assumption is based on the assumption that anything that is not known to be true is false. This means that in a closed world, any statement that cannot be verified as true is assumed to be false. In other words, the absence of evidence is treated as evidence of absence. In a closed world, only explicitly stated facts are considered to be true.
- Open world assumption (OWA): The open world assumption is based on the assumption that anything that is not known to be false may be true. This means that in an open world, any statement that cannot be verified as false is assumed to be unknown. In other words, the absence of evidence is not treated as evidence of absence. In an open world, only explicitly stated facts are considered to be true, and everything else is considered to be unknown.