Relations-between-classes based Constraints and Constraint-based Ontology-Completenesses _____________________________________________________________________________________
Dr Philippe A. MARTIN (www.phmartin.info; article .REMOVE_THIS_TEXT. @ .REMOVE_THIS_TEXT. phmartin.info)
This article is a very stong extension of
the one accepted at Quatic 2018
which, for example, did not include
the (content of the) subsections 2.2.2.1, 2.2.2.3, 2.3.1.1, 2.3.2, 3.2.3, 3.2.5, 3.7,
4.1 and 4.2. A preliminary (-> smaller and older) version of the one for Quatic 2018
has also been published in French in the proceedings of EGC 2018,
with Jun H. JO as co-author.
In the current version of this article, the text is dispayed
i) in dark green if it has been added or modified in July 2018,
ii) in violet if it has been added or modified in September 2018,
and iii) in red if it has been added or modified after September
2018.
Abstract.
This article first distinguishes constraints from rules, and
descriptive constraints from prescriptive ones.
Both kinds can be used for calculating constraint-based completenesses
(as opposed to real-world-based completeness),
i.e. evaluating how much of a knowledge base is complete with respect to
some constraints, e.g. for evaluating how well this base follows given
ontology design patterns or best practices.
Such evaluations may also guide knowledge elicitation and modelisation.
This article explores the ways constraints can be represented via relations
between classes, hence via any knowledge representation language (KRL) that
has an expressiveness at least equal to RDF or RDFS.
This article also focuses on the checking of such constraints
using SPARQL or slight extensions of it.
Compared to the popular practice of both representing and checking constraints
via queries, this approach is as simple, offers more possibilities for
exploiting both knowledge and constraints, and permits the selection and
use of inference engines adapted to the expressiveness of the exploited
knowledge instead of the use of restricted or ad hoc constraint-validation tools.
This approach is also modular in the sense it separates content from usage:
the represented “content focused constraints” can then be exploited
via few “content independent” queries, one for each usage and kind
of constraint.
Keywords:
prescriptive and descriptive constraints, ontology completeness, OWL, SPARQL.
1. Introduction
2. Constraints: Definitions and Representation Approaches
2.1. Definitions
2.1.1. Considered Constraints: Those Not Used For Inferences, Just For Checks
2.1.2. Descriptive vs. Prescriptive (Hence Not Using All Possible Inferences)
2.1.3. Restriction of Some Constraint Parts to Named or Anonymous Individuals
2.2. General Approach for Representing And Exploiting Constraints
2.2.1. Using Constraint Types
2.2.2. Using “Clones Without Inferred Relations” For Bypassing Some Inferences
When Checking The Conclusions Of Prescriptive Constraints
2.2.2.1. Checking Prescriptive Constraints Via the Cloning of At Least KB0\Tbox
When R is Exploited Via Forward Chaining Saturation
2.2.2.2. Checking Prescriptive Constraints Via the Cloning of At Least KB0\Tbox
Using “Clones Without Types”
2.2.2.3. Checking Prescriptive Constraints Via the Creation of Saving Forms for R
and the Creation of a Temporary Copy of (a Relevant Part of) the KB without R
When All Inferences are Performed Via Backward Chaining
2.3. Representing Constraints Via Relations Between Classes
2.3.1. Approaches
2.3.1.1. Comparison With OWL-ER
2.3.2. Proposed Types
2.3.3. Examples
2.3.3.1. Individual-based Constraints
2.3.3.2. SubclassOf-analogous Constraints
2.3.3.3. SubclassOf-based Constraints
3. Exploiting Relations-between-classes Constraints With a SPARQL-like language or With Javascript
3.1. Examples of KB Pre-treatments With SPARQL
3.1.1. Finding Named Individuals and Marking Them As Such
3.1.2. Temporarily Removing Individuals For Checking Constraints on Types
With Classic Description Logic Inference Engines
3.2. Checking Individual-based Constraints With a SPARQL-like language
3.2.1. Checking Individual-based Positive Descriptive Constraints
3.2.2. Checking Individual-based Inconsistency-implying Constraints
3.2.3. Checking Individual-based Positive Prescriptive Constraints
Via the Cloning (By Reification) of At Least KB0\Tbox
When R is Exploited Via Forward Chaining Saturation
3.2.3.1. Checking The Prescriptive Constraints With Already Created Clones
3.2.3.2. Checking The Constraints Via a Dynamic Creation of the Clones
3.2.3.3. KB Pre-treatment Creating the Saving Forms and the Clones
3.2.3.4. Removing Inferred Relations From Clones
3.2.4. Checking Individual-based Positive Prescriptive Constraints
Via the Cloning of At Least KB0\Tbox Using “Clones Without Types”
3.2.4.1. KB Pre-treatment Creating “Clones Without Types”
3.2.5. Checking Individual-based Positive Prescriptive Constraints Via the Creation of
Saving Forms for R and the Creation of a Temporary Copy of the KB without R
When All Inferences are Performed Via Backward Chaining
3.2.5.1. KB Pre-treatment Creating Saving Forms For R In an OWL based KB
3.2.5.2. Checking the Prescriptive Constraints Via the Checking of Conclusions
Within a Temporary Copy of (a Relevant Part of) the KB without R
3.3. Checking SubclassOf-analogous Constraints With a SPARQL-like language
3.4. Checking SubclassOf-based Constraints With a SPARQL-like language
3.5. Checking Binary Relations Instead of Individuals With a SPARQL-like language
3.6. Evaluating a Constraint-based Completeness in a SPARQL-like language
3.7. JavaScript Function for Ontology Completeness Evaluation
4. Illustration of Applications (or Use Cases) and Limits of the Approach
By Showing How To Represent Some Important Ontology Design Patterns
Via Relations-between-classes based Constraints
4.1. Examples of Useful General Descriptive Constraints
4.1.1. Checking or Evaluating the Design of a Subtype Hierarchy
4.1.1.1. Approach and Rationale
4.1.1.2. Implementation of the Subclass ODP with OWL2
4.1.2. Generalization To Check the Use of Any Transitive Relation
4.2. Examples of Useful General Prescriptive Constraints
4.2.1. Checking that Certain Relation Types Are Systematically Used
From Objects of Certain Types
5. Evaluation and Comparisons
6. Conclusion
7. Acknowledgments
8. References
Table of Contents
Knowledge representations (KRs) are formal descriptions
enabling automatic logical inferencing, and thus
automatic KR comparison, search, merge, etc. KRs are logic formulas, e.g.
the binary predicates of 1st-order logic; these predicates are called
triples or property instances in RDF
and binary relations in Conceptual Graphs (CGs) [Sowa, 1992].
For the purpose of clarity, this article uses the
intuitive terminology of CGs:
(information) objects are either types or individuals, and
types are either relation types or
concept types (classes and datatypes in RDF).
A formal knowledge base (KB) is a collection of such objects written using
a KR language (KRL).
An ontology is a KB that is essentially about types,
rather than about individuals.
Creating a KB or evaluating its quality
– for knowledge sharing or exploitation purposes, or
for designing or generating software, or evaluating their qualities –
are difficult. Models and constraints (e.g. design patterns) help these tasks and
can be stored into an ontology. E.g., the author of this article is building an
ontology representing and organizing ontology design patterns as well as
software design patterns; both kinds are organized by the same top-level types.
[Zaveri et al., 2016], a survey on quality assessment for Linked Data,
provides many dimensions and metrics for evaluating the quality of KBs and
hence for helping the selection or design of KBs. One of the quality dimensions is
the (degree of) completeness of a KB, with respect to some criteria or,
more concisely, “its completeness”.
Evaluating such a degree is common in various tasks or fields but
is performed differently by different tools and sometimes
in implicit or ad hoc ways. Examples of such tasks or fields are:
i) the automatic/manual extraction of knowledge or the creation of a KB,
ii) the exploitation of ontology design patterns,
KB design libraries (e.g., the KADS library) or
top-level ontologies (e.g., DOLCE), and
iii) the evaluation of ontologies or, more generally, datasets.
In this third field, as noted in
[Zaveri et al., 2016],
completeness commonly refers to a degree to which the
“information required to satisfy some given criteria or a
given query” are present in the considered dataset.
To complement this very general definition, this article distinguishes
two kinds of completeness: One way to define or calculate a constraint-based completeness of a KB is
to divide “the number of statements satisfying the constraints in that KB”
by “the total number of statements in the KB”.
When calculating real-world-based-based completeness, aggregating the
results of completeness oracles or making comparisons with the gold standard
dataset may not be as simple.
As a variant, one may want to consider objects instead of just statements, and
thus measure the percentage of objects for which all relations from/to them
satisfy the constraints.
Other variants may be defined by considering only certain kinds of objects or
statements.
Defining constraints via KRs in the KB, instead of via queries, permits the
definition of “content-independent queries” to exploit these constraints.
Otherwise, a different (content-dependent) query
has to be created for each variant of constraint-based checking or completeness.
Because of this lack of modularity, when stored in an ontology, content-dependent
queries are also less easily organized than content-independent ones.
This article does not address real-world based completeness but the
techniques this article proposes for representing and exploiting constraints
may also be used for representing certain domain-specific parts of the rules
used for calculating real-world based completeness.
From now on, unless otherwise explicitly specified, “completeness”
refers to constraint-based completeness.
Section 2.1 explores the first research question
of this article:
what does the expression “must and must not be represented in the
dataset” entail or,
more precisely, given the “descriptive vs. prescriptive” distinction,
what kinds of constraints need to be considered for evaluating
constraint-based completeness
via content-independent queries?
Section 2.3 proposes a
general approach to answer the second research question of this article:
how to represent constraints in a KRL independent way
– or, more precisely, in any KRL that has an expressiveness at least
equal to RDF or
RDFS –
even though actually defining the semantics of some of these constraints
would require much more expressive logics?
The proposed solution relies
i) on the representation of constraints via restricted constructs based
on relations between classes (or to classes), e.g. rules using the
The sections 3.1 to 3.6 show commands (queries or update requests) that may be used
as an answer to a third research question: how to implement the above cited
general approach in a SPARQL-like language:
SPARQL or slight extensions of it.
A restriction associated to such a query language is that the relations must be binary.
Section 3.7 shows that using a
procedural language instead of a query language permits the implementation of the
proposed general approach in a simpler and more flexible way.
Section 4 illustrates applications or use cases
for the given content-independent queries.
Like Section 3, it also shows how some
– but not all – expressiveness limits of subclassOf-based constraints
can be overcome by pre-treating the KB, e.g. via SPARQL.
Section 5 evaluates the proposed approach
and compares it to other ones.
1. Introduction
rdfs:subClassOf
relation for representing a restricted
implication between the condition and conclusion of the rule,
ii) on the use of certain special types for specifying that a statement
is a constraint of a certain kind, and
iii) on the exploitation of these types in content-independent queries.
Such constraint representations can then be exploited via most inference engines
and KR query languages instead of tools tied to a particular KRL or goal
(knowledge acquisition, ontology design pattern application,
ontology evaluation, etc.).
In this article, as in
KIF (Knowledge Interchange Format) [Genesereth & Fikes, 1992],
a rule is a statement that can be represented
– directy or via a loss-less translation –
in the form “ A rule allowing the derivation of a non-modal statement is a rule that
can be represented in the form “ [Chein & Mugnier, 2008]
defines constraints as positive or negative, respectively
expressing statements of the form
“if A, B must be true” and
“if A, B must be false”.
Thus, the authors define constraints as rules where the conclusion has
a “must” modality.
These are the kinds of constraints considered in this article,
with the interpretation that
in such constraints the “must” entails that the constraints
can only be used for checking statements, not deriving statements.
Thus, constraints are not rules allowing the derivation of non-modal statements.
More formally, this means that such positive or negative constraints can
respectively be translated into the forms
“ In this article, constraints that are directly represented in a form
ending by “ In the research literature on constraints, these ones are generally not
represented – or checked – via modal logic based KRLs but
rather using queries, e.g. via SPARQL or the
nonmonotonic-epistemic-logic query language EQL-Lite
[Calvanese et al.].
In (unidirectional) rule based systems, rules with empty conclusions
(or “ As noted in [Assmann & Wagner, 2006],
one common distinction between engineering models
is whether they are
i) descriptive of some reality, like most ontologies are, or
ii) prescriptive of what must be in the considered dataset, as with
system specifications, meta-models, XML schemas, database schemas,
SHACL, etc.
Similarly, this article distinguishes two kinds of
constraints: descriptive constraints and prescriptive ones.
As detailed at the end of this subsection, the notion of
prescriptive constraints as defined in this section is original.
Before seeing precise definitions, let us first see different purposes
of these two kinds of contraints, as well as general (purpose-derived) definitions and
examples for these constraints.
In the approach proposed by this article, constraints have types
or use particular relation types. These types permit the distinction between
the different kinds of constraints and their distinction from non-modal rules:
the “must” modalities are represented via these types.
These distinctions can thus be exploited by content-independent queries,
inference engines or special procedures.
A constraint in inconsistency-implying form (e.g.,
“ When trying to match an object (in the dataset) to the
condition of a constraint, in order to avoid missing a match
i) the expressiveness used for the inference regime must be at least equal to the
language expressiveness that was necessary for describing
the object, and thus
ii) there is no reason to restrict the exploitation of
certain inference supporting statements.
This is also true when trying to match an object to the conclusion of a
descriptive constraint.
However, when trying to match an object O to the conclusion C of a prescriptive
constraint, a certain set of inference supporting statements should not be used
since a prescriptive constraint is meant to ensure that certain
uses of certain formal terms
– i.e. the use of these terms in certain relations –
is not due to inferences. In this article, this set is called R. It is the set of
all inference supporting statements that could automatically associate relations to
an object – and thus, in the above generic case, to the object O –
except for type relations (and subtype relations if the object is a type).
There are several reasons for that exception. In the following arguments, the expression
“type relations” also refers to subtype relations and thus
“non-type relations” also exclude subtype relations.
One pragmatic reason for the exception is that, for matching O to C, it is necessary to
exploit type relations from O as well as from all the objects that are destinations of
non-type relations from O. Fortunately, this is not the sole reason: there is also no
point for the conclusion of a prescriptive constraint to check if O is of a certain type.
Indeed, this does not help ensure that non-inferred genuine knowledge is
associated to O since
i) without their definitions involving other relations than type relations,
types would not constrain any use for their instances and, in that sense,
would not have any genuine meaning, and
ii) exploiting these definitions for checking whether O has a certain relation
would be checking if O has a certain inferred relation, something which a
prescriptive constraint is not meant to support.
This last point is also a reason of its own for the above cited exception.
For a definition of what prescriptive constraints mean with respect to how they are checked,
first note that inferencing via forward chaining saturation statically adds relations
to objects. On the other hand, with backward chaining, inferred relations are temporarily
or virtually added or associated to objects.
Second, consider KB0 the version of the KB before
the exploitation of R by forward chaining saturation adds the inferred relations.
Then, whether or not inferencing involves forward chaining saturation, testing if O
matches C is trying to find if O0 permits the derivation of C via
inference supporting statements in “KB0 minus R” (KB0\R),
i.e. testing if “O0 (KB0\R) ⊢ C”.
In OWL, only a class or an instance of a class can have an
automatically associated non-type relation – with “non-type” still
also excluding subtype. E.g., a relation type cannot have such an associated relation;
associating a property chain to a relation type is no exception since this association
is made via a subtype relation.
Thus, in OWL, the automatic association of a non-type relation to an object
is only supported by class definitions via property restrictions, i.e. instances of
When the system can exploit two KBs at the same time.
In this case, one KB (KB0') can be initialized with KB0
and the other (KB0\R') with KB0\R, although in both KBs each existentially quantified
anonymous object must be given a generated unique identifier – so that
from each object in KB0\R', its possibly extended counterpart in KB0' can be retrieved.
Then, for each prescriptive constraint C and object O in KB0, C can be checked
if the system can
i) test if O in KB0' matches the condition of C (in KB0'), and then, if needed,
ii) test if O in KB0\R' matches the conclusion of C (in KB0\R').
This method works whether inferences involves
forward chaining saturation or just backward chaining.
If the constraint checking system is implemented in an ad hoc procedural way,
via calls to a function testing if an object matches another one in a given KB,
specifying and connecting the two above cited tests is not difficult.
In the current standard versions of SPARQL, using “inline data” (and
hence the When the system can exploit only one KB but, within this KB,
at least KB0\Tbox can be duplicated in a way that does not trigger R (or at least not
in a problematic way).
The idea here is to distinguish between inferred statements and those in KB0. To do so,
a first way is that each inferred object or statement is automatically marked as such,
e.g. via a special type or relation associated to it or to at least one of its parts.
A second way is that, when initializing the KB with KB0, each object in KB0\Tbox
– or the whole KB0 for simplicity purposes – is duplicated into
a “saving form” that later permits to regenerate the object as it
was in KB0. One such form is a reification of the object, with a
special type or special relations for
i) distinguishing this “reification for duplication purposes” from
other reifications,
ii) distinguishing its initial relations from
those that might be added to it via inferences, and
iii) automatically discarding this saving form from the results of a query
when it appears in such results.
With both ways, for each prescriptive constraint C and object O that is not a duplication,
C can be checked in two steps. The first step is to test if O matches the
condition of C. The second step is to test the conclusion, if still needed.
With the first way, this second step consists in performing
this test without involving the marked inferred objects or statements.
Since in the general case a general purpose inference engine cannot be
parameterized to support this first way, it is not explored in this article.
With the second way, i.e. the saving form based way, the conclusion matching test is
performed not with the current O but with a complete or partial copy of O as it was in KB0.
In this article, this copy is called a
“clone without inferred relations” or simply a clone. This clone is not
necessarily a complete copy of the initial O. It is only a partial copy when
i) this is not detrimental for the conclusion matching test, and
ii) this permits the clone to be directly the “saving form” instead of
being regenerated from this form.
In practice, “initializing the KB” is simply opening or importing the
file containing KB. Creating the saving form can be performed by pre-treating the KB,
i.e. modifying the KB as soon as it is initialized and before any potential forward
chaining saturation process is triggered. KB treatments or pre-treatments may for example
performed via SPARQL update requests. Regenerating clones from saving forms may be
performed statically, i.e. via a treatment of the KB before the checking of
prescriptive constraints. It may also be performed dynamically, i.e. by temporarily
generating certain clones just before checking the conclusion of certain constraints.
When the system can exploit only one KB but all inferences
are performed via backward chaining and
either i) R can be temporarily disabled or removed, or
ii) KB0\R can be temporarily duplicated.
With backward chaining, inferred relations are only temporarily or
virtually added or associated to objects.
Thus, if all inferences are performed via backward chaining, KB0 is not statically
modified and, more importantly, testing whether or not an
object O matches the conclusion of a prescriptive constraint C can be performed by
i) temporarily removing R or disabling the use of R, just for the time of
the test, or ii) testing the match in a temporary copy of the KB where R is absent.
Since in the general case a general purpose inference engine does not support
specifications for temporary disabling the use of certain inference supporting
statements, this method is not explored in this article.
The two other cited ways first involve duplicating R into saving forms
– before each conclusion matching test or, once and for all,
when the KB is initialized.
Then, just before testing if an object O matches the conclusion, either
i) R is removed before the test and restaured from the saving forms after the test, or
ii) a temporary copy of (a relevant part of) the KB except for R is created and the
test is performed within this copy. In this last described approach,
“relevant” means “all the objects that can be reached from O via a
sequence of relations” instead of all the KB.
Section 2.2.2
proposes three methods based on clones and/or saving forms: two methods based on the
last described approach in the above second point and one method based the last
described approach in the third point.
These methods exploit a general purpose inference engine and a SPARQL-like language.
Each method has different limitations.
The notion of prescriptive constraints as above defined is original:
to our knowledge, in other works on constraints, these ones apply on
statements of the KB whether they have been inferred or originally provided.
Since these works do not provide ways to make the differences between
those two kinds of statements, the constraints they handle are
by default descriptive.
Yet in other words, since these works do not provide a mechanism for temporary disabling
– or ignoring the results of – some inferences only when
checking the conclusions of the constraints, the only way to handle
these constraints as prescriptive constraints (in its above defined sense) is to
disable – or ignore the results of – some inferences not only during
the checking of their conclusions
but also during the checking of their conditions. Thus, some inferences may be
missing for a complete checking of the checking of the conditions.
A constraint, whether descriptive or prescriptive, is
intended to check the existence (or absence of existence)
of certain objects in a KB. To that end,
constraint checking needs to exploit some form of closed world assumption.
When SPARQL is used for checking constraints, the
closed world assumption related parts can be represented via the
operators It is sometimes interesting to create constraints allowing certain
relations from individuals to be inferred only if these individuals
are named, i.e. only if they are not anonymous individuals (hence, in the RDF
terminology, not “blank nodes”).
E.g., assume that a user wants to ensure that each instance of As long as named individuals are represented in the
constraints the same way they are in the rest of the KB, any way can be used and no special
content-independent query has to be introduced. One way is to type named individuals with
[Tao et al., 2010] shows that representing and checking certain kinds
of integrity constraints that exploit some forms of the Unique Name Assumption
or Closed World Assumption can be performed via SPARQL queries.
Instead, as explained in the introduction, the goal is here to enable
the representation of constraints that
i) can be exploited via content-independent queries,
ii) can be represented via any KRL that has an expressiveness at least
equal to RDF or RDFS, and
iii) can be marked as descriptive or prescriptive
(this distinction is not made in [Tao et al., 2010]).
To that end, the proposed approach is to introduce a few types for constraints.
By setting instanceOf or subtypeOf relations from certain KRs to some of
those types, KB authors can state that these KRs are constraints and
can indicate which kind of constraints.
Thus, these constraints can be exploited by content-independent queries
or inference engines that understand the used constraint types.
This means that these engines know that these types represent particular
“must” modalities and know how to handle such modalities.
This approach is similar to the use of
OWL types in
RDF statements and their exploitation by OWL-aware inference engines.
The name of the proposed ontology of constraint types is CSTR.
In this ontology,
For the reasons given in Section 2.1.2,
this section proposes three clone based methods
supporting the checking of prescriptive constraints for when the system can exploit
only one KB at any given time.
The methods of the next three subsections exploits a general purpose
inference engine and a SPARQL-like language.
They are each dependent on different features of the used system but
they do not rely on a particular logic, KRL, inference engine or tool.
In other words, these methods are KRL independent and tool independent in
the sense that they can be used with any KRL and any tool satisfying the listed features.
Thus, depending on the domain and application, different inference engines can
be reused to check or evaluate ontology completeness.
This method has two or three steps.
First, just after the KB has been initialized with KB0, i.e. when the file containing KB0
has been imported, each object of at least KB0\Tbox is duplicated into a saving form,
e.g. by reification. For simplicity purposes, the whole KB may be duplicated this way.
The clones may also be created at that time
(Section 3.2.3.3
gives a SPARQL command to perform such a step).
If so, the second step is to exploit the saving forms for removing inferred relations
from the clones (cf.
Section 3.2.3.4)
and the third step is to perform the checking of the prescriptive constraints
(for each object O matching the condition of a constraint C, testing the matching
of the conclusion should be done with the clone of O, not O;
cf. Section 3.2.3.1).
If the clones are not created along with the saving forms, the second step can be
the generation of these clones from these forms, and the third step is the constraint
checking.
A third option is to merge the second and third steps, i.e. to dynamically
generate each needed clone (based on the saving forms) just before testing the match of a
constraint conclusion
(cf. Section 3.2.3.2).
This method relies on the following conditions. This method has two differences with the previous one.
First, each saving form is directly a clone. It has the same form as the
original object but does not have its type relations
(and, if it is a named individual, it has an identifier different from the original object).
In this article, each of such clones is called a
clone without type. Indeed, this method assumes that R is such that all inferences
which automatically associate relations to an object do so based on its type.
Given this first difference, the second one is that this method can be used
whether or not the used inference engine performs forward chaining saturation.
If the engine only uses backward chaining, i.e. if no inferred relation has been statically
added to objects in the KB via forward chaining saturation, it is possible to dynamically
create a clone without type for each object before testing if this clone matches the
conclusion of a prescriptive constraint.
However, the alternative of using a KB pre-treatment for cloning all the objects once and
for all, when the KB is initialized, is more efficient and works regardless of whether or not
forward chaining saturation is involved. This is why no LDScript query
for the dynamic generation of clones is proposed in
Section 3.2.4.
Section 3.2.4.1
gives a SPARQL This method relies on the following conditions. The last three are
the counterparts of the last three conditions given for the previous method. Section 2.1.2 summarized
the two steps of this method. Here are more details.
The first step is to duplicate R into saving forms.
This may be done once and for all, via a KB pre-treatment.
A less efficient way is to save R just before testing if an object matches the
conclusion of a prescriptive constraint.
The saving can for example be performed by reification, as in the first method.
Section 3.2.5.1
provides a SPARQL query implementing the KB pre-treatment for an OWL based KB.
The second step is to check the prescriptive constraints, in the following way:
just before each testing if an object matches a conclusion,
create a temporary copy of the relations that can be directly or indirectly reached
from the object, except for relation in R, and perform the test within this copy.
Section 3.2.5.2
provides an LDScript query implementing this second step.
This method relies on the following conditions.
They are the counterparts of the last three conditions given for the previous method.
One way to represent and exploit (simple) rules
in a KRL that
has an expressiveness at least equal to
RDF or RDFS is to
use an In other words, using subclassOf-based constraints when inferences based on
subclassOf relations then have to be ignored is generally not relevant.
However, the idea of using classes for representing the conditions and conclusion
of a constraint without using variables is interesting.
Here are the two simplest ways to do so. A disadvantage of any solution using relations from/to classes when these
relations are not subclassOf ones is that the result requires a KRL with
an expressivity at least equal to RDF. For OWL-based representations, this means interpreting them with the
RDF-Based Semantics, not the
OWL Direct Semantics.
Since each of the three approaches has its advantages and drawbacks,
this article proposes types and commands for the three approaches:
Section 2.3.2 gives the types,
Section 2.3.3 gives examples
and Section 3.1 gives the requests.
SubclassOf-based constraints could be generalized using a subtypeOf relation
instead of a subclassOf one.
Conforming to the terminology given in the first paragraph of the introduction,
this would permit the use of relation types and datatypes, hence not just classes,
in the condition and conclusion of a constraint.
However, checking objects such as relations and instances of datatypes can
most often be done via constraints on class instances related to these objects.
For example, relations can generally be checked via their sources or
destinations, or via the assertions that include these relations, i.e.
instances of classes such as [Baget et al., 2015]
shows that a sublanguage of OWL2 called OWL2-ER using subclassOf-based rules can
represent many kinds of “existential rules”
(hence the suffix “-ER”),
i.e. rules in which the conclusion (or head) can include
anonymous existentially quantified objects. A sublanguage of OWL2-ER is
OWL2-RL [OWL 2 profiles] which
can be translated into Datalog (definite Horn rules with equality), a purely
declarative subset of Prolog. Similarly, OWL2-ER can be translated into
– and hence represented by –
Datalog+ [Baget et al.,
2015] (the extension of Datalog to existential rules) and then into
RuleML.
Thus, not all OWL2 class expressions are allowed in OWL2-ER: rules are
restricted to the logical form “∀X ∀Y
(B(X,Y) => ∃Z H(X,Z))”, where
“X, Y and Z are sets of variables, B (the body) is a conjunction
with variables in X and Y, and H (the head) is a conjunction with
variables in X and Z; hence Z denotes the set of variables that
occur in H but not in B” [Baget et al., 2015].
Given its restrictions with respect to OWL2, OWL2-ER is roughly the part of
Datalog+ that can be expressed using only OWL2, hence with binary
relations only and without the syntactic possibility of directly and
explicitly using variables for relating objects shared by both
the condition and conclusion of a rule.
In OWL2-ER, negative constraints can be represented in two ways:
i) by using a class expression equivalent to the type
Since the proposed approach is purposely KRL independent and not restricted to
subclassOf-based constraints (and, a fortiori, to OWL2-ER statements),
more formal details on OWL2-ER – or on what representing a rule
using Figure 1 uses a concise version of
UML
to show subclassOf relations between some classes of CSTR.
Figure 2 uses the FL notation
– a notation somewhat similar to Turtle but more expressive and –
to show subtypeOf relations between main types of CSTR.
This figure shows how many of the first-order classes are instances of
some second-order classes that are similarly organized and somewhat
similarly named. E.g., to state that a class (class expression or named class)
is the condition of a subclassOf-based constraint one can mark it as subtype of
In its constraint examples, this article only uses 1st-order classes.
Indeed, most inference engines for description logics cannot handle
a KB that defines 2nd-order types, i.e. a KB that exploits 2nd-order types
not predefined in those description logics.
As explained in the second paragraph of
Section 2.3.1, the two other proposed
representation approaches use other relations to classes and
hence cannot be handled by these above cited inference engines.
Because of this limitation of these engines,
to use them for checking constraints on class definitions, it is necessary
to pre-treat the KB by removing the individuals.
In such a case, 1st order classes become individuals and their subclass
relations must be removed too.
Section 3.1 shows the basis of
how to do that in SPARQL.
Conversely, if such engines must be used to check constraints on individuals
the 2nd-order types must first be removed.
Legend.
The class names with no prefix are from the CSTR ontology. Legend. In this article, except in Figure 1 and Figure 2, the
Turtle notation is
used when SPARQL is not used since SPARQL reuses this notation for
representing relations. For clarity purposes, the names of relation types
have a lowercase initial while other names have an uppercase initial.
In SPARQL, Turtle or other graph-based notations, a statement of the form
“ SourceConcept
relationName1 DestConcept1ForRelName1, DestConcept2ForRelName1;
relationName2 DestConcept1ForRelName2,
DestConcept2ForRelName2 . ”
can for example be read “SourceConcept has for
relationName1 DestConcept1ForRelName1
as well as DestConcept2ForRelName1,
and has for relationName2 DestConcept1ForRelName2,
and has for relationName2 DestConcept2ForRelName2”.
Here is the positive prescriptive constraint
“if x is a Person, x must have a parent
in the base of facts (prescriptive-must)”
represented as an individual-based constraint using the CSTR ontology. For the general descriptive version of this
constraint, it is sufficient to replace
“prescriptive” by “descriptive” in the previous
representation.
Here is a version where parents are restricted to be named individuals.
The negative constraint
“if x is a Person_without_parent, x must not have a parent”
may be translated into the inconsistency-implying form
“ As referred to in the description of the “clones without types” based method
in Section 2.1.2,
here are examples of how to write inconsistency-implying constraints that are equivalent
to the use of For the positive prescriptive constraint
“if x is a Person, x must have a parent
in the base of facts (prescriptive-must)”,
using the CSTR ontology: For the general descriptive version of this constraint,
it is sufficient to replace “prescriptive”
by “descriptive” in the previous representation.
Here is a version where parents are restricted to be named individuals. Here is the negative constraint
“x is a Person_without_parent =>> false” represented
as a subclassOf-analogous constraint using the CSTR ontology. Here are the subclassOf-analogous version for the constraints
“a For the positive prescriptive constraint
“if x is a Person, x must have a parent
in the base of facts (prescriptive-must)”,
using the CSTR ontology: For the general descriptive version of this constraint,
it is sufficient to replace “prescriptive”
by “descriptive” in the previous representation.
However, as explained in
Section 2.3.1, such descriptive versions require using an
inference engine that takes into account the special meaning of
Using the CSTR ontology, the negative constraints
of the previous subsection are identical to their subclassOf-analogous versions
except for the fact that 2. Constraints: Definitions and Representation Approaches
2.1. Definitions and General Approaches
2.1.1. Considered Constraints: Those Not Used For Inferences, Just For Checks
X =>> Y
” where “=>>
” is a
restricted version of the logical implication ( “X => Y
”):
it only supports modus ponens, not modus tollens.
An inference supporting statement is either a rule or a
statement that can be represented in the form “X => Y
”.
X =>> Y
”
where Y does not include a modality (e.g., must).
An example is “if x is a Person then x has a parent”.
If this statement and “Tom is a Person” are in a KB,
an inference engine can derive the non-modal statement
“Tom has a parent”.A ∧ ¬B =>> false
” and
“A ∧ B =>> false
” where
A and B do not contain a “must” modality and A may be empty.
As an example, consider the positive constraint
“if x is a Person, x must have a parent”.
From this constraint and the fact “Tom is a Person”,
an inference engine must not derive “Tom has a parent”.
It may derive “Tom must have a parent” but, in practice,
such derivation is not performed.
As a somewhat opposite example,
RDFS-aware
engines do not exploit relations of type rdfs:domain
or
rdfs:range
as relation signature constraints but as
inference supporting statements: when a relation r
has a type partially defined with an rdfs:domain
(vs. rdfs:range
) relation, RDFS-aware engines
infer a type for the source (vs. destination) of r.=>> false
”
– or, equivalently, “=>> ⊥
” –
are called constraints in inconsistency-implying form.
Not all KRLs allow to represent rules (instead of
– or in addition to – implications);
in those that do, representing negative constraints using the
inconsistency-implying form is easy but using this form for representing
positive constraints may not possible: the KRL may not permit the
representation of the negation in the “¬B
” part.
This in why in this article
i) negative constraints are represented in inconsistency-implying form, and
ii) positive constraints are in the form
“A =>> B
” but have a
type that distinguishes them from actual rules.
Furthermore, as in most rule-based systems, in the rest of this article
the A
and B
parts share variables. More precisely,
these parts are
representations of relations from a same object (i.e. from a type or an
individual, including a relation or a more complex statement since they
are particular kinds of individuals). Thus, checking
if a positive constraint is violated by an object in a dataset is checking
if the object matches – that is, specializes –
the condition of the constraint but not its conclusion. In other words, both
the condition and conclusion of a positive constraint should be logically
derivable from the object, possibly using various inference supporting
statements from the KB.
false
” as conclusions) are handled like constraints.
However, this is a particularity of these systems. It should not be
relied upon for general knowledge representation purpose. For such a
purpose, the special semantics of constraints has to be made explicit
via special syntactic sugar or special types.
Since KRLs rarely propose syntactic sugar for expressing constraints,
a more generic approach for expressing that a statement is a constraint,
as opposed to an inference supporting statement, is to state that this statement
is an instance of a type expressing a particular kind of constraint,
as explained in Section 2.1.4.
Then, these constraints can be retrieved and exploited by
content-independent queries such as those provided below. These constraints
can also be directly interpreted and exploited by inference engines designed
to take into account the used constraint types. In any case, either
i) constraints are not represented in a way they can be exploited
as inference supporting statements, or
ii) the results of these inferences must not be detrimental, i.e.,
must not influence the checking of constraints.
Both techniques will be illustrated in
Section 2.3.1 and
Section 3.
2.1.2. Prescriptive (i.e. Not Using All Possible Inferences) vs. Descriptive
A ∧ B =>> false
”) is meant to ensure
that certain uses of terms are not used in a KB. In this case,
“prescriptive” means that the non-existence of these uses is
not due to inferences, and hence “descriptive” means that
it does not matter whether or not the non-existence of these uses is due to inferences.
However, what would be the interest of such prescriptive constraints?
Indeed, inferences leading to the existence of unwanted relations need to be caught too.
The situation is different from the one for wanted relations where, as above illustrated,
it may be interesting to specify whether or not these relations can be inferred.
Thus, at least in this article, all constraints in inconsistency-implying form
are considered to be “descriptive”.
If the used KRL supports negations, a constraint in inconsistency-implying form
can specify that the non-existence of a certain use of a term is an inconsistency
and thus that this certain use must exist in the KB
– e.g. “A ∧ ¬B =>> false
”
may be used to ensure that B is true when A is true.
Such a constraint is descriptive since it allows the existence of this certain use
to be due to any inference.
owl:Restriction
. In other words, this association is only supported by
class definitions not restricted to specifying that the class is
i) subclass, superclass, intersection or union of other classes, or
ii) type or instance of other objects.
Thus, in an OWL based KB, R is the set of restrictions, and KB0\R is KB0 where
the body or content of each restriction has been removed.
In other kinds of KBs, R is different since they offer other ways to express universal
quantification and thus other inference-based means to associate non-type relations
to objects.
In any case, removing inferred non-type relations or not taking them into account can be
done automatically, in different ways depending on the features on the used system:
general purpose inference engine, ad hoc procedural implementation, etc.
The following summary of approaches
not only refers to R but also to Tbox, i.e. R plus the types, their subtype relations
and their type relations. In a Description Logic based KB, Tbox refers to its T-box
while KB\Tbox refers to its A-box, i.e. the individuals,
the relations between them, and the type relations from individuals to types.
VALUES
keyword) or “named graphs” (and
hence the FROM
keyword), it is possible to perform queries on
different KBs and use a same object identifier in the queries to relate them.
However, it is not yet possible to use a same variable to relate the results
of queries on different KBs. With an extension of SPARQL such as
LDScript
[Corby, Faron-Zucker & Gandon, 2017],
based on the results of one query, it would be possible to generate another
query with the results of the first query hardcoded in it, and thus solve the
problem. The names of the KBs would also have to be hard-coded.
This approach is not explored in this article.
EXISTS
and NOT EXISTS
.
In this article, SPARQL is used for the content-independent
queries (cf. Section 3) and only the operator EXISTS
had to be
used in these queries.
Furthermore, with such queries, there is no need to represent any
closed world assumption related part in the constraints themselves.
Thus, the inference engines used for matching objects of the KB to the constraints
need not use the closed world assumption.
Hence, regular inference engines for OWL can for example be reused.
2.1.3. Restriction of Some Constraint Parts to Named or Anonymous Individuals
Person
has a parent
relation, inferred or not as long as its destination is
named.
To do so, using a prescriptive constraint is not possible since, by definition,
it would not accept inferred named parents. A solution is to write two constraints:
i) an inconsistency implying one stating that an instance of Person
without a(n inferred or not) parent
relation is an error, and
ii) a descriptive constraint specifying that the destination of each of these
parent
relations must be named.
owl:NamedIndividual
– this is the way used in one of the examples of
Section 2.3.3.
Since doing so by hand may be tedious,
Section 3.1.2 proposes a
SPARQL update request that automatically adds such a type to named individuals.
2.2. General Approach for Representing And Exploiting Constraints
2.2.1. Using Constraint Types
cstr:Constraint
is the supertype of all types
of constraints. Similarly, the type cstr:Prescriptive_constraint
,
a subtype of cstr:Constraint
, enables one to state that some
rules are actually prescriptive constraints or to retrieve all and only
such constraints.
The prefix “cstr:
” in these identifiers is an abbreviation
for the namespace http://www.webkb.org/kb/it/CSTR
.
CSTR also includes types for constraint conditions and
types for relations between the condition and condition of a constraint.
These types are listed in
Section 2.3.2.
2.2.2. Using “Clones Without Inferred Relations” For Bypassing Some Inferences
When Checking The Conclusions Of Prescriptive Constraints
2.2.2.1. Checking Prescriptive Constraints Via the Cloning of At Least KB0\Tbox
When R is Exploited Via Forward Chaining Saturation
Furthermore, forward chaining saturation is not triggered when the original
version of objects are regenerated, e.g. because such an inferencing can be
temporarily disabled.
These conditions and the one of the previous point are for example verified in
Corese [Corby, Faron-Zucker & Gandon, 2017],
an OWL2-RL inference engine and a SPARQL engine.
CONSTRUCT
sub-requests are
allowed by LDScript, not INSERT
or DELETE
sub-requests.
For reasons explained in Section 3.2.3.2, using CONSTRUCT
for that sub-request
requires the temporary duplication of nearly the whole KB
before each conclusion matching test. Thus, the used KB management system (KBMS)
must be able to handle this duplication in terms of memory and required speed.
For a big enough KB, no such KBMS may exist.
2.2.2.2. Checking Prescriptive Constraints Via the Cloning of At Least KB0\Tbox
Using “Clones Without Types”
INSERT
request to perform the above cited KB pre-treatment.
The SPARQL query for checking prescriptive constraints given in
Section 3.2.3.1 not only
works for the method described in the previous subsection, it also works for the method
described in the present subsection. Thus, this query is not repeated in
Section 3.2.4
which therefore has only one subsection.
rdfs:domain
or
rdfs:range
relations, and
ii) definitions based on owl:equivalentClass
relations to
restrictions (there is no problem if only rdfs:subClassOf
is used).
Here is an example of how the use of rdfs:domain
relation can lead certain
inference engines to regenerate types for some clones.
Assuming there is an rdfs:domain
relation from the relation type
parent
to the type Person
, if an object of type
Person
Person is source of a parent
relation and
this object has its type removed, an inference engine may set it again.
To avoid such a case, instead of using rdfs:domain
or
rdfs:range
relations, one may write inconsistency-implying
constraints that are equivalent to these relations except that they are
usable only for checking purposes.
An example is given at the end of each of the subsections of
Section 2.3.3.
2.2.2.3. Checking Prescriptive Constraints Via the Creation of Saving Forms for R
and the Creation of a Temporary Copy of (a Relevant Part of) the KB without R
When All Inferences are Performed Via Backward Chaining
2.3. Representing Constraints Via Relations Between Classes
2.3.1. Approaches
rdfs:subClassOf
relation for representing the
implication between the condition and conclusion of a rule.
However then, either this implication must not be used for modus tollens
or the results must not be detrimental.
The situation is not much more complex when subclassOf rules are used as a way to
represent constraints.
There are three cases.
owl:Nothing
, i.e. if the
inconsistency-implying form is used, the rule is semantically a
constraint and, depending on the inference engines, modus tollens may or
may not be a danger.
cstr:SubclassOf-based_constraint_condition
or
instance of cstr:Type_of_subclassOf-based_constraint_condition
.
cstr:descriptive_constraint_conclusion
and cstr:prescriptive_constraint_conclusion
.
cstr:condition_class
and
cstr:conclusion_class
.
Since this way uses an individual to relate two classes, it can be
seen as indirectly relation-between-classes based, as opposed to the
subclassOf-based and subclassOf-analogous ways which are
directly relation-between-classes based.
A similar solution would be to use a type instead of an individual
for representing the constraint. This solution is
slightly closer to the two other ones but a bit artificial. Another reason
it is not used in this article is that the individual-based solution
is easier to name and distinguish from the other two basic solutions.
owl:NegativeObjectPropertyAssertion
.
Hence, for relating the condition and conclusion of a constraint,
this article does not propose the use of subtypeOf relations that are not
rdfs:subClassOf
relations.
2.3.1.1. Comparison With OWL-ER
owl:Nothing
in a rule conclusion, hence a rule of the form
∀x (ClassExpression(x) => ⊥)
,
ii) by using the type owl:NegativeObjectPropertyAssertion
to
express negated facts of the form
¬ ∃x ClassExpression(x)
.
Neither OWL2-ER nor Datalog+ can directly represent a positive constraint
and hence also do not distinguish between descriptive and prescriptive
constraints. However, OWL2-ER statements can be used for expressing
and exploiting constraints with the types and approach proposed in this
article.
Conversely, with the proposed approach, subclassOf-based constraints are not
restricted to OWL2-ER or OWL2 class expressions.
Here are examples of OWL2-ER rules given in [Baget et al., 2015];
to express the meaning of constraints reusing them, it is sufficient to
replace the words “is” or “has” by the words
“must be” or “must have” in the
informal descriptions (each of them is prefixed by “#”).
∀X ∀Y ∀Z (isProject(X, Y, Z) => isMember(Z,X))
#Every leader of a project is a member of this project
∀X ∀Y (Researcher(X) ∧ hasExpertise(X,Y) => ∃Z ∃L (isProject(Z,Y,L) ∧ isMember(X,Z)))
#Every researcher expert in an area is a member of a project in that area
∀X ∀Y ∀Z (isProject(X,Y,Z) ∧ isProject(X,Y,Z´) => Z = Z´)
#Every project has at most one leader
rdfs:subClassOf
entails – would be irrelevant in
this article.
See [Baget et al., 2015] and
[Swan, 2016] for those formal points.
The SPARQL code given in Section 3
provides the remaining necessary formal details.
2.3.2. Proposed Types
cstr:SubclassOf-based_prescriptive_constraint_condition
or instance
of cstr:Type_of_subclassOf-based_prescriptive_constraint_condition
.
The first case implies the second one but it also specifies that, if the
marked class has subclasses, each definition of these subclasses is also a
condition for a constraint. The other second-order classes have the same
purpose.
cstr:Constraint cstr:Constraint_condition
/\ /\ /\
| | |
cstr:Descriptive_constraint | cstr:SubclassOf-based_constraint_condition
cstr:Prescriptive_constraint /\ /\
| |
cstr:SubclassOf-based_descriptive_constraint_condition |
cstr:SubclassOf-based_prescriptive_constraint_condition
Each arrow “->” represents a subClassOf relation.
For concision purposes, boxes around classes are not drawn.
cstr#binary_relation rdf#type: rdf#Property,
> exclusion //cstr#binary_relation has the following 5 exclusive subtypes
{ cstr#id // (this exclusion set is not complete: this is not a partition)
cstr#descriptive_constraint_conclusion cstr#condition_class
cstr#prescriptive_constraint_conclusion cstr#conclusion_class
(cstr#NonInferredRelation < rdf#Statement)
(cstr#RelationFromPropertyRestriction < rdf#Statement)
};
cstr#Constraint rdf#type: cstr#Type_of_constraint,
> cstr#Descriptive_constraint cstr#Prescriptive_constraint;
cstr#Constraint_condition
> (cstr#SubclassOf-based_constraint_condition
rdf#type: cstr#Type_of_subclassOf-based_constraint_condition,
> (cstr#SubclassOf-based_descriptive_constraint_condition
rdf#type: cstr#Type_of_subclassOf-based_descriptive_constraint_condition)
(cstr#SubclassOf-based_prescriptive_constraint_condition
rdf#type: cstr#Type_of_subclassOf-based_prescriptive_constraint_condition)
);
owl#Class
> exclusion //owl#Class has the following 3 exclusive subclasses
{ cstr#Type_of_constraint
(cstr#Type_of_constraint_condition
> (cstr#Type_of_constraint_condition_that_is_a_type
> (cstr#Type_of_subclassOf-based_constraint_condition
> cstr#Type_of_subclassOf-based_descriptive_constraint_condition
cstr#Type_of_subclassOf-based_prescriptive_constraint_condition
) ) )
};
“>” means “(has for) subtype”.
“<” means “(has for) supertype”.
Namespace prefixes end by “#”, not “:”.
Thus, the syntactic form “X r: Y” can be read “X has for r Y”
(although FL syntactic sugar such as “exclusion”
slightly changes this reading: see the comments prefixed by “//”).
2.3.3. Examples
2.3.3.1. Individual-based Constraints
[] rdf:type cstr:Prescriptive_constraint;
cstr:condition_class :Person;
cstr:conclusion_class
[rdf:type owl:Class;
owl:equivalentClass [rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom :Person] ].
[] rdf:type cstr:Descriptive_constraint;
cstr:condition_class :Person;
cstr:conclusion_class
[rdf:type owl:Class;
owl:equivalentClass [rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom :Named_person] ].
#with:
:Named_person owl:equivalentClass
[rdf:type owl:Class;
owl:intersectionOf ( :Person owl:NamedIndividual )].
x is a Person_without_parent =>> false
”. Here is its
representation as an individual-based constraint using the CSTR ontology.
[] rdf:type cstr:Descriptive_constraint; #this rdf:type relation is optional
cstr:condition_class # for a negative constraint
[rdf:type owl:Class;
owl:equivalentClass [rdf:type owl:Restriction; owl:onProperty :parent;
owl:maxCardinality "0"^^xsd:nonNegativeInteger] ];
cstr:conclusion_class owl:Nothing.
rdfs:domain
or rdfs:range
relations except that they are
usable only for checking purposes.
These examples respectively represent that
“if a parent
relation has something that is Not_a_person
as source (alias, domain), this relation is incorrect” and
“if a parent
relation has something that is Not_a_person
as destination (alias, range), this relation is incorrect”.
[] cstr:condition_class #constraint-based version of an rdfs:domain relation:
[rdf:type owl:Class;
owl:intersectionOf ( :Not_a_person
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom owl:Thing ] ) ];
cstr:conclusion_class owl:Nothing.
[] cstr:condition_class #constraint-based version of an rdfs:range relation:
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom :Not_a_person];
cstr:conclusion_class owl:Nothing.
#with:
[] rdf:type owl:AllDisjointClasses; owl:members (:Person :Not_a_person).
2.3.3.2. SubclassOf-analogous Constraints
:Person #class and constraint condition
cstr:prescriptive_constraint_conclusion
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom :Person].
:Person #class and constraint condition
cstr:descriptive_constraint_conclusion
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom :Named_person]. #as previously defined
:Person_without_a_parent #class and constraint condition
owl:equivalentClass #definition and constraint condition
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:maxCardinality "0"^^xsd:nonNegativeInteger];
cstr:prescriptive_constraint_conclusion owl:Nothing.
parent
relation has something that is Not_a_person
as source (alias, domain) =>> false” and
“a parent
relation has something that is Not_a_person
as destination (alias, range) =>> false”.
:Not-a-person_with_a_parent #constraint-based version of an rdfs:domain relation:
owl:equivalentClass #constraint-based version of an rdfs:domain relation:
[rdf:type owl:Class;
owl:intersectionOf ( :Not_a_person
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom owl:Thing ] ) ];
cstr:conclusion_class owl:Nothing.
:Thing_having_for_parent_a_Not-a-parent #constraint for an rdfs:range relation:
owl:equivalentClass
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom :Not_a_person ];
cstr:prescriptive_constraint_conclusion owl:Nothing.
2.3.3.3. SubclassOf-based Constraints
:Person #class and constraint condition
rdfs:subClassOf cstr:SubclassOf-based_prescriptive_constraint_condition;
rdfs:subClassOf #conclusion (types of relations that must be present):
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom :Person].
cstr:SubclassOf-based_descriptive_constraint_condition
.
Here is a particular case where parents are restricted to be named individuals.
:Person #class and constraint condition
rdfs:subClassOf cstr:SubclassOf-based_descriptive_constraint_condition;
rdfs:subClassOf #conclusion (types of relations that must be present):
[rdf:type owl:Restriction; owl:onProperty :parent;
owl:someValuesFrom :Named_person]. #as previously defined
cstr:prescriptive_constraint_conclusion
is replaced by
rdfs:subClassOf
.
Only Section 3.7 uses Javascript.
All the other subsections use a SPARQL-like language, specifically
SPARQL or LDScript.
In some extensions of SPARQL,
e.g. LDScript,
the proposed next SPARQL-like commands (queries or update requests) can be grouped
into scripts or functions.
The commands of Section 3.1 are KB pre-treatments
and hence update requests.
The commands of the remaining subsections – except 3.7 –
are queries.
These queries are organized by kind and size. At first, the difficulty to
understand them is directly related to their size.
In SPARQL and LDScript, variable names begin by “?”.
In its The next command shows a reversible way to remove individuals
– as well as subclassOf relations between their types –
to allow classic Description Logic inference engines to handle user-defined
relations between types, as explained in
the paragraph before Figure 1 in Section 2.3.
To find individuals the The next query lists every object violating at least one of the
individual-based positive descriptive constraints.
As shown by the code, such an object satisfies two conditions.
First, this object matches – and hence has for type – the
condition of a constraint In the rest of this Section 3, when a query that checks constraints has some
code that has not been used in a previous command, this code is in italics.
The next query lists every object violating an individual-based
inconsistency-implying constraint.
The next four subsections give a SPARQL-like implementation of the method
described in Section 2.2.2.1.
Two of the discussed variants are here supported.
The next query assumes that
i) the clones without inferred relation have already been created,
e.g. via the SPARQL requests of
Section 3.2.3.3 and
Section 3.2.3.4
below, and
ii) these clones are accessible from regular objects via relations of type
Here is an LDScript query which, for each object satisfying the condition of a constraint,
calls a function before testing if the clone of this object matches the conclusion of this
constraint. This function generates the clone in a temporary graph ?g,
via a This technique of generating a temporary graph and then querying within it
could also be used for
dynamically typing named individuals with For every object This This section is about a SPARQL-like implementation of the method
described in Section 2.2.2.2. However,
as explained in that subsection, this one has only one subsection. One reason is that the SPARQL
query given in Section 3.2.3.1
also works for this method.
The next command is similar to the one in
Section 3.2.3.3.
The differences are that, when treating an object, the next command does not also reify it and
does not copy its type into the clone. This last difference is highlighted via the use of
italic characters.
This section is about a SPARQL-like implementation of the method
described in Section 2.2.2.3.
In a OWL based KB, R is the set of restrictions and KB0\R is KB0
where the body of each restriction is empty. The next command reifies
the relations of such bodies.
Here is an LDScript query which, to test if an object matches a constraint,
i) tests if the object matches the condition of the constraint, and if so,
ii) via a The usable content-independent queries here are identical to their counterparts
in Section 3.2
except for the initialization of There are other ways to write the queries. For example:
The previous queries do not rely on
inference engines to take into account the special meaning of CSTR classes.
Hence, as explained in
Section 2.3.1, these queries cannot be adapted for checking
subclassOf-based constraints representing positive descriptive constraints.
For prescriptive constraints, the queries are the same as their counterparts in
Section 3.2 except for the
initialization of The next query lists every object violating an individual-based
inconsistency-implying constraint.
Such an object matches – and hence has for type –
a class
Except as a module for calculating the completeness degree of a KB,
this previous query is useless if, when building the KB, its consistency is already
checked by an inference engine that delivers an error message when detecting
that an object is instance of a subclass of To list binary relations violating constraints – instead of
individuals that have some relations violating constraints –
it is sufficient to replace Like queries on individuals, queries on relations can use additional filters.
E.g., for the last query
(Section 3.4) to operate
only on negative facts, one may add at the end of its body: A simple way to define or calculate a completeness degree for a KB
is to divide “the number of relations (in the KB) that do not violate
constraints” by “the total number of relations”.
Another completeness degree may be obtained by considering
only prescriptive constraints. Yet another one may be obtained by dividing
“the number of individuals that do not violate prescriptive constraints” by
“the total number of individuals”.
The next query implements a variant of this last definition:
instead of individuals, this query exploits
“objects that are source of at least one relation to another object”.
Furthermore, this query assumes that the constraints are represented as
individual-based constraints. It also considers
constraints in inconsistency implying form as (negative) prescriptive constraints.
Via the method given in Section 3.5,
this query can be adapted to exploit
binary relations violating constraints instead of
individuals violating constraints.
The next function shows a way to list
each constraint violation and calculate the completeness of the ontology
wrt. these violations. There are however a few differences with the command given in
the previous subsection. JavaScript is here used instead of pseudo-code since its syntactic sugar is rather
well-known or intuitive and is object-oriented. This previous feature is required
here to have a code that is short, clear and generic with respect to error handling.
The objects in the ontology are assumed to be accessible in the object-oriented way
used in the code.
The complexity of the next function is the complexity of the object matching
multiplied by the number of given constraints, multiplied by the number of
given classes, multiplied by the average number of instances by class.
3. Exploiting Relations-between-classes Based Constraints
With a SPARQL-like language or With Javascript
3.1. Examples of KB Pre-treatments With SPARQL
3.1.1. Finding Named Individuals and Marking Them As Such
WHERE
clause, the next command
uses an heuristic to find each individual in an RDF compatible KB.
The heuristic is that each “object ?o
which
i) has a type having a superclass, and ii) is not typed as a class”
is considered an individual.
The WHERE
clause also checks that ?o
is named, via the
SPARQL function isIRI
.
The INSERT
clause then types the found named individual ?o
with owl:NamedIndividual
.
The used heuristic proved more reliable than simply looking for each object that
is not a relation and that does not have rdfs:Class
as type.
With this other method, many conditions would then have to be added for
filtering out objects such as
i) classes defined via an equivalence to a restriction, and
ii) owl:Thing
and some other types from OWL, RDFS and XSD.
This is why this heuritic is also used in every command
presented below if it requires detecting that an object is an individual.
INSERT { ?o rdf:type owl:NamedIndividual } WHERE
{ ?o rdf:type ?t. #?o has a type ?t
?t rdfs:subClassOf ?superClass. #?t has a superclass
FILTER NOT EXISTS { ?o rdf:type rdfs:Class } #?o is not a class
FILTER isIRI(?o) #?o is named with an IRI
FILTER NOT EXISTS { ?o rdf:type owl:NamedIndividual } #?o not yet typed as owl:NamedIndividual
}
3.1.2. Temporarily Removing Individuals For Checking Constraints on Types
With Classic Description Logic Inference Engines
WHERE
clause of this command uses the same
heuristic as in the previous subsection. It does not rely on the type
owl:NamedIndividual
since this would tie this content-independent
query to a particular OWL2 way to state that an object is a named individual.
The DELETE
clause removes the rdf:type
relations from
each found individual ?o
and, from the destinations of these
relations, removes the rdfs:subClassOf
relations when they exist.
As replacements for these rdfs:subClassOf
relations,
the INSERT
clause sets cstr:type
and
cstr:subClassOf
relations.
These substitute relations later permit
the re-assertion of the initial rdf:type
and
rdfs:subClassOf
relations, when needed, via a similar command.
DELETE { ?o rdf:type ?t . ?t rdfs:subClassOf ?superClass }
INSERT{ ?o cstr:type ?t . ?t cstr:subClassOf ?superClass }
WHERE { ?o rdf:type ?t . ?t rdfs:subClassOf ?superClass
FILTER NOT EXISTS { ?o rdf:type rdfs:Class }
}
3.2. Checking Individual-based Constraints
With a SPARQL-like language
3.2.1. Checking Individual-based Positive Descriptive Constraints
?posConstr
that is of type
cstr:Descriptive_constraint
.
Second, this object does not match
– and hence has not for type – the conclusion of the
constraint.
Thus, this query requires a SPARQL engine with an
entailment regime
enabling the matching (alias, categorization) of an individual to
a class expression and thence the deduction of an rdf:type
relation
between them. In the code of the commands in this Section 3,
such deduced rdf:type
relations are highlighted in bold.
For inferencing completeness purposes, such deductions require an entailment
regime able to handle the expressiveness used in the constraints and the rest
of the KB.
SELECT ?objectNotMatchingPosConstr ?posConstr_condition ?posConstr_conclusion
WHERE #"posConstr" abbreviates "positive constraint"
{ ?posConstr rdf:type cstr:Descriptive_constraint;
cstr:condition_class ?posConstr_condition;
cstr:conclusion_class ?posConstr_conclusion.
FILTER (?posConstr_conclusion != owl:Nothing)
?objectNotMatchingPosConstr rdf:type ?posConstr_condition. #matches condition
FILTER NOT EXISTS #objects satisfying the conclusion must NOT be listed
{ ?objectNotMatchingPosConstr rdf:type ?posConstr_conclusion }
}
3.2.2. Checking Individual-based Inconsistency-implying Constraints
SELECT ?objectMatchingNegativeConstr ?negativeConstr_condition WHERE
{ ?negativeConstr cstr:condition_class ?negativeConstr_condition;
cstr:conclusion_class owl:Nothing.
?objectMatchingNegativeConstr rdf:type ?negativeConstr_condition.
}
3.2.3. Checking Individual-based Positive Prescriptive Constraints
Via the Cloning (By Reification) of At Least KB0\Tbox
When R is Exploited Via Forward Chaining Saturation
3.2.3.1. Checking the Prescriptive Constraints With Already Created Clones
cstr:cloneWithoutInferredRelation
.
In this query, only two lines differ from the query for descriptive constraints
given in Section 3.2.1.
The first is the line for accessing the clone. The second is the line testing if the clone
– instead of the regular object – matches the constraint conclusion.
SELECT ?objectNotMatchingPosConstr ?posConstr_condition ?posConstr_conclusion WHERE
{ ?posConstr rdf:type cstr:Prescriptive_constraint;
cstr:condition_class ?posConstr_condition;
cstr:conclusion_class ?posConstr_conclusion.
FILTER (?posConstr_conclusion != owl:Nothing)
?objectNotMatchingPosConstr rdf:type ?posConstr_condition. #matches condition
?objectNotMatchingPosConstr cstr:cloneWithoutInferredRelation ?clone
FILTER NOT EXISTS #objects with clones satisfying the conclusion must not be listed
{ ?clone rdf:type ?posConstr_conclusion }
}
3.2.3.2. Checking the Constraints Via a Dynamic Creation of the Clones
CONSTRUCT
command. Like SPARQL, LDScript does not permit the use of
a same variable to relate the results of queries on different KBs or graphs
– indeed, as explained in Section 2.1.2,
a same object identifier can be used in two queries as a way to insert in the second
query the result of the first, but a variable cannot be used for the same purpose.
Thus, the conclusion matching test
has to be performed within the generated graph. Hence, this graph cannot solely contains the
clone of the initially tested object: it must also duplicate
– and thus permits the access to – all the relations directly or
indirectly accessible from this initial object, except for those which have been inferred,
i.e. those that have no saving forms. Here,
the saving forms are assumed to have been created via the KB pre-treatment
specified in the next subsection although without also creating the clone (to that end,
in this KB pre-treatment the line to remove is indicated via a comment).
Since CONSTRUCT
is used, not INSERT
, running this command
does not require the priviledge to modify the KB.
SELECT ?objectNotMatchingPosConstr ?posConstr_condition ?posConstr_conclusion WHERE
{ ?posConstr rdf:type cstr:Prescriptive_constraint;
cstr:condition_class ?posConstr_condition;
cstr:conclusion_class ?posConstr_conclusion.
FILTER (?posConstr_conclusion != owl:Nothing)
?objectNotMatchingPosConstr rdf:type ?posConstr_condition. #matches condition
#the two functions called in the next two lines are defined below
BIND( cstr:createCloneOf(?objectNotMatchingPosConstr) as ?clone )
BIND( cstr:createTemporaryGraphContainingTheGivenCloneAndTheWholeKBminusTheGivenObject
(?clone,?objectNotMatchingPosConstr) as ?g )
FILTER NOT EXISTS { GRAPH ?g { ?clone rdf:type ?posConstr_conclusion } }
}
FUNCTION cstr:createCloneOf (?object) #the returned clone does not yet have relations
{ uri(concat(str(?object),"_cloneWithoutInferredRelation"))
}
FUNCTION cstr:createTemporaryGraphContainingTheGivenCloneAndTheWholeKBminusTheGivenObject
(?clone, ?objectNotMatchingPosConstr)
{ LET (?g = CONSTRUCT { ?clone ?r ?dest . ?x ?r2 ?y } WHERE
{ VALUES ?clone { UNDEF } #the clone is not predefined to be only certain objects
?objectNotMatchingPosConstr ?r ?y. #anything reachable from the object
?x ?r2 ?y. FILTER (?x != ?objectNotMatchingPosConstr) #only indirectly reachable
?m rdf:subject ?clone. ?m rdf:predicate ?r. ?m rdf:object ?dest.
?m rdf:type cstr:NonInferredRelation
})
{ xt:entailment(?g) } #triggers forward chaining saturation on ?g
}
owl:NamedIndividual
.
Thus, this technique could permit the checking of constraints restricted
to named individuals without using the KB pre-treatement given in
Section 3.1.1, hence
without modifying the KB.
3.2.3.3. KB Pre-treatment Creating the Saving Forms and the Clones
?o
in the whole KB, if ?o is an individual, the next command
i) if needed, creates ?o2
, a clone of ?o
that has the same relations, and
ii) reifies each of these relations to store the fact it is not an inferred relation.
The object ?o2 has for identifier the one of ?o
but with the suffix
“_cloneWithoutInferredRelation”.
For the sake of clarity, two assumptions were made when writing this command.
One is that the KB does not include user-defined 2nd-order types.
Another assumption is that KB0\Tbox is the set of individuals of the KB
along with non-type relations associated to them. This is the case if the KB is
RDFS based or OWL based.
INSERT #the next line must be removed if the clones must not also be created
{ ?o cstr:cloneWithoutInferredRelation ?o2. ?o2 ?r ?dest.
?m rdf:type cstr:NonInferredRelation. #this last type is subtype of rdf:Statement
?m rdf:subject ?o2. ?m rdf:predicate ?r. ?m rdf:object ?dest #reification of ?r by ?m
}
WHERE
{ ?o rdf:type ?t . FILTER NOT EXISTS { ?o rdf:type rdfs:Class }
?t rdfs:subClassOf ?superClass . #?o is a typed individual
FILTER NOT EXISTS { ?o cstr:cloneWithoutInferredRelation ?c1 } #?o must not have a clone
FILTER NOT EXISTS { ?c2 cstr:cloneWithoutInferredRelation ?o } #?o must not be a clone
{ #Case 1: supporting the cloning of each individual ?o having at least 1 relation of type
# ?r different from owl:sameAs (the INSERT clause adds the relations "?o2 ?r ?dest"
# from the clone and also reifies these relations)
?o ?r ?dest. FILTER(?r!=owl:sameAs)
}
UNION #Case 2: preventing the useless cloning of an individual having only 1 relation when
# this relation is of type owl:sameAs
{ ?o ?r ?dest #?o has at least one relation from it
FILTER NOT EXISTS {?o ?r2 ?dest2. FILTER(?r2!=owl:sameAs) }
}
BIND( uri(concat(str(?o),"_cloneWithoutInferredRelation")) as ?o2 )
#the identifier of ?m incudes those of ?o2, ?r and ?dest:
BIND( uri(concat(str(?o2),concat("_",
concat(str(?r),concat("_",
concat(str(?dest),"_reification")))))) as ?m )
}
3.2.3.4. Removing Inferred Relations From Clones
DELETE
command is supposed to be run after forward chaining saturation is
performed and before running a query for checking prescriptive constraints.
This may be done manually or, at least with LDScript, by using a script that runs this
command just before running the query.
This command removes the inferred relations associated to the clones, i.e. the relations
which have no reification stating they are not inferred.
DELETE { ?o2 ?r ?dest } WHERE
{ ?o cstr:cloneWithoutInferredRelation ?o2 . ?o2 ?r ?dest.
FILTER NOT EXISTS { ?m rdf:subject ?o2. ?m rdf:predicate ?r. ?m rdf:object ?dest.
?m rdf:type cstr:NonInferredRelation }
}
3.2.4. Checking Individual-based Positive Prescriptive Constraints
Via the Cloning of At Least KB0\Tbox Using “Clones Without Types”
3.2.4.1. KB Pre-treatment Creating “Clones Without Types”
INSERT { ?o cstr:cloneWithoutInferredRelation ?o2 . ?o2 ?r ?dest } WHERE
{ ?o rdf:type ?t. FILTER NOT EXISTS { ?o rdf:type rdfs:Class }
?t rdfs:subClassOf ?superClass. #?o is a typed individual
FILTER NOT EXISTS { ?o cstr:cloneWithoutInferredRelation ?c1 } #?o must not have a clone
FILTER NOT EXISTS { ?c2 cstr:cloneWithoutInferredRelation ?o } #?o must not be a clone
{ #Case 1: supporting the cloning of each individual ?o having at least 1 relation of type
# ?r different from owl:sameAs and rdf:type (the INSERT clause adds the relations
# "?o2 ?r ?dest" from the clone)
?o ?r ?dest. FILTER(?r!=owl:sameAs) FILTER(?r!=rdf:type)
}
UNION #Case 2: preventing the useless cloning of an individual having only 1 relation when
# this relation is of type owl:sameAs or rdf:type
{ ?o ?r ?dest #there is at least one relation from ?o
FILTER NOT EXISTS {?o ?r2 ?dest2. FILTER(?r2!=owl:sameAs) FILTER(?r2!=rdf:type) }
}
BIND( uri(concat(str(?o),"_cloneWithoutInferredRelation")) as ?o2 )
}
3.2.5. Checking Individual-based Positive Prescriptive Constraints
Via the Creation of Saving Forms for R and
the Creation of a Temporary Copy of the KB without R
When All Inferences are Performed Via Backward Chaining
3.2.5.1. KB Pre-treatment Creating Saving Forms For R In an OWL based KB
INSERT{ ?m rdf:subject ?restr. ?m rdf:predicate ?r. ?m rdf:object ?dest. #reification of ?r
?m rdf:type cstr:RelationFromPropertyRestriction. #subtype of rdf:Statement
}
WHERE { ?restr rdf:type owl:Restriction. ?restr ?r ?dest. FILTER(?r != owl:equivalentClass)
#the identifier of ?m incudes those of ?o2, ?r and ?dest:
BIND( uri(concat(str(?restr),concat("_",
concat(str(?r),concat("_",
concat(str(?dest),"_reification")))))) as ?m )
}
3.2.5.2. Checking the Prescriptive Constraints Via the Checking of Conclusions
Within a Temporary Copy of (a Relevant Part of) the KB without R
CONSTRUCT
sub-command, temporarily creates a copy of all the
relations that can be directly or indirectly reached from the object, except for the relations
reified via the command in the previous subsection, and
iii) tests if the object matches the conclusion of the constraint within the
temporary graph.
SELECT ?objectNotMatchingPosConstr ?posConstr_condition ?posConstr_conclusion WHERE
{ ?posConstr rdf:type cstr:Prescriptive_constraint;
cstr:condition_class ?posConstr_condition;
cstr:conclusion_class ?posConstr_conclusion.
FILTER (?posConstr_conclusion != owl:Nothing)
?objectNotMatchingPosConstr rdf:type ?posConstr_condition. #matches condition
#the function called in the next line is defined below
BIND( cstr:createTemporaryGraphContainingTheKBminusR(?objectNotMatchingPosConstr) as ?g )
FILTER NOT EXISTS { GRAPH ?g { ?objectNotMatchingPosConstr rdf:type ?posConstr_conclusion } }
}
FUNCTION cstr:createTemporaryGraphContainingTheKBminusR (?objectNotMatchingPosConstr)
{ LET (?g = CONSTRUCT { ?objectNotMatchingPosConstr ?r ?y } WHERE
{ ?objectNotMatchingPosConstr ?r ?y. #anything reachable from the object
FILTER NOT EXISTS { ?m rdf:subject ?x. ?m rdf:predicate ?r. ?m rdf:object ?y.
?m rdf:type cstr:RelationFromPropertyRestriction }
})
}
3.3. Checking SubclassOf-analogous Constraints
With a SPARQL-like language
?posConstr_condition
and
?posConstr_conclusion
since now they are related by
a cstr:descriptive_constraint_conclusion
relation or
a cstr:prescriptive_constraint_conclusion
relation.
E.g., here is a query for checking subclassof-analogous positive descriptive
constraints. See the line in italics for the new initialization.
SELECT ?objectNotMatchingPosConstr ?posConstr_condition ?posConstr_conclusion WHERE
{ ?posConstr_condition cstr:descriptive_constraint_conclusion ?posConstr_conclusion.
FILTER (?posConstr_conclusion != owl:Nothing)
?objectNotMatchingPosConstr rdf:type ?posConstr_condition. #matches condition
FILTER NOT EXISTS #objects satisfying the conclusion must NOT be listed
{ ?objectNotMatchingPosConstr rdf:type ?posConstr_conclusion }
}
FILTER (?posConstr_conclusion != owl:Nothing)
”
could be replaced by “FILTER NOT EXISTS {
?posConstr_conclusion cstr:prescriptiveConclusion owl:Nothing }
”.
The first way has the advantage of not being dependent on the chosen
representation for constraint and hence this way minimizes the difference between
the queries. On the other hand, with this way, owl:Nothing
cannot be
replaced by equivalent class expressions (in SPARQL).
?objectNotMatchingPosConstr cstr:cloneWithoutInferredRelation
?cloneWithoutInferredRelation
”
before “FILTER NOT EXISTS
” could be replaced by the line
“BIND( uri( concat( str(?objectNotMatchingPosConstr),
"_cloneWithoutInferredRelation" ) ) as ?cloneWithoutInferredRelation )
”
within the “FILTER NOT EXISTS
” block.
3.4. Checking SubclassOf-based Constraints
With a SPARQL-like language
?posConstr_condition
and ?posConstr_conclusion
.
E.g., for a positive prescriptive constraint, this initialization now is:
?posConstr_condition rdfs:subClassOf cstr:SubclassOf-based_prescriptive_constraint_condition,
?posConstr_conclusion.
?negConstr_condition
which
i) has owl:Nothing
as conclusion, and optionally
ii) is subclass of the type
cstr:SubclassOf-based_constraint_condition
.
If this previous type is not used in the negative constraints of the KB,
this second condition must actually not be used in the query.
SELECT ?objectMatchingNegConstr ?negConstr_condition WHERE
{ ?negConstr_condition rdfs:subClassOf owl:Nothing;
rdfs:subClassOf cstr:SubclassOf-based_constraint_condition. #optional line
?objectMatchingNegConstr rdf:type ?negConstr_condition.
}
owl:Nothing
.
By default, some Description Logic inference engines such as
Corese [Corby, Faron-Zucker & Gandon, 2017] do not deliver error messages
or warning messages when detecting such objects.
Having to make inferences on instances of a subclass of owl:Nothing
also makes Corese behaves abnormally. E.g., when an individual violates
a subclassOf-based inconsistency-implying constraint – and hence
is instance of owl:Nothing
– as well as a positive constraint,
this object does not appear in the results of the above described queries
searching objects violating positive constraints.
Thus, as noted in Section 2.3.1, using subclassOf-based constraints is not relevant
for general-purpose knowledge sharing.
3.5. Checking Binary Relations Instead of Individuals
With a SPARQL-like language
rdf:type
by the “logical
implication relation between statements” in the previous
content-independent queries that check positive constraints.
For referring to such relations, Tim Berners-Lee uses the type name
log:implies
[Berners-lee et al., 2008] in his Notation3 KRL.
However, for this replacement to work, the used SPARQL engine must exploit an
inference engine that can deduce the existence of such a relation when it
exists between the matched statements. Description Logic inference engines
generally do not do so.
?objectMatchingNegativeConstr rdf:type owl:NegativePropertyAssertion.
3.6. Evaluating a Constraint-based Completeness
With a SPARQL-like language
SELECT ( ((?nbObjs - ?nbAgainstPosCs - ?nbMatchingNegCs) / ?nbObjs)
AS ?completeness )
{ {SELECT (COUNT(DISTINCT ?o) AS ?nbObjs)
WHERE { ?o ?r ?o2 } #any object source of a relation to another object
# For considering only objects that have a type:
# { {?o rdf:type ?t1} UNION {?o cstr:type ?t2} }
}
{SELECT(COUNT(DISTINCT ?objectNotMatchingPosConstr) AS ?nbAgainstPosCs)
WHERE { ... #the body of a query checking an individual-based positive
# prescriptive constraint (see Section 3.2.3.1) must be copied here
#if ?objectNotMatchingPosConstr also violates a negative constraint
# it must not be counted here (otherwise it would be counted twice),
FILTER NOT EXISTS # hence this code here
{ ?negativeConstr cstr:condition_class ?negativeConstr_condition;
cstr:conclusion_class owl:Nothing.
?objectNotMatchingPosConstr rdf:type ?negativeConstr_condition
}
}
}
{SELECT (COUNT(DISTINCT ?objectMatchingNegativeConstr) AS ?nbMatchingNegCs)
WHERE { ?negativeConstr cstr:condition_class ?negativeConstr_condition ;
cstr:conclusion_class owl:Nothing.
?objectMatchingNegativeConstr rdf:type ?negativeConstr_condition
}
}
}
3.7. JavaScript Function for Ontology Completeness Evaluation
specializes
is called to perform the
matching between relations associated to an object and relations in a
constraint condition or conclusion, i.e., to test if each relation in the
second set of relations has a specialization in the first set (details on
such a specialization test can be found in
[Chein & Mugnier, 2008] and
[Swan, 2016]).
This method can ignore the instanceOf relations from the handled object.
It does so when the checked constraint is a prescriptive one, i.e.,
if its parameter is instance of cstr:Prescriptive_constraint
.
In other words, for such constraints this method is based
on the same idea as the one based on “clones without types” but
does not need to generate such clones.
With such a method, the code below needs not include a temporary removal of
instanceOf relations from the objects.
function checkThatInstancesOfTheseClassesSatisFyTheseConstraints
(setOfClasses, //e.g.: {Man, Woman}
setOfConstraints, //e.g.: {"if x is a Person, x has a parent"}
errorHandling) //contains a class for generic error handling
{ if (setOfConstraints.length() == 0)
return errorHandling.returnCodeIfNoPropertyToCheck;
if (setOfClasses.length() == 0)
return errorHandling.returnCodeIfNoClassToCheck;
for co in setOfConstraints //co may or may not be inconsistency-implying
{ var numberOfObjsSatisfyingCo= 0, numberOfObjsNotSatisfyingCo= 0;
for cl in setOfClasses
for obj in cl.instances()
if (obj.specializes(co.condition))
if ((co.isPositiveConstr() && obj.specializes(co.conclusion))
numberOfObjsSatisfyingCo++;
else { numberOfObjsNotSatisfyingCo++;
if (errorHandling.alertAtEachError)
alert("Relations from '" + obj.toString() +
"' do not satisfy:\n " + co.toString());
//e.g.: Relations from 'Tom' do not satisfy:
} // if x is a Person, x has a parent
var completenessForThatConstraint= numberOfObjsSatisfyingCo /
(numberOfObjsSatisfyingCo + numberOfObjsNotSatisfyingCo);
if (completenessForThatConstraint < 1)
{ if (errorHandling.alertAtEachIncompleteness)
alert("Only " + completenessForThatConstraint + "% of instances"
+ " of the given classes satisfy the constraint:\n " +
+ co.toString());
if (errorHandling.returnCodeIfIncompletenessForSomeConstraint)
return errorHandling.returnCodeIfIncompletenessForSomeConstraint;
}
}
return errorHandling.returnCodeIfNoError;
}
Ontology design patterns or best practices (ODPs), e.g. those
recommended by the W3C
or those of the
“ODP catalog”, are
i) informal descriptions about how certain things should be represented,
and/or
ii) collections of types that should be reused whenever possible, or
iii) lexical or syntactic rules to follow when importing or exporting
formal or informal knowledge.
As shown in Section 2,
descriptive or prescriptive constraints are ways to represent two meanings of
“must be reused whenever possible”
and hence ways to formalize and implement ODPs related to the second point.
The next subsections show how this can be done with the proposed approach.
To seek the limits of this approach and to be as generic as possible,
the used example ODPs are not tied to a particular ontology. If not directly
re-used for a particular ontology, these examples can be used as templates
for writing simpler constraints for particular ontologies.
Several research works in knowledge acquisition, model-driven engineering or
ontology engineering, e.g. [Marino, Rechenmann & Uvietta, 1990]
[Bachimont, Isaac & Troncy, 2002] [Dromey, 2006] [Rector et al., 2012],
have advocated the use of tree structures when designing a subtype hierarchy,
hence the use of
i) single inheritance only, and
ii) multiple views or viewpoints, one for each tree.
Indeed, they argue that objects have a unique place in such trees and hence
that such trees can be used as decision trees or ways to avoid redundancies,
normalize KRs and ease KR search/handling.
This is true but, for the same reasons and as further detailled in the next
paragraph, the same advantages can be obtained by subtyping solely via
either complete sets of exclusive (direct) subtypes
– i.e. subtype partitions, e.g. via the use of
disjoint unions of types –
or incomplete sets of exclusive subtypes.
Furthermore, this last method has the additional advantages of
i) still allowing multi-inheritance,
ii) not requiring a special inference engine to handle views
(e.g. [Marino, Rechenmann & Uvietta, 1990]
[Djakhdjakha, Hemam & Boufaïda, 2014]),
iii) being less cumbersome for knowledge providers since with the last method
they do not have to specify or use views each time they create subtypes, and
iv) when this can be done, specifying that a set of (direct) subtypes is
complete supports interesting inferences for checking or reasoning purposes.
The fact that OWL-EL, OWL-RL or OWL-QL do not include disjoint unions of types
is not a reason for not using them in ontologies designed for knowledge sharing
purposes: this fact only means that, before using such ontologies with
inference engines dedicated to OWL-EL, OWL-RL or OWL-QL, some expressiveness
has to be lost, e.g. by automatically converting each combination of
disjoint union and class equivalence into a combination of
class disjointness and subtyping.
Indeed, an aside contribution of this article is to note that in order to keep the
above cited advantages of the tree structures, it is sufficient (and necessary)
that whenever
two types are exclusive – i.e., whenever they must not share common
subtypes or instances –
the exclusion relation that links these two types must be specified,
explicitly or implicitly. With tree structures, such exclusion relations are
implicit but still specified.
By systematically using sets of exclusive subtypes, all exclusion relations can be
easily and explicitly specified, e.g. via the use of binary partitions such as
“Spatial_entity / Non-spatial_entity”.
Exclusion relations are a simple and efficient way to support ontology checking
and knowledge retrieval. For example, when searching for particular
“achievements” in a KB, people may not know which exact type to use and
then which query to use: whether a query for specializing objects, for
generalizing ones, for doing both with a maximal number of
specializing/generalizing relations, etc. In this case, it may be easier for
them to search for instances of all types not exclusive with some type
they have in mind. This query may well collect all and only the objects they
are interested in, if the ontology is systematically structured using
sets of exclusive subtypes.
However, there are some kinds of types for which using only subtype partitions
– or sometimes even just sets of exclusive subtypes –
is cumbersome. An example for these kinds is the one of
non-natural types [Guizzardi, 2005] [Mizoguchi, Kozaki & Kitamura, 2012]
such as role types and phase types.
Examples of general role types are Agent, Experiencer, Recipient, Cause and
Instrument. Examples of more specific ones are Driver, Employee and King.
Examples of phase types are Beginning and Child.
Examples of role type partitions are “Cause / Non-cause” and
“King / Non-king”.
Systematically specifying such partitions is often cumbersome and useless since
role types such as Non-cause and Non-king will probably not be instantiated.
Always using sets of exclusive subtypes for non-natural types is also sometimes
cumbersome, e.g. when only one subtype is known or relevant to represent.
The situation is similar for some other kinds of types
– such as those for attributes, information objects or relations –
for which the distinction between “natural type” and
“non-natural type” is hard to make or not relevant.
Thus, even though systematically using subtype partitions or sets of
exclusive subtypes is an interesting ontology design pattern (ODP),
for practical purposes this ODP should be generalized to include options
allowing ontology designers not to use exclusive subtypes in certain
circumstances, as long as such choices are explicitly represented.
To that end, first considering only subclasses (not all subtypes),
the use of certain subtypes of When – and only when – an ontology using such subtypes of
The next subsection is only about the Subclass ODP. This article does not propose
a similar constraint for a “Subproperty ODP" to support the checking
of relation subtypes. Indeed, RDFS+OWL provides the
Defining If the last three lines are removed from this
definition, the constraint for the Subclass ODP will check that all subclasses
are defined only via subtype partitions. In other words, the last three lines
provide three more subtyping options.
Defining Defining The problem with this definition is that an OWL list
of disjoint classes is not source of One solution is to define membership in a way that is
partial but sufficient for correctly handling the content of the KB, as
illustrated here:
Another solution is to use a KRL that permits a full
definition of A third solution is to pre-treat the KB for adding
Defining The previous paragraph explained why the use of
Defining the constraint for the Subclass ODP.
A positive descriptive constraint can now be defined to ensure that
“if there is a subclass relation between two classes, this relation
must be of type This constraint can also be translated into a negative constraint and
represented in inconsistency-implying form. That way, an OWL2-RL inference engine
can be used for checking this constraint.
The above constraint seems worth being generalized and applied
to all transitive relations used in an ontology, e.g. part relations.
Indeed, the reasons why the destinations of subtype relations are worth being
organized into subtype partitions are also reasons why the destinations
of part relations are worth being organized into part partitions.
The definition of a part partition for an object mirrors the one for a
subtype partition: it is a complete set of disjoint parts of the object;
“disjoint” meaning that the parts cannot share common subparts,
and “complete” meaning that the union of the parts is equal to the
object.
Since OWL does not allow the use of variables referring to relation types,
OWL does not enable the specification of a unique constraint for all
transitive relations used in an ontology.
Hence, for each transitive relation type
that needs to be checked, a specific constraint must be created.
However, this creation can be automated using the next
SPARQL command. It takes as a parameter each instance of
Thus, based on the next partial definitions of
The use of The previous examples of generated constraints were descriptive: they stated that
if transitive relations of certain general types are used,
these relations must also be of certain specific types.
A prescriptive constraint is needed for checking that if
certain kinds of relation types are declared in an ontology,
every object instance of the domain (resp. range) of such relation types
must be the source (resp. destination) of such relations.
To support the generation of such constraints, one must first indicate which these relation types are,
e.g. by typing them with the 2nd-order type Similarly, for specifying that every class not marked as a subclass leaf
must have a subclass, one can write:
Below is a SPARQL command that generates a prescriptive constraint for each
instance of This generation-based approach can also be used to check that
instances of a certain class – or,
equivalently, instances of certain classes since a class may have
subclasses – are systematically connected to at least one instance
of a certain class via relations of a certain type, e.g. a transitive relation
type or one defined via The specifications given in Section 4.1
do not enforce the use of subtype partitions or part partitions. However,
if the specifications above given in this subsection are also used, these
partitions become mandatory except for non-natural types in the case of subtype partitions.
Thus, if there are no non-natural types, doing so specifies that any two classes
in the KB are either connected by a subclass relation or an exclusion relation.
This could not be specified using only one prescriptive constraint
since RDF+OWL does not allow the explicit use of variables for relating objects
shared by both the condition and conclusion of a constraint.
4. Illustration of Applications (or Use Cases) and Limits
of the Approach By Showing How To Represent Some
Important Ontology Design Patterns
Via Relations-between-classes based Constraints
4.1. Examples of Useful General Descriptive Constraints
4.1.1. Checking or Evaluating the Design of a Subtype Hierarchy
4.1.1.1. Approach and Rationale
sub:subclass
(inverse of
rdfs:subClassOf
) seems the easiest way compatible with OWL and
the proposed approach.
This is why this article introduces the relation types
sub:nonNaturalSubclass
and sub:loneSubclass
.
The prefix “sub:
” is an abbreviation
for the namespace http://www.webkb.org/kb/it/SUB
.
SUB is the ontology declaring and organizing the types proposed in this article.
The next subsection concludes with the representation of a constraint that
enforces the above cited generalized ODP (restricted to subclasses).
This ODP, which from now on is called the “Subclass ODP”, works
by checking that each used subclassOf relation has for type a certain subtype
of rdfs:subClassOf
. The relation types
sub:nonNaturalSubclass
and sub:loneSubclass
can be used for (exceptionally) not using exclusive subclasses. The first
type also implies that the source of the relation is a non-natural class.
rdfs:subClassOf
has to be exploited by an inference engine
that cannot handle such subtypes,
a pre-treatment that expands the definition of these subtypes
– and thus removes these subtypes – must first be applied
on the ontology.
rdfs:subPropertyOf
and owl:propertyDisjointWith
but does not
allow the use of owl:disjointUnionOf
or owl:unionOf
on properties.
4.1.1.2. Implementation of the Subclass ODP with OWL2
sub:subclassInExclusionSetOrAlone
, the supertype of the
subtyping relation types allowed by the Subclass ODP. For the reasons given in
Section 2.3.3,
the Turtle notation is used.
sub:subclass owl:inverseOf rdfs:subClassOf.
sub:subProperty owl:inverseOf rdfs:subPropertyOf.
sub:subclassInExclusionSetOrAlone
rdfs:subPropertyOf sub:subclass;
sub:subProperty sub:subclassInDisjointUnion #for a subclass in a subtype partition
sub:subclassInSetOfExclusiveSubclasses
sub:loneSubclass #for a subclass not in a set of exclusive subtypes
sub:nonNaturalSubclass. #idem but for a non natural type
sub:nonNaturalSubclass
.
With the following full definition of this relation type, it does not
have to be used explicitly for sub:subclassInExclusionSetOrAlone
relations to be inferred:
rdfs:subClassOf
relation can also be used instead as long as all
non-natural classes are categorized as subtypes of sub:NonNaturalClass
.
sub:nonNaturalSubclass rdfs:subPropertyOf sub:subclassInExclusionSetOrAlone;
rdfs:domain owl:Class; rdfs:range sub:NonNaturalClass.
sub:subclassInDisjointUnion
.
If the following full definition of this relation type
sub:subclassInDisjointUnion
was correct, it
would not have to be used explicitly for
sub:subclassInExclusionSetOrAlone
relations to be inferred:
owl:disjointUnionOf
relations or
equivalent representations could be used instead.
sub:subclassInDisjointUnion
owl:propertyChainAxiom (owl:equivalentClass owl:disjointUnionOf rdfs:member).
rdfs:member
relations to these
classes: these classes are supposed to be accessed sequentially from the first
(via an rdf:first
relation) to the last (via a sequence of
rdf:rest
relations followed by a rdf:first
relation).
Since OWL does not allow regular expressions or recursive definitions, it does
not allow a full definition of membership based on rdf:rest
and
rdf:first
relations. Thus, this article proposes the following three
solutions to support the checking of sub:subclassInExclusionSetOrAlone
relations and equivalent representations, via the proposed content-independent
queries.
sub:listMember
sub:subclass sub:listMember1, sub:listMember2, sub:listMember3. #and so on if needed
sub:listMember1 owl:equivalentProperty rdf:first.
sub:listMember2 owl:propertyChainAxiom (rdf:rest rdf:first).
sub:listMember3 owl:propertyChainAxiom (rdf:rest rdf:rest rdf:first). #and so on if needed
Then, the following definition of sub:subclassInDisjointUnion
is sufficient.
However, these definitions are not within OWL2-RL, OWL2-EL and OWL2-QL since
owl:disjointUnionOf
is not within them.
sub:subclassInDisjointUnion
owl:propertyChainAxiom (owl:equivalentClass owl:disjointUnionOf sub:listMember).
sub:listMember
. Representations that use this type
can still be translated into OWL2 by using the above kind of
“partial but sufficient” representations.
sub:subclassInDisjointUnion
relations based on
owl:disjointUnionOf
relations in
the KB. Here is a SPARQL1.1 update request supporting this.
INSERT { ?st1 sub:subclassInDisjointUnion ?t.
?st2 sub:subclassInDisjointUnion ?t } WHERE
{ ?t owl:equivalentClass [rdf:type rdfs:Class;
owl:disjointUnionOf/rdf:rest*/rdf:first ?st1;
owl:disjointUnionOf/rdf:rest*/rdf:first ?st2]
FILTER (?st1 != ?st2)
}
sub:subclassInSetOfExclusiveSubclasses
.
If the following full definition of this relation type was correct,
it would not have to be used explicitly for
sub:subclassInExclusionSetOrAlone
relations to be inferred:
owl:disjointUnionOf
relations or
equivalent representations could be used instead.
sub:subclassInSetOfExclusiveSubclasses owl:propertyChainAxiom
(sub:subclass owl:equivalentClass owl:disjointUnionOf rdfs:member).
rdfs:member
relations is incorrect and how there are three solutions
to fix it. In the current case, the partial but sufficient definition is:
sub:subclassInSetOfExclusiveSubclasses owl:propertyChainAxiom
(sub:subclass owl:equivalentClass owl:disjointUnionOf sub:listMember).
sub:subclassInExclusionSetOrAlone
”.
An equivalent constraint, albeit class-focused instead of relation-focused, is:
“if a class C1 has a subclass relation to a class C2, this relation must
be of type sub:subclassInExclusionSetOrAlone
”.
However, OWL does not allow the use of variables to refer to the relation or to C2.
The OWL compatible version of this constraint is then
“if a class C1 has a subclass relation,
all subclass relations from C1 must be of type
sub:subclassInExclusionSetOrAlone
”.
If “some” was used instead of “all”, not all
the subclass relations from C1 would be checked.
Using OWL2 and the CSTR types, here is a way to write this constraint without
having to define a second-order class.
The type owl:AllDisjointClasses
is allowed in OWL2-EL, OWL2-QL and OWL2-RL.
However, although an OWL2-RL inference engine can check if an expression using this
type is respected, this engine cannot use this type for inferring whether or not
an object is instance of the conclusion class below.
[] rdf:type cstr:Descriptive_constraint;
cstr:condition_class # if C is a class that has a subclass ...
[rdf:type owl:Class;
owl:equivalentClass [rdf:type owl:Restriction; #"any class that has a subclass"
owl:onProperty sub:subclass; owl:someValuesFrom rdfs:Class] ];
cstr:conclusion_class #... then C has no subclass relation that is not
# of type sub:subclassInExclusionSetOrAlone
sub:ClassWithNoRelationOfType_subclassButNot-subclassInExclusionSetOrAlone.
#(=> all subclass relations of type sub:subclassInExclusionSetOrAlone)
#with:
[] rdf:type owl:AllDisjointClasses;
owl:members #this relation type is usable even though rdfs:member is not
# usable for accessing each member of the destination list!
(sub:ClassWithNoRelationOfType_subclassButNot-subclassInExclusionSetOrAlone
sub:ClassWithSomeRelationOfType_subclassButNot-subclassInExclusionSetOrAlone).
sub:ClassWithSomeRelationOfType_subclassButNot-subclassInExclusionSetOrAlone
owl:equivalentClass
[rdf:type owl:Restriction;
owl:onProperty sub:subclassButNot-subclassInExclusionSetOrAlone;
owl:someValuesFrom owl:Class].
sub:subclassButNot-subclassInExclusionSetOrAlone
rdfs:subPropertyOf sub:subclass;
owl:propertyDisjointWith sub:subclassInExclusionSetOrAlone.
[] rdf:type cstr:Descriptive_constraint;
cstr:condition_class # if C is a class that has a subclass relation that is not
# of type sub:subclassInExclusionSetOrAlone
sub:ClassWithSomeRelationOfType_subclassButNot-subclassInExclusionSetOrAlone];
cstr:conclusion_class owl:Nothing. #... then there is a problem
4.1.2. Generalization To Check the Use of Any Transitive Relation
owl:TransitiveProperty
specified in the ontology,
i.e. each transitive relation type ?trRelType
,
e.g. sub:subclass
and sub:part
which is the
inverse of the general partOf relation type.
From each ?trRelType
, via a
sub:aloneOrInPartitionTrRelType
relation
the command accesses the supertype of recommended relation types to use
when representing ?trRelType relations.
For example, for sub:subclass
relations, the previous section showed
that this recommended supertype can be
sub:subclassInExclusionSetOrAlone
. If subtyping should rather be
done only via subtype partitions, the recommended supertype would rather be
sub:subclassInPartition
(same definition as the previous type but
without the last three lines). For sub:part
relations, the
recommended supertype can be sub:partInPartition
.
INSERT
{ [] rdf:type cstr:Descriptive_constraint;
cstr:condition_class #class for the source of a transitive relation
[?TrRelTypeDomain owl:equivalentClass
[rdf:type owl:Restriction; #class source of a transitive relation
owl:onProperty ?trRelType; #e.g., a sub:part relation
owl:someValuesFrom owl:Thing] ]; #any thing (type or individual)
cstr:conclusion_class
[?conclusion_class owl:equivalentClass
[rdf:type owl:Restriction;
owl:onProperty ?aloneOrInPartitionTrRelType; #e.g., sub:partInPartition
owl:someValuesFrom owl:Thing] ]
}
WHERE{ ?trRelType rdf:type owl:TransitiveProperty;
sub:aloneOrInPartitionTrRelType ?aloneOrInPartitionTrRelType;
rdfs:domain ?TrRelTypeDomain #e.g., owl:Thing (see below)
}
sub:subclass
and sub:part
, the above command generates
i) a descriptive constraint for checking sub:part
relations, and
ii) a descriptive constraint for checking sub:subclass
relations.
This last generated constraint is different from the one given in the previous
subsection, for two main reasons.
sub:subclassInExclusionSetOrAlone
(or, more exactly,
a partial but sufficient definition of this type) but
OWL does not permit a similar definition for sub:partInPartition
nor for any other transitive type.
However, if needed, sub:partInPartition
relations and other
transitive relations can be generated via SPARQL update requests similarly to
the way relations of type sub:subclassInDisjointUnion
can be generated, as seen in the previous subsection.
log:implies
based approach described in
Section 3.5.
Otherwise, these constraints are ambiguous, like the OWL compatible version
in the previous if “some” was used instead of “all”.
However, as noted, Description Logic inference engines generally do not
deduce log:implies
relations between statements (triples, ...)
and hence cannot be reused with this solution.
sub:subclass rdf:type owl:TransitiveProperty; rdfs:domain rdfs:Class;
sub:aloneOrInPartitionTrRelType sub:subclassInExclusionSetOrAlone.
sub:part rdf:type owl:TransitiveProperty; sub:trRelDomain owl:Thing;
sub:aloneOrInPartitionTrRelType sub:partInPartition.
owl:Thing
in this last example may be a problem for some
classic Description Logic inference engines. For example, owl:Thing
cannot be used as source or destination of an rdfs:subClassOf
relation
in OWL-RL [OWL 2 profiles].
It is therefore preferable to use a class other than owl:Thing
as
the domain of a transitive relation type. It is also often easy and relevant,
as in using sub:Spatial_entity
for the domain of
sub:spatial_part
.
4.2. Examples of Useful General Prescriptive Constraints
4.2.1. Checking that Certain Relation Types Are Systematically Used
From or To Objects of Certain Types
sub:MandatoryOutRelationType
(or sub:MandatoryInRelationType
).
From such relation types, relations of type sub:leafObjectType
can also
be used for indicating the type of objects (classes or individuals) that,
as an exception, must not be checked. As an example, the following representation
states that every dividable object
– i.e. every instance of sub:DividableThing
–
must be the source of a sub:part
relation except
for each object instance of sub:PartDestLeaf
.
By stating such an exception, ontology creators can make explicit their choice of
not using a certain object as source of a sub:part
relation even when it is possible.
sub:part rdf:type sub:MandatoryOutRelationType; rdfs:domain sub:DividableThing;
sub:leafObjectType sub:PartDestLeaf;
sub:subclass rdf:type sub:MandatoryOutRelationType; rdfs:domain owl:Class.
sub:leafObjectType sub:SubclassDestLeaf;
sub:MandatoryOutRelationType
.
In this command, each occurrence of “out” must be replaced by
“in” if the same is to be done for each instance of
sub:MandatoryInRelationType
.
When the prescriptive constraint generated by this command
is about relations which are not between classes,
e.g. sub:part
relations,
a classic Description Logic inference engine can be reused to check this
constraint, although this may require a temporary removal of
relations from types such as sub:leafObjectType
relations.
INSERT
{ [] rdf:type sub:Prescriptive_constraint;
cstr:condition_class #source class of mandatory "out relation"
[?MandatoryOutRelTypeDomain owl:equivalentClass
[rdf:type owl:Class; #?MandatoryOutRelTypeDomain minus ?OutRelLeafDestType
owl:intersectionOf ( ?MandatoryOutRelTypeDomain
[rdf:type owl:Class;
owl:complementOf ?OutRelLeafDestType] ) ] ];
cstr:conclusion_class
[?conclusion_class owl:equivalentClass
[rdf:type owl:Restriction; #there must be a relation of type ?outRelType
owl:onProperty ?outRelType; all:someValuesFrom owl:Thing] ]
}
WHERE { ?outRelType rdf:type sub:MandatoryOutRelationType;
sub:leafObjectType ?OutRelLeafDestType;
rdfs:domain ?MandatoryOutRelTypeDomain
}
owl:propertyChainAxiom
.
For example, to generate a constraint checking that every instance of
sub:Component
is related via sub:part
relations to
an instance of sub:Component
, one can write:
sub:partOf rdf:type sub:MandatoryOutRelationType;
rdfs:domain sub:Component; rdfs:range sub:DividableThing.
The originality of the approach proposed in this article is that it enables
i) the representation of constraints independently of their exploitation (this
one is represented within content-independent queries),
ii) the representation of both descriptive and prescriptive constraints with
any KRL the expressiveness of which is at least equal to RDF or RDFS,
and hence
iii) the exploitation of most inference engines, especially via SPARQL queries.
Since the proposed approach relies on other methods and tools chosen by
each user of the approach, it inherits from their theoretical or practical
improvements. It would thus not be relevant to focus on theoretical aspects of
a particular method or tool in this article. For a general comparison,
[Baget et al., 2015] and
[Swan, 2016]
list theoretical points relevant to the proposed approach.
Regarding the use of SPARQL to check constraints,
[Tao et al., 2010]
shows that SPARQL can be used for both expressing and validating
integrity constraints based on some partial forms of the
Unique Name Assumption and Closed World Assumption.
It also shows that this validation is sound and complete when
the expressiveness used for the constraints and the rest of the KB are
respectively only
“SROIQ and SRI” or only “SROI and SROIQ”.
In the proposed approach, queries are used only for validating constraints,
not expressing them, but this is only a generalization of the
approach of [Tao et al., 2010] which does not change the
associated theoretical results.
In [Tao et al., 2010], the used partial forms of the
Unique Name Assumption and Closed World Assumption
are specified in SPARQL via its operator
The proposed approach was validated experimentally by testing the degree to
which a few constraints
– including all those presented in
this article – were followed in
i) the “family relationship” focused sample ontology given in the
OWL2 Primer W3C document and
ii) a few ontologies from
LOV, a Linked Data repository.
The validation came from finding the right constraint violations and the right
degrees with the proposed queries and, when necessary, preliminary
update requests.
Besides testing these constraints, queries and requests, another goal of this
validation phase was to represent ontology design patterns or best
practices (ODPs) as constraints. As shown by the previous sections,
the proposed kinds of descriptive or prescriptive constraints are ways
to represent ODPs that are collections of types that
should be reused whenever possible.
However, during the above cited validation phase, no ODP satisfying the
two following criteria was found:
i) the ODP could be implemented via a constraint, and
ii) the ODP was likely not to lead to a completeness degree close to
0% for a randomly chosen ontology.
More generally, no widely followed ODP was found.
Querying a KB for detecting anti-patterns in it is analogous to querying it
for detecting violations of ODPs in it. However, like the SPARQL based works of
[Roussey & Zamazal, 2013, many works on anti-pattern
detection use queries essentially as a way not to use an expressive inference
engine for detecting certain problems].
Instead, the proposed approach exploits inference engines.
With a sufficiently powerful KRL,
any anti-pattern can be expressed as a negative constraint in
inconsistency-implying form.
The introduction of this article summarized the strong distinction that exists
between constraint-based completeness and
represented-world-based completeness, and hence the reason why it would
not be relevant to further compare the proposed approach with those of tools such as
SWIQA and Sieve.
Since the proposed approach is based on a particular use
of RDFS it should be compared to SHACL
and SPIN. Some transformation languages or systems exploit KRs.
[Zamazal & Svátek, 2015] and [Corby & Faron-Zucker, 2015] present
such systems. Although few of them explicitly have a function
that detects KR patterns without also transforming the matched KRs
(e.g., PatOMat [Zamazal & Svátek, 2015] is an exception),
these languages or systems could easily be adapted to have such a
function and hence be used for handling prescriptive constraints.
However, all such systems appear to use rule-based languages with more
expressiveness than what relations-between-classes based constraints allow.
Typically, these languages allow the direct and explicit use of variables
for relating objects shared by both the condition and conclusion of a rule.
Indeed, using such languages can simplify the writing of prescriptive constraints.
However, regarding what can be expressed and checked via
constraints, this article shows that
i) much can be achieved simply using relations-between-classes
based constraints and SPARQL1.1, and
ii) the power of the proposed approach then relies on the power of the
inference engine used for object matching, rather than on the
used language.
Some transformation systems, like PatOMat [Zamazal & Svátek, 2015],
issue SPARQL queries for detecting patterns, based on non-SPARQL specifications
for patterns and their transformations.
Some other transformation systems directly propose an extension of SPARQL
such as STTL [Corby & Faron-Zucker, 2015] to write specifications for
patterns and their transformations. For instance, as shown in
[Corby, Faron-Zucker & Gazzotti, 2016], STTL can be combined with
LDScript [Corby, Faron-Zucker & Gandon, 2017] to specify STTL queries
(compiled into SPARQL queries) for detecting patterns and then transforming
the results. However, [Corby, Faron-Zucker & Gazzotti, 2016]
does not discuss the exploitation of object matching capabilities of
inference engines and it does not distinguish between prescriptive constraints
and non-prescriptive ones.
To sum up, the proposed SPARQL commands, including those generating
prescriptive constraints, could also be reused in these transformation systems,
although in an adapted form.
Alternatively, using scripts may be more flexible, as shown with JavaScript in
Section 3.7.
This work will continue in both directions, respectively using
STTL+LDScript and JavaScript, especially for prescriptive constraints related
to the use of given KRL models and KRL notations. To date, bases for this
future work are mostly unrelated and discussed in
[Corby, Faron-Zucker & Gazzotti, 2016] and
[Martin & Bénard, 2017a, 2017b].
5. Evaluation and Comparisons
NOT EXISTS
and relations of type
owl:sameAs
or owl:differentFrom
.
These forms can similarly be expressed via the commands seen in
Section 3 and the use of relations of
type owl:sameAs
or owl:differentFrom
in the constraints.
spin:rule
and spin:constraint
,
the (possibly recursive) calls of SPARQL queries or Javascript functions
for adding nodes or values to the KB.
Thus, SPIN enables procedural attachments in a KB and thereby
also supports the extension of SPARQL. However, the use of SPIN requires
a SPIN aware engine.
The approach proposed in this article is KRL independent (and hence
not based on procedural attachments). SPIN could be used
for storing the SPARQL commands (queries and update requests)
proposed in Section 3 and
Section 4, thus not only
procedurally defining the types proposed for constraints but also
but also providing a way to trigger such commands
automatically.
SPIN can also be used for checking constraints in other ways that are
less modular (i.e., not using content-independent queries) or less
logic-based (i.e., more procedural), hence in ways that offer
less possibilities for knowledge comparison, translation, inferencing,
reuse or, more generally, exploitation. The widespread use of such other ways
may be a reason why SHACL has been designed. This article provides a
less restricted alternative.
The author also works on a knowledge translation tool exploiting
ontology based specifications of conversions, including for constraints.
This article focuses on the original goal of supporting the checking of
constraint-based completeness
via a few predefined content-independent queries or procedures
as opposed to creating special-purpose queries for each constraint
or ad hoc procedures not reusing inference engines.
An example of application is the checking of the following of
ontology design patterns (ODPs),
KB design libraries (e.g., the KADS library [Breuker & van de Velde, 1994]) or
top-level ontologies (e.g., DOLCE) in order to validate the quality of a KB or,
during it design, help elicit missing knowledge from experts.
Section 2 answers the first two research questions:
what kinds of constraints need to be considered for evaluating constraint-based
completeness via content-independent queries, and
how to represent constraints in any KRL that has an expressiveness at
least equal to RDF or RDFS?
Section 2 does so via complementary means. Section 3 answers the third research question:
how to implement the general approach with query languages such as SPARQL or slight
extensions of it as well as in procedural languages?
For query languages, Section 3 shows that only few (kinds of)
content-independent queries are necessary and that pre-treatments on the KB are
a way to counter-balance some lack of expressiveness of some query languages
as well as implementing inference bypassing methods.
For procedural languages, Section 3.7
shows that more control and flexibility are possible at lower costs.
Section 4 provides an ODP for the use of
subClassOf relations and generalizes
it into a multi-inheritance based ODP that
i) works on any transitive relation,
ii) keeps the advantages of the single inheritance based ODP, and
iii) is more flexible and not dependent on special tools.
More importantly, Section 4 shows how these ODPs can be represented as
descriptive constraints exploitable by content-independent queries.
Section 4 also shows how prescriptive constraints can be generated for
checking the systematic use of all the subtypes of a certain type.
Section 5 evaluates the proposed techniques and
compares the approach to other ones.
A complement to this work will be to
i) represent ODPs in several research areas
(knowledge sharing, cooperation, security, etc.),
using only relations between classes whenever possible,
ii) organize them by relations of specialization or other kinds, and
iii) test these ODPs via STTL+LDScript and JavaScript.
The resulting analyzes will be added to this article.
6. Conclusion
Many thanks to Dr Olivier Corby (member of the Wimmics and SPARKS teams of,
respectively, the INRIA and I3S CNRS laboratories at the University
Côte d'Azur, France) for his questions and remarks on the approaches
presented in this article and his help during the implementation of these
approaches with SPARQL and LDScript via the Corese tool.
The interesting feedback provided by reviewers of published parts of
this article was also well appreciated.
7. Acknowledgments
8. References
ftp://www.ksl.stanford.edu/pub/KSL_Reports/KSL-92-86.ps.gz
https://pdfs.semanticscholar.org/590c/e95343370578dd1d357319c84d1dc7ff8fac.pdf