This article strongly extends
our article published in the proceedings of ECKM 2017.
Abstract.
There are many knowledge representation languages (KRLs), i.e., many
KRL notations and KRL abstract structure models. They suit different
needs. E.g., knowledge modeling and sharing require expressive and
concise KRLs to support and ease the entering of precise
knowledge. Many KRLs are more suited to knowledge exploitation with
computational tractability constraints. Current KR based tools
– including KR translators – allow the use of
only one or few KRLs, and hardly allow their end-users to adapt these
KRLs to their needs, e.g., the need to exploit even ad hoc
KRs. Indeed, some systematic ad hoc usages can be automatically
interpreted. Finally, it is difficult to compare KRLs and KRs
according to criteria or KRL related best practices.
The approach presented in this article addresses these problems by
answering an original research question: “can KR import or export
methods be specified in a generic way and, if so, how can they and
their resources be specified?”.
The approach is based on an ontology of KRLs, hence on KRs about
KRLs. It is named KRLO. It has three originalities:
i) it represents very different KRL abstract models in a uniform way,
ii) it represents KRL notations, and
iii) it specifies methods for importing and exporting KRs,
and hence also translating them.
This article presents principles and uses for this approach. We have
built Javascript functions and tools that import and export KRs by
exploiting KRLO and a parser generator. For these tools to use new
KRLs or KRL presentations, their end-users can add or adapt
specifications in KRLO. Other tools can use these tools or functions
as Web services or modules. No translator between each pair of KRLs
needs to be written. At least for export purposes, KRLO can also be
exploited via inference engines for OWL2 or Datalog, or via simple
path retrieval mechanisms, e.g., via SPARQL queries.
Categories and Subject Descriptors
I.2.4 [Artificial Intelligence]:
Knowledge Representation Formalisms and Methods - Representation languages.
General Terms: Languages, Management, Standardization.
Keywords:
Knowledge representation languages (KRLs), Ontology of KRLs,
Knowledge parsing/transformation/integration/sharing, Language technologies,
KRL syntactic/structural/semantic translation, Semantic interoperability.
1. Introduction 2. Introductory Examples of Content and Uses of KRLO 2.0. Terminology and Conventions 2.1. Examples of Relations Between Elements of Different Logics 2.2. Methods Specified in KRLO and Associated Tools 2.3. Examples of Translations Between Different KRLs 2.4. Examples of Translations Between Ad-hoc Representations 2.5. Examples of Transformations Or Evaluations With Respect To Some Best Practices 2.6. Examples of Web Interfaces and Uses in Documents 3. Principles and Examples For the Specification of Abstract Elements 3.1. For Abstract Elements 3.2. For Translating between KRLs 4. Principles and Examples For the Specification of Concrete Elements 5. Principles and Examples For the Specification of Parsing/Formatting/Translating Methods 5.1. For Importing Knowledge 5.2. For Exporting into a KRL 5.3. For Translating Between KRLs 6. Validation of KRLO and Non-declarative Implementation of Tools Exploiting KRLO 7. Comparison With Other Approaches and Kinds of Uses Cases 7.1. For Importing Knowledge 7.2. For Translating Between KRLs 7.3. For Exporting into a KRL 8. Conclusion 9. References
KRLs are languages which permit to represent information in logic-based forms – knowledge representations (KRs) – within knowledge bases (KBs). KRs can be exploited by inference engines or KB management systems (KBMSs), e.g., for precision-oriented information sharing or retrieval and problem solving. The W3C has popularized the interest of using and interconnecting “KRs on the Web”. The W3C calls them “Linked Data”. The “Semantic Web” is the set of representations that use KRLs advocated by the W3C.
Many KRLs exist.
A unique one would not adequate for every kind of
knowledge modelling or exploitation, nor for every person or tool.
An expressive KRL with a rich and concise textual notation is useful for
modelling and sharing complex information such as
i) the content of some natural language sentences, or
ii) an ontology that precisely defines complex types of concepts or relations.
To such ends, it is preferable or necessary to use KRLs with a second-order
notation and syntactic sugar for meta-statements and numerical quantifiers.
numerical quantifiers and having a second-order notation.
On the other hand, less expressive KRLs can be simpler to learn and they
have better computational properties that some applications require.
This is why the W3C proposes different KRLs, all of which are less expressive
that First-Order Logic (FOL). However, expressiveness restrictions also
lead people not to represent some knowledge or to represent it in biased ways.
For knowledge reuse purposes, this is problematic because many applications
can benefit from the unbiased knowledge. E.g.,
i) they can use more powerful inference engines,
ii) they use very expressive KRs only for KB checking purposes, or
iii) they are information retrieval applications only using
structure matching techniques
instead of complete and consistent deduction techniques.
Thus, many KBs use expressive KRLs and many applications require handling many KRLs.
More generally, for knowledge entering, reuse and interoperability purposes,
importing, exporting or translating KRs expressed in different KRLs
is needed, especially on the Web.
KRLs may have abstract models, e.g., the W3C Resource Description Framework (RDF), the W3C OWL2 Web Ontology Language model, and Common Logics (CL) [CL 2007], the ANSI standard for KRLs based on FOL. These are abstract data structure models, such as the models or meta-models of Model Driven Engineering (MDE). These are not theory models of model-theoretic semantics. Different abstract models may follow different logics, e.g., FOL or the SHOIN(D) description logic.
An abstract model model may be formally (re-)presented with different
notations, e.g., the Knowledge Interchange Format
(KIF), Turtle or XML-based notations.
Notations are also called concrete models, concrete syntaxes or formal
presentations.
From now on, unless preceded by “concrete”, the word “model”
refers to an abstract model.
Models and notations are themselves KRLs: a KRL is
a model and/or a notation.
In this article, an element is a KRL element.
A concrete element (CE) is a notation element, e.g.,
an infix or prefix representation,
as in “3 = 2 + 1” and “= (3 +(2 1) )”
or “(= 3 (+ 2 1))”.
An abstract element (AE) is an element of a model.
Conversely, a model is a set of AEs. An AE may be
i) a formula, i.e., something denoting a fact,
ii) an abstract term, i.e., something denoting a logic object,
e.g., a constant, a variable, a function call, or
iii) a symbol, e.g., one for a quantifier, variable or constant.
Importing KRs is done by a syntactic parser and a semantic analyser. One input of the parser is a file, e.g., containing a text or a graphic. Another input is a concrete grammar, i.e., a concrete model. The parser outputs a collection of CEs, almost always organized into a Concrete Syntax Tree (CST). It may also output syntax related AEs, typically organized into an Abstract Syntax Tree (AST). From these CEs or AEs, the semantic analyser creates semantically structured AEs, typically organized into an Abstract Semantic Graph (ASG). If these AEs are not the ones required by the importing tool, a translation to other AEs also occurs. The parser may also directly be a semantic analyser, e.g., when the input notations are homo-iconic to the model of the KRL used in the inputs. A KRL is homo-iconic, or a KRL notation is homo-iconic to a KRL model, when its “CST less the CEs that are not syntactic sugar for AEs (e.g., parenthesis)” is isomorphic to the AST or ASG for the model. This happens when the notation forces its user to write structured CEs using the names of their related AEs, as with XML, LISP, N-triples and Prolog. With LISP, all structured AEs can be assimilated to functions. In N-Triples, all structured AEs are binary relations. In Prolog, all structured AEs can be assimilated to relations. Prolog allows the use of “:-” as an abbreviation for an implication relation, and allows the writing of this relation in an infix form – while other relations must be written in a prefix form – but the above definition still holds. On the other hand, RDF/XML, Turtle and JSON-LD are not homo-iconic notations for RDF.
Exporting KRs expressed in a KRL goes in the reverse direction. This process is sometimes called unparsing or formatting.
Translating KRs is translating their logic and/or non-logic objects. Thus, if the source and target KRLs are different, KRL translation first occurs, e.g., between CL AEs and RDF+OWL AEs or between CLIF CEs and RDF+OWL/XML CEs. Then, KR content translation occurs if the source and target content ontologies are different, i.e., if different non-logic objects or different names for these objects are used for expressing the source and target KRs. This is for example the case when translating between physical units. KR content translation exploits relations between objects in the source and target content ontologies, e.g., between one medical ontology and another. Such relations may come from ontology mapping. KR translation, i.e., the translation of KRLs or KR content, may be directly between CEs. More flexibility and genericity is achieved when translation between AEs is involved, i.e., when the import, translation and export processes are separated. Thus, current research works on translation focus on translation between AEs. However, they assume that the import and export processes are done separately, via other techniques.
This article is not about KR content translation. It is about the KRL related part of importing, translating and exporting KRs. The advantages of our approach first come from its exploitation of an ontology of KRLs in each of the last three processes. Furthermore, these advantages also come from the three originalities of the particular exploited ontology of KRLs. We named it KRLO (KRL Ontology). It is the first to represent KRL abstract models of different families in a uniform way, e.g., the RDF+OWL models and the CL model. KRLO is also the first to include an ontology that defines KRL notations. Finally, KRLO is the first to include rules and functions specifying default methods for input, translation and export purposes. Thus, this article is also about this ontology. We have designed some tools to help its exploitation. Our approach, KRLO and these tools are meant to ease the implementation of applications handling many KRLs, including KRLs specified by end-users. These tools are: a generic KRL parser and semantic analyser, a KRL translation Web server based on it, and a Web server allowing its users to complete KRLO, for example with new models and notations. This last server directly reuses our shared KB server WebKB-2. Its KB sharing protocols have already been published [Martin, 2011] and hence will not be presented in this article. KRLO and these servers are accessible from http://www.webkb.org/kb/it/o_KR/o_KRL/.
Section 2 gives uses cases for our approach and situates it with respect to other ones. Section 3 gives examples and principles for the specification of KRLs. Section 4 introduces the default rules or functions represented in KRLO for importing, translating and exporting knowledge. Section 5 concludes.
This text is in a smaller font because it is not absolutly necessary for understanding the rest of the article: it is only a plus.
The terminology used in KRLO and in this article is derived from the terminologies used by research communities in knowledge representation, model-based approaches and formal language parsing. The most common and precise terms have been reused. The elements of a KRL model are called “abstract terms” – from now on, abbreviated as “ATs” – while the elements of a notation are called “concrete terms”. KB can be seen as a set of (quantified or not) ATs with semantic relations between them. Quantifiers, variables, values, functions, relations or statements are themselves such ATs.
A “link” is an instance of a “binary relation type”. In OWL, such a type is instance of the 2nd-order type “owl:ObjectProperty” or “owl:DatatypeProperty”. As illustrated by these last two identifiers, in the text of this document, as in most W3C notations, “:” is used as separator between the namespace identifier shortcut (here “owl”) and the identifier with the namespace. However, in FL and hence in some examples within tables or figures below, the used separator is “#” because “:” is used for separating a link and its destination node, as in most frame-based notations. However, in the text, the A frame is a statement composed of several links connected to a same source node. A link source/destination node is called a “concept node” and either is a named individual (named term that is not a type) or refers to one or several individuals (e.g., by using a type and a quantifier; in RDF, the quantifier is implicitly the existential quantifier and an unnamed resource node is called a “blank node”). Named terms that are not individuals nor relation types are called “concept types” (“classes” in RDF). In the examples of this article, the default namespace is for the types introduced by KRLO.
For readability purposes, in KRLO and this article, each name for a concept type or individual is a nominal expression beginning by an uppercase letter, as in “Model” and “KRL_Model”. In this article, the name of a relation type introduced by KRLO begins by “has_” (in the general case) or “rc_” if this is a type of link having a concrete term as destination (the only exception is the relation type named “rc” which connects an AT to one of its possible CTs). In KRLO, the name of a relation type introduced by KRLO begins by “r_” or “rc_”. Thus, in the illustrations of this article, all the names not following these conventions and not prefixed by a namespace are KRL keywords. Within nominal expressions, “_” and “-” are used for separating words. When both are used, '-' connects words that are more closely associated. Since nominal expressions are used for the introduced types, the common convention for reading links in graph-based KRLs can be used, i.e., links of the form “X R: Y” can be read “X has as R Y”. However, when a keyword such a “of” is used for reversing the direction of a link, the form “X R of: Y” should rather be read “X is the R of Y”. The syntactic sugar of Formalized English (FE) [Martin, 2002] – a KRL notation – makes this reading convention explicit. Since following this convention leads to the use of nouns instead of verbs and adjectives as type names, it normalizes representations. It also eases the categorization and reading of statements, e.g., the concept nodes “any Small_entity” and “2 Transformation” are easier to read and categorize than the concept nodes “any Small” and “2 Transform”.
In FL, if a link is not a subtype link nor another “link from a type”, its source node is quantified and its default quantifier is “any”. This one is the “forall” quantifier for definitions: the link represents a necessary condition for all instances of the node type. A destination node can also be source of links if they are delimited by parenthesis.
Given these explanations, the content of the tables in this article can now be read. Every keyword not introduced above will be explained via a comment near it. Comments use the C++ and Java syntax. In these figures, bold characters are only used for highlighting some important types and for readability purposes.
Finally, in the examples of this article, all relation types names are in italics.
Translations of an English sentence into different KRLs are given below. The most common notation families – except for XML-based ones – are illustrated, e.g., prefixed notations, frame based infix ones and graphic notations. This example shows the diversity of KRL models and notations, and hence gives a feeling for the difficulty of translating between KRLs. The sentence is a simple definition and use cardinalities restrictions. For the notations not having syntactic sugar for cardinalities restrictions, the OWL2 model is used, except in KIF since this KRL allows to define the numeric quantifier “exactlyN” (a much more expressive construct than cardinalities restrictions, hence one that normalizes representations, in the above explained sense). OWL-Lite would have been sufficient but the qualified cardinality restrictions of OWL2 make the representations more readable. The English sentence is about birds: we reused and adapted a classic example in Artificial Intelligence. It is represented in different KRLs, including UML via Table 2.0, roughly from the representation most similar to English, to the least similar. The names of some of these KRLs are composed of the model(s) they use, followed by the notation they use. E.g., the “RIF+OWL/RIF-PS” KRL follows the RIF-FLD model plus the OWL ontology (in order to represent cardinalities restrictions) and use the RIF-PS (RIF Presentation style) notation. Indeed, such a use of “/” has been popularized by the W3C with the “RDF/XML” notation. This last convention is used in Figure 4.1 for illustrating some relations between models and notations.
Apart from KIF, “OWL Functional Style” (FSS) and RIF-PS, all notations below are graph-based: they directly show the concept nodes – and relation nodes relating them – of a graph-based model. Apart from UML, these graph-based notations are, at least sometimes, frame-based: the order of their concept nodes may be important for understanding them. A notation that is not graph-based is positional or name-based: the concept nodes appear as positional or named arguments of a relation node which looks like a function call in traditional programming languages. No example of named argument is given below: the respective positions of the arguments are therefore important. There is at least one normalized AT for the different representations below. The second FL representation is a direct map for one such normalized AT. The UML representation is nearly a direct map for it (the underlying structure is slightly less generic). The first FL representation seems more intuitive.
English: By definition, a "flying_bird_with_2_wings" is a bird that flies and has two wings.
| |
FE: any Flying_bird_with_2_wings has as has_type Bird, is has_agent of a Flight,
has as has_part 2 Wing.
| |
FL: Flying_bird_with_2_wings = ^(Bird - has_agent: a Flight, has_part: 2 Wing);
// "^(...)": lambda-abstraction; "-" reverses the direction of a relation
| |
CGLF: type Flying_bird_with_2_wings (*b)
[ [Bird: *b]-{ ->(has_agent)->[Flight]; ->(has_part)->[Wing:{*}@2]; }]
| |
KIF: (defrelation Flying_bird_with_2_wings (?x) :=
(exists ((?f Flight)) (and (Bird ?x) (has_agent ?f ?x)
(exactlyN 2 '?w Wing ^(has_part ,?x ?w)))) )
| |
CL+OWL2/CGIF+XMLnamespace, a KRL using the CL and OWL2 abstract models and, for linearizing notation, CGIF for structures and XML for namespaces:
[If: [Flying_bird_with_2_wings: *b]
[Then: [Flight: *f] [Bird: *b] (has_agent ?f ?b)
("rdf:type" *b ["owl:Restriction": *r] )
("owl:onProperty" *r has_part) ("owl:onClass" *r Wing)
("owl:qualifiedCardinality" *r 2) ] ]
//and conversely, i.e., with the part right to the "If:" and the part right to the "Then:" swapped | |
RIF+OWL2/RIF-PS, KRL with the RIF and OWL2 models, linearized with the RIF-PS notation.
RIF-PS is prefixed except for i) some infix forms, e.g., for implications (cf. ":-" below) and ii) the frames, each being a list of properties in infix form (cf. the "->"), delimited by [ ] and prefixed by their source :
Forall ?b ?r ( ?b[rdf:type Flying_bird_with_2_wings] :- // ":-" means "<="
And( ?b[rdf:type->Bird rdf:type->?r] Exists ?f ( ?f[rdf:type->Flight has_agent->?b] )
?r[owl:onProperty->has_part owl:onClass->Wing owl:qualifiedCardinality->2] ) )
//and conversely, i.e., with the two sides of the ":-" swapped
| |
RDF+OWL2/N3 (fully frame-based KRL; in frames delimited by [ ], the source of the properties is implicit)):
:Flying_bird_with_2_wings owl:intersectionOf
(:Bird [rdf:type owl:Restriction; owl:onProperty :has_agent; owl:someValuesFrom :Flight]
[rdf:type owl:Restriction; owl:onProperty has_part; owl:qualifiedCardinality 2;
owl:onClass :Wing] ) .
| |
OWL2QL / OWL Manchester: Class: :Flying_bird_with_2_wings EquivalentTo:
Bird and has_agent some Flight and has_part exactly 2 Wing
| |
OWL2QL / OWL Functional-style:
EquivalentClasses( Flying_bird_with_2_wings
ObjectIntersectionOf( Bird ObjectSomeValuesFrom( :has_agent :Flight )
ObjectExactCardinality( 2 :has_part :Wing ) ) )
| |
RDF+OWL/XML:
<owl:Class rdf:about="Flying_bird_with_2_wings"> <owl:equivalentClass> <owl:Class>
<owl:intersectionOf rdf:parseType="Collection"> <owl:Class rdf:about="Bird">
<owl:Restriction><owl:onProperty rdf:resource="#has_agent"/>
<owl:someValuesFrom rdf:resource="#Flight"/> </owl:Restriction>
<owl:Restriction><owl:onProperty rdf:resource="#has_part"/>
<owl:qualifiedCardinality rdf:datatype="&xsd:nonNegativeInteger">2</owl:qualifiedCardinality>
<owl:onClass rdf:resource="Wing"/> </owl:Restriction>
</owl:intersectionOf> </owl:Class> </owl:equivalentClass> </owl:Class>
| |
UML_model / UML_concise_notation: |
Legend for this graph-based notation. Each arrow "->" represents a super-type (here, subClassOf) link. For other links, the arrow "→" is used with an associated link type and also a destination cardinality when this cardinality is different from 0..*, i.e. 0–N. For concision purposes, boxes around classes (types) and associations (links) are not drawn. |
Figure 2.0 shows some subtype and has_part relations between some KRL models
and between some KRL notations.
This permits to organize and compare notations or models, families of them,
and hence also to modularize KRL specifications.
For KRL models, Figure 2.0 gives one subtype partition, i.e., one complete and
disjoint set of subtypes for a type. This partition represents the distinction
between general models that can store Higher-order logics based representations
(e.g., the RIF-FLD model) and other models. Many other partitions can be made.
RIF-FLD has as part RIF-BLD which follows (or “can be interpreted with”)
First-order Logic (with equality) plus some additional features such as i) relations
with “values” (i.e., “concrete domains”, e.g., integers and strings)
as destinations, and ii) the possibility to state that a relation is transitive.
These RIF-BLD features are outside (classic) FOL but are also in OWL2, hence the
type FOL_part_of_OWL2 in Figure 2.0.
The relations between models and the logics they follow are not shown,
nor are relations between models and notations.
For notations, Figure 2.0 illustrates
one subtype partition and one example of has_part relation: the one between
(any instance of any version of) Notation3 (N3) and Turtle.
Legend:
same as for the legend of the UML_concise_notation in Table 2.0 plus,
still for readability reasons, the associations are in italics and their cardinalities are not displayed since
they are all 0-N in both directions. |
+Contexts
We noted that the structure of any KRL element can be represented in a uniform way, like the structure of a function or as in functional programming, i.e., as an operator with an optional set of arguments and a result Thus, in KRLO, the six most important primitive relation types for relating AEs are named has_operator, has_argument, has_arguments, has_result, has_parts and has_part. The first two are subtypes of the last since the operator and arguments of an AE are also its parts. Thus, in KRLO, the structure of a relation is defined to include as operator a relation type, as arguments a list of AEs and as result a boolean. E.g., the structure for the relation “has_part(USA Iowa)” has as operator has_part, as arguments USA and Iowa, and as result True. The structure of a quantification is defined to include as operator a quantifier, some arguments and as result a boolean. A function call is defined to have a function type as operator, some arguments and a result. A variable or an identifier is (partially) defined to have as operator a name, as arguments an empty list of AEs and as result an AE of a certain type. Thus, in KRLO, an AE operator may be a function/relation/collection type, a quantifier or a value. Being an operator is only a structural role that different kinds of elements may have. When the AE is a formula, representing its structure is not representing its meaning directly. E.g., a relation does not actually have a boolean as result. In RDF, this is illustrated by the difference between a statement and its reification. As formally represented in Figure 3.1 (Section 3.4), KRLO distinguishes things representing a description content (e.g., types and statements) from the instruments for describing them, e.g., languages, symbols (for types, variables, ...) and structural descriptions of statements. Hence, for reasons of coherence, uniformity and simplification, all language elements are represented as description instruments. What they are description instrument of can be referred to via inverses of relations of type has_descr_instrument. The top-level of KRLO, i.e., the part reused and specialized by specifications for particular KRLs, currently includes over 900 AE types.
OWL2-RL expressiveness is enough for relating types of KRL elements by subtype relations and defining structural relations on the instances of a type. Indeed, for representing or using such relations (e.g., has_result, has_parts or has_part), only the following description logic features are needed: type restrictions on the destination and either existential quantification or “at-most 0/1 cardinality restrictions”. If the description of an unusual KRL was actually more precise with other cardinality restrictions, this precision could be ignored without consequence for our analyser and export rules. OWL2-RL is the subset of OWL2 that can be fully defined in RIF-BLD (as done in [OWL2 in RIF, 2013]). I.e., OWL2-RL can be implemented in rule-based reasoning engines handling RIF-BLD expressiveness, i.e., the expressiveness of definite Horn rules with equality and a standard first-order semantics. More precisely, OWL2 and RIF are only interoperable for their OWL2-RL and RIF-BLD subsets. Indeed, for RIF-BLD, both the “OWL 2 Direct Semantics” (following the first-order logic nature of the Description Logic semantics) and the “RDF-based Semantics” can be used, as detailed in [OWL2-RL to RIF-BLD, 2013]. Thus, an ontology not requiring more than (OWL2-RL and) RIF-BLD expressiveness, i) permits to reuse RDF graphs, ii) can be translated into KRLs understood by many reasoning engines, and iii) allows “scalable reasoning” (details in [OWL2, 2012] and [Krötzsch et al., 2013]).
Since structures can be defined in OWL2-RL, many equivalences or implications between structures can be defined with RIF-BLD (this is less possible with OWL2-RL because of its lack of quantified variables). [OWL2 in RIF, 2013] defines OWL2-RL types in RIF-BLD. [OWL2-RL to RIF-BLD, 2013] gives rules for translating OWL2-RL representations into RIF-BLD, and only in that direction. Since RIF does not define and organize its types into a KRL model ontology and since it has no notation ontology for its syntax, these rules of [OWL2-RL to RIF-BLD, 2013] are somewhat syntactic: they translate prefixed functional representations of OWL2-RL structures into prefixed functional representations of RIF-BLD structures. Frame properties (relations with type owl:Property or an equivalent type) are kept as such: they are not translated into relations with a binary predicate (e.g., First-order logic binary relations) since, in some cases, this would be incorrect; this would equate the above cited two semantics for RIF-BLD. In KRLO, both types are generalized by the type Binary_relation (cf. Figure 3.1). In this article, we refer to both as binary relations. Since few KRLs allow both of these two similar types of elements, translating between them – and hence also between frames and conjunction of relations with binary predicates from a same source – under the control of the end-user and in both directions, is interesting. The second row of Table 3.2.1 gives a RIF-BLD rule proposed by KRLO to enable this. Here, “propose” means that the types of AEs such a rule applies to can be restricted by the end-user simply by restricting the source or destination of an equivalence relation. KRLO has other similar rules and will soon have more, e.g., to translate between i) non-binary relations and binary ones (with a list as second argument; cf. third row of Table 3.2.1), ii) different structures for meta-statements (formulas about formulas), iii) different structures for numerical quantifiers, and iv) some kinds of definitions and some uses of universal quantification with implication or equivalence relations. Indeed, when possible, the end-user should be able to select which of such structures are generated.
KRLO does not have pairwise equivalence relations between all equivalent structures but between one and the others. Transitively, translations can be found between all equivalent structures. When the target KRL is not formally expressive enough to represent an AE – e.g., when a statement such as “in 2015, at least 78% of birds in UK were able to fly, according to ...“ has to be translated into RDF+OWL – ad-hoc forms can be used. If the specification of a target KRL describes such forms, our generic export rules use them. Otherwise, the source forms are kept but isolated within comments or annotations to distinguish them from formal translations in the target KRL.
Correctly defining the structures of AEs and CEs, their generalization relations and their structure implications permit their parsing and export, and hence what we call their structural translations. They preserve semantics if the target KRL is expressive enough to represent the translated knowledge correctly. They permit lexical and syntactic translations as well as, to a certain extent, semantic translations. For more complete semantic translations, an inference engine will require – and be able to handle – more complete definitions of the semantics of each exploited AE type. Such definitions can be imported from ontologies that are complementary to KRLO. E.g., a RIF-BLD enabled inference engine that does not understand the types of OWL2-RL can import the RIF-BLD definitions of these types [OWL2 in RIF, 2013] in order to perform more inferences (although, these particular definitions will soon be in KRLO). KRLO also does not define types which are not logic-related, e.g., types for scalar quantities and physical quantities or dimensions (Ontolingua [Farquhar et al., 1997] defines such types), types for some process related concepts or relations, or domain related types. Thus, if a KRL notation has some syntactic sugar for such types, to enable translations other than structural ones, the notation specification has to reuse types that are not defined in KRLO but in other ontologies.
In Figure 3.1.1, the part above the types Language_element and Language (in bold font) situates them with respect to some top-level types representing common distinctions. We gave rationale for this categorization at the beginning of Section 3.1. It is easy to relate these categories to those of the common kinds of semiotic triangles (or rectangles), also called “triangles of reference/meaning”. Such a specification is missing in RIF [RIF-FLD, 2013], even informally, which makes the categorization (wrt. the above cited categories) of some of its types difficult. The ontology of SBVR (Semantics of Business Vocabulary and Business Rules; a KRL proposed by the OMG) [SBVR, 2008] includes a similar specification but unfortunately not (also) in terms of the previous cited categories. This makes the SBVR types sometimes difficult to categorize, understand and hence reuse.
The part below the type Language_element represents some of its most useful general partitions. Although subtype partitions (or disjoint unions of types) are not allowed in OWL2-RL and not needed for what we defined as structural translations, they are used here for readability reasons. The definitions of five major types of ATs – in terms of their parts (operator and/or arguments) and result – are shown. Before finding these general partitions, it was difficult for us to categorize AE types and modularize or organize the descriptions of the KRL model ontology. A good categorization was needed since we found many ways to generalize or group these types and many of the created supertypes proved useful for KRL representation: KRLO already includes over 900 AE types. For example, the types Reference_to_a_phrase, NR_phrase (phrase that is not a reference) and Phrase (formula or modularizing phrase that is not a formula, e.g., an import directive) are useful for distinguishing, organizing and checking the use of such AEs. E.g., it is necessary to distinguish functions returning phrases since such functions are not usable in models not based on a HOL (Higher order logic), such as Common Logics and RIF-BLD (unlike RIF-FLD which allows functions to return a formula but not a modularizing phrase). To allow the representation and categorization of AEs from any KRL, KRLO does not make restrictions. Finding a generic normalized representation for types of modules and their components (module directives and sentences of the module body) was difficult. Finally, we found that a module could be represented as a set of meta-statements – more precisely, a set of binary relations (the module directives) on a formula (the module body) – and we found a way, using only lists and path expressions, to represent different meanings (depending on KRLs) for a relation on a formula (and its sub-formulas). This permitted us to represent different kinds of directives and different kinds of modules – from those of JSON-LD to those of RIF-FLD – with the same primitive structures.
Legend: same as in Figure 3.1.1 plus,
|
Figure 3.1.2 shows some important subtypes of NR_phrase.
KRLO is similarly structured for types of values, references and CEs.
The dual meaning of frames is represented via the types
Frame_as_NR-phrase (subtype of NR_phrase) and
Frame_as_reference (subtype of Reference, like Function_call). In some notations
– e.g., FL [Martin, 2009], the notation that was used for initially writing KRLO –
the syntax enables to distinguish between these two AE types,
the one for a Frame_as_NR_phrase and
the one for the source of the properties of a frame.
In others notations, the representation of a frame with its properties
either always refers to a formula (as in RIF-PS)
or to their source (e.g., as in notations where relations cannot be on formulas).
Legend: same as in Figure 3.1.1 plus
|
Table 3.2.1 illustrates implications between AE structures: i) from properties in KRLO to reified RDF statements, ii) between frames and binaries relations from a same source, and iii) from a non-binary relation to a binary one using a list as second argument. In all of them, relations of type has_descr_instrument (or a subtype of it) are used to access the things represented by certain AEs. Indeed, the AEs are different but what they are description instrument of can be the same or be related by equivalence relations. Hence the translation rules apply. The equivalence relations can be modified by the end-user to adapt the translations. No rule for generating type names during translation is illustrated.
//1) For properties, the use of has_op and has_args implies RDF property reification // (the AE in bold font is the AE (partially) defined by this rule: ":-" means "<="; // each has_descr relation relates a Descr_content (thing) to an AE; // has_descr is subtype of the relation type has_descr_instrument specified in Figure 3.1.1):
|
//2) The next rule relates the structures of two AEs (?c and ?f) that are // description instruments of something that is both a // Thing_that_can_be_represented_via_binary-relations_from_a_same_source // and a Thing_that_can_be_represented_via_a_frame. // By default, in KRLO, these two types are related by an equivalence relation (see Figure 3.1.1). // However, as explained in Section 3, if the represented knowledge has // more expressiveness than OWL-RL and RIF-BLD, this equivalence is not (always) true. // Hence, to get the translation results she wants, i.e., for the next rule to apply only // to certain types of frames and certain types of conjunctions of binary relations from // a same source, the end-user can modify this equivalence relation by specializing its // source type or destination type, hence restricting them. She can also remove this // equivalence or keep it and have more complete results even if they may not always // be correct. // Note: alone, the next rule is not sufficient to correctly handle quantifiers around // the conjunction or the use of cardinality restrictions within the frame. This is // handled by other rules.
|
//3) The next rule relates the structures of two AEs that are description instruments of // something that is both a Thing_that_can_be_represented_via_a_non-binary_relation and a // Thing_that_can_be_represented_via_a_binary_relation. By default, in KRLO, these two types // are related by an equivalence relation (the next rule can be used for translations in both directions).
|
Using a subtype partition, the second row of Table 3.2.2 lists the
types of formulas that a rule premise is allowed to be in RIF-BLD.
For two of these types, further restrictions are illustrated using
relations of type has_argument.
All the used types are restrictions for RIF-BLD of types from the top-level of KRL.
To refer to such types, the function f_in is used.
The first row of Table 3 defines it.
All definitions for AEs and CEs of a given KRL use it.
Forall ?elemType ?languageType ?elemTypeInLanguage ?elem ( ?elem[rdf:type->elemType]
:- And( ?elem [rdf:type->?elemTypeInLanguage ]
?language [rdf:type->languageType has_direct-or-not_part->?elem]
f_in(?elemType ?languageType) = ?elemTypeInLanguage ) )
Explanations. The above rule is a definition of the function “f_in” in
RIF-BLD/RIF-PS. |
Legend: same as in Figure 3.1.2. Notes: f_in returns a KRL element type, here always an AE type. |
A CE – i.e., a presentation for an AE in a given notation – can be
derived from the presentation of each direct part of this AE, and so on recursively,
at least when the notation has a formal (recursively defined) grammar.
With textual notations, this derivation is often a simple ordering on the
presentation of these parts (e.g., in a prefix, infix or postfix way)
plus some syntactic sugar to delimit them.
The structure of a CE can also be represented like the structure of a function
(e.g., since it can be derived from the structure of an AE).
For all the KRLs we know, we noted that this structure could be specified
in a generic way in a LL(1) or LALR(1) grammar, hence a deterministic
context-free grammar that can be efficiently parsed.
This was an important discovery since it meant that
i) we could build one efficient generic parser-analyser for all these KRLs,
and ii) the primitives of a notation ontology could represent these structures.
As an example for the underlying idea, consider an AE composed of
an operator ”o” with two arguments ”x” and ”y”. If
parenthesis are mandatory delimiters and if spaces are the only usable
separators, this AE has only the next five possible CEs (not counting
optional uses of spaces and parenthesis) in all the notations we know:
- ”o (x y)” (prefix functional form as in RIF-PS),
- ”(o x y)” (prefix list-like form as in KIF),
- ”(x o y)” (infix form as in Turtle and some RIF-PS formulas),
- ”(x y o)” (postfix list-like form) and
- ”(x y) o” (postfix functional form).
Five rules of an LL(1) or LALR(1) grammar can be used for specifying these
five possibilities and they can also be generalized for any number of arguments,
not just two. Furthermore, if – as with the Lex&Yacc parser
generators –
the grammar can be divided into a lexical grammar and a non-lexical grammar,
the separators can be made generic in the non-lexical part via terminal symbols
such as Placeholder_for_begin-mark_of_the_arguments_of_a_prefix-function-like_element
and Placeholder_for_end-mark_of_the_arguments_of_a_postfix-list-like_element.
Finally, in the lexical part, it is possible to specify rules for detecting
various kinds of tokens and various kinds of separators rather than specific ones.
Following this idea and using Flex&Bison (GNU variants of Lex&Yacc), we created a generic (parser-)analyser for any KRL that can have an LALR(1) grammar. This grammar for this KRL does not have to be found since it is generalized by the generic LALR(1) grammar that our analyser uses. Given a CE and its KRL (notation and model, plus, via KRLO, their specifications and relationships), this analyser directly generates an abstract semantic graph, i.e., a data structure for an AE in the given model. As a comparison, a classic parser generator is given a notation specification (a concrete grammar) and generates a parser which, given a CE, creates an abstract syntax tree for it.
The top-level of the ontology of notations of KRLO does not categorize all possible prefix/infix/postfix notation forms for an AE structure. It only contains the primitive relations permitting to describe them. Indeed, we found that there were too many possible combinations of these primitives for a categorization to be helpful. Since the primitives also proved too cumbersome to be used directly, we defined intermediary functions accepting list-based descriptions. Table 4.1 uses such functions – e.g., fc_OP and fc_ARG (in KRLO, the names of functions returning a CE or CE_type begin by “fc_“) – and defines simpler ones. These functions are explained in a technical way after the table. First, here is a higher level of what Table 4.1 illustrates. Via the above cited functions and three relations of type rc_spec, the graph of Table 4.1 specifies the default presentation form of CEs in RIF-PS for three types of RIF AEs, first for the type of all AEs in RIF. For this type, Table 4.1 defines the prefixed functional presentation form that all RIF AEs may have in RIF-PS, one that most RIF AEs must have. Then, Table 4.1 defines the types of AEs that should preferably have an infix presentation in RIF-PS (rules, equalities, ...) and defines this presentation. The third rc_spec relation specifies the RIF-PS presentation of some parts of AEs such as quantification variables and context-related parts of a document in RIF (directives, ...). When a type of AEs has an associated specification of a presentation for a given type of notations, this one complements a presentation specification for that type of notations or a supertype of it. The export rules of KRLO always use the specification of the most specialized relevant AE type. If RIF-PS did not also allow the prefix presentation to be used for all RIF AEs, even for those that should preferably be presented in an infix form, Table 4.1 would have had to specify that. To that end, the OWL2-RL owl:allValuesFrom restriction could have been used on the rc_spec relation. The export rules of KRLO are generic in the sense that they are based on the operator+argument structures of AEs and CEs, and their rc_spec relationships.
//Here is a definition of the function "f_in" in RIF-BLD/RIF-PS. This is a partial definition // since ":-" means "<=". This function is used in the UML graph below for referring to the // "specialization in a model" of a type from the top-level of KRLO, i.e., a type not related // to a particular model. This definition exploits the fact that in KRLO, from any // Language_element to a Language, there is a chain of part or member relations. // f_in is one of many ideas that led to simpler and better specifications in KRLO compared // to its version before September 2015.
|
//Here is a (partial) definition of the function "f_dest" in RIF-BLD/RIF-PS. // This function is used below for referring to the destination of a property of given type, // from a given source.
|
Legend: same as in Figure 4.1.2 except that
Notes: f_in returns a KRL element type, here always an AE type. |
Using RIF-BLD/RIF-PS, a rather intuitive KRL introduced in Table 4.2.1, Table 4.1 first illustrates partial definitions of some functions used in the graph. The other functions used in this graph are similarly defined in KRLO. The second argument of fc_type is a list of notation types since i) an AE may have identical types of CEs in different notations, and ii) fc_type returns a type for CEs that are member of notations of such types. The first argument of fc_type is an ordered list of CE specifications: syntactic sugar (delimiters or separators) or results of functions such as fc_OP_from and fc_ARGS_from. These functions are similar to fc_type but they i) respectively work on the operator and arguments of their AE parameter, and ii) permit to specify the role of each CE in the specified list: operator, argument or separator. Thanks to these roles, the grammar rule to use for parsing such a list of terms can be selected. For exporting into non-indented textual CEs, only the order of the terms in this list is important. The fc_ARGS and fc_ARGS_from functions may have a second argument to enable the specification of a non-space separator to use between the arguments. To conclude, the first rc_spec relation in Table 4.1 specifies that any instance of AE in RIF has (at least, by default) RIF-PS CEs composed of the following sequence of elements, separated by at least one spacing character: 1) the (concrete) representation (in RIF-PS) of the operator of the AE, 2) an opening parenthesis, 3) the representation of its arguments separated by at least one spacing character, and 4) a closing parenthesis. This specifies what we called “the prefix functional form” Simply modifying the order of the elements in this list permits to generate the other presentation forms.
The functions fc_OP or fc_ARG simply indicate that the CE associated to the AE that is parameter of these functions has the specified role (respectively, operator or argument) in the ordered list of CE specifications of the embedding fc_type function. These functions provide another degree of flexibility: any AE can be used in the list of CE specifications, not just the (direct) operator and arguments of the AE to which the CE specifications are associated. This is often useful, e.g., when a notation has a special syntax for some types of AE which leaves certain parts of the AE implicit. The second rc_spec relation in Table 4.1 illustrates the use of fc_ARG.
Thus, the presentation for an AE (i.e., for an abstract semantic graph, not just an abstract syntax tree) in one or several notations can be specified in a concise way, with fc_type controlling that this presentation is LALR(1). Before finding this way, we tried creating a function for each kind of prefix/infix/postfix notation form. This led to many functions with long names and many parameters. Despite the long names, it was hard to understand or remember what each function specified exactly.
Table 4.1 notes that structural relations can also be associated to AEs for specifying structural and semantic restrictions on AEs in a particular type of model, e.g., for representing a restriction of a RIF-BLD AE compared to its counterpart in RIF-FLD. To sum up, Table 4.1 shows how particular KRLs (model and/or notation) can be specified in a concise or modular way, exploiting inheritance along subtype relations. In KRLO, the types for KRLs themselves are also organized via subtype relations (as illustrated in Figure 4.0 for notations); this is not a new idea for models but was not used in ontologies of KRLs.
Figure 4.1 gives a concrete example for the exploitation of model and notation specifications for KRL import/export/translation purposes. Indeed, Figure 4.1 shows relations between some (types or individuals for) AEs and CEs involved in the representation or export of “A bird flies” in RIF-PS, FL and N-Triples. This statement is represented both via a “frame that is also a connective formula” and a “conjunction of binary relations that is a frame”, two semantically equivalent AEs. In Figure 4.1, all the AEs with names that are not in the courier font come from the AE related part of the top-level of KRLO.
Legend: same as in the previous figure, plus
|
In KRLO, each AE has one and only one (inherited or overriding) rc_spec relation for a given type of notation. The “only one” part is ensured by our knowledge server which prevents the entering of ambiguities when it detects them. The other “one” part comes from the default presentation specifications of KRLO. Thus, for a given AE and notation, the multiple possible presentations, i.e, the corresponding CEs, are declaratively and unambiguously described. This description is recursive when a CE specification refers to a component AE (via the first parameter of functions such as fc_OP or fc_ARGS) since this component AE has itself a CE specification. The spacing between CEs is similarly defined in a declarative and unambiguous way. More precisely, KRLO allows the definition of different kinds of spacing, e.g., for different types of indentation methods, and a notation specification generally has no associated spacing definition, but KRLO has default rules for choosing the spacing. A notation specification or the end-user may specify overriding rules as long as they are unambiguous. Thus, the export rules of KRLO can also be translated into functions or type definitions for inference engines that do not handle rules. These export rules simply perform the recursive exploration of the specifications and the concatenation of the resulting CEs. If the relations in KRLO are correct, the export process is semantic preserving. It is also complete – if what is expressed by the relations is complete – since there is one and only one (inherited or overriding) rc_spec relation for a given type of notation.
Even in the absence of ambiguities, it is necessary to use a function like fc_type which returns a type for CEs since specifying a notation implies defining a type of CE for a type of AE, not a particular CE for a particular AE from a particular KB. On the other hand, all the above cited CEs are identical, at least structurally since spacing definition is generally not directly associated to a notation. Thus, to express that an AE can be represented in different ways, it is necessary to make each version explicit using subtype relations. E.g., consider the fact that in notations for RDF, a “blank node” (i.e., in CG terminology, an existential quantified concept node) can be linearized with or without an associated identifier whereas a RDF AE must not have such an identifier (probably because RDF graphs can only be connex). To represent this in KRLO, it is necessary to make explicit the fact that there exists a “Standard RDF” (which does not allow identifier relations on its blank nodes) and an “RDF which allows identifier relations on its blank nodes”, both subtypes of a more general “RDF-like model”. In order to bridge the two models, it is also necessary to write an equivalence rule between RDF graphs using blank node identifiers and those not using them. Then, notations for RDF can use both models. It would be incorrect – and fortunately impossible with the few primitives of KRLO – to represent the identifier as syntactic sugar, i.e., as something which does not represent an AE. Indeed, an identifier within a CE denotes the existence of an AE having a relation to this identifier. Furthermore, handling identifiers or such relations is something that equivalence relations between rules can typically do.
Our generic analyser shows the usefulness of our fully ontology based approach. However, this analyser is not in-line with our approach since it is not declaratively specified. Only programmers can reuse it, via its API, or adapt it by modifying its source code. This is why, we have begun specifying two complementary knowledge import methods in KRLO.
The first method generates rules that directly parse a given KRL. The manual creation of such rules is not uncommon in Prolog via the use of Definite Clause Grammars (DCGs). E.g., [Wielemaker et al., 2003] discusses the rules, uses and good performances of an “RDF compiler on top of the sgml2pl package” in SWI-Prolog.
For the second method, we have begun i) representing models and notations for grammars and their associated actions, and ii) extending our knowledge translation and export methods for generating procedural code for these actions. These generations will allow KRLO based systems to reuse any parser generator.
Both methods exploit specifications. Both are being implemented via RIF-BLD rules and via functions. These two implementations maximize the possibilities of reuse by different inference engines. Furthermore, we are making a Javascript module for these functions to be executable by a browser. Thus, they will be usable in client-side scripts and adaptable by Javascript aware end-users.
Given an AE and specifications related to a target KRL, the default export method specified in KRLO generates a CE for this AE by
The translation and export processes are complete with respect to what is expressed by these relations, rules or functions since there is one rc_spec relation for each AE in a given type of notation. Each user can control the translation and export processes. Indeed, she can select not only the translation properties but also the target notation and the target model or the target AEs. She can also extend KRLO or adapt her copy of KRLO, e.g., to adapt some translation rules as previously explained. In the general case, knowledge export and translation are arbitrary in the sense that knowledge can be translated and exported in various ways. However, this is not the case with KRLO in the sense that particular ways can be represented and then selected by the user. The default presentation choices represented in KRLO permit the user not to do this work if she does not want to.
However, KRLO does not represent all information necessary for any export or translation to be semantically complete with respect to any application. E.g., KRLO does not yet represent any inference strategy and hence the order of statements generated by translation and export processes may not be adequate for a particular inference strategy. As an example, rules may be generated in an order that lead to infinite loops if they are used with a Prolog inference engine.
Ad-hoc forms can be used when the target KRL is not formally expressive enough to represent an AE , e.g., when a statement such as “in 2015, at least 78% of birds in UK were able to fly, according to ...” has to be translated into RDF+OWL. If the specification of a target KRL describes such forms, our default export method uses them. Otherwise, the source forms are kept but within comments or annotations to isolate them from formal translations.
For KRL translation, in addition to equivalence or implication relations between types of AE, KRLO currently proposes some equivalences or implications via functions and rules. These rules and functions are for translations between structures, e.g., between i) non-binary relations and binary ones, ii) different structures for meta-statements (formulas about formulas), and iii) some kinds of definitions and some uses of universal quantification with implication or equivalence relations. These structural translations are simple: they can be expressed via RIF-BLD rules and do not require the complex strategies of general term/graph rewriting techniques. Backward chaining is sufficient to exploit them. Thus, KRLO does not specify a default translation method for combining these translation rules or functions.
For translations not yet supported by KRLO, the user has to import complementary ontologies. E.g., a RIF-BLD enabled inference engine that does not hard-code the special semantics of the types of OWL2-RL can import the RIF-BLD definitions of these types [OWL2-RL in RIF-BLD, 2013]. KRLO does not define types which are not logic-related, e.g., types for physical quantities or dimensions. Thus, if a KRL notation has some syntactic sugar for such types, the notation specification has to reuse types that are not defined in KRLO but in other ontologies. Using them for translation may require special translation methods.
KRLO and our tools are validated together by ensuring that, for each KRL represented in KRLO, translating KRs to any other KRL of KRLO and translating back the results lead to the original KRs, except for optional white space. For each KRL, the first test is actually simpler: importing and exporting a file of KRs must lead to the same file, except for optional white space. “Each KRL” means “each possible pair of model and notation”. The first input file is generated to include at least an instance of each AE type of the tested KRL. Then, the other tests use real-world files in this KRL, especially files from major ontology repositories, e.g., Ontohub (ontohub.org), DBpedia (dbpedia.org), Wikidata (wikidata.org) and YAGO3 (datahub.io/dataset/yago). Finally, KRLO itself is translated from FL to the target KRL – complemented by IKLmE if necessary – and back. Currently, given the implementation stage of our tools, we have not tested all the features of all represented KRLs, only those related to the pairing of the FL, Turtle and JSON-LD notations with RDF+OWL2Full or IKLmE models. However, more features are actively implemented.
Legend:
|
Classic parser generators, such as Lex&Yacc, are given a concrete grammar with actions associated to its rules for building AEs in memory. Implementing and updating the code for these actions is – or is akin to – a programming task, hence long and error prone.
Interactive programming environment generators are designed to ease this task, including in knowledge engineering [Corby & Dieng, 1996]. E.g., Centaur [Borras et al., 1988] proposed declarative languages for specifying concrete grammars, abstract grammars and rules bridging them. Based on them, Centaur could generate language specific structured editors, parsers, type checkers, interpreters, compilers and translators. However, these Prolog-like declarative languages are execution oriented rather than modelling oriented: they do not ease the creation and reuse of ontologies with given or calculated subsumption and exclusion relations [Dromey, 2006] [Guizzardi, 2010]. Thus, with them, as with procedural code, i) small changes in the KRLs to specify often lead to important changes in the specifications, and ii) the specifications are difficult to organize into an ontology and hence are not as easy to compare, automatically analyze and reuse as in an ontology based approach. The API based approach, e.g., via OKBC for knowledge exchange between KBMSs, is also a procedural approach.
+_: For concision, genericity or flexibility purposes, some languages have “language extension features”, e.g., macros, higher-order functions or functions with names such as “eval” or “apply” which interpret statements created via these languages. Lisp, KIF and Prolog have the last kinds of functions. XBNF [Botting, 2012], which is a KRL, is also an EBNF-like language and thus permits to specify languages or extend itself. XBNF is a KRL since a syntax rule is also a type definition and it can use logical relations as well as sets. However, in any language, such “extensions” do not change the language model and its expressiveness. Our approach, which can be seen as a generalization of this extension-based approach, may also be seen as escaping its problem since it permits different inference engines or inferencing methods to be used for different tasks.
Some meta-languages such as XML enable and force their users to include AE type names when describing a structure that follows these AEs. With these meta-languages, notations are homo-iconic to their model. E.g., the notation RDF/XML is homo-iconic to the RDF model. MOF, the Meta-Object Facility of the OMG (Object Management Group), along with any of its notations – e.g., MOF-HUTN – constitutes another such meta-language for describing structures. Other such meta-languages are used in Model Driven Engineering (MDE). Rather generic tools permit to parse and check structures described with such meta-languages. E.g., XML tools can parse RDF/XML structure and check their syntax. However, i) these “rather generic tools” do not perform logical inferences, and ii) the concrete descriptions they exploit are often not concise or high-level enough to be used directly for knowledge entering/display or by tools for knowledge handling. E.g., which KBMS or inference engine uses XML objects internally? Translations from/to other models or notations are still necessary. Our approach provides an ontology-based concise alternative to the use of XML as a meta-language for creating KRLs that follow given KRL ontologies. Thus, any notation can be used and XML is not a required pivot notation.
To sum up, a first kind of use case for an approach based on an ontology like KRLO is to ease the task of implementing tools that parse and check KRs written in many KRLs. Our generic KRL parser can be reused and adapted. Alternatively, since KRLO has the expressiveness of RIF-BLD expressiveness [RIF-BLD 2013], any inference engine handling this expressiveness can be reused to parse KRLO. Then, such an engine can exploit the KRL specifications in KRLO for importing knowledge written in these KRLs. RIF-BLD (Rule Interchange Format - Basic Logic Dialect) is the W3C recommendation and interchange model for definite Horn rules with equality, i.e., for Datalog like KRLs. Since there is no negation, no closed world assumption is necessary. More precisely, in FL [Martin, 2009], i.e., in the KRL in which KRLO was originally designed, KRLO uses a bit more expressiveness than RIF-BLD. E.g., KRLO uses subtype partitions, i.e., complete sets of disjoint subtypes for types. However, this additional expressiveness is not exploited by the import, translation and export methods we propose, and hence can be dropped out when translating KRLO to a RIF-BLD compatible KRL.
A related second kind of use case is to
enable the end users of tools to extend or adapt KRLs
in more advanced ways than what other approaches can support.
E.g., even though the Model Driven Engineering tool BAM [Feja et al., 2011]
is designed to handle many models and notations, its meta-model is predefined and hence,
for notation extension purposes, BAM only proposes macros and informal annotations.
An ontology based approach lets each user specify new abstract models or notations,
e.g., by adapting existing ones via menus, as in an ontology editor tailored to
a particular domain.
This permits application developers as well as knowledge providers or
consumers to tailor notations to their tastes and needs, or to the ones of a group of
persons. Indeed, in many cases, implementing or extending a KRL parser, translator,
displayer or navigator is either not an option or a cumbersome one. Yet, it is also
often interesting to add syntactic sugar or new structures to a particular notation,
to gain conciseness, ease readability or lift some expressiveness restrictions.
Each time we designed complex ontologies, to help us visualize the KRs and their
relationships, we added syntactic sugar to our in-house KRL (FL), e.g., for numeric
quantifiers, particular interpretations of relations to sets or meta-statements,
and some ubiquitous functions such as the f_in and f_dest in KRLO (cf. Table 1).
Nowadays, when people want to represent knowledge that cannot be fully expressed
with the KRLs they need or wish to use, they represent the knowledge in
incomplete or ad_hoc and biased ways. Instead, with our approach they can extend their
KRLs.
Conversely, our approach also provides a way to exploit KBs even if they include
syntactically or semantically incorrect KRs, as long as the kinds
of errors are systematic. Indeed, with our approach, the used KRL model,
notation or parser can be easily adapted to interpret some systematic
incorrect usages in special ways.
This can often be useful: in the study of [Beek et al., 2014], only 37.6% of
Datahub resources for Linked Data proved fully machine-processable.
Adaptations by an end-user is hardly possible when no declarative specification
is used for the KRL models.
With Centaur, this was possible but complex, if only because the specifications were
not ontology based.
Since any structure used in a Web resource
can be described in KRLO, the approach we propose for KRLs can be extended and used for
generating Semantic Web wrappers [Bühmann et al., 2014], i.e.,
tools parsing structures in certain resources to extract KRs.
A third kind of use case is the checking of input files with respect to
a KRL model and notation. E.g., specifications in KRLO for
profiles of OWL2
can be exploited to check if a given KB follows this particular profile.
Profile discovering can similarly be done.
In KRLO, the informal presentation of KRs –the use of a certain form of
indentation – can be specified too and then checked too.
+_:
This can for example be used to specify presentation best practices.
Then, their automatic checking can for example be used to teach
knowledge engineering students how to follow these presentation best practices
when entering knowledge.
However, KRLO does not yet provide rules or functions specifying a way to do
those checks or profile matching.
Nevertheless, the possibilities and flexibility offered by an ontology of KRLs
exceed what is possible with tools and languages solely based on XML, XSLT and CSS.
KRL translation is often specified from one KRL to another one, between CEs or between AEs. This is the direct mapping approach. E.g., in Centaur, translations rules could be specified between two given abstract grammars. Tools such as ODE [Corcho, 2004] proposed rule based languages for specifying lexical, syntactic, semantic and pragmatic translations between CEs or AEs, from a KRL to another. [OWL2-RL to RIF-BLD, 2013] and the Annex 8 of [OWL2-RL in RIF-BLD, 2013] propose rules translating CEs of the OWL2 Functional-Style Syntax (FSS) into CEs of the RIF-PS notation for the RIF-BLD model. The W3C does not propose translations between AEs since i) RIF-BLD was not represented into an ontology, and ii) CEs of FSS directly represent AEs of the OWL2 Structural Specification. In [OWL2 FSS from/to RDF triples, 2012] the W3C proposes rules to translate CEs of FSS into CEs of Triples notation for the RDF model. Thus, more generally, the W3C adopted the layered approach, i.e., translations based on models of increased expressiveness with direct mappings between each level of expressiveness. An alternative way to reduce the number of direct mappings between KRLs is the pivot approach, i.e., two-ways translations between each KRL and an expressive interlingua. In the 1990s, this led to the creation of the Knowledge Interchange Format (KIF) [Genesereth & Fikes, 1992], a FOL based KRL with a second-order notation, and then to Ontolingua [Gruber, 1993] [Farquhar et al., 1997], a library of ontologies written in KIF. In the 2000s, these works led to the CL model and its standard notations (CLIF, CGIF and XCL), COLORE (Common Logic Repository) and IKL [Hayes, 2006]. IKL is based on CL but, like KIF, can represent notions which are important for knowledge sharing but which are usually only found in KRLs based on Higher-order Logics (HOLs), e.g., certain kinds of meta-statements and numeric quantifiers.
None of the above cited works specifies or uses an ontology of KRLs. E.g., since KIF came to be the de-facto interlingua, Ontolingua did not include specifications for particular KRL models or notations. Its only KRL related ontology was named the “Frame Ontology”. It included definitions for objects similar to those of OWL. [Euzenat & Stuckenschmidt, 2003] showed that the mapping, layered and pivot approaches are generalized when an ontology of KRL models is used, one where AEs – and then models – are related by certain translation relations. Each of these relations has an associated definition for performing the translation. Each relation also represents translation properties, i.e., whether or not the translation preserves the model-theoretic semantics, interpretations and logic consequences of the translated AEs. Thus, given AEs to translate, a tool exploiting this approach may propose its users to choose different target models according to what they want the translations to preserve. As a proof-of-concept, [Euzenat & Stuckenschmidt, 2003] used XSLT to implement about 40 translation relations between AEs belonging to 25 description logics. The LATIN (Logic Atlas and Integrator) Project (2009-2012) [Codescu et al., 2011] went further by representing such translation relations between many different logics. Via HETS (Heterogeneous Tool Set), LATIN exploits several FOL KRLs and HOL KRLs, e.g., Isabelle and HasCASL. Via DOL (Distributed Ontology, Modeling, and Specification Language) [DOL, 2016], the OMG recently proposed a standard KRL for i) specifying particular kinds of translation relations between KRL models, and ii) using several KRLs in a same DOL document. DOL is also implemented via HETS and the authors of DOL see some results of LATIN as avenues for future extensions. Ontohub is a DOL based repository which includes KRL models and translation relations between them. DOL and HETS do not specify notations. They rely on external parsers and exporters.
None of the above cited works specifies or exploits an ontology representing AEs of different KRL models in a uniform and organized way. KRLO does so by setting as many generalization relations and partOf relations as possible and, for these last ones, using the operator-arguments schema detailed in Section 3. These relations permit to represent many semantic preserving structural translations without using functions or a rule-based format, hence in a simple and concise way. KRLO specifies some other translations via functions and rules. Since only structures are considered, these translations so far only require the expressiveness of definite Horn rules with equality. Thus, KRLO has a version using OWL-RL and RIF-BLD, the W3C KRL model for rules with such expressiveness. In KRLO, since the translated structures are akin to reified statements, the original statements can even follow an HOL model, this does not change the expressiveness required to convert structures. Currently, translations represented via rules in KRLO are only structure-preserving structure translation rules. Thus, for other translations, KRLO users have to also exploit complementary ontologies and, possibly, more powerful inference engines. In the future, KRLO will be added to Ontohub and made exploitable via HETS.
None of the above cited works specifies or exploits an ontology of notations. KRLO does. This permits the import, translation and export tasks to exploit an ontology of KRLs and the same one. Thus, our approach extends the 'family of languages' approach and may be seen a pivot approach based on such an ontology of KRLs instead of a KRL. As illustrated in Section 4, the use of generalization relations between AEs, models or notations can maximize modularity and reuse.
Besides KRLO and Ontohub, there is another ontology relating KRL models not belonging to a same family: ODM 1.1 [ODM, 2014], an OMG specification. It uses UML for representing four KRL models: RDF, OWL, CL and Topic Maps. It also relates a few AEs of different models via semantic relations such as generalization or equivalence relations. Finally, it gives QVT rules for direct mappings between AEs of different models. Since direct mappings are used instead of few primitives for defining and relating the various AEs, the heterogeneity of the various KRL models is not eliminated. This heterogeneity also makes the AEs difficult to compare or exploit in a uniform way. Finally, QVT rules are not directly usable by inference engines and translating them into KRLs may not be easy. We are not aware of works using ODM for KR import, translation or export purposes. Similarly, we have not found ontologies for notations, even for RDF notations. Hence, apart from our KRLO based translator, it seems there is no other translator based on an ontology for a KRL model and a notation. There are translators between notations for the RDF model, e.g., RDF-translator, RDF Distiller, EasyRDF, RDF2RDF, JSON-LD Playground, the "Transformation tool" of Corese and the OWL Syntax Converter which exploits The OWL API. Their Web interfaces or APIs propose limited or no ways to parameter their knowledge import, translation and export processes. The JSON-LD Playground is the most advanced of these tools regarding the parametrisation of a notation since it handles four formats for the JSON-LD notation: "Expanded", "Compacted", "Flattened" and "Framed". However, the user cannot adapt these formats: they are predefined. They are described in the official JSON-LD specification.
To sum up the above paragraphs of this Section 3.2, regarding KRL translation a kind of use case for our approach is an increased simplification of the implementation of this process. A related kind of use case is to enable end users to parameter, extend or adapt KRL translations. Adding a new KRL specification to KRLO – e.g., via its Web servers – may be sufficient to specify the structural translation of this KRL to other represented KRLs: no transformation rule or function may be required. Another convenient feature is that no specific KRL such as those of ODE is necessary: any KRL with at least RIF-BLD expressiveness can be used. Finally, as detailed by the authors of DOL, LATIN and [Euzenat, 2001], it is easier to implement a tool generating proofs for the properties of its translations if the tool exploits rules or relations in an ontology.
Without abstract and concrete models, exporting has to be done procedurally. Centaur exploited concrete and abstract grammars for both export and import purposes. For KRL models that have a MOF or XML based notation, languages such as Mof2Text, XSLT and CSS can be used for generating KRs in other notations. However, this is not easy since these languages are not KRLs or KR query languages. This is why there have been several works on rulebased and/or style-sheet based transformation languages for RDF. They specify how RDF AEs may be presented, e.g., in a certain notation, in a certain order, in bold, in a pop-up window, etc. Examples of such tools are Xenon [Quan, 2005], Fresnel [Pietriga et al., 2006], OWL-PL [Brophy & Heflin, 2009] and SPARQL Template [Corby & Faron-Zucker, 2014]. These tools were not initially meant to use a notation ontology: they initially required the use of a new style-sheet for each target notation. However, some of these tools – e.g., SPARQL Template – could exploit KRLO since it can be represented in RDF+OWL2-RL ou SWRL [Horrocks et al., 2004], an RDF-based KRL for representing Horn rules. This is an upcoming work with the author of SPARQL template.
Here again, a kind of use case for our approach is the simplification of KR export according to each user's preferences. Exporting KRs may include the generation of hyperlinks to let users navigate from CEs in query results to other KRs of the queried KB, and then again from these KRs to other ones. A related kind of use case is to enable end users to parameter, extend or adapt such exports, e.g., for knowledge pretty-printing.
One contribution of this article was to present the interest of exploiting an homogeneous ontology of KRL models and notations when implementing tools importing, translating and exporting knowledge in many KRLs. The beneficiaries of this exploitation are the tool designers and the tool end-users. Homogeneous domain ontologies are commonly used instead of rules or procedural code to reduce coding effort and increase the possibilities of reuse or parameterization. KRLO permits to do so for the handling of KRLs. It can also be used to ease the definition and comparison of KRLs.
Another contribution was to show how we created such an ontology, which main ideas we used, and which first tools we designed to help its exploitation. A related underlying contribution is the resulting ontology, KRLO, which i) represents and relates several models and notations into a unified framework, ii) declaratively specifies some import and export methods, and iii) can be extended with additional specifications by Web users.
Our approach provides an ontology-based concise alternative to the use of XML as a meta-language for creating KRLs that follow KRL ontologies. Thus, it is also a complement to GRDDL, the W3C recommendation for specifying where a software agent can find tools – e.g., XSLT ones – to convert a given KRL to RDF/XML. This new research avenue is important given the frequent need for applications to i) integrate or import and export from/to an ever growing number of models and notations, ii) exploit ad hoc uses of them, and iii) let application end-users adapt them.
Our translation server and its inference engine have recently been implemented by the second author of this article, employee of the software company Logicells. This company will use this work in some of its software products for them to i) collect and aggregate knowledge from knowledge bases, and ii) enable end-users to adapt the input and output formats to their needs. The goal behind these two points is to make these products – and the other ones they interconnect – more (re-)usable, flexible, robust and inter-operable. By itself, our translation server is currently only a proof of concept, not a claimed contribution. Our generic KRL analyser is a contribution but, as noted, is not fully in-line with our approach. We shall continue our work on KRLO to i) implement declarative import methods, ii) integrate more abstract models and notations for KRLs as well as query languages and programming languages, and iii) complement our notation ontology by a presentation ontology with concepts from style-sheets and, more generally, user interfaces.