Project Name: WebKB
Project Title: Knowledge Warehousing and Querying
Contact Researcher: Philippe MARTIN
E-mail: pm .@. phmartin dot info
Phone: +61 7 5552 8923
Project Team:
Dr Philippe MARTIN | DSTC | 100 |
The WebKB project is focused on precision-oriented Knowledge Representation. It has been a sub-project of the KVO project since January 1997, and has been supported by DSTC since January 2000. Since January 2003 the WebKB project is separate from the KVO project which has re-centered on its FCA-based data classification activities (FCA: Formal Concept Analysis).
The line of research of the WebKB project is to permit and help (i) the represention or structuration of any kind (and amount) of information in a formal or semi-formal way, manually or semi-automatically, and (ii) the retrieval and comparison of this represented information.
Before January 2000, research was focused on the design and implementation
of WebKB-1, a private knowledge annotation server, i.e. a server
permitting internet/intranet users
(i) to create documents that contain knowledge representations (formal
statements), possibly mixed with usual document elements (text, images),
and possibly indexing sentences or images in other Web documents,
(ii) to retrieve these knowledge representations, or the document elements
they index/annotate, via knowledge queries.
In other words, each user may create and document private knowledge bases (KBs)
within Web files, and ask WebKB-1 to parse the statements or queries within
these files. (Associating WebKB queries to hyperlinks is a simple way to create
virtual/dynamic documents).
Since 1999, the "Semantic Web" activity of the W3C focuses on designing
a language (RDF) that permit the creation of such private knowledge bases.
However, RDF is not very expressive, is difficult to read and use, does not
include queries nor a language of commands, does not permit the statements
to be mixed with other document elements, and its indexation capabilities are
basic. It is also too low-level to ease the sharing/re-use of knowledge
between knowledge providers.
Work on WebKB-1 was stopped in January 2000. Since then, WebKB research
has been focused on easing knowledge sharing between users.
Since the W3C approach (i.e. let people users set specialisation or identidy
links between categories within private KBs) is difficult and extremely
far from the optimum (compared to what a more centralised approach can offer),
the adopted approach was to design WebKB-2 (www.webkb.org),
a shared knowledge annotation server, i.e. a server permitting
internet/intranet users to update a shared KB on the server machine.
WebKB-2 has several advantages over other current knowledge servers:
(i) it has most of the features of WebKB-1, i.e. it also supports
private KBs within documents and the indexation of document elements,
(ii) its shared KB can be very large and has a default large initial
ontology which is a merge, correction and extension of the WordNet natural
language ontology and several top-level ontologies,
(iii) WebKB-2 proposes high-level intuitive and expressive input/output
notations and can exploit the shared KB to generate menus to guide knowledge
entering,
(iv) its category naming scheme and its update/cooperation protocols make
it the only system permitting its users to update a same KB without
lexical/semantic conflicts nor redundancies and without forcing the
users to agree with each other,
(v) it has a procedural language of commands and various query/navigation
mechanisms that permit people or automated agents to explore the KB easily and
filter out the knowledge of certain users or kinds of users if necessary.
Since the implementation of WebKB-2 began in January 2000 and since, as opposed to WebKB-1, it has been built above a free object-oriented database system, its IP fully belongs to DSTC. Although many small features still need to be implemented, WebKB-2 and its resources (e.g. notations and ontology) may be considered at the "pilot stage" for the goal of manual "knowledge storage, retrieval and comparison". Many small examples have been built (see http://www.webkb.org/kb/) and a customised application ("locating and comparing short-term accommodations on the Sunshine Coast") is in development (http://www.webkb.org/SC/). This application will be progressively extended to include other Yellow-Pages-like or Auction-like services for the same area (the Sunshine Coast). Its interface is more user-friendly than the generic interface of WebKB-2 but retains much flexibility for the knowledge representation process.
Tools to (semi-)automatically extract knowledge from general documents would be very interesting since, despite WebKB-2 features, manual knowledge representation remains a slow and difficult process. However, to be used for query answering, the extracted representations must not only be of good quality (i.e. explicit and factual), they must be converted into a normalized form that permit their comparison and hence logical inferences. Both aspects are very difficult and have never been achieved outside very limited domains. After conducting some tests on sentences from research articles, I believe the first aspect might be reached within a year but the second aspect (in its general case) would need much more time. These works are clearly at the "idea stage".
To re-use knowledge produced by people not using WebKB-2 (e.g. knowledge in RDF) or to permit the content of the KB to be exploited by other knowledge representation tools, some import/export wrappers need to be added to WebKB-2. The implementation of two such import wrappers (one for RDF and one for KIF) has begun. They are at the "exploration stage". However, the relevance of this work is uncertain. KIF (or its future successor CommonLogic) will probably remain an academic standard for a long time. On the other hand, RDF (or its extension OWL), the probable future industry standard, is very likely to remain too low-level to support (non-basic) knowledge re-use/sharing (except in peer-to-peer ad-hoc ways).
Thus, in the short term, a good part of the work on WebKB-2 will be focused
on extending the above cited application or other ones.
Another opening direction is the application of the WebKB-2 model to extend
UML for knowledge representation purposes. To that end, the cooperation with
Kerry Raymond to answer an "Ontology Definition Metamodel RFP" from the OMG,
has just begun.
1. A long description (35p) of WebKB-2, its principles, and its relevance to the "Semantic Web" has been published in January as part of a book titled "Web Intelligence" (Springer-Verlag, Eds.: N. Zhong, J. Liu, Y. Yao).
2. The rationales and methodology for correcting and extending WordNet have been accepted for publication at ICCS 2003 (the conference will be in July). The resulting ontology has been translated in various formats (e.g. RDF) and made Web-accessible (www.webkb.org/doc/wn/). It has also been advertized on three mailing lists and private enquiries about it have been answered.
3. An abstract has been submitted to Evolve 2003.
4. The application of WebKB-2 for "locating and comparing short-term accomodations on the Sunshine Coast", which began mid-December, has reached a usable form. A presentation and demo was given at the Participant Forum in February. The original idea of this application - using WebKB-2 to complement the hotel brokering services of the Wotif Web site - comes from Andy Bond. A meeting with Paul Young, director of Wotif, will permit to decide which directions should be further be explored.
The cooperation with Kerry Raymond to answer an "Ontology Definition Metamodel RFP" by the OMG, has just begun.
The initial investigation (3 days) about an automatic and normalized representation of English sentences extracted from research papers revealed the extreme difficulty of the task. This (semi-)automatic representation project is at the heart of a project for a possible next "Smart Internet CRC" that Pr. Abdul Sattar (Griffith Uni.; leader of the "Knowledge Representation and Reasoning Unit") would like me to be involved in.
The "Wotif application" has been implemented. | Demo given at the Participant forum. | Further extensions will be implemented. |
Wotif application | To be further extended. Demonstration at Demo Day in May |
Ontology Definition Metamodel RFP | To be completed by September |
Comments (if applicable):
Organisation |
Paper/Chapter |
Journal/Conference/Book |
Published Date |
Ph. Martin | Knowledge Representation, Sharing and Retrieval on the Web. | Chapter of a book titled "Web Intelligence" | Springer Verlag, LNCS, January 2003 |
Ph. Martin | Correction and Extension of WordNet 1.7 |
ICCS 2003 | Springer Verlag, LNCS, July 2003 |
Comments (if applicable):
Organisation |
and Organisation |
and Dates | |
Elemental? | Kerry Raymond | Ontology Definition Metamodel RFP | 8/04 (4 h), 16/04 (1 h) |
Comments (if applicable):
9.1 Presentations to participants:
and Organisation |
and Organisation |
presentation/demonstration |
|
Philippe MARTIN | Participants meeting at DSTC | Knowledge warehousing with WebKB-2 | 4/03/2003 |
Comments (if applicable):
and Organisation |
Organisation, Location |
|||
Ph. Martin | Technische Universitaet, Dresden, Germany | ICCS 2003 + recreation leave | 20/06 to 12/08 (conference: 21-25/08) | DSTC |
Comments (if applicable):