Towards more semantically structured and integrated course materials, assignments and student feedbacks
A subject for an M.Sc. thesis (more precisely, a "D.E.A." in French)
Introduction
Computer-assisted learning, and hence e-learning, most often simply uses
emails, a discussion forum and Web-accessible hypermedia documents for
the course materials. However, most advances in CSCW (Computer Supported
Cooperative Work), or more generally Information Technology, can be
re-used for computer-assisted learning (e.g.,
online games and
simulations can be used for motivating students and letting them experiment).
Here, we focus on the application and refinement of certain knowledge representation
and sharing techniques. We distinguish four kinds of research in that general
direction.
- Creating a formal knowledge base (KB) representing one or several research domains
to support automatic problem solving (and then the explanation of the provided
solution). These are ambitious and long term projects.
One example is the QED Project
which aims to build a formal KB of all important, established mathematical knowledge.
Another example is the "Digital Aristotle"
project which aims to build a system capable of teaching much of the world's
scientific knowledge by
(i) adapting
to its students' knowledge and preferences (Hillis, 2004), and
(ii) preparing and answering (with explanations) test questions for them
(this implies the encoding of the knowledge in a formal way and
meta-reasonings on the problem-solving strategies).
- Creating tools that let people (usually researchers) add to a database (DB) or a
large KB to represent the factual knowledge of a domain in a (rather) formal way,
for knowledge
retrieval/sharing/teaching purposes rather than for automatic problem solving purposes.
A DB contains a lot of data that follow a small and fixed conceptual schema.
On the other hand, a KB contains an ontology, that is, a set of
user-provided definitions for categories (concept types, relation types, instances),
and a set of facts or rules written using the defined categories.
Examples of such DBs/KBs/ontologies are the botanical databases, the
Gene ontology and the
OpenGalen medical ontology.
In order to ease the creation of a shared KB, a KB system must be Web-accessible
(it is then called a KB server or an ontology server), should provide user-friendly
interfaces/notations and a default large lexical ontology, and should have
multi-user cooperation protocols.
WebKB-2 (Martin, 2003) is
currently the only knowledge server that has these last three characteristics.
- Supporting the manual, semi-automatic or automatic indexation/representation
- and then retrieval - of (parts of) documents (e.g., research articles or
user manuals) with categories or formal statements for information retrieval (IR)
purposes.
Depending on its focus, such research may be catalogued in different areas:
Document Retrieval, precision-oriented IR, Question Answering, Semantic Web,
Knowledge Extraction, Natural Language Understanding/Extraction,
Terminological Analysis, etc. With formal annotation servers such as
WebKB-1 (Martin & Eklund, 2000)
people can mix formal indexations/representations with informal statements within a
same document. However, in any case, information sharing is restricted to adding
a document into a collection of documents.
- Letting people (e.g., teachers, researchers, students) add, edit and structure
information in a semi-formal way into a shared repository.
This leads to less information redundancies, and eases information
retrieval/understanding/comparison. Some tools for that are
argumentation systems such as AAA (Schuler & Smith, 1990),
typed hypertext systems such as MacWeb (Nanard & al.. 1993),
semi-formal KB systems such as CODE4 (Skuce and Lethbridge, 1995),
formal KB servers such as WebKB-2, and nowadays
Semantic wikis,
that is, wikis systems such as Wikimedia (which supports Wikipedia, Wikitionary, etc.)
but allowing the use of some semantic relations such as subtypeOf, partOf and
agentOf). Some of these systems support (large) KBs and hence can also be used
for Point 2.
The research described below relates to Point 2 and Point 4.
We believe that learning will be improved if a good part of the information from the
course related materials, assignments and discussions with/between the students is (or
can incrementally become) semantically structured (and hence inter-related or integrated).
More precisely, teachers and students should be able to represent the connections from
an object (concept, definition, assertion, question) to other objects at the level of
precision they want but in an explicit and semantically valid way, and in an easily
retrievable way. For example, for a course about Natural Language Parsing,
a semantic network of the various involved concepts (techniques, possible goals, features
and inputs or outputs of the techniques, tasks, subtasks, strategies, building blocks,
problems, debated ideas, existing tools, possible features of such tools, etc.)
should be given and organised using specialization relations, partOf relations,
argumentation relations, etc. More concrete examples are refered to below.
Such a semantic network has several advantages over a "bunch of sentences".
For example,
(i) it is much less ambiguous than informal texts (e.g., it is often difficult
to understand and remember the specialization relations between techniques or features
when reading a book or a manual, and hence understand or compare techniques or tools),
(ii) it permits to easily access a particular object and its related objects
(e.g., its features, specializations, arguments, counter-arguments), and then again
this eases the understanding of an object or a whole domain,
(iii) it can be extended by a teacher or any student for feedback purposes (e.g.,
for a student to clarify or ask for a clarification of a point that she found ambiguous or
badly explained in a course material or an assignment) or for evaluation purposes
(correctly extending a part of such a network shows a good undertanding of the related
part of the course), and
(iv) if it is a good state of the art, it could be adopted by researchers
the domain to represent their research, thereby becoming an alternative (and much more
efficient for knowledge sharing purposes) way to publish or retrieve information on a
domain than journals or Web pages.
A semantic network is more difficult to update (correctly) than writing sentences
since it forces analysis and the respect of some normalization rules.
This is both a challenge (how to make updates easy enough) and an advantage (the network
can be searched, filtered or, more generally, exploited, in much deeper ways than
informal sentences). Not all the course related materials, assignments and discussions
should have to be organised into the global semantic network, only the parts for
which this seems interesting.
Here are various ressources that are important to consider, re-use and extend
for this M.Sc. thesis.
- This article (Martin & al, 2005)
describing how WebKB-2 and its cooperation protocols (which permit its users to store
and tightly interconnect their knowledge into a same large KB without having to
agree on terminology or beliefs) can be extended for handling semi-formal knowledge
(e.g., structured discussions) and comparing tools or techniques.
- These first quick modellings of various domains:
Logics,
Information Sciences,
Knowledge Management,
Conceptual Graphs,
Formal Concept Analysis.
- This comparison of CG tools.
- These structured discussions:
about abortion,
about XML for Knowledge Representation.
- This experiment about creating a semantic
network for Workflow Management concepts and letting student extend it
(note: no knowledge server was used during this experiment).
- This experiment evaluating different
computer-supported argumentation approaches.
- This article about
concept maps. Note: this article is a good source of information but,
because concept maps are not formal enough, they are often less interesting
(less understandable, more ambiguous) than the sentences they represent (this is
also why they can often be automatically extracted from these sentences).
However, concept maps are already used in teaching for reasons similar to the
ones outlined above, e.g. see
this site and google on "concept map" and "teaching".
- Semantic wikis are also
used in teaching for reasons similar to the ones outlined above (e.g., google on
"semantic wiki" and "teaching"). Although still very interesting for learning,
Wikipedia is regretably not structured enough (strangely, Wikipedia even has
policies against structuring) and many semantic wikis are also mainly informal.
Proposed tasks for this thesis
Modelling tasks:
- Representing and comparing the concepts, tools and techniques related to
knowledge-oriented approaches usable computer-assisted learning (especially
for the above cited Point 4).
If time permits, the student should also do so for another domain of his/her choice.
The models to follow are described in the above cited
article,
comparison of CG tools and
and first quick modellings of various domains
(hence, WebKB-2 and its ontology should be used whenever (semi-)formal parts are
used).
- Organising the debates about the approaches related to
knowledge-oriented approaches usable computer-assisted learning (especially
for the above cited Point 4) into
structured discussions.
If time permits, the student should also do so for another domain of his/her choice.
This may lead to suggestions of extensions of the current notation.
Some research into how to extend the "algorithm to quantify the popularity and
originality of each contribution and contributor" or other ways to exploit
structured discussions should be considered.
Programming tasks: (yet to be decided)
All modellings and code should be in English only.
If this is legally possible, the thesis should be written in English too.
References
W.D. Hillis (2004).
"Aristotle" (The Knowledge Web).
Edge Foundation, Inc., No 138, May 6, 2004.
Ph. Martin & P. Eklund (2000).
Knowledge Indexation and Retrieval and the Word Wide Web.
IEEE Intelligent Systems,
special issue "Knowledge Management and Knowledge Distribution over the Internet",
May/June 2000.
Ph. Martin (2003a).
Knowledge Representation, Sharing and Retrieval on the Web.
Chapter of a book titled "Web Intelligence",
(Eds.: N. Zhong, J. Liu, Y. Yao), Springer-Verlag, Jan. 2003.
Ph. Martin (2003b).
Correction and Extension of WordNet 1.7.
Proc. of ICCS 2003
(Dresden, Germany, July 2003), Springer Verlag, LNAI 2746, 160-173.
Ph. Martin, M. Blumenstein and P. Deer (2005).
Toward cooperatively-built knowledge repositories.
Proc. of ICCS 2005,
(Kassel, Germany, July 2005),Springer Verlag, LNAI 3596, pp. 411-424.
W. Schuler and J.B. Smith (1990).
Author's Argumentation Assistant (AAA): A Hypertext-Based Authoring Tool for
Argumentative Texts. Proc. of ECHT'90 (INRIA, France, Nov. 1990),
Cambridge University Press, 137-151.
D. Skuce and T.C. Lethbridge (1995).
CODE4: A Unified System for Managing Conceptual Knowledge.
International Journal of Human-Computer Studies, 42, 413-451.
See also the successor / commercial version: Fact Guru.
D.A. Smith (1998).
Computerizing computer science.
Communications of the ACM, 41(9), 21-23.
Kassel G., Abel M.-H., Barry C., Boulitreau P., Irastorza C. & Perpette S.
Construction et exploitation d'une ontologie pour la gestion des connaissances d'une
équipe de recherche. In Actes de la Conférence en Ingénierie des Connaissances.
IC 2000 (Toulouse, France), 251-259.
J. Nanard, M. Nanard, A. Massotte, A. Djemaa, A. Joubert, H. Betaille,
J. Chauché (1993). Integrating Knowledge-based Hypertext and Database for
Task-oriented Access to Documents, in Proc. DEXA 1993, Prague, Springer Verlag,
LNCS Vol. 720, 721-732.