![]() |
The semantic web is a mesh of information linked up in such a way as to be easily processable
by machines, on a global scale. A characteristic of semantic web content is that it is
annotated in a machine-understandable fashion. To our opinion, there are three properties
that ensure machine understandable: explicit makes an annotation publicly accessible,
formal makes an annotation publicly agreeable, and unambiguous makes an
annotation publicly identifiable. The process of upgrading the actual web pages to be
machine-understandable semantic web pages is named the process of the web semantic annotation.
Currently, the web semantic annotation research is my major interest. There are also some other research projects in which I have involved or am involving. |
|
|
| The right figure shows our two-layer annotation model. The lower layer conceptual annotator uses an ontology-based IE tool. Since ontology-based IE tools are resilient to changes of web pages, so do the conceptual annotator. Hence the conceptual annotator will be able to work immediately on web pages within the domain when they come online. The upper layer structural annotator uses a layout-based IE tool. Since layout-based IE tools execute fast, and, when properly constructed, have high accuracy, the structural annotator will also execute fast and have high accuracy. In general, the system will pass an arbitrary input web page to the conceptual annotator to fulfill the requirement of resiliency. When there are a large set of input documents that follows a similar layout pattern, the system will automatically build a structural annotator based on the results of the conceptual annotator according to a small set of sample web pages. Then the system will use the dynamically created structural annotator to annotate the rest of massive number of documents in a fast and high accurate way. | ![]() |
|
|
|
The purpose of this ontology assembling research is to maximize the reuse of
existing ontologies and minimize the work of constructing new ontologies.
The figure on the right illustrate this goal. If some part of
(or in the best case all of) the required
formal semantics has already been built in a collection of knowledge, we want to adopt the existing semantics to avoid the
work of reconstructing it from scratch. Therefore, the work becomes to determine how
the system can find useful knowledge components and assemble them together to become the
domain ontology that describes the information in a web page presented by the user.
The collection of knowledge contains pre-used ontologies, ontology components, and data frames. In addition, the collection of knowledge also contains pre-used mapping information. Our policy is, however, that we are not going to enforce any intergration of the knowledge within the collection unless it is demanded by a process. Therefore, the collection keeps "open-minded" instead of "stubborn". When a web page is presented, the system selects components of existing formal semantics using their data recognition mechanisms. Then we both use existing mapping information and apply new mapping procedure to assemble these components together to be a new domain ontology. During the process, when there are some knowledge that is not originally contained in the collection of knowledge, the system will link users to an manual ontology generation tool so that they can create them and add these new formal semantics into the collection of knowledge. |
![]() |
|
|
David W. Embley (Computer Science, BYU)
Stephen W. Liddle (Bussiness School, BYU)
Deryle W. Lonsdale (Linguistic, BYU)
Yuri A. Tijerino (Applied Media Informatics, Kwansei Gakuin University, Japan)
Troy Walker (Google)
Alan Wessman (CS graduate, BYU)
Li Xu (Computer Science, U. of Ariziona, South)
|
|
| Last updated: Apr 10th, 2006 |