The objective of this research is to explore and study the new paradigm for effectively utilizing and evaluating, from users' perspective, not from creators', diverse cultural resource data that are archived at libraries, museums, research institutes, universities, and corporations today. The research starts with applying the Topic Map paradigm to relatively small data. The data size is then gradually scaled up in order to examine the applicability of the Topic Map technology to large-scale database, where it is expected to be most effective. For the small data, we use 227 pieces of graphic data from "Genji Monogatari (the Tale of Genji)" and related hypertext data in Japanese and English. As extraction of topics, grouping and association to other graphic data and topics can be done with ease on graphic data, we expect that our experiment will serve as a model which users may follow when they attempt to build Topic Maps on the same graphic data later on. We are still in the early phase of the research and expect many things to be done in order to attain our ultimate goal. In this paper, we put forth our view of anticipated future developments in the research method as well as in the research itself and invite advice and/or criticism from readers. In the later phase of the research we plan to apply the Topic Map system to large-scale database, evaluate its usefulness and effectiveness and discuss issues found in the process. We hope that eventually the knowledge and theory developed through this research will help manage, analyze, and evaluate massive and diverse cultural resource data that have been accumulated in our society over hundreds of years and also help bring users' point of view to the forefront in the world of computer.
| Keywords: | Topic Maps, user-oriented, XML, "Genji Monogatari (the Tale of Genji)" |
Whereas every conventional paper book has its index, typically at the end of the book, electronic documents are rarely equipped with such arrangements that help users. What is worse is that neither common rules nor protocols exist for preparing indexes for digital documents. Indeed, with today's high-speed search engines, one can easily retrieve, "precisely" in an alphabetical order, digital documents containing the texts he/she specifies, out of billions placed on the internet. Using portal sites like Yahoo, one can also obtain a long list of URLs related to the topic of his/her choice. The retrieved documents or search results, however, often do users practically more harm than good. In some cases, it takes the user too much time to go through all the "hits". In others, the "hits" are simply irrelevant documents containing relevant keywords.
Today we input a large amount of data on paper documents into the computer everyday with a lofty goal of accumulation and sharing of our knowledge and wisdom. Unfortunately, however, the database is often of little value to its users, because it is edited and organized with its creators' point of view only. As Steve Pepper, one of the proponents of the Topic Map paradigm, pointed out by saying "A book without an index is just like a country without a map", an electronic document without a index is useless when it comes to building a solid knowledge base. The Topic Map system was invented to solve the problem with users in mind and subsequently formalized by the ISO as an international standard.
In this research we attempt to examine the usefulness and effectiveness of the Topic Maps system by applying it to "Genji Monogatari (the Tale of Genji)", a masterpiece of Japanese classic literature and evaluating the results. We also attempt to develop and evaluate a prototype of the Topic Map authoring tool.
We hope that, with the authoring tool, anyone who wants to build Topic Maps on his/her own will be able to do so without knowing its syntaxes or being an Topic Maps expert.
With the support of the Universite de Charles-de-Gaulle-Lille III, the French government, the Japanese government and private funds, the graphic database had developed by one of the author of this paper from 1996 to 1998 in France. And now it can be seen at UNESCO site. (http://webworld.unesco.org/genji/)
The Tale of Genji was written by Murasaki-shikibu about 1000 years ago in the Heian era(794-1192.) The graphic data of 227 pictures we use this research were scanned into the computer from "Ehon Genji Monogatari" an illustrated book for Genji Monogatari. The pictures were created with woodcut in 1650 by Harumasa Yamamoto, one of the finest Japanese lacquer artists in the Edo era(1603-1867.) The book was used for the processing, instead of the original woodcuts that are archived at the Library at Tsurumi University and the Office of Japanese Literature at Tokyo University, largely because the originals were in the unfavorable conditions and their availability for observation was restricted. Another factor was that the book featured a five-to-six line caption for each picture, allowing readers to follow the story with ease.
The 227 pictures can be grouped under the following six themes.
The sum of 1. Waka, 2. Music performance, 3. Party, and 4. Love totals 123, accounting for 54.4% of all the pictures. Many, however, fall in multiple categories, so the percentage is not necessarily accurate.
Figure 1-4 shown below are select scenes from some of the category. The picture numbers in the caption of figures represent the volume number and the serial number of the picture in the original database.
Music was made one of the categories not just because there were many music related pictures but more importantly because music played such a significant role in the Heian culture.
Figure 1 is a scene well-known as "Amayo no Shinasadame": they talked on of the varieties of women, from Volume 2. When Sama-no-kami visits his lover at her house, he plays the flute in her front yard to let her know of his arrival, and the lover answers by playing the harp in return.
Musical instruments are used on many scenes. They play a major role in dating, consolation, dancing scenes, not to mention party and concert scenes. The musical instruments include wagoto (a Japanese harp), karagoto (a Chinese harp), flutes, drums, biwa (a four-stringed Japanese lute), and sho (a wind instrument composed of a mouthpiece and seventeen bamboo pipes of various lengths).
![]() |
![]() |
|||
| figure 1: Picture 2-8 | figure 2: Picture 4-18 |
Figure 2 is a scene from Volume 4, where Hikaru Genji, the main character of the story, is about to leave the house of his lover, Rokujo-no-miyasudokoro, in a misty morning. The lover older than Genji looks after him, as he leaves, with swollen eyes from her bedroom.
Genji, captivated by her beauty, flirts passionately with a high-ranked woman servant at the corner of the hall out of the sight of Rokujo-miyasudokoro. In the meantime, Genji sends a girl to the yard to pick bush clovers and morning glories for a gift for the lover. This perfectly depicts Genji, a legendary playboy.
Figure 3 is a scene of a fuji-no-hana (wisteria flower) viewing party at the house of Udaijin, a high-ranked government official. There are two pictures showing scenes of people enjoying fuji-no-hana, in addition to others with red leaves, cherry blossoms, and plum blossoms. This may be because wisteria flowers are purple, and the Japanese lacquer artist thought that wisteria flowers become Murasaki-shikibu, the author of "Genji Monogatari (the tale of Genji)." Note: Murasaki means purple in Japanese.
![]() |
![]() |
|||
| figure 3: Picture 33-124 | figure 4: Picture 3-14 |
Figure 4 is a scene where Utsusemi and Nokiba-no-ogi play the game of go (traditional Japasese board game). Three out of the 227 pictures show go related scenes. Two out the three are go playing scenes, while the other is a scene where Murasaki-no-ue stands on the go board and has Genji trim her hair. There are many go playing scenes in the story including one where princesses play go betting their favorite trees.
As many as 35 pictures are devoted to the scenes of play including hunting, boating, playing with dolls, playing sugoroku (Japanese backgammon), and dancing, which make this category the second largest of all.
As shown above, the graphic data provide, by illustrating various scenes of life, invaluable information about people's life in the Heian era that is hard to describe with text data only. What is interesting is that many of the pictures including the go playing scene, the music playing scene, and the love scene, are presented as if someone else were peeping at the scenes. These pictures tell us much about what the life without privacy was like. The world surrounded by folding screens, lattice windows, partitions, and bamboo blinds as shown on these woodcuts certainly helps us understand the way of life and architecture of those days.
The grouping of the pictures as shown above is highly a matter of personal views. Although there may be some interpretations many people share together, basically people feel differently, have different impressions, and find different things, when they look at the woodcuts Harumasa Yamamoto created. If so, the data should be reorganized in such a way that the users can personalize the data based on their own view. The electronic documents should be made available as a database useful to most people. One way to do so is the use of Topic Maps.
Topic maps are a method to organize information on the semantic and metadata level and are also a hyperlink network composed of the following three elements and three kinds of subject indicator.
Three elements:
![]() |
| figure 5: Topic, Association and Occurrence |
Topic maps enable users of information to organize the information from the users' point of view and can also be an effective measure against infoglut. Topic maps are a tool that can help the use of the computer evolve from data processing to information processing to knowledge processing. The syntax of Topic Maps was first based on Hytime. In 2001 the XML-based syntax was added, and now one can build Topic Maps using either Hytime-based syntax (HyTM) or XML-based syntax (XTM).
The Topic Map paradigm was formalized by the ISO/IEC and became ISO/IEC 13250 in 2000. Currently, related specifications are under discussion, which include Reference Model, Standard Application Model, TMQL (Topic Map Query Language), and TMCL (Topic Map Constraint Language).
Reference Model is a PMTM4 (Processing Model for XTM 1.0)-based reference model for Topic Maps, which can be used to define the relationships to other knowledge representations.
Standard Application Model is a Infoset Model-based formal data model for Topic Maps. Standard Application Model can be serialized into the HyTM or XTM format, which in turn can be de-serialized back into Standard Application Model.
TMQL is a search language for Topic Maps while TMCL is a schema language for Topic Maps.
The specifications under development or discussion, their status and relationships are presented in the material prepared for the ISO/IEC JTC1 SC34 conference, which was held in Orlando, Florida, USA in December, 2001. (See "Topic maps, roadmap for further work" for further details.)
![]() |
| figure 6: Topic maps, roadmap for further work |
Besides the ISO/IEC, the OASIS (the Organization for the Advancement of Structured Information Standards) is active on the development of XTM 2.0 and the standardization and promotion of published subjects. Many members are involved in activities at both organizations, and the synergy effects are expected to facilitate the standardization of the related specifications.
In Japan, ISO/IEC 13250 and XTM 1.0 are soon to become part of JIS (Japan Industrial Standard) and JIS TR respectively, and the JIS organization plans to discuss, as they are available for discussion, whether to include in the JIS the standards currently being prepared by the ISO/IEC.
Non-profit organizations active in developing and promoting topic related standards and specifications are as follows.
Currently, signing up with mailing lists and visiting web sites are best ways to know more about the Topic Map system. We recommend the following mailing lists and websites as an excellent source of Topic Map related information.
Mailing list:
Below is a partial list of businesses offering Topic Maps related products and services including software tools, education, and consulting.
In Japan, Synergy Incubate Inc. is the most active organization in promoting the Topic Map paradigm and its standardization as a JIS and also in participating in experimental projects with the Topic Map technology.
In this research we applied the Topic Map system to the graphic data from Genji Monogatari as described in Chapter 2 in order to acquire know-how on Topic Map building and evaluate the effectiveness of the Topic Map. In building the Topic Map, we followed the process developed by Steve Pepper as shown below as a guidance.
As a first step of the process, we first defined the subject domain as whatever covered in the illustrated book for Genji Monogatari.
We then analyzed the domain and extracted the candidates of topics, associations, and occurrences. Many subjects are included in the pictures. So we can extract various topic candidates from various point of view. For example we can define them as follows.
Topics:
| Event(type) | |||
| example:(instance) | |||
| peeping at Utsusemi, making love with Nokiba-no-ogi, meeting Yugao for the first time. |
|||
| Place(type) | |||
| example:(instance) | |||
| Kii-no-kami's residence, Daini-no-menoto's next door, Genji's residence, Rokujo-no-fujin's residence. |
|||
| Person(type) | |||
| −women(type) | |||
| example:(instance) | |||
| Utsusemi, Nokiba-no-ogi Yugao |
|||
| −men(type) | |||
| example:(instance) | |||
| Genji, Kokimi, jusha (attendant), Koremitsu |
|||
| Entertainment(type) | |||
| −play(type) | |||
| example:(instance) | |||
| go (Japanese Othello) | |||
| −nature(type) | |||
| example:(instance) | |||
| jugoya-no-tsuki (August full moon), kuretake (bamboo), yugao (moonflower), asagao (morning glory), koyo (red leaves), hagi (bush clover) |
|||
| −party(type) | |||
| −tanka (traditional Japanese poem)(type) | |||
| Superclass - Subclass relationship(type) | ||
| example:(instance) | ||
| person (superclass) − woman (subclass) | ||
| person (superclass) − man (subclass) | ||
| entertainment (superclass) − play (subclass) | ||
| entertainment (superclass) − nature (subclass) | ||
| entertainment (superclass) − party (subclass) | ||
| entertainment (superclass) − tanka (subclass) | ||
| Class − Instance relationship(type) | ||
| example:(instance) | ||
| place (class) − Kii-no-kami's residence (instance) | ||
| place (class) − Daini-no-menoto's next door (instance) | ||
| event (class) − peeping at Utsusemi (instance) | ||
| event (class) − meeting Yugao for the first time (instance) | ||
| Event at Place with Person relationship(type) | ||
| example:(instance) | ||
| peeping at Utsusemi (event) − Kii-no-kami's residence (place) − Genji (person), Utsusemi (person), Nokiba-no-ogi (person) | ||
| meeting Yugao for the first time (event) − Daini-no-menoto's next door (place) − Genji (person), Yugao (person) | ||
| Person Play Game with Person relationship(type) | ||
| example:(instance) | ||
| Utsusemi (person) − play the game of go with - Nokiba-no-ogi (person) | ||
| Image files | ||
| Japanese text files | ||
| English text files | ||
Figure 7 is a example of the schematic view of the ontology.
![]() |
figure 7: Example of ontology |
We utilized a regular editor for Topic Map building purposes and Ontopia's Omnigator for Topic Map display purposes. Omnigator was selected because it was readily available for free and also supported Japanese language.
Initially, we built the Topic Map manually using the text editor and made improvements by switching back and forth between Omnigator and the editor. In the process of developing the Topic Map, we also reviewed the ontology from time to time and gave refinements as deemed necessary.
Our initial approach to Topic Map building was just-do-it, using samples as a model and the published standard as a reference. In the meantime we actively participated in tutorial sessions to learn more about Topic Maps and gain skills in Topic Map building. We had lots of valuable advice from Ontopia.
In addition to how to construct the ontology, major problems we found while developing the Topic Map were as follows.
We worked on developing the prototype of the authoring tool that would solve the above problems. We set our goals as follows.
At the time of this writing, we can input types and instances of the major Topic Map elements (topic, association, and occurrence) through the interface generated using JSP and XSLT. We can also create XTM syntax-based Topic Maps based on the input data, which is viewable with Ontopia's Topic Map navigation tool, Omnigator. The structure of the tool is as follows,
![]() |
figure 8: Structure of system |
In designing the authoring tool, we considered the following features.
First, in order to understand XTM data model, we designed the tool so that we simply created fields necessary for input items along the XTM structure. In the next phase, we plan to discuss how to enable one with little knowledge about the Topic Map syntax to build Topic Maps with input work as little as possible.
![]() |
figure 9: Sample screen of the authoring tool |
The authoring tools we have developed in this research are admittedly very primitive. At this stage we have not evaluate it quantitively. And yet, there are a number of benefits we have noticed, based on our observations. The benefits are as follows.
We have also identified issues to be addressed
The Topic Map system allows users to create data maps outside and independent of data resources and navigate freely between various points of view. We have an impression that the Topic Map technology has a significant potential. As for cultural resource data, Topic Maps are expected to benefit its users in many ways, including:
On the other hand, we see the need for further development and improvement with Topic Map system and Topic Map tools. Indeed, a pile of issues exists because of it's high potential. We will show "the ten issues." We will discuss such issues and possible solutions in the following section.
If topics had attributes and methods, just as objects do in the world of the object-oriented technology, they would have more power to represent themselves better. For example, if such attributes as "size", "color", "singing voice", "habitat" and "diet" are attached to a topic "bird", the information that the topic carries becomes richer.
Currently, only "SuperClass-SubClass" and "Class-Instance" are available as common association types. It would be beneficial to users to add more association types, such as "Whole-Part" that represent common relationships between topics.
Man ------------------------- Woman (Class)
Husband-Wife
Parent-Child
Other Types
Genji ----------------------- Kiritsubo-no-koi (Instance)
Parent-Child
To illustrate, consider association types used for the relationships between "Man" and "Woman". Currently, it is not clear that association types defined for these "Class" topics (Husband-Wife, Parent-Child, Other Types, in the above case) are all inherited or not automatically by the "Instance" topics, even though the correct association type for the relationship between Genji and Kiritsubo-no-koi is Parent-Child only. The mechanism that enables users to actively control what to inherit and what not to is needed.
One of the major new trends that have been brought by growing networks is personalization. Today one can have a glimpse of the trend toward personalization in many areas, particularly in business. One-to-one marketing and on-demand publishing are just a few such examples.
Meanwhile, with information technology more advanced and sophisticated, knowledge sharing over the network has been one of the hot topics in our society attracting much public interest and discussions. On-line libraries and on-line museums, the classic case of knowledge sharing over the network, are expected to be available soon.
In our opinion, however, those systems that archive and manage massive information resources for the public should be the first to apply the concept of personalization. To supply information tailored to individual needs and to personalize databases from users' point of view are the key to the effective use of information.
In addition to the traditional database building with administrators' logic, the mapping of electronic documents from users' perspective will be critical in the future. The goal should be the restructuring and reorganization of information resources on the semantic and metadata level, with the use of new paradigms for information resource management such as Topic Maps.
This restructuring, however, calls for published subjects to be developed for each individual domain of knowledge and then evaluated for validity. Expertise in knowledge information, information technology and artificial intelligence is necessary for successful development. Involvement of experts in each domain of knowledge and study of information user behavior are also indispensable.
The next step will be to explore ways to restructure information resources on the semantic and metadata level. The ultimate goal will be to complete the standardization of topics for each domain of knowledge and make the restructured information resources available to users as semantic networks. This research is a small step toward the final destination.