What is personal Information Organizer (pim)?
Soon after trying to use the Web, every user encounters the same problem -- finding a way
of archiving the locations of useful resources for future reference. This
was recognized early in the design of Web software, so that all browser contain some
mechanism for storing lists of interesting resources. Such lists are known as hotlists,
or bookmarks. Originally such lists were simple flat files. More recently,
browsers have supported hierarchical lists, such that items on similar topics can be
grouped into a folder, a subfolder under a folder, and so on. This allows for increased
flexibility in the storage of bookmarked entries.
The names or titles chosen for each of the entries is, by default, taken from
the content of the TITLE element in the HTML document being archived. Thus to a
large extent the cataloging of the entries is determined by the author of the document,
and not by the archiver of the bookmark. Users can modify the titles associated with their
bookmarks, but my observations show that this is only rarely done.
Folder titles must be selected by the user: in general, users choose folder titles that
associate well with the folder content.
These methods of archiving work well, provided the lists do not grow too large or to
stale in the user's mind. When the lists get very large (greater than 50 or so items),
traditional retrieval problems start to occur -- the user knows that a URL was recorded,
but cannot find it. In addition, the user will often add a bookmark for a resource that
already exists in the bookmark collection, having forgotten where the original entry lay.
Finally, the user may enter two bookmarks for the same collection, but referencing
slightly different locations (e.g., one referencing the Table of Contents, the other the
Introduction). To summarize, the possible problems are: categories:
- The user cannot remember the TITLE of the desired resource.
- Some archived objects, such as images, FTP, or mail URLs, do not have a TITLE.
The default is to use the URL, which is not terribly informative.
- The user cannot remember under which folder the item was stored.
- The user thinks the item was stored under one folder, but in fact it is in another.
- The user has entered duplicate bookmark entries for the same resource, as he has
forgotten about (or can't find), the earlier entry.
- The user has entered similar bookmark entries for the same resource -- for example,
entries pointing to the Table of Contents, or Introduction, of the same collection.
- The link is no longer functional, because the original document has been deleted or
moved.
- The link is no longer relevant, as the target resource has changed, and is no longer
related to the original archived resource.
Finally, there is a semantic problem associated with the very idea of an hierarchical
bookmark list. Many entries do not belong in a single place in the hierarchy, but rather
in multiple locations. Thus it would be nice to find another way of storing that provides
a better organizational model, along with a better interface for browsing or searching the
bookmarks collection.
The above information is straightforwardly added to a bookmark database -- the hard
part is the semantic structuring of the information. This must be done in a way that
reflects the meaning associated to the bookmarked entry by the user, while the the interface
by which entries are encoded into this index must be simple, as otherwise it will not be
used.
What information is there to work with? We really have two things:
- The document text content
- User selection of some parameters
There are several ways these can be processed.
- -The Document Content -- Intelligent Software
- Determines the structural type of the document -- (resource list,
text-based material, mostly graphics, FORM interface to tool). This could be based on text
content, as well as information in the document head (LINK and META
elements).
- Determines and extracts document keywords -- the s/w could look through the
document and locate important keywords, and use these to index the content.
- Correlates the text with pre-defined categories or keywords -- The index may have
predefined categories and/or keywords for indexing purposes, and the software could test
the document against these, and choose appropriate categories.
- -User Selection of Parameters
- User selects arbitrary keywords and categories -- not very good, as the user is
unlikely to do it, and the results are not well organized.
- User selects keywords and categories from predetermined list -- easier to do, but
the user must also be able to add categories when necessary.
TextIndexer helps to storage, categorized and sort web pages.
It helps you storage search words and results of searching.
- Link to Internet addresses within articles, including e-mail addresses and local files.
- Hot links to any web site or even to local files.
- Browses WWW using default or user defined browser.
- Storing Internet search words sequence.
|