Summary: |
The manipulation of large quantities of information, once a task for specialists, is currently required of people in all areas, not only in their professional life but in most of everyday activities. Quality in information retrieval becomes crucial for many professions, and better systems can have a large impact, especially for collections of heterogeneous documents.
Web search has been an excellent ground for large-scale information retrieval and massive automatic indexing. State-of-the-art search engines show a good performance in the retrieval of text documents.
The notion of document is however changing, as more complex items are being manipulated. Multimedia documents are commonly created by ordinary applications and audiovisual information needs to be managed both at huge production sites and in home archives.
In current retrieval systems we must rely on the query words for obtaining relevant documents. As the document collections are indexed based on textual content, documents with no text become invisible. A better system requires indexing on the visual and audio content and more elaborate extraction of document meaning. The results of a query on "university sports activities" should include a video record of a university tournament and a research paper on the impact of sports in student success. The retrieval system must be able to process documents and create descriptions that can relate these items, using words, underlying ontologies and the analysis of audio and visual items.
To go beyond text retrieval, the analysis of document components such as audio or image segments is required. The Metamedia prototype (FEUP) currently incorporates the extraction of audiovisual features and the automatic association of the corresponding descriptors to the document. Further work is required on audiovisual extraction, in order to create more expressive descriptors. Ontologies will be used at this point: both the inclusion of audiovisual descriptors in ont |
Summary
The manipulation of large quantities of information, once a task for specialists, is currently required of people in all areas, not only in their professional life but in most of everyday activities. Quality in information retrieval becomes crucial for many professions, and better systems can have a large impact, especially for collections of heterogeneous documents.
Web search has been an excellent ground for large-scale information retrieval and massive automatic indexing. State-of-the-art search engines show a good performance in the retrieval of text documents.
The notion of document is however changing, as more complex items are being manipulated. Multimedia documents are commonly created by ordinary applications and audiovisual information needs to be managed both at huge production sites and in home archives.
In current retrieval systems we must rely on the query words for obtaining relevant documents. As the document collections are indexed based on textual content, documents with no text become invisible. A better system requires indexing on the visual and audio content and more elaborate extraction of document meaning. The results of a query on "university sports activities" should include a video record of a university tournament and a research paper on the impact of sports in student success. The retrieval system must be able to process documents and create descriptions that can relate these items, using words, underlying ontologies and the analysis of audio and visual items.
To go beyond text retrieval, the analysis of document components such as audio or image segments is required. The Metamedia prototype (FEUP) currently incorporates the extraction of audiovisual features and the automatic association of the corresponding descriptors to the document. Further work is required on audiovisual extraction, in order to create more expressive descriptors. Ontologies will be used at this point: both the inclusion of audiovisual descriptors in ontologies and their combination with domain ontologies will be explored.
A second component is dialog management. The retrieval task can be significantly improved by gathering information from the user interaction and analyzing the dialog to extract user intentions and plans. Retrieval is an intrinsically imprecise task and therefore this line has to be complemented by appropriate evaluation procedures and tools.
A third line of research is the refinement of the database model to encompass the association of metadata to objects at different levels, the compliance with audiovisual standards and the use of heterogeneous descriptors in the computation of similarity measures for retrieval. Audiovisual descriptors are commonly multi-dimensional and quantitative; the similarity measures required in retrieval open a large ground for new approaches. |