Summary: |
Organizations are increasingly relying on databases as the main component of their record keeping systems. However, at the same pace the amount and detail of information contained in such systems grows, also grows the concern that in a few years most of it may be lost, when the current hardware, operating systems, database management systems (DBMS) and actual applications become obsolete and turn the data repositories unreadable. The paperless office increases the risk of losing significant chunks of organizational memory and thus harm the cultural heritage.
Significant research has already been conducted. The conclusions discard approaches now considered naive like trying to preserve especimens of the machines, system software and applications, in all their main versions, so that the backups of every significant system could be used whenever needed. A variant of this, instead of preserving the hardware, suggests simulating the older hardware in newer machines. More promising research suggests the conversion of database contents into an open neutral format with a significant amount of semantics associated (XML dialects), so that it becomes independent of the details of the actual DBMS.
This project stems from this principle but tries to go a step further based on the following observation: there is a paralel in the attitude of a data warehouse designer approaching a database-centered operational information system (IS) to specify a data warehouse (DW) and an archivist analysing a document-centered organizational IS to specify an archiving policy and system. Both search an integrated model of the organization, merging information from a diversity of sources, systems and technologies, both have a process-centric methodology, specifying data marts or classifying related series of documents, both have long-term validity and integrity requirements, both have an evaluation attitude, leaving out irrelevant details in the data or in the series of documents to concentra |
Summary
Organizations are increasingly relying on databases as the main component of their record keeping systems. However, at the same pace the amount and detail of information contained in such systems grows, also grows the concern that in a few years most of it may be lost, when the current hardware, operating systems, database management systems (DBMS) and actual applications become obsolete and turn the data repositories unreadable. The paperless office increases the risk of losing significant chunks of organizational memory and thus harm the cultural heritage.
Significant research has already been conducted. The conclusions discard approaches now considered naive like trying to preserve especimens of the machines, system software and applications, in all their main versions, so that the backups of every significant system could be used whenever needed. A variant of this, instead of preserving the hardware, suggests simulating the older hardware in newer machines. More promising research suggests the conversion of database contents into an open neutral format with a significant amount of semantics associated (XML dialects), so that it becomes independent of the details of the actual DBMS.
This project stems from this principle but tries to go a step further based on the following observation: there is a paralel in the attitude of a data warehouse designer approaching a database-centered operational information system (IS) to specify a data warehouse (DW) and an archivist analysing a document-centered organizational IS to specify an archiving policy and system. Both search an integrated model of the organization, merging information from a diversity of sources, systems and technologies, both have a process-centric methodology, specifying data marts or classifying related series of documents, both have long-term validity and integrity requirements, both have an evaluation attitude, leaving out irrelevant details in the data or in the series of documents to concentrate on the essential, both want to build an archive which remains basically unchanged, except for the addition of newer data or documents, and both want to expose the respective information contents in a simple and systematic way. Of course there are differences, first of all in the respective goals. The DW designer usually tries to answer the information needs of the organization management from the point of view of decision support, monitoring, trend analysis and forecast, while the archivist wants to preserve the memory of the organization and its processes, for future generations. So concrete decisions on evaluation and elimination procedures may differ, according to the specific requirements, but the general working framework seems similar.
Following this basic intuition, the goal of the project is to explore the adequateness of the DW approach as a target vehicle to perform, with respect to a given IS, the functions considered essential from an archivistic viewpoint like appraisal, classification, elimination, description, access while respecting properties like authenticity and integrity. |