PANDORA Digital Archiving System (PANDAS)
The PANDORA Digital Archiving System, known as PANDAS was developed by the National Library following an unsuccessful attempt to find an off-the-shelf system (or systems) to provide an integrated, web-based, web archiving management system.
The need for such a system was evident as the scale of the Library's archiving activity increased and if the best possible efficiencies were to be achieved in building a collaborative, selective and quality assessed web archive. It was also necessary to enable other PANDORA participants to contribute to the Archive from various geographic locations.
PANDAS was first implemented in June 2001. A much enhanced version, PANDAS 2, was released in August 2002. PANDAS version 3, a completely reengineered and enhanced version of the software was deployed on 27 June 2007. This remains the current production system although there have been many incremental enhancements since it was first deployed.
Workflows
PANDAS was designed to support the workflows defined by the staff of the National Library's Web Archiving Section, and also adopted by the other PANDORA participants. These workflows include:
- identifying, selecting and registering candidate titles;
- seeking and recording permission to archive;
- setting harvest regimes;
- gathering (harvesting) files;
- undertaking quality assurance checking;
- initiating archiving processes; and
- organising access, display and discovery routes to, and metadata for, the archived resources.
Functions
PANDAS supports these work flows by means of the following functions:
- the management of administrative metadata about titles that have been either selected for archiving, rejected, or are being monitored pending a decision;
- the management of access restrictions;
- the scheduling and initiation of the harvesting of titles selected for archiving;
- the management of the quality checking and assurance process and associated problem fixing;
- the preparation and organisation of archived instances for public display through title entry pages, and title and subject listings; and
- the provision of defined management reports.
Manuals
For further information on how PANDAS supports these functions, refer to the PANDAS Manual .
Persistent identifiers
PANDAS assigns a system generated running number to each title when it is registered. This number becomes part of the persistent URL applicable to each archived site's title entry page and access point for PANDORA content in the Australian Web Archive. These persistent URLs are in the form: http://nla.gov.au/nla.arc-12345.