Archival Tools & Systems

♦ e-RECORDS DIGITĀLAYA (ई-रिकार्ड डिजिटालय): Electronic Records Management and Archival System
C-DAC has developed an Electronic Records Management and Archival System named e-RECORDS DIGITĀLAYA for the preservation of word processing documents, postscript documents, spread sheets, e-mails, images, scanned documents, presentations, text and xml documents in wide variety of file formats. e-RECORDS DIGITĀLAYA provides a searchable database of record retention schedules and archival strategies as per the file format and preservation duration. It demonstrates the model implementation of e-GOV-PID Preservation Metadata Standard already notified by MeitY. In this system, the electronic records management includes processes such as identification, archival, migration, disposal and transfer of eligible e-records to trusted digital repository.

e-RECORDS DIGITĀLAYA allows the record depositors to manage the access rights for e-records and encrypt the private records for confidentiality. Integrity and authenticity of e-records can be ascertained by using the digital evidences maintained by the archival system. It allows the authorized users to search, retrieve, view and download the electronic records as per the access rights. This system complies with the requirements specified in ISO standards related to records management, archival and digital repository audit.

  e-RECORDS DIGITĀLAYA (ई-रिकार्ड डिजिटालय) Brochure
Main features are briefly outlined as under-
  • Client-server architecture with user types like e-Record Depositor, Record Officer, Archivist, Director, Archive Administrator
  • Login name, password and biometric authentication for information security
  • Built-in searchable database of record retention schedules for government organizations
  • Archival strategies based file formats and preservation duration, migration into open and standards based file formats, record of migration history
  • Deposit of individual, bulk records and batches of digitized records
  • Model implementation of e-GOV-PID preservation metadata standard notified by MCIT
  • ERM procedures like identification, archival, migration, disposal and transfer to trusted digital repository
  • Archival fonds and storage management
  • Access control as per the access rights definitions for each e-record, encryption of private records for confidentiality
  • Access module includes search and retrieval as per the access rights, provision of extracts of e-Records with secure sharing of documents
  • Preservation of digital provenance, integrity, evidences of e-records as per the IT ACT
  • Complies with the requirements specified in ISO standards on records management, archival and digital repository audit.
  • Dashboards, graphical reports, audit logs
DIGITĀLAYA framework is adopted / customized to meet the preservation requirements of the following organizations where the pilot digital preservation initiatives are started.
  • Stamps & Registration Department, Andhra Pradesh
  • National Archives of India, New Delhi
  • Indira Gandhi National Centre for Arts, New Delhi
♦ SANSKRITI DIGITĀLAYA (संस्कृति डिजिटालय): e-Library and Archival System
SANSKRITI DIGITĀLAYA (संस्कृति डिजिटालय): e-Library and Archival  System is a dedicated archival solution for cultural data preservation. It supports audio, video, image and document file formats. The metadata support includes MARC21 and Dublin Core metadata standards. SANSKRITI DIGITĀLAYA is developed with support for audio video streaming and object based storage.
Main features are briefly outlined as under-
  • Multi-institutional/organization data management system
  • Supports Marc21, Dublin core  metadata cataloging standards
  • Collaborative framework for metadata enrichment
  • Data tracking module to track complete cycle of the data
  • Templates(SIP, AIP, DIP) to define preservation policies
  • Generates searchable PDF and EPUB
  • Supports Object Based Storage(OBS)
  • Virus scanning module does scanning of all the incoming SIPs
  • Supports streaming for audio/video content to enable efficient public access
  • Audio/Video timecode marking
  • Configurable scheduler for periodically ensures integrity of AIPs
  • Comprehensive ingest process
    • Extraction of technical metadata
    • Generate cataloging metadata
    • Generate dissemination content (jpeg for images, mp3 for audio, mp4 for video)
    • Generate fixity
  • Ingest failed and validation failed register to track the failure records
  • Workflow to reprocess the failure records
  • Archival Register to find all archived records for preview (content and metadata), publish(to portal) and  withdraw(from portal)
  • Record distribution module through which distribute records to all users to balance the workload
  • User wise dashboards
  • Record and storage statistics
  • Notifications
Built-in Public Access Module
  • Search options like basic, advanced and command search
  • Supports full text searching
  • Filtering options on various fields like organization, author, publisher, date etc.
  • Search within search functionality to further drill down the search results
  • Supports streaming for audio/video content to enable efficient public access
  • Browse records collection wise
  • Book View functionality to get the feel of reading the actual book to the user 
  • Provide content (book, audio, video) preview with directly jump on specific content using page tagging and audio/video timecode markings
DATĀNTAR (डेटांतर): e-Records Extraction Tool
C-DAC has developed an e-Records Extraction Tool named DATĀNTAR for automatic extraction of preservation metadata in compliance with eGOV-PID Standard and for extracting the electronic records stored in the database of an e-governance system. It allows the user to connect with e-gov database, upload and map e-record schemas with database, map preservation metadata as per eGOV-PID standard,  schedule extraction of e-records, schedule transfer of e-records for digital preservation using the Open Archival Information System (OAIS), capture migration history and maintenance of batch logs. DATĀNTAR has been deployed for extracting the registered documents stored in the database of Computer Aided Administration of Registration Departments (CARD), Hyderabad. In this pilot implementation around 25 Lakh documents with preservation metadata have been successfully extracted for preservation.

A separate version of e-Records Extraction Tool is developed for e-district pilot. It has been tested using the sample database comprising of 10,000 records provided by UP e-District.
Main features are briefly outlined as under-
  • Login name, password and biometric authentication for information security
  • Connect with e-gov database
  • Upload e-record schemas for mapping with e-gov database
  • Mapping of preservation metadata as per eGOV-PID standard
  • Schedule the extraction and transfer of e-records to eGOV-DIGITĀLAYA for archival
  • Captures e-record conversion / migration history and digital evidences
  • Maintains batch logs of entire extraction process, marks the failed e-records
♦ e-RUPANTAR (ई-रूपांतर): Pre-archival Processing Tool
Huge amount of digitization has been already done by several organizations which does not follow the best practices essential for long term preservation. Therefore, the pre-archival processing tool called e-RUPĀNTAR has been developed in order to help in getting the digitized records in acceptable form as per the information science and preservation best practices. This software provides a collection of digital preservation best practices such as policy based and batch assignment of Unique Record Identifiers, batch cropping, image enhancement, image watermarking, OCR, structural tagging, conversion to PDF/A and creation of Submission Information Packages (SIPs) as per the OAIS standard. It also provides user management and registry of URIs.

Main features are briefly outlined as under-
  • User management
  • Centralized admin/ settings for all best practices
  • Enforcement of best practices in multi-user environment
  • Integrated and customizable workflow with best practices
  • Saving of workflows based on the type collection / data for re-use
  • Content organization
  • Batch image cropping with shortcuts
  • Image enhancement features
  • User defined URI policies and schemas
  • URI schemas in XML for name validation
  • Uniform definition of classifications and sub-categories for URIs
  • Batch assign Unique Record Identifiers
  • Automatic / Selective OCR XML output
  • Structural metadata (Tagging) (single page and double page) for navigation
  • Access Image conversion
  • Image Watermarking
  • PDF/A-1b Conversion
  • Reverse a step partially or completely
  • Maintain registry of URIs with fixity
  • Validate content against URI
  • Define collection wise process flows
  • Production of valid SIP acceptable to OAIS
New exploration
  • Automatic tagging (maximum accuracy based the quality of text) with manual verification and enhancements
♦ META- PARIVARTAN (मेटा-परिवर्तन): Any To Any Metadata Conversion Tool
The digital repository managers, archivists, librarians need to convert the existing metadata into another metadata format for various purposes. It is very difficult to get readymade tools that can help in mapping the metadata schemas and convert large volume of XMLs. There is plethora of metadata schemes being designed and used in various fields. 

Therefore, META-PARIVARTAN software tool has been developed to enable large scale conversion from one Metadata format into another metadata format such as MARC21 to Dublin Core or e-Gov PID to Dublin Core, METS to MODs, etc. This tool is completely metadata neutral and flexible for usage. This tool can be used to integrate with any other Project or can work by itself and supports all versions of metadata formats. Using this library one can visually map one Metadata tags to another in automated manner (UI).
Main features are briefly outlined as under-
  • Any Metadata to any Metadata conversion based on available schema
  • Automatic namespace detection
  • Provision to manual addition of namespace
  • Tree representation of schema elements
  • Element to element mapping
  • Project wise mapping  and saving of multiple schemas
  • Automatic  XSLT code generation based on user actions
  • Supports most of XSLT functions and gives provision to customize them
  • Automatic generation of XSLT code 
  • XML Validation
  • Provision to add/edit XSLT code
  • Mapping XSLT file creation
  • Metadata conversion based on XSLT
  • Validation of input xml against schema
  • Conversion of source metadata into target metadata as per the mapping XSLT
♦ SUCHI SAMEKAN (सूची समेकन): Metadata Importing and Aggregation Tool
During digital preservation, just as the metadata stored in database need to be extracted in XML form, it is also necessary to import XML records back in the database. Therefore, a software tool named SUCHI SAMEKAN (सूची समेकन) is developed for importing and aggregation of metadata.

Main features are briefly outlined as under-
  • Parsing of standard metadata XML and importing it into the database 
  • Supports various kind of standard metadata like Dublin core(DC), Extended Dublin core (ext-DC), Open Archives Initiative Dublin core (OAI-DC), Machine-Readable Cataloguing (MARC21).
  • Mapping of any kinds of database with Dublin core metadata elements (15 elements) as generic fields and other more specific fields as specific facets.
♦ DATA HASTĀNTAR (डेटा-हस्तांतर): Data Encryption and Transfer Tool
The data depositors normally copy the data on DVD, Flash Drive or any other storage media and send it for archival purpose by courier or by hand or through network. Due to unavailability of any secure means, the data is openly sent wherein it is vulnerable to data theft and unauthorized access. Therefore, DATA HASTĀNTAR (डेटा-हस्तांतर): Data Encryption and Transfer Tool has been developed which helps the data depositors in encrypting the data package before transferring it for archival.
Main features are briefly outlined as under-
  • Data packaging
  • Package encryption
  • Register of encrypted packages
  • PDF report of encrypted packages

The data encrypted by DATA HASTĀNTAR (डेटा-हस्तांतर) Tool is decrypted by e-SANGRAHAN (ई-संग्रहण) Tool at the time of e-acquisition.

♦ e-SANGRAHAN (ई-संग्रहण): e-Acquisition Tool
IIn case of manual acquisition of physical documents, the archives have to maintain a physical register / inventory of the document received from any organization. On the contrary, when digital data is received the archives are not in a position to manually register the list of millions of files stored on the digital media.

Therefore, in order to address this requirement, e-SANGRAHAN (ई-संग्रहण): e-Acquisition Tool has been developed, which receives digital data in offline storage media like DVD, flash drives, tapes and automatically extracts the list of contents received in the package and sends an acknowledgement report in PDF format through e-mail to data depositor. Data deposits are as important and valuable as financial deposits for the data depositors. Therefore, it extremely important capture all evidentiary details of the entire e-acquisition process. The software automatically extracted technical properties of storage media such as serial number, make details, volume name, storage capacity, actual data size, file count, etc. Therefore, the software is also supported with web camera interface for capturing the external details of storage media. e-SANGRAHAN Tool maintains an e-acquisition register with search and retrieval for ensuring consistency between the received data and archived data.

Main features are briefly outlined as under-
  • Webcam interface to photograph the storage media label
  • Extract media properties
  • Copy the contents on acquisition storage
  • Decrypt the data package
  • Extract the list of contents on transit location
  • Generate PDF report of e-acquisition
  • Combining a single report of data package divided on multiple storage media
  • Send e-acquisition report to depositor
  • Maintain e-acquisition register
  • Search retrieval within e-acquisition register
♦ ABHILEKH DIGITĀLAYA (अभिलेख डिजिटालय): Government Archival System

It is an adaptation of OAIS for the usage of government archives involved in managing huge volumes of paper and microfilm based records which are being digitized.

  • Receive SIPs from e-Rupantar tool
  • Linking an integration of legacy database
  • Mechanism to correct errors and remove repetition in legacy database
  • Automatic record matching suggestions for mapping a digital object with metadata
  • Metadata enrichment forms, metadata entry form and collaborative framework for record creation
  • Mechanism for SIP distribution
  • Support for automated batch ingest and manual ingest both
  • Dashboards to display current activities like notifications, user activities, record statistics etc.
  • Activity dashboard for system activities details Hard disk & RAM utilization, AIP, SIP, user accounts etc.
  • User management for creating and managing users in archival system
  • Main accession register for listing AIP in archival and record management tasks
  • Scheduling and manual checking of integrity of selected or all AIPs in archive
  • Work reports based on various filters like specific user, department, ministry etc.
  • Transfer of AIPs to LTFS storage medium
  • Security measures like biometric authentication, single device user login, single browser, single user login
  • User statistics for Logged in user, user browsing statistics etc.
  • Content generation for Dissemination Information Package (DIP)

Public Access Module

  • Public access to Dissemination Information Package (DIP) of Archive on web portal
  • Metadata search and retrieval for public, private and cartographic records
  • Request for downloading of DIP by web user
  • Fuzzy search, faceted, exact match on catalogue metadata and digital archived data
  • Record display feature consisting metadata, images and PDF
  • Access of record pages based on indexing metadata
♦ Records Reporting System(अभिलेख सूचना प्रणाली)
Records Reporting System (अभिलेख सूचना प्रणाली) is an online system for reporting of transfer of records to National Archive of India by various ministries and government departments.
Main features are briefly outlined as under-
  • Digital transformation of forms notified by Public Records Rules, 1997.
  • Automation of the process of transferring of records from ministries to archive.
  • Multi-parameter sorting of records based on criteria like ministry, department, year, form types, etc. 
  • Online generation of record transfer list, reacquisition slips, destruction list, etc in compliance with Public Records Rules, 1997.
  • Searching and sorting based on year and user of the submitted online forms.
  • User management, user verification, notifications by email.
♦ eGOV DIGITĀLAYA (ई-गव्ह डिजिटालय): Migrated Data Authentication and Access Portal
eGOV DIGITĀLAYA portal is developed and deployed for authentication of migrated records by various sub-registrars in Andhra Pradesh. Presently 67 Sub-registrars from various districts of AP have authenticated around 25 Lakh records using this system.
Main features are briefly outlined as under-
  • Sub-Registrar can register and upload digital signature certificate in the eGOV DIGITĀLAYA portal.
  • Sub-Registrar can access the archived records in eGOV DIGITĀLAYA portal through intranet
  • Compare the source document with the migrated document and then affix the signature based on access rights
  • Supports batch signing of converted PDF/A documents using e-Tokens through intranet
  • Verification and preview of digitally signed converted documents 
  • Provision to download documents with its metadata 
  • Full text search and retrieval based on document identifier, Keyword, Document Sort with major parameters and OCR text
  • Bookmarking and add to favorite searched documents
    The archival process is completed only after the migrated document in PDF/A-lb format is digitally signed by the sub-registrar through network 
  • Migration history maintains the digital evidences of all processes like extraction, conversion and re-authentication
♦ PDF/A-1b Converter Tool :
Most of the e-governance projects as well as archiving institutions have to produce documents in PDF format. However, the scanners and PDF libraries used by e-Gov application developers tend to produce proprietary PDF documents which are not recommended for preservation purpose. Either they have no awareness of PDF/A-1b format which is an ISO 19005 open archival format meant for long term preservation or even if they are aware they are not able to produce / convert into PDF/A-1b document. Therefore, PDF/A-1b Converter Tool has been developed.

Main features are briefly outlined as under-
  • PDF to PDF/A-1b conversion (scanned or born digital both)
  • TIF to PDF/A-1b conversion (single/ multiple and multi-page TIF both)
  • JPG to PDF/A-1b conversion (single and multiple JPGs both)