Digitization
There are many great reasons to digitize departmental and organizational records. Digitization may consolidate disparate files, improve discoverability, (e.g. searches), and mitigate loss of information due to deteriorating physical originals. This page includes guidance on best practices before, during, and after any digitization effort.
Considerations prior to beginning any digitization project
- Have I documented a digitization project plan for present use and future reference?
- Do I have a good inventory of the items I want to digitize?
- Do I have appropriate resources (staff, time, equipment) to digitize these items?
- How do I want to organize the digital files? Do I have a good naming system and organizational structure that anyone could understand?
- Where will I store the digital files? Do I have a long-term plan for storing the files? Do I have a plan for back-ups and digital preservation?
- Where will I store my originals once they are scanned? Do I want to donate them to the Archives?
The archivists within Archives and Special Collections are experts at digitization and preservation of materials. Before the onset of any digitization project, please take the opportunity to discuss your plans with an archivist. Below are functional aspects to consider when undertaking any digitization project: digitization parameters, formats, and metadata.
Optimal Digitization Parameters
Resolution refers to the sharpness or clarity of an image, or the ability of an imaging system to resolve fine detail. Original items should be scanned with long-term care, use, and access in mind in order to preserve content at archival quality. Not adhering to archival scan standards may result in a need to rescan items or loss of information.
Digital files should match the look of the original as closely as possible. Color originals should be digitized in color; pages yellowed over time may be digitized in grayscale.
The scan resolution is dependent upon the size of the original. A fixed number of pixels in an image might display adequately in a 2”x 2” passport photo, yet will not display effectively at 8” x 10”, as those same pixels are inadequate to render at the increased size. The result is an image of poor quality that may be blurry with details difficult to distinguish. The resolution of an image is measured by dpi or ppi, dots or pixels per inch. All scanning/imaging equipment and accompanying software offer high resolution options.
Archives and Special Collections has set standards for the minimal resolution needed in scanning various types of media. Situations may require digitization at higher resolutions.
PHOTOGRAPHIC PRINTS
- Size up to 3.5″ x 5.0” – 1200 dpi
- Sizes above 3.5″ x 5.0” – 600 dpi
NEGATIVES
Some scanners do not possess negative scan capability. Before purchasing any equipment confirm its capabilities.
- 35 mm negatives or slides – 1600 dpi
- Negatives up to 3.5×5” (excluding borders) – 1200 dpi
- Negatives sizes above 3.5×5” (excluding borders) – 600 dpi
TEXT
All text should be scanned at a minimum of 400 dpi and a maximum of 600 dpi.
Format Recommendations
While there are many formats options available, only a handful are considered to be sound archival options that will still be accessible into the future. Many common file types, such as mp3 and JPEG, compress information to reduce files sizes. While this is good for quick access and lower file storage needs, the file compression may increase the possibility of a corrupted file over time. Best practice dictates that the first scan, or preservation file scan, be completed in an archival format. A secondary, or access file, can then be created from the preservation file scan. The most widely-accepted preservation file scan format is a TIFF file. Access formats include JPEG, PDF, and many others.
Scan your images or text to TIFF files. Audio should be reformatted to WAV and video to MOV or MP4. Then choose the one or more access options to best suit need.
For more on best formats for digitization visit the complete Archives and Special Collections file format recommendations page.
Metadata
Information about the item digitized, known as metadata, is essential to understanding the content and context of the item. Without knowing the date, subject, or persons captured in an archival item the file may become less or not at all useful. Capture metadata in order to ensure that digital files keep their background details. Content to capture may include the following:
- Title
- Description
- Subject(s)
- Creator(s)
- Date and/or Date Range of Original
- Marginalia
- Size of Original Item (Height, Width, Depth)
- Collection Title/Record Series Title/Folder Title
- Digital Specifications (scanner/camera, resolution, date digitized)
- File name
This is a basic list of metadata data points; there are many types of metadata.
Summary
Before beginning any digitization activities, review and prepare the physical arrangement of the materials. A good file arrangement often translates into a useful starting point for creating digital file structures and unique file names and identifiers.
Digitization takes time and resources. The total volume, formats, equipment, and detail of metadata all influence the length of any digitization project.
It is important to ensure the metadata, digital file, and physical original can all be related to one another; to do so, use the file name or another unique identifier to maintain that relationship.
Consult with the University Archives before destroying any records regardless of digitization plans or completion. The Archives is happy to answer any questions and help preserve the history of the university.