I first proposed founding the Last McCoy Library in a paper written for my class in Digitization. It outlined my goals for the project and all the specifications I hoped to use for the preservation and digitization of my family's oldest photographs. It was submitted in June of 2020. The timeline obviously fell behind, but other than that, the original project holds true to what I hope to accomplish. Below is the original proposal.
Digitization Proposal: The Last McCoy Library's Aging Family Photographs
by Jaylyn McCoy
University of Denver
LIS 4820, Section 1
Professor Krystyna Matusiak
June 11, 2021
The Last McCoy Library Collections Overview
The Last McCoy Library currently consists of six collections that capture the life of the McCoy family from 1866-2000. Family members have passed down photographs, letters, and ephemera through the decades until they came into the possession of the current curator, who is the last remaining child that carries the McCoy surname. Inventory of the items remains an ongoing process, but the library currently consists of 811 photographs, 65 books, 34 newspaper clippings, 21 letters, 13 certificates/forms, 4 postcards, 2 telegrams, and 54 pieces of ephemera. Many of the items originated more than a hundred years ago. They have been stored in dubious containers, such as vintage suitcases, and left to their own devices until now.
Project Overview and Scope
The following proposal outlines the first digitization effort of the Last McCoy Library (LML). The proposal focuses exclusively on photographic materials contained within LML, specifically those originating before the year 1945. The scope of this project is limited to photographs in this category, equaling 605 photographs. The proposal will cover the digitization and preservation of the physical originals as well as the preservation and online display of the digital copies. The original photographs range in size from 3.4-inch cabinet cards to 11x14 inch portraits. Workflows will prioritize older photographs much in need of preservation, specifically those originating before the year 1900.
Project Goals
The project hopes to achieve two main goals. First, this project will preserve older materials within the library, ranging from 1866-1945. Preservation will occur for the physical items as well as the digital versions to ensure the family history remains intact for future generations. Second, this project will provide access to the digital collections for family members and interested parties. This will include a blog detailing the journey of the collection and the curator as well as an open-access digital library.
Figure 1 – Digitization Project Goals: Preservation, Access, Workflow Testing, and Skills Development
The project will also promote several secondary goals. The project will serve as the development and test run for a digitization workflow that LML will use to complete future projects. The end goals is to digitize photos with preservation needs first, but eventually to digitize the entire collection including ephemera. The project will also encourage the development of necessary skills for the curator to bring to future employment opportunities. These skills include utilization of free and open-access digitization practices, knowledge and practice of industry standards, experience with digitization and metadata processes, management of Digital Collection Management Systems, and project management capabilities.
Significance
The photographs selected for priority digitization include those with preservation needs. Due to poor storage, many of the photographs are in danger of deteriorating beyond recognition. If the photographs are to be accessible to family members and other interested parties, action must be taken to ensure their survival. LML also represents a convergence of artifacts from various branches of the McCoy family that cannot be found elsewhere. Items have been inherited from the McCoy’s, the Shepherd’s, the Cherry’s, the Sailer’s, and the Perry’s. In addition, these items have historical significance. The Jean Shepherd collection contains photographs of female officer training in World War II. There are many photographs from World War I as well. Both popular historic events. Even older photographs, though not documenting historic events, serve to capture the feeling of bygone eras and provide details of daily life.
Target Audience
The main target audience of LML remains McCoy family members and those of related families captured in the collections. However, the wide scope of time and place in the photographs allows for a larger audience. This includes history enthusiasts and amateur genealogists. The photographs were taken across New England and even abroad, allowing for interest from those local communities as well. The LML blog will focus on the process of digitization and preservation, thus drawing in an audience interested in such workflows including existing digitization and preservation professionals as well as students.
Resources
Because this project exists outside the institutional sphere, resources must be minimized as much as possible. Funding for the project comes from the curator alone, although family members may donate to the cause when solicited. For this reason, all resources used for the project will be free and open access when available. This will also serve as an opportunity for the curator to learn more about maximization of the budget and the capabilities of free resources.
To digitize LML’s aging resources, the curator will utilize as many free resources as possible. This includes scanners available through local resources such as the Arapahoe Library and the Westminster Law Library at the University of Denver. The Arapahoe Library system allows patrons to check out an Epson Perfection V550 Photo Scanner. It allows for 48-bit color scanning and 16-bit grayscale scanning with a 6,400 dpi resolution. The tray holds items up to 8.5x11.7 inches and allows for reflective and transparent materials (Epson America, Inc, 2021). The kit includes the scanner, cables, film holders, and software (Arapahoe Libraries, 2021). The item is in high demand so time with the equipment will be limited. Therefore, the curator will also utilize the BookEye 4 V1A overhead scanner at the Westminster Law Library. The BookEye 4 allows for 36-bit color scanning and 12-bit grayscale scanning with 600 ppi resolution. The tray holds items up to 18x24.4 inches (Image Access, Inc., 2018). To assist with digitization, a color target, ruler, and multiple flash drives will be necessary.
For image editting, the curator will use open access programs to reduce the costs of the project. Digital copies of the photographs will be edited using GIMP, which provides free and open-source high quality photo manipulation software (GIMP, 2021).
For online collection delivery, OMEKA will serve as the content management system. The project will start with the hosted version at the “Plus” level: 2 GB of storage, 2 sites, 20 plug-ins, and 8 themes at $35 per year (Corporation for Digital Scholarship, 2021). If further expansion is needed, the hosted version offers affordable increases. OMEKA allows for the creation of separate collections as well as exhibits. It requires the use of Dublin Core metadata. The functionality is basic, but many plug-ins have been created to enhance the capacity of the site. While other options exist for content management systems, OMEKA is the only one that provides a low-end pricing model that can be utilized by a single person. It also avoids a hefty initial fee that many systems demand while still delivering a useable and functional content management system.
The curator will invest time in learning both the capabilities of the freely available scanners as well as the finer technicalities of both GIMP and OMEKA. Training received in a Master’s of Library and Information Science will serve as a starting point that will be supplemented with online tutorials, time spent tinkering, and questions to knowledgeable sources.
Finally, LML will need physical preservation materials for the originals. This will include binders with preservation quality sleeves. The originals have thus far been stored loose in pieces of luggage, cardboard boxes, and plastic bags. Preservation resources will allow the originals to be properly maintained for future use.
Technical Specifications
Figure 2 – Technical Specification for the Digitization Project - Resolution at 4000 pixels on the long dimension, Bit Depth at 36-bit or 48-bit, Color Profile in full color, and File Outputs as Tiff or Jpg
This project focuses on delicate and one-of-a-kind photographs with a main goal of preservation. To this end, resolution will be done at the highest possible caliber. The resulting resolution will vary depending on the size of the photograph, but the goal will be to capture 4,000 pixels across the long dimension. This will depend on availability of scanners. Most of the photographs are black and white, but for preservation purposes they will be captured at 36-bit or 48-bit color scanning depending on availability of scanners. This will help capture sepia tones and handwriting on the versa of the images. Preference will be given for older photographs to be captured at the higher bit rate to ensure better detail in accordance with the Federal Agencies Digitization Guidelines Initiative (FADGI, 2016).
Resulting digital images will be saved as uncompressed TIFF files for the archival master. Service masters will be created as JPEG2000 with lossy compression. Service masters will be edited in GIMP to correct straightness, crop extraneous backgrounds, adjust color levels, and sharpen. From the service master, smaller online access files will be created. GIMP will be used again to lower resolution to 72 ppi and reduce the image size to 1024 pixels on the long edge. The reason for a smaller image size is two-fold: to take up less space on OMEKA and to prevent the copying of images from the site. Higher resolution images will be available upon request.
The naming convention for the digital images will follow the same pattern across all collections. All images will start with the three-letter institutional marker LML. This will be followed by a three-letter collection marker, such as JSC for the Jean Shepherd Collection. Finally, there will be four numbers that are unique to each item starting with 0001 and counting upwards. The resulting name will look as follows: LML_JSC_0001.jpg. An “s” will be added to the end for service masters and a “w” will be added to the end for online access files. All types of images will be stored in separate folders for Archival Master, Service Master, and Online Access Files.
Metadata Specifications
The OMEKA content management system requires users to utilize basic Dublin Core metadata. However, a plugin can be used to include Extended Dublin Core elements. This standard will ensure compliance with the Open Archives Initiative Protocol for Metadata Harvesting (Open Archives, 2015) and allow the collections to be easily accessible. Metadata will be provided for individual items. The customized Dublin Core Extended metadata will follow the template created by the curator in LIS 4010. This includes title, creator, date, description, language, subject, coverage, relation, type, format, source, publisher, rights, and identifier. The following standardized vocabularies will be used: LC Thesaurus for Graphic Materials (subject), Getty Thesaurus of Geographic Names (coverage – location), Library of Congress Primary Source Timeline (coverage – time), DCMI Type (type), and Internet Media Type (format).
Workflow
Photos will be selected from all collections based on their year of origin. After selection and planning, the photographs will be digitized. After digitization, the physical photographs will be placed in preservation sleeves arranged in collection level binders. Images will be quality controlled on a 100% basis as they are edited and corrected in GIMP. Metadata will then be created for each photograph. Finally, items will be ingested and tested in OMEKA. The process may then begin again on the next project. Several ongoing processes will coincide with this loop. The first is a survey that will go out to family members to ascertain their interests in the photographs. It will also solicit donations of both money to fund the project and items they may wish to contribute. Surveys will go out as more family members are discovered to ensure appropriate display and treatment of all items. The LML Blog will document processes and major events within the project and beyond. Maintenance of the website and preservation of the original photographs and digital images will also be ongoing processes in the workflow.
Figure 3 – LML Photograph Digitization Workflow - Planning and Selection then Digitization then Conservation of Originals then Quality Control then Metadata Creation and finally Ingest and Test. Ongoing processes include family survey, blog, maintenance, and preservation.
Preservation of Files
Because of the personal nature of the project, paying for a preservation level repository is not an option. Therefore, the preservation of files will be done through the free functions of Google Drive and Dropbox cloud storage. Google Drive provides 15 GB of storage for free to all users with an account. After that, payment options start at $1.99/month for 100 GB up to $9.99/month for 2 TB (Google One, 2021). Dropbox offers only 2 GB of storage for free. After that, the next level offers 2 TB of storage for $9.99 a month (Dropbox, 2021). Dropbox will be used for older photographs to take advantage of the higher security offered by the service. However, most of the photographs will be stored on Google Drive as there is more storage available for free. Storage space will be assessed as the project enters the quality control phase, but for this proposal it will be assumed that one of the plans must be purchased to hold all the original TIFF files for preservation purposes.
Other methods will be used to ensure preservation, keeping with the practice of LOCKSS – lots of copies keep stuff safe (Stanford University, 2020). OMEKA offers a hosted option for the content management system. This will serve as a virtual storage for the lower quality, online access images. Full size, high resolution images will be stored on the curator’s computer. External hard drives will be utilized as well, with one in Colorado and one in Virginia. Additional external hard drives may be purchased by family members so the curator can upload the library contents and ship the drive to them. These will serve as a final safeguard should anything happen to the cloud storage while also providing invested parties with access to photographs of their family history.
Project Timeline
Initial work will begin in mid-June to construct the survey so it can be emailed to family members for input. The selection process will then commence with a focus on older items that need preservation. Physical preservation supplies will also be purchased at this time to prepare for the next steps. The curator will document the process on the LML Blog. In July, digitization efforts will begin in earnest. This should take approximately 20 hours, but extra time has been allotted to account for scanner learning curves and transportation of materials. After digitization, photos will be arranged in sleeves and binders per collection.
Figure 4 – Digitization Project Timeline - June 2021: Family survey, selection, and blog creation, July 2021: Digitization and preservation, August 2021: Quality control and image editing, Autumn 2021: Metadata creation, and Winter 2021: Ingest, test, and launch of website.
In August, digital images will be edited and checked for quality. This will take approximately 20 hours, but extra time has been allotted to learn the finer points of the GIMP software. Starting in September, metadata will be created for the digital images as well as printed and kept with the original photos This will take approximately 100 hours. By Mid-November, the digital images should be ready for ingest into OMEKA. Once the OMEKA site has been tested, LML will officially launch before the new year. Time estimates were calculated using the Digitization Cost Calculator (DLF Assessment Interest Group, 2021).
Project Budget
The budget differs significantly from regular institutional projects as this project will be operated on a personal level by the curator. Therefore, the project has been scaled back significantly. Items that require purchase have been vetted for both quality and price point to allow for a balance between usability and affordability. The itemized budget follows in Table 1. Item names link to vendor websites for further information.
Table 1 – Digitization Project Budget
The project will also utilize free or existing resources to maximize the existing budget. These items include the locally accessible scanners, ruler, and editing software GIMP. The curator will donate time to complete the project, thereby providing free labor and skills.
Project Evaluation
The success of the project will be evaluated through four processes. The first is the quality control step in the workflow. The curator will record errors which require re-digitization during this process. The resulting statistics will be used to judge the success of the digitization stage of the project. As time increases, accuracy of digitization should improve. The second is a family survey. The first survey to family members will ask for input on areas of interest for digitization. The follow-up survey will ask for ratings of the OMEKA website as well as the quality of images and other aspects of the project, both quantitative and qualitative. The resulting answers will determine if the project has been a success in the eyes of interested parties.
Figure 5 – Project Evaluation Methods - Quality control, family survey, web presence visits, and photo views and downloads.
The final two measures of success will derive from the website itself. The curator will monitor website visit statistics as well as photo views/downloads. This will determine the success of the website itself based on usage. Numbers should start modestly and increase as the number of materials on the website increase. The blog will also be monitored for traffic to gauge interest, with a focus on if the blog drives traffic to the website and vice versa.
References
Arapahoe Libraries. (2021). Epson Perfection V550 Photo Color Scanner. Arapahoe Libraries. https://arapahoelibraries.bibliocommons.com/v2/record/S115C2115745.
Corporation for Digital Scholarship. (2021). Omeka.net. https://www.omeka.net/.
DLF Assessment Interest Group. (2021). Digitization Cost Calculator. Digitization Dashboard. https://dashboard.diglib.org/.
Dropbox. (2021). Choose the Right Dropbox for You. Dropbox. https://www.dropbox.com/plans?tab=personal.
Epson America, Inc. (2021). Epson Perfection V550 Photo Color Scanner. Epson Perfection V550 Photo Color Scanner | Product Exclusion | Epson US. https://epson.com/Clearance-Center/Scanners/Epson-Perfection-V550-Photo-Color-Scanner/p/B11B210201.
FADGI. (2016, September). Technical Guidelines for Digitizing Cultural Heritage Materials. Federal Agencies Digital Guidelines Initiative. http://www.digitizationguidelines.gov/guidelines/FADGI_Still_Image_Tech_Guidelines_2016.pdf.
GIMP. (2021). GIMP GNU Image Manipulation Program. GIMP. https://www.gimp.org/.
Google One. (2021). Plans & Pricing. Google One. https://one.google.com/about/plans.
Image Access, Inc. (2018). Bookeye® 4 V1A – Digitization Excellence and High Productivity. Image Access, Inc. | Bookeye 4 V1A. https://www.imageaccess.com/bookeye4v1a.
Open Archives. (2015). Open Archives Initiative Protocol for Metadata Harvesting. https://www.openarchives.org/pmh/.
Stanford University. (2020, October 30). Lots of Copies Keep Stuff Safe. LOCKSS. https://www.lockss.org/.
Comments