By DSG Group on June 14, 2017
This document was written by John Lynch, with contributions from Lisa Snyder, Annelie Rugg, Deidre Whitmore, Lucian Tucker, Todd Presner, Miriam Posner, and Patrik Svensson.
Every publication method, digital or analog, has a likely lifespan, that is, the length of time before it will stop being accessible unless it is significantly overhauled or replaced. The likely lifespan of well-cared-for paper books can be measured in decades, if not centuries; the likely lifespan of some digital artifacts is better measured in months, or even hours. This means that, without careful planning and management, digital scholarship is at high risk of being lost permanently. We’ve prepared this guide to help scholars understand the level of risk associated with various digital publication methods (based on our collective experience) and how to minimize that risk. Armed with this knowledge, a scholar can choose a publication method better matched to personal risk tolerance.
Start by asking the question: What is my risk tolerance?
- Low. I want to experiment with digital scholarship, but I want to take every reasonable precaution to ensure the longevity of my work. Likely lifespan: 10+ years.
- Publish your object as an ASCII file, such as .txt or .csv, as a PDF-A (A stands for “archival”) file, or as a .TIFF or .WAV file and deposit it in a repository, such as the UC’s eScholarship platform. You can also query http://www.re3data.org/ to find a repository specific to your discipline. Discuss video file options with a professional archivist. Best Partner: Digital Repository.
- If interactivity is necessary, publish your underlying data and your web interface separately. That will make it easier to deposit your data in a well-resourced digital repository, such as Merritt or Zenodo. (Or, query http://www.re3data.org/ to find a repository specific to your discipline.) That way, even if you can’t sustain your interactivity, your underlying data is likely to be preserved. Best partner: Library or repository for data storage. Institutional or Private Web Hosting Provider and Systems Administrator for web object.
- Publish your web object as a supplement/companion to a traditional publication (or publish a paper book as a companion to your web object). To the greatest extent possible, also include the scholarly content of your web object in the paper publication and describe its interactivity, with images. That way, even if you can’t sustain your interactivity, future scholars can recreate it. Best partner: Traditional publisher.
- Medium. I need to push boundaries, and I’m prepared to take some risks to do it. Likely lifespan: 5-10 years.
- Publish your scholarship as a flat HTML page, with no scripts, databases, etc. Use .txt, .csv, or PDF-A for documents. Only use .JPG or .TIFF image files and .WAV audio files. Discuss video file options with a professional archivist; Youtube might be a viable option. Best Partner: Private Web Hosting Provider and Systems Administrator.
- Publish your scholarship using a popular and well-supported content management system such as WordPress, Drupal, Omeka, or Github Pages + Jekyll. Use only core features, not any plugins or extensions. Use only the most commonly used file types for your content, such as .txt, .rtf, .csv, .xlsx, .docx, .pdf, .jpg, .tif, .wav. Discuss video file options with a professional archivist; Youtube might be a viable option. Best Partner: Institutional or Private Web Hosting Provider and Systems Administrator.
- High. My scholarship is only possible in interactive digital environments, and I’m prepared to take risks to realize it. Likely lifespan: 3-5 years.
- Publish your scholarship using a content management system. For websites, that could be WordPress, Drupal, Scalar, or Omeka; for video games, that could be Unity; etc. Use whatever plugins and file types you want, although try to use the most common/best supported whenever possible. For video files, Youtube is a viable option. Best Partner: Institutional or Private Web Hosting Provider and Systems Administrator.
- Build a custom application to publish your scholarship. Work with your technical partners to decide what the best tools are currently for your needs. Best partner: Institutional or Private Web Hosting Provider and Systems Administrator.
There are a lot of ways to reduce the risk that your digital publication will become inaccessible. Here are some ideas.
- If possible, plan a web object that creates ongoing scholarly value for you. Much digital scholarship vanishes because its creators treat it as a book that they publish and “shelve”. This means no one notices when it falls apart. If you plan a digital tool that will be an active component of your scholarly toolkit instead of a one-off, your active engagement with it over the long-term will help you stay ahead of most of the issues that disable digital scholarship.
- Partner with a librarian or other digital archival specialist to design a curatorial statement for your web object. A web object normally has multiple distinct components, such as assets, products, and design. The curatorial statement should identify which of your project’s components are long-term scholarly resources that must be preserved as-created (=archival content), and which, while important to the success of the project, can be ephemeral or can be preserved in archival images and videos (=ephemeral content). That will let you plan and budget for the necessary preservation work, which will make it much easier down the road.
- Try to “publish” even the most experimental digital projects in an analog environment. For example, there are peer-reviewed digital humanities-friendly journals (both print and digital) that will publish a text discussion of a digital project (e.g., RIDE). So before you begin your digital scholarship, read a few such articles to get an understanding of what that entails, and then build the funding or other resources that you’ll need to produce such an article into your project budget. Plan to include numerous images of the ephemeral content of your project in your publication. That way, if you can’t sustain your project, it can still be partially recreated by curious scholars down the road.
- Build a team. If you’re the only person who has access to the project materials, then you are also a “single point of failure” for the entire project. If you collaborate with a small team, on the other hand, it makes your digital object much more resilient. Other people will be able to access the materials in case anything happens to you, and since they share a sense of ownership, they’ll be more likely to try to preserve the web object without you.
- Budget for someone to create a “flat copy” of your object as soon as it becomes functional, and again every time it experiences a significant change (e.g., you add a lot of new content). A “flat copy” is a copy of a web object made using a longer-lifespan technology, normally sacrificing most interactivity for sustainability. Options include HTML (e.g., using HTTrack), PDF (e.g., with Adobe Acrobat Pro or another PDF editor), or even as a collection of PNG screenshots (using your computer’s built-in capabilities). Then, when you can no longer maintain the ephemeral content, draft a brief statement about the need to archive the project, deposit those copies of your site along with that statement in a repository (whether an academic one, like Merritt or Zenodo, or The Internet Archive), and update all of your project links to direct users to the new location. That way, visitors will understand that the project has been archived and will be able to explore screenshots of how it looked at its best, not encounter a neglected site full of broken links and functionality.
- Consider releasing any custom code for your project under an open-source license (e.g., Creative Commons), and deposit it along with your flat copy, or link to it in any article that you publish about your project. Computer code is almost always written as plain ASCII text, which makes it very easy to preserve. While the code itself might not work in twenty years, having access to it (especially if it is well-commented) will let future scholars recreate your project much more easily. However, make sure to consult with your university’s legal department before you do this, both to make sure that you license your code properly to protect your intellectual property, and to make sure that any contract that you sign with an outside web developer allows you to do this.
- Be strategic when partnering with institutional technology support groups on your projects. Such teams can be excellent partners for short- or medium-term projects (e.g., 3-10 years), but may not be able to sustain partnership for the duration of longer projects. Also, these partners may be constrained to working with “active” members of their institution — when scholars leave or retire, or students graduate, the capability to continue the partnership may vanish. If you do partner with such a team, try to design your project to match their strengths (in alignment with the above recommendations, of course) and maintain an active line of communication with them, e.g. by scheduling an in-person check-in meeting with the manager in charge of your hosting at least once per year. Keeping that relationship strong will help you get earlier warnings and more support in case any problems occur.
- Find ongoing funding. Most digital projects “disappear” because they are only funded by one-time grants. That pays for their creation, but not for the maintenance necessary to keep them operational. If you can secure ongoing funding for your project (e.g., from an endowed chair or via active and ongoing fundraising efforts), however, you can easily solve many of the issues that compromise interactive digital objects.
Here are a few other resources on digital preservation: