CIRCA:Archive-IT

From CIRCA

Revision as of 10:47, 27 November 2012 by VictoriaSmith (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Archive-It Project Plan

Archive-It is a web archiving service developed by the Internet Archive "that helps organizations to harvest, build, and preserve collections of digital content." ([[1]])

The University of Alberta Libraries has an Archive-It subscription and our project for collecting content related to the history of Humanities Computing is the first research collaboratory at the University of Alberta to make use of this service.

Step One

  • Before our project team begins crawling and collecting content we need to practice test-crawls to become familiar with the service and to identify the breadth and scope included in various parameters of searching.
  • Out test crawl will begin with an item of historical interest that was active in a specific time frame such as a newsletter or report.
  • Following this test crawl we will see if we can get data from the search results and determine if any of the text analysis tools at our disposal can be applied. If this is possible we will continue with the project.
  • Before beginning any crawl we will need to check the Way Back Machine to make sure the Internet Archive isn't already crawling the content; we do not want to make repeat crawls.

Step Two

  • We will come up with and identify ten different types of sites to crawl. For example:
    • Institute
    • Journal
    • Technological
    • Blogs (conference notes)
    • Tweets (hashtags)
    • Events

Step Three

  • We will evaluate the project and determine if:
    • (1) We would like to continue making and analyzing crawls; and
    • (2) We should continue using a portion of the Library's subscription or decide if it would be worthwhile to purchase our own subscription to Archive-It from the Internet Archive.
  • We will share our collaborative experience with the Library with other members of the campus community and to broader audiences as well.
Personal tools