Test crawls

Running test crawls provides you with a mechanism to see how a crawl will behave with the current seed settings, without having to worry about the data captured during a test crawl being saved. It is a safe way to test changes to seeds or collections that will highlight any scoping errors that may cause problems down the road. The data from a test crawl can either be saved (if the crawl completed as expected) or deleted, if there were errors in the crawl that need to be addressed. The first step in properly scoping any seed to run a test crawl.

Test Crawl Individual Seeds

In order to see how the addition of an individual new seed to a collection will affect that collection, you should start by running a test crawl on that seed. To run a test crawl on an individual seed you should:

1) Open the Collection Overview page for the collection that you have added the seed to.

2) Check the checkbox next to the seed you wish to run a test crawl on.

3) Click on 'Run Crawl'

4) Select 'Test Crawl' as crawl type

5) Set the duration equal to the duration of the production crawl the seed will be added to. For example if the seed will be in a collection with a recurring monthly crawl that lasts three days, set the time limit to three days.

6) Click on Crawl

Test Crawl a Batch of Seeds

Archive-It limits the amount of concurrent test crawls that can run at any given point in time. If you are adding multiple seeds to a collection, that will be crawled with identical frequency and duration, you can run a test crawl that includes multiple seeds. To do so you would follow the same steps as above, but just select the checkbox of every seed that you wish to include in the test crawl.

Test Crawl Reports

Test crawl reports are identical to production crawl reports. They are intended to give you a complete overview of the data captured during a crawl as well as links to the wayback captures of the site for quality assurance review. Test crawl reports can be found by going to the main 'Crawls' tab of the Archive-It admin interface and then clicking on 'Test Crawls'. There are a few things to pay particularly close attention to when evaluating the effectiveness of a test crawl.

Convert Test Crawl to Production Crawl

Once you have finished the QA on a test crawl and are confident that it is ready to be added as a production seed, do the following:

1) Open the collection landing page that contains the seed

2) Click on the 'Seeds' tab

3) Click the checkbox next to the seed you wish to add to a production crawl

4) Click 'Edit Settings'

5) Check the checkbox labeled 'Visible to the public'

6) Set the crawl frequency

7) Click 'Save'