LinkScan for Unix. Reference Manual. | Section 10 |
Previous Contents Next | Help Reference HowTo Card |
The LinkScan Import function may be used to:
Validate a list of Links exported from some arbitrary data source (e.g. a database management system).
Validate a list of Documents (e.g. an arbitrary sub-set of pages from a web site) and all the links contained within them. This might include the most critical/popular pages perhaps extracted from an HTTP logfile analysis program. This could also represent an arbitrary user session including a sequence of form submissions with specific data values. Such sequences may be easily captured with the LinkScan Recorder.
When processing a list of Links each URL is checked in turn and its status stored in the LinkScan database. When processing a list of Documents, each document and every link within that document is checked and its status stored.
The import function offers enormous flexibility. To use this feature, carry out the following steps:
Prepare the Import File
LinkScan will import a simple ASCII file of the following format:
URL ... one or more tab characters ... URL-Description
URL's may be absolute, or relative to the Home URL for the current server. The URL-Description is imported and carried through to the LinkScan Reports for identification purposes. You may use any ASCII string, for example a database record number.
Import files may also include URL's using the extended LinkScan conventions for form submissions (GET, POST and Multi-Part POST). See How to Submit Forms.
An alternative field separator may be specified by including a special command as the first line of the file:
## \s+
The command starts with '##' in column one followed by a Perl expression that specifies the field delimiter. In the example above, '\s+' means one or more whitespace characters (tab or space).
Lines with a '#' in column one, and blank lines, are ignored as comments.
To use the Import Function, open the linkscan.cfg file for the appropriate Project, and edit the Importfile setting. Supply the full pathname to the prepared ASCII import file. For example:
Importfile = /usr/home/linkscan/importfiles/test.txt
Then select the import mode by changing the Import setting. Valid values are:
Import = 0 Import mode disabled
Import = 1 Import a list of links
Import = 2 Import a list of documents
Import = 3 Import a list of documents with caching disabled
When using Import Documents LinkScan will by default check each document listed in the Import file but it will not follow those links and scan the entire site. Optionally, you may set Maxclicks and force LinkScan to execute a deeper scan. e.g. with Maxclicks = 3, LinkScan will check the Import File, the documents listed in the Import File, and the children (but not the grandchildren) of those documents.
Special Considerations
LinkScan de-duplicates the list of links within an Import Document list. This means that LinkScan will validate each unique URL within the list only one time.
However, you may force LinkScan to process an Import Sequence so that the same URL or document is checked more than once. This may be achieved by adjusting the URL's to make them appear unique. Note that this also provides a means by which to differentiate the test results for each step. Simply edit the URL's to make them unique by adding dummy name-value pairs to the query string of the URL's:
http://www.example.com/cookie_sensitive?dummyseq=1
[...]
http://www.example.com/set_cookie
[...]
http://www.example.com/cookie_sensitive?dummyseq=2
If the URL's already include a query string, simply append the additional parameter to the existing query and change:
http://www.example.com/foo?name=value
to:
http://www.example.com/foo?name=value&dummyseq=1
Normally, LinkScan maintains the status of each link in a cache while it scans a site. This dramatically improves performance since LinkScan does not need to re-check commonly used images and other components over and over. However, it may also be undesirable with some stateful sequences. For example, if the same URL produces a completely different result before and after a cookie is set.
In those situations, you may use a special option (Import = 3) which will force LinkScan to flush its cache after each imported document has been validated.
LinkScan for Unix. Reference Manual. Section 10. Import Scanning
LinkScan Version 12.3
© Copyright 1997-2012
Electronic Software Publishing Corporation (Elsop)
LinkScan and Elsop are Trademarks of Electronic Software Publishing Corporation
Previous Contents Next | Help Reference HowTo Card |