ixCreateRobotsTxtParser
Name
ixCreateRobotsTxtParser
Synopsis
RobotsTxtParserT ixCreateRobotsTxtParser(StatusCodeT *Status)
Arguments
Status: A pointer to a value of type StatusCodeT representing any error conditions.
Returns
A parser of type RobotsTxtParserT for robots.txt which also gives permissions on URLs.
If an error occurred, Status will be set to the error number.
Description
Robots.txt is a standard file which webmasters use to instruct the webcrawlers (web "search engines") which files, and directories to exclude. A full description of the standard may be found at http://info.webcrawler.com/mak/projects/robots/norobots.html. By providing the parser with the robots.txt file from a given site, you can then test URLs from that site against the parser to see if you have permission to download and index them. The robots.txt file for a given site is standard. It is sitename/robots.txt so for example, the robots.txt file for Webcrawler may be found at: http://www.webcrawler.com/robots.txt.
See Also
Robots.txt, Robots Spec
ixDeleteRobotsTxtParser, ixSetRobotName, ixParseRobotsTxt, ixRobotsPermissionGranted, ixRobotsTxtLength