perl-CrawlerCommons-RobotRulesParser - parser for robots.txt files

Property Value
Distribution ALT Linux Sisyphus
Repository Autoimports noarch Official
Package filename perl-CrawlerCommons-RobotRulesParser-0.03-alt1.noarch.rpm
Package name perl-CrawlerCommons-RobotRulesParser
Package version 0.03
Package release alt1
Package architecture noarch
Package type rpm
Homepage -
License -
Maintainer -
Download size 46.05 KB
Installed size 46.05 KB
This module is a fairly close reproduction of the Crawler-Commons
From BaseRobotsParser javadoc:
Parse the robots.txt file in <i>content</i>, and return rules appropriate
for processing paths by <i>userAgent</i>. Note that multiple agent names
may be provided as comma-separated values; the order of these shouldn't
matter, as the file is parsed in order, and each agent name found in the
file will be compared to every agent name found in robotNames.
Also note that names are lower-cased before comparison, and that any
robot name you pass shouldn't contain commas or spaces; if the name has
spaces, it will be split into multiple names, each of which will be
compared against agent names in the robots.txt file. An agent name is
considered a match if it's a prefix match on the provided robot name. For
example, if you pass in "Mozilla Crawlerbot-super 1.0", this would match
"crawlerbot" as the agent name, because of splitting on spaces,
lower-casing, and the prefix match rule.
The method failedFetch is not implemented.


Package Version Architecture Repository
perl-CrawlerCommons-RobotRulesParser All All All


Name Value
/usr/share/perl5 -
perl(Const/ >= 0.014
perl( >= 2.800
perl(Log/ -
perl( >= 2.180.400
perl(MooseX/ >= 0.290
perl(MooseX/Log/ >= 0.470
perl(Try/ >= 0.240
perl( >= 1.710
perl(URI/ >= 3.310
perl(namespace/ >= 0.280
perl( -
perl-base >= 1:5.10.1
rpmlib(PayloadIsLzma) -


Name Value
perl(CrawlerCommons/ = 0.030
perl(CrawlerCommons/ = 0.030
perl(CrawlerCommons/ = 0.030
perl(CrawlerCommons/ = 0.030
perl(CrawlerCommons/ = 0.030
perl-CrawlerCommons-RobotRulesParser = 0.03-alt1


Type URL
Binary Package perl-CrawlerCommons-RobotRulesParser-0.03-alt1.noarch.rpm
Source Package perl-CrawlerCommons-RobotRulesParser-0.03-alt1.src.rpm

Install Howto

  1. Add the following line to /etc/apt/sources.list:
    rpm [Sisyphus] noarch autoimports
    rpm [Sisyphus] noarch autoimports
  2. Update the package index:
    # sudo apt-get update
  3. Install perl-CrawlerCommons-RobotRulesParser rpm package:
    # sudo apt-get install perl-CrawlerCommons-RobotRulesParser



See Also

Package Description
perl-Criteria-Compile-0.047-alt1.noarch.rpm Describe wanted objects / data using grammar
perl-Crixa-0.13-alt1.noarch.rpm A Cleaner API for Net::RabbitMQ
perl-Cron-RunJob-0.06-alt1.noarch.rpm Monitor Cron Jobs
perl-CrowdControl-0.01-alt1.noarch.rpm User Management Framework for Web Applications
perl-CryoTel-CryoCon-0.0.6-alt1.noarch.rpm A module for interfacing with CryoTel Cryocontrollers via TCP
perl-Crypt-AES-CTR-0.03-alt1.noarch.rpm This is a port of Chris Veness' AES implementation
perl-Crypt-Affine-0.15-alt1.noarch.rpm Interface to the Affine cipher
perl-Crypt-AllOrNothing-0.10-alt1.noarch.rpm perl module Crypt-AllOrNothing
perl-Crypt-AllOrNothing-Util-0.09-alt1.noarch.rpm perl module Crypt-AllOrNothing-Util
perl-Crypt-AppleTwoFish-0.051-alt1.noarch.rpm An Apple iTMS internal key descrambling algorithm
perl-Crypt-CBCeasy-0.24-alt1.noarch.rpm perl module Crypt-CBCeasy
perl-Crypt-CFB-0.02-alt1.noarch.rpm perl module Crypt-CFB
perl-Crypt-CVS-0.03-alt1.noarch.rpm Substitution cipher for CVS passwords
perl-Crypt-Caesar-0.01-alt1.noarch.rpm perl module Crypt-Caesar
perl-Crypt-Camellia_PP-0.02-alt1.noarch.rpm perl module Crypt-Camellia_PP