perl-Text-Scan - Fast search for very large numbers of keys in a body of text

Property Value
Distribution ALT Linux Sisyphus
Repository Autoimports i586
Package filename perl-Text-Scan-0.31-alt4.1.i586.rpm
Package name perl-Text-Scan
Package version 0.31
Package release alt4.1
Package architecture i586
Package type rpm
Category Development/Perl
Homepage -
License -
Maintainer -
Download size 94.64 KB
Installed size 94.64 KB
This module provides facilities for fast searching on strings with very many search keys. The basic object behaves somewhat like a perl hash, except that you can retrieve based on a superstring of any keys stored. Simply scan a string as shown above and you will get back a perl hash (or list) of all keys found in the string (along with associated values and/or positions). All keys present in the text are returned.
There are several ways to influence the behavior of the match, chiefly by the use of several types of global character classes. These are different from regular expression char classes, in that they apply to the entire text and for all keys. These consist of the "ignore" class, the "boundary" class, the "inclboundary" class, and any user-defined classes.
Using "ignore" characters you can have the scan pretend a char in the text simply does not exist. This is useful if you want to avoid tokenizing your text. So for instance, if the period '.' is in your "ignore" class, the text will be treated exactly as if all periods had been deleted.
To define what characters may count as the delimiter of any match (single space by default) you can use the "boundary" class. For instance this way you can count punctuation as a boundary, and phrases bounded at the end by punctuation will match.
Any user-defined character classes can be used to count different chars as the same. For instance this is used internally to implement case-insensitive matching.
About unicode/utf8 strings. Text::Scan acts at the octet level so it's not aware of anything about unicode/utf8 encoded strings. If you deal with such strings, it's recommended to give octets strings to Text::Scan using Encode::encode_utf8(). Text::Scan will then give you back octets strings , utf8 encoded found keys.


Package Version Architecture Repository
perl-Text-Scan-0.31-alt4.1.x86_64.rpm 0.31 x86_64 Autoimports
perl-Text-Scan - - -


Name Value
/usr/lib/perl5 - - - - - - -
perl( -
rpmlib(PayloadIsLzma) -
rpmlib(SetVersions) -
rtld(GNU_HASH) -


Name Value
perl(Text/ = 0.310
perl-Text-Scan = 0.31-alt4.1


Type URL
Binary Package perl-Text-Scan-0.31-alt4.1.i586.rpm
Source Package perl-Text-Scan-0.31-alt4.1.src.rpm

Install Howto

  1. Add the following line to /etc/apt/sources.list:
    rpm [Sisyphus] i586 autoimports
    rpm [Sisyphus] noarch autoimports
  2. Update the package index:
    # sudo apt-get update
  3. Install perl-Text-Scan rpm package:
    # sudo apt-get install perl-Text-Scan



See Also

Package Description
perl-Text-SpeedyFx-0.012-alt3.1.i586.rpm tokenize/hash large amount of strings efficiently
perl-Text-SpeedyFx-scripts-0.012-alt3.1.i586.rpm Text-SpeedyFx scripts
perl-Text-TNetstrings-1.2.0-alt4.1.i586.rpm Data serialization using typed netstrings
perl-Text-Tidx-0.94-alt4.2.i586.rpm Index a delimited text file containing start-stop positions
perl-Text-Tidx-scripts-0.94-alt4.2.i586.rpm Text-Tidx scripts
perl-Text-Tmpl-0.33-alt4.1.i586.rpm perl module Text-Tmpl
perl-Text-Tokenizer-0.4.6-alt4.1.i586.rpm Perl extension for tokenizing text(config) files
perl-Text-Unaccent-1.08-alt3_17.i586.rpm Remove accents from a string
perl-Text-Upskirt-0.100-alt4.1.i586.rpm turns baubles into trinkets
perl-Text-Ux-0.11-alt3.1.i586.rpm More Succinct Trie Data structure (binding for ux-trie)
perl-Text-Ux-scripts-0.11-alt3.1.i586.rpm Text-Ux scripts
perl-Text-VCardFast-0.11-alt3.1.i586.rpm Perl extension for very fast parsing of VCards
perl-Text-VisualWidth-0.02-alt4.1.i586.rpm perl module Text-VisualWidth
perl-Text-Wrap-Smart-XS-0.06-alt4.1.i586.rpm Wrap text fast into chunks of similar length
perl-Thread-Channel-0.003-alt4.1.i586.rpm Fast thread queues