Zeta Components Manual :: Docs For Class ezcDocumentRstTokenizer
Document::ezcDocumentRstTokenizer
Class ezcDocumentRstTokenizer
Tokenizer for RST documents
Source for this file: /Document/src/document/rst/tokenizer.php
Version: | //autogen// |
Constants
SPECIAL_CHARS
= '!"#$%&\'()*+,./:;<=>?@[\\]^_`{|}~-'
|
Allowed character sets for headlines. |
TEXT_END_CHARS
= '`*_\\\\[\\]|()"\':.\\r\\n\\t '
|
Characters ending a pure text section. |
WHITESPACE_CHARS
= ' \\t'
|
Common whitespace characters. The vertical tab is excluded, because it causes strange problems with PCRE. |
Member Variables
protected array |
$tokens
= array()
List with tokens and a regular expression matching the given token. The tokens are matched in the given order. |
Method Summary
public void |
__construct(
)
Construct tokenizer |
protected void |
convertTabs(
$token
)
Convert tabs to spaces |
public array |
tokenizeFile(
$file
)
Tokenize the given file |
public array |
tokenizeString(
$string
)
Tokenize the given string |
Methods
__construct
Construct tokenizer
Create token array with regular repression matching the respective token.
convertTabs
Convert tabs to spaces
Convert all tabs to spaces, as defined in: http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#whitespace
Parameters:
Name | Type | Description |
---|---|---|
$token |
ezcDocumentRstToken |
tokenizeFile
Tokenize the given file
The method tries to tokenize the passed files and returns an array of ezcDocumentRstToken struct on succes, or throws a ezcDocumentTokenizerException, if something could not be matched by any token.
Parameters:
Name | Type | Description |
---|---|---|
$file |
string |
tokenizeString
Tokenize the given string
The method tries to tokenize the passed strings and returns an array of ezcDocumentRstToken struct on succes, or throws a ezcDocumentTokenizerException, if something could not be matched by any token.
Parameters:
Name | Type | Description |
---|---|---|
$string |
string |