TYPO3 4.1.0: tx_indexedsearch_lexer Class Reference

tx_indexedsearch_lexer::tx_indexedsearch_lexer ( )

Constructor: Initializes the charset class, t3lib_cs

Returns:: void

Definition at line 105 of file class.lexer.php.

References t3lib_div::makeInstance().

tx_indexedsearch_lexer::split2Words ( $ wordString )

Splitting string into words. Used for indexing, can also be used to find words in query.

Parameters:

string

String with UTF-8 content to process.

Returns:: array Array of words in utf-8

Definition at line 116 of file class.lexer.php.

References addWords(), and get_word().

tx_indexedsearch_lexer::addWords	(	&$	words,
		&$	wordString,
		$	start,
		$	len
	)

Add word to word-array This function should be used to make sure CJK sequences are split up in the right way

Parameters:

	array	Array of accumulated words
	string	Complete Input string from where to extract word
	integer	Start position of word in input string
	integer	The Length of the word string from start position

Returns:: void

Definition at line 178 of file class.lexer.php.

References charType(), and utf8_ord().

Referenced by split2Words().

tx_indexedsearch_lexer::get_word	(	&$	str,
		$	pos = `0`
	)

Get the first word in a given utf-8 string (initial non-letters will be skipped)

Parameters:

	string	Input string (reference)
	integer	Starting position in input string

Returns:: array 0: start, 1: len or false if no word has been found

Definition at line 239 of file class.lexer.php.

References utf8_is_letter().

Referenced by split2Words().

tx_indexedsearch_lexer::utf8_is_letter	(	&$	str,
		&$	len,
		$	pos = `0`
	)

See if a character is a letter (or a string of letters or non-letters).

Parameters:

	string	Input string (reference)
	integer	Byte-length of character sequence (reference, return value)
	integer	Starting position in input string

Returns:: boolean letter (or word) found

Definition at line 264 of file class.lexer.php.

References charType(), t3lib_div::inList(), and utf8_ord().

Referenced by get_word().

tx_indexedsearch_lexer::charType ( $ cp )

Determine the type of character

Parameters:

integer

Unicode number to evaluate

Returns:: array Type of char; index-0: the main type: num, alpha or CJK (Chinese / Japanese / Korean)

Definition at line 329 of file class.lexer.php.

Referenced by addWords(), and utf8_is_letter().

tx_indexedsearch_lexer::utf8_ord	(	&$	str,
		&$	len,
		$	pos = `0`,
		$	hex = `false`
	)

Converts a UTF-8 multibyte character to a UNICODE codepoint

Parameters:

	string	UTF-8 multibyte character string (reference)
	integer	The length of the character (reference, return value)
	integer	Starting position in input string
	boolean	If set, then a hex. number is returned

Returns:: integer UNICODE codepoint

Definition at line 383 of file class.lexer.php.

Referenced by addWords(), and utf8_is_letter().

tx_indexedsearch_lexer::$lexerConf

Initial value:

 array(
                'printjoins' => array(  // This is the Unicode numbers of chars that are allowed INSIDE a sequence of letter chars (alphanum + CJK)
                        0x2e,   // "."
                        0x2d,   // "-"
                        0x5f,   // "_"
                        0x3a,   // ":"
                        0x2f,   // "/"
                        0x27,   // "'"
                        // 0x615,       // ARABIC SMALL HIGH TAH
                ),
                'casesensitive' => FALSE,       // Set, if case sensitive indexing is wanted.
                'removeChars' => array(         // List of unicode numbers of chars that will be removed before words are returned (eg. "-")
                        0x2d    // "-"
                )
        )

Definition at line 83 of file class.lexer.php.


Public Member Functions
	tx_indexedsearch_lexer ()
	split2Words ($wordString)
	addWords (&$words, &$wordString, $start, $len)
	get_word (&$str, $pos=0)
	utf8_is_letter (&$str, &$len, $pos=0)
	charType ($cp)
	utf8_ord (&$str, &$len, $pos=0, $hex=false)
Public Attributes
	$debug = FALSE
	$debugString = ''
	$csObj
	$lexerConf

Documentation TYPO3 par Ameos

tx_indexedsearch_lexer Class Reference

Public Member Functions

Public Attributes

Detailed Description

Member Function Documentation

Member Data Documentation