Documentation TYPO3 par Ameos

tx_indexedsearch_indexer Class Reference

List of all members.

Public Member Functions

 hook_indexContent (&$pObj)
 backend_initIndexer ($id, $type, $sys_language_uid, $MP, $uidRL, $cHash_array=array(), $createCHash=FALSE)
 backend_setFreeIndexUid ($freeIndexUid, $freeIndexSetId=0)
 backend_indexAsTYPO3Page ($title, $keywords, $description, $content, $charset, $mtime, $crdate=0, $recordUid=0)
 init ()
 initializeExternalParsers ()
 indexTypo3PageContent ()
 splitHTMLContent ($content)
 getHTMLcharset ($content)
 convertHTMLToUtf8 ($content, $charset='')
 embracingTags ($string, $tagName, &$tagContent, &$stringAfter, &$paramList)
 typoSearchTags (&$body)
 extractLinks ($content)
 extractHyperLinks ($string)
 indexExternalUrl ($externalUrl)
 getUrlHeaders ($url)
 indexRegularDocument ($file, $force=FALSE, $contentTmpFile='', $altExtension='')
 readFileContent ($ext, $absFile, $cPKey)
 fileContentParts ($ext, $absFile)
 splitRegularContent ($content)
 charsetEntity2utf8 (&$contentArr, $charset)
 processWordsInArrays ($contentArr)
 procesWordsInArrays ($contentArr)
 bodyDescription ($contentArr)
 indexAnalyze ($content)
 analyzeHeaderinfo (&$retArr, $content, $key, $offset)
 analyzeBody (&$retArr, $content)
 metaphone ($word, $retRaw=FALSE)
 submitPage ()
 submit_grlist ($hash, $phash_x)
 submit_section ($hash, $hash_t3)
 removeOldIndexedPages ($phash)
 submitFilePage ($hash, $file, $subinfo, $ext, $mtime, $ctime, $size, $content_md5h, $contentParts)
 submitFile_grlist ($hash)
 submitFile_section ($hash)
 removeOldIndexedFiles ($phash)
 checkMtimeTstamp ($mtime, $phash)
 checkContentHash ()
 checkExternalDocContentHash ($hashGr, $content_md5h)
 is_grlist_set ($phash_x)
 update_grlist ($phash, $phash_x)
 updateTstamp ($phash, $mtime=0)
 updateSetId ($phash)
 updateParsetime ($phash, $parsetime)
 updateRootline ()
 getRootLineFields (&$fieldArr)
 removeLoginpagesWithContentHash ()
 includeCrawlerClass ()
 checkWordList ($wl)
 submitWords ($wl, $phash)
 freqMap ($freq)
 setT3Hashes ()
 setExtHashes ($file, $subinfo=array())
 md5inthash ($str)
 makeCHash ($paramArray)
 log_push ($msg, $key)
 log_pull ()
 log_setTSlogMessage ($msg, $errorNum=0)
 fe_headerNoCache (&$params, $ref)

Public Attributes

 $reasons
 $excludeSections = 'script,style'
 $external_parsers = array()
 $defaultGrList = '0,-1'
 $tstamp_maxAge = 0
 $tstamp_minAge = 0
 $maxExternalFiles = 0
 $forceIndexing = FALSE
 $crawlerActive = FALSE
 $defaultContentArray
 $wordcount = 0
 $externalFileCounter = 0
 $conf = array()
 $indexerConfig = array()
 $hash = array()
 $file_phash_arr = array()
 $contentParts = array()
 $content_md5h = ''
 $internal_log = array()
 $indexExternalUrl_content = ''
 $cHashParams = array()
 $freqRange = 32000
 $freqMax = 0.1
 $csObj
 $metaphoneObj
 $lexerObj

Detailed Description

Definition at line 141 of file class.indexer.php.


Member Function Documentation

tx_indexedsearch_indexer::hook_indexContent ( &$  pObj  ) 

Parent Object (TSFE) Initialization

Parameters:
object Parent Object (frontend TSFE object), passed by reference
Returns:
void

Definition at line 207 of file class.indexer.php.

References $indexerConfig, indexTypo3PageContent(), init(), t3lib_extMgm::isLoaded(), log_pull(), log_push(), and log_setTSlogMessage().

tx_indexedsearch_indexer::backend_initIndexer ( id,
type,
sys_language_uid,
MP,
uidRL,
cHash_array = array(),
createCHash = FALSE 
)

Initializing the "combined ID" of the page (phash) being indexed (or for which external media is attached)

Parameters:
integer The page uid, &id=
integer The page type, &type=
integer sys_language uid, typically &L=
string The MP variable (Mount Points), &MP=
array Rootline array of only UIDs.
array Array of GET variables to register with this indexing
boolean If set, calculates a cHash value from the $cHash_array. Probably you will not do that since such cases are indexed through the frontend and the idea of this interface is to index non-cachable pages from the backend!
Returns:
void

Definition at line 308 of file class.indexer.php.

References init(), and makeCHash().

tx_indexedsearch_indexer::backend_setFreeIndexUid ( freeIndexUid,
freeIndexSetId = 0 
)

Sets the free-index uid. Can be called right after backend_initIndexer()

Parameters:
integer Free index UID
integer Set id - an integer identifying the "set" of indexing operations.
Returns:
void

Definition at line 347 of file class.indexer.php.

tx_indexedsearch_indexer::backend_indexAsTYPO3Page ( title,
keywords,
description,
content,
charset,
mtime,
crdate = 0,
recordUid = 0 
)

Indexing records as the content of a TYPO3 page.

Parameters:
string Title equivalent
string Keywords equivalent
string Description equivalent
string The main content to index
string The charset of the title, keyword, description and body-content. MUST BE VALID, otherwise nothing is indexed!
integer Last modification time, in seconds
integer The creation date of the content, in seconds
integer The record UID that the content comes from (for registration with the indexed rows)
Returns:
void

Definition at line 365 of file class.indexer.php.

References indexTypo3PageContent().

tx_indexedsearch_indexer::init (  ) 

Initializes the object. $this->conf MUST be set with proper values prior to this call!!!

Returns:
void

Definition at line 416 of file class.indexer.php.

References t3lib_div::getUserObj(), initializeExternalParsers(), t3lib_div::intInRange(), t3lib_div::makeInstance(), metaphone(), and setT3Hashes().

Referenced by backend_initIndexer(), and hook_indexContent().

tx_indexedsearch_indexer::initializeExternalParsers (  ) 

Initialize external parsers

Returns:
void private
See also:
init()

Definition at line 468 of file class.indexer.php.

References t3lib_div::getUserObj().

Referenced by init().

tx_indexedsearch_indexer::indexTypo3PageContent (  ) 

Start indexing of the TYPO3 page

Returns:
void

Definition at line 509 of file class.indexer.php.

References charsetEntity2utf8(), checkContentHash(), checkMtimeTstamp(), checkWordList(), extractLinks(), indexAnalyze(), is_grlist_set(), log_pull(), log_push(), log_setTSlogMessage(), md5inthash(), t3lib_div::milliseconds(), processWordsInArrays(), splitHTMLContent(), submitPage(), submitWords(), update_grlist(), updateParsetime(), updateRootline(), updateSetId(), and updateTstamp().

Referenced by backend_indexAsTYPO3Page(), and hook_indexContent().

tx_indexedsearch_indexer::splitHTMLContent ( content  ) 

Splits HTML content and returns an associative array, with title, a list of metatags, and a list of words in the body.

Parameters:
string HTML content to index. To some degree expected to be made by TYPO3 (ei. splitting the header by ":")
Returns:
array Array of content, having keys "title", "body", "keywords" and "description" set.
See also:
splitRegularContent()

Definition at line 596 of file class.indexer.php.

References embracingTags(), t3lib_div::get_tag_attributes(), and typoSearchTags().

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::getHTMLcharset ( content  ) 

Extract the charset value from HTML meta tag.

Parameters:
string HTML content
Returns:
string The charset value if found.

Definition at line 642 of file class.indexer.php.

Referenced by convertHTMLToUtf8().

tx_indexedsearch_indexer::convertHTMLToUtf8 ( content,
charset = '' 
)

Converts a HTML document to utf-8

Parameters:
string HTML content, any charset
string Optional charset (otherwise extracted from HTML)
Returns:
string Converted HTML

Definition at line 657 of file class.indexer.php.

References getHTMLcharset().

tx_indexedsearch_indexer::embracingTags ( string,
tagName,
&$  tagContent,
&$  stringAfter,
&$  paramList 
)

Finds first occurence of embracing tags and returns the embraced content and the original string with the tag removed in the two passed variables. Returns false if no match found. ie. useful for finding <title> of document or removing <script>-sections

Parameters:
string String to search in
string Tag name, eg. "script"
string Passed by reference: Content inside found tag
string Passed by reference: Content after found tag
string Passed by reference: Attributes of the found tag.
Returns:
boolean Returns false if tag was not found, otherwise true.

Definition at line 685 of file class.indexer.php.

Referenced by splitHTMLContent().

tx_indexedsearch_indexer::typoSearchTags ( &$  body  ) 

Removes content that shouldn't be indexed according to TYPO3SEARCH-tags.

Parameters:
string HTML Content, passed by reference
Returns:
boolean Returns true if a TYPOSEARCH_ tag was found, otherwise false.

Definition at line 712 of file class.indexer.php.

Referenced by splitHTMLContent().

tx_indexedsearch_indexer::extractLinks ( content  ) 

Extract links (hrefs) from HTML content and if indexable media is found, it is indexed.

Parameters:
string HTML content
Returns:
void

Definition at line 741 of file class.indexer.php.

References extractHyperLinks(), t3lib_div::getFileAbsFileName(), t3lib_div::htmlspecialchars_decode(), includeCrawlerClass(), indexExternalUrl(), indexRegularDocument(), t3lib_div::isAllowedAbsPath(), t3lib_extMgm::isLoaded(), log_setTSlogMessage(), and t3lib_div::makeInstance().

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::extractHyperLinks ( string  ) 

Extracts all links to external documents from content string.

Parameters:
string Content to analyse
Returns:
array Array of hyperlinks
See also:
extractLinks()

Definition at line 827 of file class.indexer.php.

References t3lib_div::makeInstance(), and t3lib_div::shortMD5().

Referenced by extractLinks().

tx_indexedsearch_indexer::indexExternalUrl ( externalUrl  ) 

Index External URLs HTML content

Parameters:
string URL, eg. "http://typo3.org/"
Returns:
void
See also:
indexRegularDocument()

Definition at line 886 of file class.indexer.php.

References getUrlHeaders(), indexRegularDocument(), t3lib_div::tempnam(), and t3lib_div::writeFile().

Referenced by extractLinks().

tx_indexedsearch_indexer::getUrlHeaders ( url  ) 

Getting HTTP request headers of URL

Parameters:
string The URL
integer Timeout (seconds?)
Returns:
mixed If no answer, returns false. Otherwise an array where HTTP headers are keys

Definition at line 917 of file class.indexer.php.

References t3lib_div::getURL(), and t3lib_div::trimExplode().

Referenced by indexExternalUrl().

tx_indexedsearch_indexer::indexRegularDocument ( file,
force = FALSE,
contentTmpFile = '',
altExtension = '' 
)

Indexing a regular document given as $file (relative to PATH_site, local file)

Parameters:
string Relative Filename, relative to PATH_site. It can also be an absolute path as long as it is inside the lockRootPath (validated with t3lib_div::isAbsPath()). Finally, if $contentTmpFile is set, this value can be anything, most likely a URL
boolean If set, indexing is forced (despite content hashes, mtime etc).
string Temporary file with the content to read it from (instead of $file). Used when the $file is a URL.
string File extension for temporary file.
Returns:
void

Definition at line 963 of file class.indexer.php.

References $content_md5h, $contentParts, checkExternalDocContentHash(), checkMtimeTstamp(), checkWordList(), fileContentParts(), t3lib_div::getFileAbsFileName(), indexAnalyze(), t3lib_div::isAbsPath(), t3lib_div::isAllowedAbsPath(), log_pull(), log_push(), log_setTSlogMessage(), md5inthash(), t3lib_div::milliseconds(), processWordsInArrays(), readFileContent(), setExtHashes(), submitFile_section(), submitFilePage(), submitWords(), updateParsetime(), and updateTstamp().

Referenced by extractLinks(), and indexExternalUrl().

tx_indexedsearch_indexer::readFileContent ( ext,
absFile,
cPKey 
)

Reads the content of an external file being indexed. The content from the external parser MUST be returned in utf-8!

Parameters:
string File extension, eg. "pdf", "doc" etc.
string Absolute filename of file (must exist and be validated OK before calling function)
string Pointer to section (zero for all other than PDF which will have an indication of pages into which the document should be splitted.)
Returns:
array Standard content array (title, description, keywords, body keys)

Definition at line 1069 of file class.indexer.php.

Referenced by indexRegularDocument().

tx_indexedsearch_indexer::fileContentParts ( ext,
absFile 
)

Creates an array with pointers to divisions of document.

Parameters:
string File extension
string Absolute filename (must exist and be validated OK before calling function)
Returns:
array Array of pointers to sections that the document should be divided into

Definition at line 1086 of file class.indexer.php.

Referenced by indexRegularDocument().

tx_indexedsearch_indexer::splitRegularContent ( content  ) 

Splits non-HTML content (from external files for instance)

Parameters:
string Input content (non-HTML) to index.
Returns:
array Array of content, having the key "body" set (plus "title", "description" and "keywords", but empty)
See also:
splitHTMLContent()

Definition at line 1104 of file class.indexer.php.

tx_indexedsearch_indexer::charsetEntity2utf8 ( &$  contentArr,
charset 
)

Convert character set and HTML entities in the value of input content array keys

Parameters:
array Standard content array
string Charset of the input content (converted to utf-8)
Returns:
void

Definition at line 1137 of file class.indexer.php.

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::processWordsInArrays ( contentArr  ) 

Processing words in the array from split*Content -functions

Parameters:
array Array of content to index, see splitHTMLContent() and splitRegularContent()
Returns:
array Content input array modified so each key is not a unique array of words

Definition at line 1160 of file class.indexer.php.

Referenced by indexRegularDocument(), indexTypo3PageContent(), and procesWordsInArrays().

tx_indexedsearch_indexer::procesWordsInArrays ( contentArr  ) 

Processing words in the array from split*Content -functions This function is only a wrapper because the function has been removed (see above).

Parameters:
array Array of content to index, see splitHTMLContent() and splitRegularContent()
Returns:
array Content input array modified so each key is not a unique array of words
Deprecated:

Definition at line 1185 of file class.indexer.php.

References processWordsInArrays().

tx_indexedsearch_indexer::bodyDescription ( contentArr  ) 

Extracts the sample description text from the content array.

Parameters:
array Content array
Returns:
string Description string

Definition at line 1195 of file class.indexer.php.

References t3lib_div::intInRange().

Referenced by submitFilePage(), and submitPage().

tx_indexedsearch_indexer::indexAnalyze ( content  ) 

Analyzes content to use for indexing,

Parameters:
array Standard content array: an array with the keys title,keywords,description and body, which all contain an array of words.
Returns:
array Index Array (whatever that is...)

Definition at line 1217 of file class.indexer.php.

References analyzeBody(), and analyzeHeaderinfo().

Referenced by indexRegularDocument(), and indexTypo3PageContent().

tx_indexedsearch_indexer::analyzeHeaderinfo ( &$  retArr,
content,
key,
offset 
)

Calculates relevant information for headercontent

Parameters:
array Index array, passed by reference
array Standard content array
string Key from standard content array
integer Bit-wise priority to type
Returns:
void

Definition at line 1238 of file class.indexer.php.

References metaphone().

Referenced by indexAnalyze().

tx_indexedsearch_indexer::analyzeBody ( &$  retArr,
content 
)

Calculates relevant information for bodycontent

Parameters:
array Index array, passed by reference
array Standard content array
Returns:
void

Definition at line 1257 of file class.indexer.php.

References metaphone().

Referenced by indexAnalyze().

tx_indexedsearch_indexer::metaphone ( word,
retRaw = FALSE 
)

Creating metaphone based hash from input word

Parameters:
string Word to convert
boolean If set, returns the raw metaphone value (not hashed)
Returns:
mixed Metaphone hash integer (or raw value, string)

Definition at line 1277 of file class.indexer.php.

Referenced by analyzeBody(), analyzeHeaderinfo(), checkWordList(), and init().

tx_indexedsearch_indexer::submitPage (  ) 

Updates db with information about the page (TYPO3 page, not external media)

Returns:
void

Definition at line 1319 of file class.indexer.php.

References bodyDescription(), removeOldIndexedPages(), submit_grlist(), and submit_section().

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::submit_grlist ( hash,
phash_x 
)

Stores gr_list in the database.

Parameters:
integer Search result record phash
integer Actual phash of current content
Returns:
void
See also:
update_grlist()

Definition at line 1393 of file class.indexer.php.

References $hash, and md5inthash().

Referenced by submitFile_grlist(), submitPage(), and update_grlist().

tx_indexedsearch_indexer::submit_section ( hash,
hash_t3 
)

Stores section $hash and $hash_t3 are the same for TYPO3 pages, but different when it is external files.

Parameters:
integer phash of TYPO3 parent search result record
integer phash of the file indexation search record
Returns:
void

Definition at line 1413 of file class.indexer.php.

References $hash, and getRootLineFields().

Referenced by submitFile_section(), and submitPage().

tx_indexedsearch_indexer::removeOldIndexedPages ( phash  ) 

Removes records for the indexed page, $phash

Parameters:
integer phash value to flush
Returns:
void

Definition at line 1431 of file class.indexer.php.

Referenced by removeLoginpagesWithContentHash(), and submitPage().

tx_indexedsearch_indexer::submitFilePage ( hash,
file,
subinfo,
ext,
mtime,
ctime,
size,
content_md5h,
contentParts 
)

Updates db with information about the file

Parameters:
array Array with phash and phash_grouping keys for file
string File name
array Array of "cHashParams" for files: This is for instance the page index for a PDF file (other document types it will be a zero)
string File extension determining the type of media.
integer Modification time of file.
integer Creation time of file.
integer Size of file in bytes
integer Content HASH value.
array Standard content array (using only title and body for a file)
Returns:
void

Definition at line 1474 of file class.indexer.php.

References $content_md5h, $contentParts, $hash, bodyDescription(), and removeOldIndexedFiles().

Referenced by indexRegularDocument().

tx_indexedsearch_indexer::submitFile_grlist ( hash  ) 

Stores file gr_list for a file IF it does not exist already

Parameters:
integer phash value of file
Returns:
void

Definition at line 1540 of file class.indexer.php.

References $hash, md5inthash(), and submit_grlist().

tx_indexedsearch_indexer::submitFile_section ( hash  ) 

Stores file section for a file IF it does not exist

Parameters:
integer phash value of file
Returns:
void

Definition at line 1554 of file class.indexer.php.

References $hash, and submit_section().

Referenced by indexRegularDocument().

tx_indexedsearch_indexer::removeOldIndexedFiles ( phash  ) 

Removes records for the indexed page, $phash

Parameters:
integer phash value to flush
Returns:
void

Definition at line 1568 of file class.indexer.php.

Referenced by submitFilePage().

tx_indexedsearch_indexer::checkMtimeTstamp ( mtime,
phash 
)

Check the mtime / tstamp of the currently indexed page/file (based on phash) Return positive integer if the page needs to be indexed

Parameters:
integer mtime value to test against limits and indexed page (usually this is the mtime of the cached document)
integer "phash" used to select any already indexed page to see what its mtime is.
Returns:
integer Result integer: Generally: <0 = No indexing, >0 = Do indexing (see $this->reasons): -2) Min age was NOT exceeded and so indexing cannot occur. -1) mtime matched so no need to reindex page. 0) N/A 1) Max age exceeded, page must be indexed again. 2) mtime of indexed page doesn't match mtime given for current content and we must index page. 3) No mtime was set, so we will index... 4) No indexed page found, so of course we will index.

Definition at line 1604 of file class.indexer.php.

References log_setTSlogMessage(), and updateTstamp().

Referenced by indexRegularDocument(), and indexTypo3PageContent().

tx_indexedsearch_indexer::checkContentHash (  ) 

Check content hash in phash table

Returns:
mixed Returns true if the page needs to be indexed (that is, there was no result), otherwise the phash value (in an array) of the phash record to which the grlist_record should be related!

Definition at line 1640 of file class.indexer.php.

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::checkExternalDocContentHash ( hashGr,
content_md5h 
)

Check content hash for external documents Returns true if the document needs to be indexed (that is, there was no result)

Parameters:
integer phash value to check (phash_grouping)
integer Content hash to check
Returns:
boolean Returns true if the document needs to be indexed (that is, there was no result)

Definition at line 1657 of file class.indexer.php.

References $content_md5h.

Referenced by indexRegularDocument().

tx_indexedsearch_indexer::is_grlist_set ( phash_x  ) 

Checks if a grlist record has been set for the phash value input (looking at the "real" phash of the current content, not the linked-to phash of the common search result page)

Parameters:
integer Phash integer to test.
Returns:
void

Definition at line 1671 of file class.indexer.php.

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::update_grlist ( phash,
phash_x 
)

Check if an grlist-entry for this hash exists and if not so, write one.

Parameters:
integer phash of the search result that should be found
integer The real phash of the current content. The two values are different when a page with userlogin turns out to contain the exact same content as another already indexed version of the page; This is the whole reason for the grlist table in fact...
Returns:
void
See also:
submit_grlist()

Definition at line 1684 of file class.indexer.php.

References log_setTSlogMessage(), md5inthash(), and submit_grlist().

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::updateTstamp ( phash,
mtime = 0 
)

Update tstamp for a phash row.

Parameters:
integer phash value
integer If set, update the mtime field to this value.
Returns:
void

Definition at line 1699 of file class.indexer.php.

Referenced by checkMtimeTstamp(), indexRegularDocument(), and indexTypo3PageContent().

tx_indexedsearch_indexer::updateSetId ( phash  ) 

Update SetID of the index_phash record.

Parameters:
integer phash value
Returns:
void

Definition at line 1714 of file class.indexer.php.

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::updateParsetime ( phash,
parsetime 
)

Update parsetime for phash row.

Parameters:
integer phash value.
integer Parsetime value to set.
Returns:
void

Definition at line 1729 of file class.indexer.php.

Referenced by indexRegularDocument(), and indexTypo3PageContent().

tx_indexedsearch_indexer::updateRootline (  ) 

Update section rootline for the page

Returns:
void

Definition at line 1742 of file class.indexer.php.

References getRootLineFields().

Referenced by indexTypo3PageContent().

tx_indexedsearch_indexer::getRootLineFields ( &$  fieldArr  ) 

Adding values for root-line fields. rl0, rl1 and rl2 are standard. A hook might add more.

Parameters:
array Field array, passed by reference
Returns:
void

Definition at line 1757 of file class.indexer.php.

Referenced by submit_section(), and updateRootline().

tx_indexedsearch_indexer::removeLoginpagesWithContentHash (  ) 

Removes any indexed pages with userlogins which has the same contentHash NOT USED anywhere inside this class!

Returns:
void

Definition at line 1776 of file class.indexer.php.

References log_setTSlogMessage(), and removeOldIndexedPages().

tx_indexedsearch_indexer::includeCrawlerClass (  ) 

Includes the crawler class

Returns:
void

Definition at line 1793 of file class.indexer.php.

References t3lib_extMgm::extPath().

Referenced by extractLinks().

tx_indexedsearch_indexer::checkWordList ( wl  ) 

Adds new words to db

Parameters:
array Word List array (where each word has information about position etc).
Returns:
void

Definition at line 1820 of file class.indexer.php.

References log_setTSlogMessage(), and metaphone().

Referenced by indexRegularDocument(), and indexTypo3PageContent().

tx_indexedsearch_indexer::submitWords ( wl,
phash 
)

Submits RELATIONS between words and phash

Parameters:
array Word list array
integer phash value
Returns:
void

Definition at line 1857 of file class.indexer.php.

References freqMap().

Referenced by indexRegularDocument(), and indexTypo3PageContent().

tx_indexedsearch_indexer::freqMap ( freq  ) 

maps frequency from a real number in [0;1] to an integer in [0;$this->freqRange] with anything above $this->freqMax as 1 and back.

Parameters:
double Frequency
Returns:
integer Frequency in range.

Definition at line 1881 of file class.indexer.php.

Referenced by submitWords().

tx_indexedsearch_indexer::setT3Hashes (  ) 

Get search hash, T3 pages

Returns:
void

Definition at line 1914 of file class.indexer.php.

References md5inthash().

Referenced by init().

tx_indexedsearch_indexer::setExtHashes ( file,
subinfo = array() 
)

Get search hash, external files

Parameters:
string File name / path which identifies it on the server
array Additional content identifying the (subpart of) content. For instance; PDF files are divided into groups of pages for indexing.
Returns:
array Array with "phash_grouping" and "phash" inside.

Definition at line 1940 of file class.indexer.php.

References $hash, and md5inthash().

Referenced by indexRegularDocument().

tx_indexedsearch_indexer::md5inthash ( str  ) 

md5 integer hash Using 7 instead of 8 just because that makes the integers lower than 32 bit (28 bit) and so they do not interfere with UNSIGNED integers or PHP-versions which has varying output from the hexdec function.

Parameters:
string String to hash
Returns:
integer Integer intepretation of the md5 hash of input string.

Definition at line 1964 of file class.indexer.php.

Referenced by indexRegularDocument(), indexTypo3PageContent(), setExtHashes(), setT3Hashes(), submit_grlist(), submitFile_grlist(), and update_grlist().

tx_indexedsearch_indexer::makeCHash ( paramArray  ) 

Calculates the cHash value of input GET array (for constructing cHash values if needed)

Parameters:
array Array of GET parameters to encode
Returns:
void

Definition at line 1974 of file class.indexer.php.

References t3lib_div::cHashParams(), t3lib_div::implodeArrayForUrl(), and t3lib_div::shortMD5().

Referenced by backend_initIndexer().

tx_indexedsearch_indexer::log_push ( msg,
key 
)

Push function wrapper for TT logging

Parameters:
string Title to set
string Key (?)
Returns:
void

Definition at line 2006 of file class.indexer.php.

Referenced by hook_indexContent(), indexRegularDocument(), and indexTypo3PageContent().

tx_indexedsearch_indexer::log_pull (  ) 

Pull function wrapper for TT logging

Returns:
void

Definition at line 2015 of file class.indexer.php.

Referenced by hook_indexContent(), indexRegularDocument(), and indexTypo3PageContent().

tx_indexedsearch_indexer::log_setTSlogMessage ( msg,
errorNum = 0 
)

Set log message function wrapper for TT logging

Parameters:
string Message to set
integer Error number
Returns:
void

Definition at line 2026 of file class.indexer.php.

Referenced by checkMtimeTstamp(), checkWordList(), extractLinks(), hook_indexContent(), indexRegularDocument(), indexTypo3PageContent(), removeLoginpagesWithContentHash(), and update_grlist().

tx_indexedsearch_indexer::fe_headerNoCache ( &$  params,
ref 
)

Frontend hook: If the page is not being re-generated this is our chance to force it to be (because re-generation of the page is required in order to have the indexer called!)

Parameters:
array Parameters from frontend
object TSFE object (reference under PHP5)
Returns:
void

Definition at line 2051 of file class.indexer.php.

References t3lib_extMgm::isLoaded().


Member Data Documentation

tx_indexedsearch_indexer::$reasons

Initial value:

 array(
                -1 => 'mtime matched the document, so no changes detected and no content updated',
                -2 => 'The minimum age was not exceeded',
                1 => "The configured max-age was exceeded for the document and thus it's indexed.",
                2 => 'The minimum age was exceed and mtime was set and the mtime was different, so the page was indexed.',
                3 => 'The minimum age was exceed, but mtime was not set, so the page was indexed.',
                4 => 'Page has never been indexed (is not represented in the index_phash table).'
        )

Definition at line 144 of file class.indexer.php.

tx_indexedsearch_indexer::$defaultContentArray

Initial value:

array(
                'title' => '',
                'description' => '',
                'keywords' => '',
                'body' => '',
        )

Definition at line 171 of file class.indexer.php.


The documentation for this class was generated from the following file:


Généré par Les experts TYPO3 avec  doxygen 1.4.6