Начало работы с Zend_Search_Lucene. Введение в Zend_Search_Lucene

Zend, “Getting Started with Zend_Search_Lucene. Zend_Search_Lucene Introduction”, public translation into Russian from English More about this translation.

See also 44 similar translations

Translate into another language.

Participants

victorgugo86 115 points
Join Translated.by to translate! If you already have a Translated.by account, please sign in.
If you do not want to register an account, you can sign in with OpenID.
Pages: ← previous Ctrl next next untranslated
1 2

Getting Started with Zend_Search_Lucene. Zend_Search_Lucene Introduction

Начало работы с Zend_Search_Lucene.
Введение в Zend_Search_Lucene

History of edits (Latest: victorgugo86 7 years, 1 month ago) §

The Zend_Search_Lucene component is intended to provide a ready-for-use full-text search solution. It doesn't require any PHP extensions[1]UTF-8mbstring or additional software to be installed, and can be used immediately after Zend Framework installation.

Компонент Zend_Search_Lucene используется как готовое решение для полнотекстового поиска. Он не требует таких PHP-расширений как UTF-8mbstring или установки дополнительного программного обеспечения, и может использоваться сразу после инсталляции Zend Framework.

History of edits (Latest: victorgugo86 7 years, 1 month ago) §

Zend_Search_Lucene is a pure PHP port of the popular open source full-text search engine known as Apache Lucene. See » http://lucene.apache.org/ for the details.

Zend_Search_Lucene - прямой PHP-порт популярного полнотекстового поискового движка с открытым исходным кодом, известного как Apache Lucene. Для более подробной информации смотрите http://lucene.apache.org/

History of edits (Latest: victorgugo86 7 years, 1 month ago) §

Information must be indexed to be available for searching. Zend_Search_Lucene and Java Lucene use a document concept known as an "atomic indexing item."

Для того чтобы быть доступной для поиска информация должна быть проиндексирована. В Zend_Search_Lucene и Java Lucene используется принцип "atomic indexing item".

History of edits (Latest: victorgugo86 7 years, 1 month ago) §

Each document is a set of fields: <name, value> pairs where name and value are UTF-8 strings[2]. Any subset of the document fields may be marked as "indexed" to include field data in the text indexing process.

Каждый документ - это набор полей: пар <ключ, значение>, где ключ и значение — это строки в кодировке UTF-8. Любое подмножество полей документа может маркироваться как "проиндексировано" ("indexed") чтобы включить данные полей в процесс индексирования текста.

History of edits (Latest: victorgugo86 7 years, 1 month ago) §

Field values may or may not be tokenized while indexing. If a field is not tokenized, then the field value is stored as one term; otherwise, the current analyzer is used for tokenization.

Several analyzers are provided within the Zend_Search_Lucene package. The default analyzer works with ASCII text (since the UTF-8 analyzer needs the mbstring extension to be turned on). It is case insensitive, and it skips numbers. Use other analyzers or create your own analyzer if you need to change this behavior.

    Note: Using analyzers during indexing and searching

    Important note! Search queries are also tokenized using the "current analyzer", so the same analyzer must be set as the default during both the indexing and searching process. This will guarantee that source and searched text will be transformed into terms in the same way.

Field values are optionally stored within an index. This allows the original field data to be retrieved from the index while searching. This is the only way to associate search results with the original data (internal document IDs may be changed after index optimization or auto-optimization).

The thing that should be remembered is that a Lucene index is not a database. It doesn't provide index backup mechanisms except backup of the file system directory. It doesn't provide transactional mechanisms though concurrent index update as well as concurrent update and read are supported. It doesn't compare with databases in data retrieving speed.

Pages: ← previous Ctrl next next untranslated
1 2