|
在phpcms中应用sphinx全文索引[性能测试中]
Sphinx is a full-text search engine,The latest stable release is 0.9.9-release.
Sphinx features
* high indexing speed (upto 10 MB/sec on modern CPUs);
* high search speed (avg query is under 0.1 sec on 2-4 GB text collections);
* high scalability (upto 100 GB of text, upto 100 M documents on a single CPU);
* ....
英文介绍:http://www.sphinxsearch.com/docs/manual-0.9.9.html
一、首先需要在服务器上安装sphinx
在Windows上安装sphinx
1.下载支持mysql的包 http://www.sphinxsearch.com/downloads/sphinx-0.9.9-win32.zip
2.解压缩 sphinx-0.9.9-win32.zip 到 D:sphinx
3.安装sphinx服务,在命令行执行命令D:sphinxsearchd --install --config d:sphinxsphinx.conf --servicename SphinxSearch
英文参照:http://www.sphinxsearch.com/docs ... #installing-windows
在Linux服务器上安装sphinx
1.下载源码包 http://www.sphinxsearch.com/downloads/sphinx-0.9.9.tar.gz- $ tar xzvf sphinx-0.9.8.tar.gz
- $ cd sphinx
- $ ./configure --prefix=/usr/local/sphinx --with-mysql=/usr/local/mysql
- $ make
- $ make install
复制代码 sphinx.conf样例- source main
- {
- type = mysql
- sql_host = 10.228.129.199 #主机地址
- sql_user = admin #用户名
- sql_pass = admin #密码
- sql_db = demo #数据库名
- sql_port = 3306 # 端口, default is 3306
- sql_query_pre = SET NAMES utf8
- sql_query_pre = REPLACE INTO phpcms_counter SELECT 1, MAX(searchid) FROM phpcms_search
- sql_query = SELECT searchid, type, data FROM phpcms_search
- WHERE searchid>=$start AND searchid<=$end
- sql_query_range = SELECT 1,max_doc_id FROM phpcms_counter WHERE counter_id=1
- sql_range_step = 5000
- sql_query_info = SELECT * FROM main2008_search WHERE searchid=$id
- }
- source delta : main
- {
- sql_query_pre = SET NAMES utf8
- sql_query = SELECT searchid, type, data FROM phpcms_search
- WHERE searchid >( SELECT max_doc_id FROM phpcms_counter WHERE counter_id=1 )
- }
- index main
- {
- source = main
- # 放索引的目录
- path = D:sphinxdatamain #主索引路径
- # 编码
- charset_type = utf-8
- # 指定utf-8的编码表
- charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
- # 简单分词,只支持0和1,如果要搜索中文,请指定为1
- ngram_len = 1
- # 需要分词的字符,如果要搜索中文,去掉前面的注释
- ngram_chars = U+3000..U+2FA1F
- }
- index delta : main
- {
- source = delta
- path = D:sphinxdatadelta #从索引(暂时这么理解吧)路径
- }
- indexer
- {
- mem_limit = 128M #索引占用内存
- }
- searchd
- {
- port = 9312
- log = D:sphinxdataphpcmssearchd.log #服务日志路径
- query_log = D:sphinxdataphpcmsquery.log #查询日志路径
- read_timeout = 5
- max_children = 30
- pid_file = D:sphinxdataphpcmssearchd.pid
- max_matches = 1000
- seamless_rotate = 0
- preopen_indexes = 0
- unlink_old = 1
- }
复制代码 二、升级phpcms search模块
下载升级包直接覆盖search模块目录
下载地址:search.zip(16.39 KB, 下载次数: 522)
进入后台配置全文检索
创建数据表- CREATE TABLE `phpcms_counter` (
- `counter_id` INT(11) NOT NULL,
- `max_doc_id` INT(11) NOT NULL,
- PRIMARY KEY (`counter_id`)
- ) ENGINE=MYISAM DEFAULT CHARSET=gbk
复制代码 三、设置计划任务更新索引
1.windows下
需要设置计划任务
#凌晨4点合并索引,执行merge.bat
#其余时间每分钟更新索引,执行delta.bat
merge.bat- @ECHO off
- D:\sphinx\bin\indexer.exe --config D:\sphinx\sphinx.conf --merge main delta --rotate
- echo indexing, window will close when complete
- 复制代码
复制代码 delta.bat- @ECHO off
- D:\sphinx\bin\indexer.exe --config D:\sphinx\sphinx.conf delta --rotate
- echo indexing, window will close when complete
复制代码 2.linux下编辑定时任务 crontab -e- #凌晨4点合并索引,其余时间每分钟更新索引
- * 0-3 * * * /usr/local/sphinx/bin/indexer --config /usr/local/sphinx/etc/sphinx.conf delta --rotate
- * 6-23 * * * /usr/local/sphinx/bin/indexer --config /usr/local/sphinx/etc/sphinx.conf delta --rotate
- 0 4 * * * /usr/local/sphinx/bin/indexer --config /usr/local/sphinx/etc/sphinx.conf --merge main delta --rotate
复制代码 注意:升级前请注意备份文件,避免意外。
各种路径、权限需要应用所在服务器一致
如:
sphinx.conf 中需要配置
sql_host
sql_user
sql_pass
sql_db
sql_port
phpcms表前缀样例中为phpcms_
索引路径 D:\sphinx\data\delta
使用coreseek中文分词sphinx.conf样例- 中文参照:http://www.coreseek.cn/products-install/
- 安装步骤:
- 按照“中文参照”安装步骤,完成“三、coreseek中文全文检索测试”表示安装成功
- coreseek.conf样例:
- source main
- {
- type = mysql
- sql_host = 10.228.129.199 #主机地址
- sql_user = admin #用户名
- sql_pass = admin #密码
- sql_db = demo #数据库名
- sql_port = 3306 # 端口, default is 3306
- sql_query_pre = SET NAMES utf8
- sql_query_pre = REPLACE INTO phpcms_counter SELECT 1, MAX(searchid) FROM phpcms_search
- sql_query = SELECT searchid, type, data FROM phpcms_search \
- WHERE searchid>=$start AND searchid<=$end
- sql_query_range = SELECT 1,max_doc_id FROM phpcms_counter WHERE counter_id=1
- sql_range_step = 5000
- sql_query_info = SELECT * FROM main2008_search WHERE searchid=$id
- }
- source delta : main
- {
- sql_query_pre = SET NAMES utf8
- sql_query = SELECT searchid, type, data FROM phpcms_search \
- WHERE searchid >( SELECT max_doc_id FROM phpcms_counter WHERE counter_id=1 )
- }
- index main
- {
- source = main
- # 放索引的目录
- path = D:\sphinx\data\main #主索引路径
- #未分词版本,详情请参考:http://www.coreseek.cn/products-install/ngram_len_cjk/
- # 编码
- #charset_type = zh_cn.utf-8
- # 指定utf-8的编码表
- #charset_table = 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F
- # 简单分词,只支持0和1,如果要搜索中文,请指定为1
- #ngram_len = 1
- # 需要分词的字符,如果要搜索中文,去掉前面的注释
- #ngram_chars = U+3000..U+2FA1F
- # 分词版本,详情请参考:http://www.coreseek.cn/products-install/ngram_len_cjk/
- charset_dictpath=D:\sphinx\etc
- # 编码
- charset_type = zh_cn.utf-8
- # 指定zh_cn.utf-8的编码表
- #charset_table =
- ngram_len = 0
- #ngram_chars =
- }
- index delta : main
- {
- source = delta
- path = D:\sphinx\data\delta #从索引(暂时这么理解吧)路径
- }
- indexer
- {
- mem_limit = 128M #索引占用内存
- }
- searchd
- {
- port = 9312
- log = D:\sphinx\data\phpcms\searchd.log #服务日志路径
- query_log = D:\sphinx\data\phpcms\query.log #查询日志路径
- read_timeout = 5
- max_children = 30
- pid_file = D:\sphinx\data\phpcms\searchd.pid
- max_matches = 1000
- seamless_rotate = 0
- preopen_indexes = 0
- unlink_old = 1
- }
复制代码 http://bbs.phpcms.cn/thread-149380-1-1.html
在phpcms中应用sphinx全文索引[性能测试中]
|
本帖子中包含更多资源
您需要 登录 才可以下载或查看,没有帐号?立即注册

x
|