ElasticSearch对电脑配置要求较高,内存至少4G以上,空闲2G内存,线程数4018+
学习的时候,推荐将ElasticSearch安装到Linux或者mac上,极度不推荐装Windows上(坑太多,服务器部署的时候,也不会部署到Window上,学习用Windows上玩,不是耽误自个时间麽)。如果是Window用户想学这个,电脑自身至少16G,然后装虚拟机,在虚拟机上搞个Linux玩
Linux系统不建议装6/6.5版本的(启动的时候,会检查内核是否3.5+,当然可以忽略这个检查),推荐装7+
自身电脑配置不高的话,怎么办呢?土豪做法,去买个云服务器叭,在云服务器上玩
上面第1、2点未满足,又舍不得去买云服务器的小伙伴,就不要往下面看了,看了也白看,ElasticSearch对电脑配置要求较高,前置条件未满足的话,服务是起不来的。
我演示的时候,是用的mac系统,上面装了个虚拟机,虚拟机版本Centos6.5,jdk用的13,ElasticSearch用的版本是 7.8.1。这些我使用的包我下面也会提供,为了学习的话,尽量和我使用的版本一致,这样大家碰到的问题都一样,安装过程中,我也猜了不少坑,都总结出来了,仔细阅读文档就可以捣鼓出来。
常用的搜索网站:百度、谷歌
指具有固定格式或有限长度的数据,如数据库,元数据等。对于结构化数据,我们一般都是可以通过关系型数据库(mysql、oracle)的table的方法存储和搜索,也可以建立索引。通过b-tree等数据结构快速搜索数据
全文数据,指不定长或无固定格式的数据,如邮件,word等。对于非结构化数据,也即对全文数据的搜索主要有两种方式:顺序扫描法,全文搜索法
我们可以了解它的大概搜索方式,就是按照顺序扫描的方式查找特定的关键字。比如让你在一篇篮球新闻中,找出“科比”这个名字在那些段落出现过。那你肯定需要从头到尾把文章阅读一遍,然后标出关键字在哪些地方出现过
这种方式毋庸置疑是最低效的,如果文章很长,有几万字,等你阅读完这篇新闻找到“科比”这个关键字,那得花多少时间
对非结构化数据进行顺序扫描很慢,我们是否可以进行优化?把非结构化数据想办法弄得有一定结构不就好了嘛?将非结构化数据中的一部分信息提取出来,重新组织,使其变得有一定结构,然后对这些有一定结构的数据进行搜索,从而达到搜索相对较快的目的。这种方式就构成了全文搜索的基本思路。这部分从非结构化数据提取出的然后重新组织的信息,就是索引。
根据百度百科中的定义,全文搜索引擎是目前广泛应用的主流搜索引擎。它的工作原理是计算机索引程序通过扫描文章中的每个词,对每个词建立一个索引,指明该词在文章中出现的次数和位置,当用户查询时,检索程序就根据事先建立的索引进行查找,并将查找的结果反馈给用户。
注意,我使用的linux搭建的,当然Window(极度不推荐,坑太多)也能搭建,ElasticSearch安装前需要先安装jdk,这里我使用的是jdk13,因为linux自带jdk版本,需要先将之前的jdk版本卸载(点我直达),在安装指定的jdk版本!!!
开发环境,建议关闭防火墙,避免不必要的麻烦!!!!生产环境,视情况开启端口号!!!!
service iptables stop 命令关闭防火墙,但是系统重启后会开启
chkconfig iptables off--关闭防火墙开机自启动
ElasticSearch是强依赖jdk环境的,所以一定要安装对应的jdk版本,并配置好相关的环境变量,比如ES7.X版本要装jdk8以上的版本,而且是要官方来源的jdk。启动的时候有可能会提示要装jdk11,因为ES7以上官方都是建议使用jdk11,但是一般只是提示信息,不影响启动。
ES官网推荐JDK版本兼容地址:点我直达
ES强依赖JVM,也很吃内存,所以一定要保证你的机器至少空闲出2G以上内存。推荐使用Linux,可以本地搭建虚拟机。
启动一定要使用非root账户!!!!这是ES强制规定的。ElasticSearch为了安全考虑,不让使用root启动,解决办法是新建一个用户,用此用户进行相关的操作。如果你用root启动,会报错。如果是使用root账户安装ES,首先给安装包授权,比如chown -R 777 安装包路径。然后再使用非root账户启动,具体的权限配置,根据自己想要的配置。
高版本的ElasticSearch自带jdk版本的,Linux中我安装的是jdk13,没用ElasticSearch自带的jdk,有兴趣的小伙伴可以去研究下。
官网地址:点我直达
链接: https://pan.baidu.com/s/1jjNEErHtBu93HmvxKCT5Sw 密码: kbcs
1、修改elasticsearch-x.x.x/config/elasticsearch.yml,主要修改成以下内容
cluster.name: my-application
node.name: node-1
network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["127.0.0.1", "[::1]"]
cluster.initial_master_nodes: ["node-1"]
bootstrap.system_call_filter: false
http.cors.allow-origin: "*"
http.cors.enabled: true
http.cors.allow-headers : X-Requested-With,X-Auth-Token,Content-Type,Content-Length,Authorization
http.cors.allow-credentials: true
2、来到elasticsearch-x.x.x/bin下,执行:sh elasticsearch启动,报错,修改配置文件elasticsearch-env
3、设置用户和组
groupadd elsearch
#添加用户组,语法:groupadd 组名
useradd elsearch -g elsearch -p elasticsearch
#添加用户,并将用户添加到组中,语法:useradd 用户名 -p 密码 -g 组名
chown -R elsearch:elsearch elasticsearch-6.3.0
注意=================以上root用户操作===============
注意=================以下es用户操作================
注意:若es用户密码登录不上,在回到root用户下,修改es用户的密码,语法:passwd 要修改用户名
4、登录到es用户下,继续启动ElasticSearch,执行:sh elasticsearch
报错如下:
java.lang.UnsupportedOperationException: seccomp unavailable: requires kernel 3.5+ with CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER compiled in
原因:我用的Centos6.5,其linux内核版本为2.6。而Elasticsearch的插件要求至少3.5以上版本。
解决方案:禁用这个插件即可
修改elasticsearch.yml文件,在最下面添加如下配置:
bootstrap.system_call_filter: false
5.继续启动ElasticSearch,执行:sh elasticsearch
修改一下内容需要使用root权限
报错如下4条:
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
[2]: max number of threads [1024] for user [es] is too low, increase to at least [4096]
[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[4]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
==========分割线===============
解决办法如下
1、vim /etc/security/limits.conf文件,添加
* soft nofile 65535
* hard nofile 65535
2、vim /etc/security/limits.conf文件,添加
* soft nproc 4096
* hard nproc 4096
3、vim /etc/sysctl.conf 文件,添加
vm.max_map_count=262144
4、vim /var/soft/es7.8.1/elasticsearch-7.8.1/config/elasticsearch.yml 文件,添加
cluster.initial_master_nodes: ["node-1"]
修改完之后,一定要重启,重启,重启,重要的事儿说三遍!!!!!
上面第2条问题,线程数修改不了,可以尝试使用这个方法修改线程数
Elasticsearch7.8.1 [1]: max number of threads [1024] for user [es] is too low, increase to at least [4096]异常
根据linux系统差异,有时候需要来点终极解决方案
新建: /etc/security/limits.d/test-limits.conf
cat>>test-limits.conf
然后加下内容:
* soft nofile 65535
* hard nofile 65535
* soft nproc 4096
* hard nproc 4096
ctrl+d保存即可;
然后重启服务器即可;
1、第一次配置过程中,踩了不少坑,我踩过的坑,都在上面记录了
2、如果照我上面哪个方法还解决不了,自行根据ElasticSearch日志,百度去找答案叭····
进入软件的安装目录,进入到bin
执行:sh elasticsearch
进入软件的安装目录,进入到bin
执行:sh elasticsearch -d -p pid
打开浏览器输入:127.0.0.1:9200
类型
描述
默认位置
设置
bin
⼆进制脚本包含启动节点的elasticsearch
{path.home}/bin
conf
配置⽂件包含elasticsearch.yml
{path.home}/confifig
path.conf
data
在节点上申请的每个index/shard的数据⽂件的位置。
可容纳多个位置
{path.home}/data
path.data
logs
⽇志⽂件位置
{path.home}/logs
path.logs
plugins
插件⽂件位置。每个插件将包含在⼀个⼦⽬录中。
{path.home}/plugins
path.plugins
传统数据库查询数据的操作步骤是这样的:建立数据库->建表->插入数据->查询
一个索引可以理解成一个关系型数据库
一个type就像一类表,比如user表、order表
注意
1、ES 5.X中一个index可以有多种type
2、ES 6.X中一个index只能有一种type
3、ES 7.X以后已经移除type这个概念
mapping定义了每个字段的类型等信息。相当于关系型数据库中的表结构
一个document相当于关系型数据库中的一行记录
相当于关系型数据库表的字段
集群由一个或多个节点组成,一个集群由一个默认名称“elasticsearch”
集群的节点,一台机器或者一个进程
action
描述
HEAD
只获取某个资源的头部信息
GET
获取资源
POST
创建或更新资源
PUT
创建或更新资源
DELETE
删除资源
GET /user:列出所有的⽤户
POST /user:新建⼀个⽤户
PUT /user:更新某个指定⽤户的信息
DELETE /user/ID:删除指定⽤户
获取elasticcsearch状态
curl -X GET "http://localhost:9200"
新建一个文档
curl -X PUT "localhost:9200/xdclass/_doc/1" -H 'Content-Type:
application/json' -d' {
"user" : "louis",
"message" : "louis is good"
}
删除一个文档
curl -X DELETE "localhost:9200/xdclass/_doc/1"
此时再次查询cba时,返回json会多一行
关闭索引标记消失
定义索引的结构,之前定义一个nba索引,但是没有定义他的结构,我们现在开始建立mapping;
type="keyword":是一个关键字,不会被分词
type="text":会被分词,使用的是全文索引
{
"properties": {
"name": {
"type": "text"
},
"team_name": {
"type": "text"
},
"position": {
"type": "keyword"
},
"play_year": {
"type": "keyword"
},
"jerse_no": {
"type": "keyword"
}
}
}
{
"persistent": {
"action.auto_create_index": "false"
}
}
当auto_create_index=false时,指定一个不存在的索引,新增文档
{
"name":"杨超越",
"team_name":"梦之队",
"position":"组织后卫",
"play_year":"0",
"jerse_no":"18"
}
PUT请求:ip:port/xxx/_doc/1?op_type=create
{
"docs": [{
"_index": "nba",
"_type": "_doc",
"_id": "1"
},
{
"_index": "nba",
"_type": "_doc",
"_id": "2"
}
]
}
{
"script": "ctx._source.age = 18"
}
{
"script": "ctx._source.remove(\"age\")"
}
upsert当指定的文档不存在时,upsert参数包含的内容将会被插入到索引中,作为一个新文档;如果指定的文档存在,ElasticSearch引擎将会执行指定的更新逻辑。
并指定mapping
{
"mappings": {
"properties": {
"name": {
"type": "text"
},
"team_name": {
"type": "text"
},
"position": {
"type": "text"
},
"play_year": {
"type": "long"
},
"jerse_no": {
"type": "keyword"
}
}
}
}
192.168.199.170:9200/nba/_doc/1
{
"name": "哈登",
"team_name": "⽕箭",
"position": "得分后卫",
"play_year": 10,
"jerse_no": "13"
}
192.168.199.170:9200/nba/_doc/2
{
"name": "库⾥",
"team_name": "勇⼠",
"position": "控球后卫",
"play_year": 10,
"jerse_no": "30"
}
192.168.199.170:9200/nba/_doc/3
{
"name": "詹姆斯",
"team_name": "湖⼈",
"position": "⼩前锋",
"play_year": 15,
"jerse_no": "23"
}
词条查询不会分析查询条件,只有当词条和查询字符串完全匹配时,才匹配搜索。
{
"query": {
"term": {
"jerse_no": "23"
}
}
}
{
"query": {
"terms": {
"jerse_no": [
"23",
"13"
]
}
}
}
ElasticSearch引擎会先分析查询字符串,将其拆分成多个分词,只要已分析的字段中包含词条的任意一个,或全部包含,就匹配查询条件,返回该文档;如果不包含任意一个分词,表示没有任何问的那个匹配查询条件
{
"query": {
"match_all": {}
},
"from": 0,
"size": 10
}
{
"query": {
"match": {
"position":"后卫"
}
},
"from": 0,
"size": 10
}
{
"query": {
"multi_match": {
"query": "shooter",
"fields": ["title", "name"]
}
}
}
post 192.168.199.170:9200/nba/_update/2
{
"doc": {
"name": "库⾥",
"team_name": "勇⼠",
"position": "控球后卫",
"play_year": 10,
"jerse_no": "30",
"title": "the best shooter"
}
}
类似于词条查询,精准查询
前缀匹配
{
"query": {
"match_phrase_prefix": {
"title": "the best s"
}
}
}
post 192.168.199.170:9200/nba/_update/3
{
"doc": {
"name": "詹姆斯",
"team_name": "湖⼈",
"position": "⼩前锋",
"play_year": 15,
"jerse_no": "23",
"title": "the best small forward"
}
}
标准分析器是默认分词器,如果未指定,则使用该分词器
{
"analyzer": "standard",
"text": "The best 3-points shooter is Curry!"
}
simple分析器当他遇到只要不是字母的字符,就将文本解析成term,而且所有的term都是小写的
whitespace分析器,当他遇到空白字符时,就将文本解析成terms
stop分析器和simple分析器很想,唯一不同的是,stop分析器增加了对删除停止词的支持,默认使用了english停止词
stopwords预定义的停止词列表,比如(ths,a,an,this,of,at)等等
(特定的语⾔的分词器,⽐如说,english,英语分词器),内置语⾔:arabic, armenian,
basque, bengali, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, fifinnish,
french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian,
lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish,
swedish, turkish, thai
用正则表达式将文本分割成sterms,默认的正则表达式是\W+
put 192.168.199.170:9200/my_index
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"type": "whitespace"
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "text"
},
"team_name": {
"type": "text"
},
"position": {
"type": "text"
},
"play_year": {
"type": "long"
},
"jerse_no": {
"type": "keyword"
},
"title": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"name": "库⾥",
"team_name": "勇⼠",
"position": "控球后卫",
"play_year": 10,
"jerse_no": "30",
"title": "The best 3-points shooter is Curry!"
}
{
"query": {
"match": {
"title": "Curry!"
}
}
}
{
"analyzer": "standard",
"text": "⽕箭明年总冠军"
}
安装后重启
{
"analyzer": "smartcn",
"text": "⽕箭明年总冠军"
}
sh elasticsearch-plugin remove analysis-smartcn
下载地址:点我直达
安装,解压到plugins目录
然后重启
由于json类型没有date类型,所以es通过识别字符串是否符合format定义的格式来判断是否为date类型
format默认为:strict_date_optional_time || epoch_millis
格式
"2022-01-01" "2022/01/01 12:10:30" 这种字符串格式
从开始纪元(1970年1月1日0点)开始的毫秒数
PUT 192.168.199.170:9200/nba/_mapping
{
"properties": {
"name": {
"type": "text"
},
"team_name": {
"type": "text"
},
"position": {
"type": "text"
},
"play_year": {
"type": "long"
},
"jerse_no": {
"type": "keyword"
},
"title": {
"type": "text"
},
"date": {
"type": "date"
}
}
}
POST 192.168.199.170:9200/nba/_doc/4
{
"name": "蔡x坤",
"team_name": "勇⼠",
"position": "得分后卫",
"play_year": 10,
"jerse_no": "31",
"title": "打球最帅的明星",
"date": "2020-01-01"
}
POST 192.168.199.170:9200/nba/_doc/5
{
"name": "杨超越",
"team_name": "猴急",
"position": "得分后卫",
"play_year": 10,
"jerse_no": "32",
"title": "打球最可爱的明星",
"date": 1610350870
}
POST 192.168.199.170:9200/nba/_doc/6
{
"name": "吴亦凡",
"team_name": "湖⼈",
"position": "得分后卫",
"play_year": 10,
"jerse_no": "33",
"title": "最会说唱的明星",
"date": 1641886870000
}
POST 192.168.199.170:9200/nba/_doc/8
{
"name": "吴亦凡",
"team_name": "湖⼈",
"position": "得分后卫",
"play_year": 10,
"jerse_no": "33",
"title": "最会说唱的明星",
"date": "1641886870",
"array": [
"one",
"two"
],
"address": {
"region": "China",
"location": {
"province": "GuangDong",
"city": "GuangZhou"
}
}
}
索引方式
"address.region": "China",
"address.location.province": "GuangDong",
"address.location.city": "GuangZhou"
POST 192.168.199.170:9200/nba/_search
{
"query": {
"match": {
"address.region": "china"
}
}
}
IP类型的字段用于存储IPv4和IPv6的地址,本质上是一个长整形字段
POST 192.168.199.170:9200/nba/_mapping
{
"properties": {
"name": {
"type": "text"
},
"team_name": {
"type": "text"
},
"position": {
"type": "text"
},
"play_year": {
"type": "long"
},
"jerse_no": {
"type": "keyword"
},
"title": {
"type": "text"
},
"date": {
"type": "date"
},
"ip_addr": {
"type": "ip"
}
}
}
PUT 192.168.199.170:9200/nba/_doc/9
{
"name": "吴亦凡",
"team_name": "湖⼈",
"position": "得分后卫",
"play_year": 10,
"jerse_no": "33",
"title": "最会说唱的明星",
"ip_addr": "192.168.1.1"
}
POST 192.168.199.170:9200/nba/_search
{
"query": {
"term": {
"ip_addr": "192.168.0.0/16"
}
}
}
可视化工具kibana的安装和使用
赋权限
chown -R es:es781g /var/soft/kibana-7.8.1-linux-x86_64
kibana.yml
server.port: 5601 #kibana端口
server.host: "10.0.0.169" #绑定的主机IP地址
elasticsearch.hosts: ["http://10.0.0.169:9200"] #elasticsearch的主机IP
kibana.index: ".kibana" #开启此选项
i18n.locale: "zh-CN" #kibana默认文字是英文,变更成中文
进⼊到⽂件夹的bin⽬录,执⾏sh kibana
ip:5601
后面示例,会大量使用该工具
手把手教你批量导入数据
ES提供了一个叫bulk的API来进行批量操作
数据
{"index": {"_index": "book", "_type": "_doc", "_id": 1}}
{"name": "权⼒的游戏"} {"index": {"_index": "book", "_type": "_doc", "_id": 2}}
{"name": "疯狂的⽯头"}
curl -X POST "192.168.199.170:9200/_bulk" -H 'Content-Type: application/json' --data-binary @test
{"mappings":{"properties":{"birthDay":{"type":"date"},"birthDayStr": {"type":"keyword"},"age":{"type":"integer"},"code": {"type":"text"},"country":{"type":"text"},"countryEn": {"type":"text"},"displayAffiliation":{"type":"text"},"displayName": {"type":"text"},"displayNameEn":{"type":"text"},"draft": {"type":"long"},"heightValue":{"type":"float"},"jerseyNo": {"type":"text"},"playYear":{"type":"long"},"playerId": {"type":"keyword"},"position":{"type":"text"},"schoolType": {"type":"text"},"teamCity":{"type":"text"},"teamCityEn": {"type":"text"},"teamConference": {"type":"keyword"},"teamConferenceEn":{"type":"keyword"},"teamName": {"type":"keyword"},"teamNameEn":{"type":"keyword"},"weight": {"type":"text"}}}}
POST nba/_search
{
"query": {
"term": {
"jerseyNo": "23"
}
},
"from": 0,
"size": 20
}
Exsit Query在特定的字段中查找非空值的文档(查找队名非空的球员)
Prefix Query查找包含带有指定前缀term的文档(查找队名为Rock开头的球员)
Wildcard Query支持通配符查询,*表示任意字符,?表示任意单个字符(查找火箭队的球员)
Regexp Query正则表达式查询(查找火箭队的球员)
Ids Query(查找id为1和2的球员)
查询指定字段在指定范围内包含值(日期、数字或字符串)的文档
查找在nba打球在2年到10年以内的球员
POST nba/_search
{
"query": {
"range": {
"playYear": {
"gte": 2,
"lte": 10
}
}
},
"from": 0,
"size": 20
}
查找1999年到2020年出生的球员
POST nba/_search
{
"query": {
"range": {
"birthDay": {
"gte": "01/01/1999",
"lte": "2020",
"format": "dd/MM/yyyy||yyyy"
}
}
},
"from": 0,
"size": 20
}
type
description
must
必须出现在匹配文档中
filter
必须出现在文档中,但是不打分
must_not
不能出现在文档中
should
应该出现在文档中
POST nba/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"displayNameEn": "james"
}
}
]
}
},
"from": 0,
"size": 20
}
POST nba/_search
{
"query": {
"bool": {
"filter": [
{
"match": {
"displayNameEn": "james"
}
}
]
}
},
"from": 0,
"size": 20
}
POST nba/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"displayNameEn": "james"
}
}
],
"must_not": [
{
"term": {
"teamConferenceEn": {
"value": "Eastern"
}
}
}
]
}
},
"from": 0,
"size": 20
}
组合起来含义:一定不在东部的james
即使匹配不到也返回,只是评分不同
POST nba/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"displayNameEn": "james"
}
}
],
"must_not": [
{
"term": {
"teamConferenceEn": {
"value": "Eastern"
}
}
}
],
"should": [
{
"range": {
"playYear": {
"gte": 11,
"lte": 20
}
}
}
]
}
},
"from": 0,
"size": 20
}
如果minimum_should_match=1,则变成要查出名字叫做James的打球时间在11年到20年西部球员
POST nba/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"displayNameEn": "james"
}
}
],
"must_not": [
{
"term": {
"teamConferenceEn": {
"value": "Eastern"
}
}
}
],
"should": [
{
"range": {
"playYear": {
"gte": 11,
"lte": 20
}
}
}
],
"minimum_should_match": 1
}
},
"from": 0,
"size": 20
}
minimum_should_match代表了最小匹配经度,如果设置minimum_should_match=1,那么should语句中至少需要有一个条件满足
POST nba/_search
{
"query": {
"match": {
"teamNameEn": "Rockets"
}
},
"sort": [
{
"playYear": {
"order": "desc"
}
}
],
"from": 0,
"size": 20
}
火箭队中按打球时间从大到小,如果年龄相同则按照身高从高到低排序的球员
POST nba/_search
{
"query": {
"match": {
"teamNameEn": "Rockets"
}
},
"sort": [
{
"playYear": {
"order": "desc"
}
},{
"heightValue": {
"order": "asc"
}
}
],
"from": 0,
"size": 20
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"avgAge": {
"avg": {
"field": "age"
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"countPlayerYear": {
"value_count": {
"field": "playYear"
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
}
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"countAget": {
"cardinality": {
"field": "age"
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"statsAge": {
"stats": {
"field": "age"
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"extendStatsAge": {
"extended_stats": {
"field": "age"
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"pecentAge": {
"percentiles": {
"field": "age"
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"pecentAge": {
"percentiles": {
"field": "age",
"percents": [
20,
50,
75
]
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"aggsAge": {
"terms": {
"field": "age",
"size": 10
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"aggsAge": {
"terms": {
"field": "age",
"size": 10,
"order": {
"_key": "desc"
}
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"aggsAge": {
"terms": {
"field": "age",
"size": 10,
"order": {
"_count": "desc"
}
}
}
},
"size": 0
}
POST /nba/_search
{
"query": {
"term": {
"teamNameEn": {
"value": "Rockets"
}
}
},
"aggs": {
"avgAge": {
"avg": {
"field": "age"
}
}
},
"size": 0
}
POST /nba/_search
{
"aggs": {
"aggsTeamName": {
"terms": {
"field": "teamNameEn",
"include": [
"Lakers",
"Rockets",
"Warriors"
],
"exclude": [
"Warriors"
],
"size": 30,
"order": {
"avgAge": "desc"
}
},
"aggs": {
"avgAge": {
"avg": {
"field": "age"
}
}
}
}
},
"size": 0
}
POST /nba/_search
{
"aggs": {
"aggsTeamName": {
"terms": {
"field": "teamNameEn",
"include": "Lakers|Ro.*|Warriors.*",
"exclude": "Warriors",
"size": 30,
"order": {
"avgAge": "desc"
}
},
"aggs": {
"avgAge": {
"avg": {
"field": "age"
}
}
}
}
},
"size": 0
}
POST /nba/_search
{
"aggs": {
"ageRange": {
"range": {
"field": "age",
"ranges": [
{
"to": 20
},
{
"from": 20,
"to": 35
},
{
"to": 35
}
]
}
}
},
"size": 0
}
POST /nba/_search
{
"aggs": {
"birthDayRange": {
"date_range": {
"field": "birthDay",
"format": "MM-yyy",
"ranges": [
{
"to": "01-1989"
},
{
"from": "01-1989",
"to": "01-1999"
},
{
"from": "01-1999",
"to": "01-2009"
},
{
"from": "01-2009"
}
]
}
}
},
"size": 0
}
POST /nba/_search
{
"aggs": {
"birthday_aggs": {
"date_histogram": {
"field": "birthDay",
"format": "yyyy",
"interval": "year"
}
}
},
"size": 0
}
query_string查询,如果熟悉lucene的查询语法,我们可以直接用lucene查询语法写一个查询串进行查询,ES中接到请求后,通过查询解析器,解析查询串生成对应的查询。
POST /nba/_search
{
"query": {
"query_string": {
"default_field": "displayNameEn",
"query": "james OR curry"
}
},
"size": 100
}
POST /nba/_search
{
"query": {
"query_string": {
"default_field": "displayNameEn",
"query": "james AND harden"
}
},
"size": 100
}
在开发中,随着业务需求的迭代,较老的业务逻辑就要面临更新甚至是重构,而对于es来说,为了适应新的业务逻辑,可能就要对原有的索引做一些修改,比如对某字段做调整,甚至是重构索引。而做这些操作的时候,可能会对业务造成影响,甚至是停机调整等问题。由此,es提供了索引别名来解决这些问题。索引别名就像一个快捷方式或软连接,可以指向一个或多个索引,也可以给任意一个需要索引名的API来使用。别名的应用为程序提供了极大地灵活性。
GET /nba/_alias
GET /_alias
POST /_aliases
{
"actions": [
{
"add": {
"index": "nba",
"alias": "nba_v1.0"
}
}
]
}
方式一
POST /_aliases
{
"actions": [
{
"remove": {
"index": "nba",
"alias": "nba_v1.0"
}
}
]
}
方式二
DELETE /nba/_alias/nba_v1.0
POST /_aliases
{
"actions": [
{
"remove": {
"index": "nba",
"alias": "nba_v1.0"
}
},
{
"add": {
"index": "nba",
"alias": "nba_v2.0"
}
}
]
}
POST /_aliases
{
"actions": [
{
"add": {
"index": "nba",
"alias": "nba_v2.0"
}
},{
"add": {
"index": "cba",
"alias": "cba_v2.0"
}
}
]
}
POST /_aliases
{
"actions": [
{
"add": {
"index": "nba",
"alias": "nba_v2.0"
}
},{
"add": {
"index": "nba",
"alias": "cba_v2.2"
}
}
]
}
GET /nba_v2.2
POST /nba_v2.0/_doc/566
{
"countryEn": "Croatia",
"teamName": "快船",
"birthDay": 858661200000,
"country": "克罗地亚",
"teamCityEn": "LA",
"code": "ivica_zubac",
"displayAffiliation": "Croatia",
"displayName": "伊维察 祖巴茨哥哥",
"schoolType": "",
"teamConference": "⻄部",
"teamConferenceEn": "Western",
"weight": "108.9 公⽄",
"teamCity": "洛杉矶",
"playYear": 3,
"jerseyNo": "40",
"teamNameEn": "Clippers",
"draft": 2016,
"displayNameEn": "Ivica Zubac",
"heightValue": 2.16,
"birthDayStr": "1997-03-18",
"position": "中锋",
"age": 22,
"playerId": "1627826"
}
POST /_aliases
{
"actions": [
{
"add": {
"index": "nba",
"alias": "national_player",
"is_write_index": true
}
},
{
"add": {
"index": "cba",
"alias": "national_player"
}
}
]
}
POST /national_player/_doc/566
{
"countryEn": "Croatia",
"teamName": "快船",
"birthDay": 858661200000,
"country": "克罗地亚",
"teamCityEn": "LA",
"code": "ivica_zubac",
"displayAffiliation": "Croatia",
"displayName": "伊维察 祖巴茨妹妹",
"schoolType": "",
"teamConference": "⻄部",
"teamConferenceEn": "Western",
"weight": "108.9 公⽄",
"teamCity": "洛杉矶",
"playYear": 3,
"jerseyNo": "40",
"teamNameEn": "Clippers",
"draft": 2016,
"displayNameEn": "Ivica Zubac",
"heightValue": 2.16,
"birthDayStr": "1997-03-18",
"position": "中锋",
"age": 22,
"playerId": "1627826"
}
ElasticSearch是一个实时的分布式搜索引擎,为用户提供搜索服务,当我们决定存储某种数据时,在创建索引的时候需要将数据结构完整确定下来,于此同时索引的设定和很多固定配置将不能修改。当需要改变数据结构时,就需要重新建立索引,为此,Elastic团队提供了很多辅助工具帮助开发人员进行重建索引
PUT /nba_20220810
{
"mappings": {
"properties": {
"age": {
"type": "integer"
},
"birthDay": {
"type": "date"
},
"birthDayStr": {
"type": "keyword"
},
"code": {
"type": "text"
},
"country": {
"type": "keyword"
},
"countryEn": {
"type": "keyword"
},
"displayAffiliation": {
"type": "text"
},
"displayName": {
"type": "text"
},
"displayNameEn": {
"type": "text"
},
"draft": {
"type": "long"
},
"heightValue": {
"type": "float"
},
"jerseyNo": {
"type": "keyword"
},
"playYear": {
"type": "long"
},
"playerId": {
"type": "keyword"
},
"position": {
"type": "text"
},
"schoolType": {
"type": "text"
},
"teamCity": {
"type": "text"
},
"teamCityEn": {
"type": "text"
},
"teamConference": {
"type": "keyword"
},
"teamConferenceEn": {
"type": "keyword"
},
"teamName": {
"type": "keyword"
},
"teamNameEn": {
"type": "keyword"
},
"weight": {
"type": "text"
}
}
}
}
POST /_reindex
{
"source": {
"index": "nba"
},
"dest": {
"index": "nba_20220810"
}
}
POST /_reindex?wait_for_completion=false
{
"source": {
"index": "nba"
},
"dest": {
"index": "nba_20220810"
}
}
POST /_aliases
{
"actions": [
{
"add": {
"index": "nba_20220810",
"alias": "nba_latest"
}
},
{
"remove": {
"index": "nba",
"alias": "nba_latest"
}
}
]
}
DELETE /nba
POST /nba_latest/_search
{
"query": {
"match": {
"displayNameEn": "james"
}
}
}
新的数据一添加到索引中立马就能搜索到,但是真实情况不是这样的
我们使用链式命令请求,先添加一个文档,再立刻搜索
curl -X PUT 192.168.199.170:9200/star/_doc/888 -H 'Content-Type:
application/json' -d '{ "displayName": "蔡徐坤" }'
curl -X GET localhost:9200/star/_doc/_search?pretty
强制刷新
curl -X PUT 192.168.199.170:9200/star/_doc/666?refresh -H 'Content-Type:
application/json' -d '{ "displayName": "杨超越" }'
curl -X GET localhost:9200/star/_doc/_search?pretty
修改默认更新时间(默认时间是1s)
PUT /star/_settings
{
"index": {
"refresh_interval": "5s"
}
}
将refresh关闭
PUT /star/_settings
{
"index": {
"refresh_interval": "-1"
}
}
如果返回的结果集中很多符合条件的结果,那怎么能一眼就能看到我们想要的那个结果呢?比如下面网站所示的那样,我们搜索“科比”,在结果集中,将所有“科比”高亮显示?
POST /nba_latest/_search
{
"query": {
"match": {
"displayNameEn": "james"
}
},
"highlight": {
"fields": {
"displayNameEn": {}
}
}
}
POST /nba_latest/_search
{
"query": {
"match": {
"displayNameEn": "james"
}
},
"highlight": {
"fields": {
"displayNameEn": {
"pre_tags": [
"
查询建议:是为了给用户提供更好的搜索体验。包括:词条检查,自动补全
text
指定搜索文本
field
获取建议词的搜索字段
analyzer
指定分词器
size
每个词返回的最大建议词数
sort
如何对建议词进行排序,可用选项:
score:先按评分排序、再按文档频率排、term顺序
frequency:先按文档频率排,再按评分,term顺序排
suggest_mode
建议模式,控制提供建议词的方式:
missing:仅在搜索的词项在索引中不存在时才提供建议词,默认值;
popular:仅建议文档频率比搜索词项高的词
always:总是提供匹配的建议词
term词条建议器,对给输入的文本进行分词,为每个分词提供词项建议
POST /nba_latest/_search
{
"suggest": {
"my-suggestion": {
"text": "jamse hardne",
"term": {
"suggest_mode": "missing",
"field": "displayNameEn"
}
}
}
}
phrase短语建议,在term的基础上,会考量多个term之间的关系,比如是否同时出现在索引的原文里,相邻成都,以及词频等
POST /nba_latest/_search
{
"suggest": {
"my-suggestion": {
"text": "jamse harden",
"phrase": {
"field": "displayNameEn"
}
}
}
}
Completion完成建议
POST /nba_latest/_search
{
"suggest": {
"my-suggestion": {
"text": "Miam",
"completion": {
"field": "teamCityEn"
}
}
}
}
手机扫一扫
移动阅读更方便
你可能感兴趣的文章