Elasticsearch
Elasticsearch是一个基于Apache Lucene的开源搜索引擎,无论是开源还是专有领域,Lucene可以被认为是迄今为止最先进、性能最好的、功能最全的搜索引擎库,Lucene非常复杂,而Elasticsearch通过RESTful API隐藏了Lucene的复杂性,让搜索变得更简单,不过Elasticsearch不仅仅是一个搜索,它更是文档NoSQL体系的一种,我们可以这样描述它:
- 分布式的实时文档存储,每个字段都可以被索引并可被搜索
- 分布式的实时分析搜索引擎
- 可以扩展到上百台大集群,处理PB级结构化或非结构化数据
 而且这些功能被集成到一个服务中,通过RESTful API调用,满足各种编程语言的需求。基本组件
- 索引(index):文档容器,换句话说,索引时具有属性的文档集合,类似于表,索引名必须使用小写,每个索引的默认分片为5个,每个分片至少有一个副本
- 类型(type):类型时索引的逻辑分区,其意义完全取决于用户需求,一个索引内部可定义一个或多个类型,一般来说,类型就是拥有相同的域的文档的预定义
- 文档(documentt):文档是Lucene索引和搜索的原子单位,它包含了一个或多个域,是域的容器:基于json格式表示
- 映射(mapping):原始内容存储为文档之前需要实现分析,例如切词、过滤掉某些词等;映射用于定义分析机制该如何实现;除此之外,ES还为映射提供了诸如将域中的内容排序等功能
ES集群组件
- Cluster:ES的集群标识为集群名称;默认为‘elasticsearch’,节点就是依靠是名字来决定加入哪个集群,一个节点只能属于一个集群。
- Node:运行单个ES实例的主机即为节点,用于存储数据,参与集群索引及搜索操作,节点的标识靠节点名。
- Shard:将索引切割成为物理存储组件;但每一个shard都是一个独立且完整的索引;创建索引时,ES默认将其分割为5个shards,用户也可以按需定义,创建完成之后不可修改;shard有两种类型:primary shard和replia,replia用于数据冗余及查询时的负载均衡,每个主shard的副本数量可自定义,且可动态修改
ES Cluster启动时默认以多播或者单播的形式在9300/tcp查询同一集群中的其他节点,并与之通信。集群中所有节点会选举出一个主节点负责管理整个集群状态,以及在集群中决定shards的分布方式,站在用户角度而言,每个均接受并响应用户的各类请求。
ES Cluster的状态:
- green:所有主要分片和副本都可用
- yellow:所有主要分片可用,但不是所有复制分片都可用
- red:不是所有主要分片都可用
倒排索引
倒排索引是Lucene中的重要概念,也是ES能够快速检索出内容的重要原因,倒排索引源于实际应用中需要根据属性的值来查找记录,这种索引表中的每一项都包括了一个属性值和具备这种的记录的值,由于不是通过记录来确定属性值,而是由属性来确定记录的位置,所以被称为倒排索引。
在搜索过程中,一段数据需要存储,Lucene首先要进行切词操作,而每个切成的可是表示为这段数据的属性,而通过保存文档于属性对的方式存储下这段数据,而后在检索的过程中检索这种属性,通过属性就可以找到相对应的文档,当然还是有匹配的权重,匹配度越高被搜索到的越前面,很像我们使用的搜索引擎吧,这就是基本的倒排索引概念。
Elasticsearch安装
Elasticseach由java开发,所以我们需要安装java运行环境JDK,OpenJDK或者OracleJDK,最新的Elasticsearch必须在JDK 1.8的情况下运行。
| 1 | ~]# yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel | 
| 2 | ~]# wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.0.rpm | 
| 3 | ~]# rpm -ivh elasticsearch-5.5.0.rpm | 
| 4 | |
| 5 | elasticsearch对系统资源比较耗费,所以一些默认的系统系统参数需要修改下: | 
| 6 | 问题一: | 
| 7 | java.lang.UnsupportedOperationException: seccomp unavailable: CONFIG_SECCOMP not compiled into kernel, CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER are needed | 
| 8 |         at org.elasticsearch.bootstrap.SystemCallFilter.linuxImpl(SystemCallFilter.java:363) ~[elasticsearch-5.5.0.jar:5.5.0] | 
| 9 |         at org.elasticsearch.bootstrap.SystemCallFilter.init(SystemCallFilter.java:638) ~[elasticsearch-5.5.0.jar:5.5.0] | 
| 10 |         at org.elasticsearch.bootstrap.JNANatives.tryInstallSystemCallFilter(JNANatives.java:215) [elasticsearch-5.5.0.jar:5.5.0] | 
| 11 |         at org.elasticsearch.bootstrap.Natives.tryInstallSystemCallFilter(Natives.java:99) [elasticsearch-5.5.0.jar:5.5.0] | 
| 12 |         at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:111) [elasticsearch-5.5.0.jar:5.5.0] | 
| 13 |         at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194) [elasticsearch-5.5.0.jar:5.5.0] | 
| 14 |         at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:351) [elasticsearch-5.5.0.jar:5.5.0] | 
| 15 |         at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:123) [elasticsearch-5.5.0.jar:5.5.0] | 
| 16 |         at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:114) [elasticsearch-5.5.0.jar:5.5.0] | 
| 17 |         at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:67) [elasticsearch-5.5.0.jar:5.5.0] | 
| 18 |         at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) [elasticsearch-5.5.0.jar:5.5.0] | 
| 19 |         at org.elasticsearch.cli.Command.main(Command.java:88) [elasticsearch-5.5.0.jar:5.5.0] | 
| 20 |         at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) [elasticsearch-5.5.0.jar:5.5.0] | 
| 21 |         at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84) [elasticsearch-5.5.0.jar:5.5.0] | 
| 22 | |
| 23 | 这是一个警告,采用最新的内核就可以解决,不影响使用。 | 
| 24 | |
| 25 | 问题二: | 
| 26 | max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536] | 
| 27 | ~]# vim /etc/security/limits.conf | 
| 28 | * soft nofile 65536 | 
| 29 | * hard nofile 65536 | 
| 30 | |
| 31 | 问题三: | 
| 32 | max number of threads [1024] for user [elasticsearch] is too low, increase to at least [2048] | 
| 33 | ~]# vim /etc/security/limits.d/90-nproc.conf  | 
| 34 | *          soft    nproc     2048 | 
| 35 | *          hard    nproc     2048 | 
| 36 | |
| 37 | 问题四: | 
| 38 | max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144] | 
| 39 | ~]# vim /etc/sysctl.conf | 
| 40 | vm.max_map_count=655360 | 
| 41 | ~]# sysctl -p | 
| 42 | |
| 43 | 问题五: | 
| 44 | system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk | 
| 45 | ~]# vim /etc/elasticsearch/elasticsearch.yml | 
| 46 | bootstrap.system_call_filter: false | 
elasticsearch.yml配置:
| 1 | ~]# cat /etc/elasticsearch/elasticsearch.yml | 
| 2 | cluster.name: MyES    集群名称,相同的集群使用同一集群名称来辨别 | 
| 3 | node.name: node1      节点名称 | 
| 4 | #node.attr.rack: r1   集群附加属性 | 
| 5 | path.data: /data/elastic  数据存储文档目录 | 
| 6 | path.logs: /data/elastic/log  日志目录 | 
| 7 | network.host: 0.0.0.0    绑定的ip | 
| 8 | http.port: 9200       restful api的接口 | 
| 9 | transport.tcp.port: 9300  参与集群事务通信的端口 | 
| 10 | discovery.zen.ping.unicast.hosts: ["10.211.55.48", "10.211.55.49"]  集群单播检查存活 | 
| 11 | discovery.zen.minimum_master_nodes: 2    当集群分区选举新的主节点时,选举要求总节点/2+1,所以这里最小的节点数应该为奇数,这里只是为了试验 | 
| 12 | gateway.recover_after_nodes: 2    当一个集群恢复或者重新启动的时候,最少需要几个节点启动,集群才会启动 | 
| 13 | action.destructive_requires_name: true  当删除索引的时候需要精确名称 | 
| 14 | ~]# service elasticsearch start | 
| 15 | ~]# tail -f /data/elastic/log/MyES.log | 
| 16 | [2017-05-16T12:27:12,599][WARN ][o.e.d.z.ZenDiscovery     ] [node2] not enough master nodes discovered during pinging (found [[Candidate{node={node2}{HYFYyQ31QmatkqJXoKsNCw}{ikvdDqarTHm-aWCX_89CcQ}{10.211.55.49}{10.211.55.49:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again | 
| 17 | [2017-05-16T12:27:15,600][WARN ][o.e.d.z.ZenDiscovery     ] [node2] not enough master nodes discovered during pinging (found [[Candidate{node={node2}{HYFYyQ31QmatkqJXoKsNCw}{ikvdDqarTHm-aWCX_89CcQ}{10.211.55.49}{10.211.55.49:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again | 
| 18 | [2017-05-16T12:27:18,441][WARN ][o.e.n.Node               ] [node2] timed out while waiting for initial discovery state - timeout: 30s | 
| 19 | [2017-05-16T12:27:18,460][INFO ][o.e.h.n.Netty4HttpServerTransport] [node2] publish_address {10.211.55.49:9200}, bound_addresses {[::]:9200} | 
| 20 | [2017-05-16T12:27:18,460][INFO ][o.e.n.Node               ] [node2] started | 
| 21 | [2017-05-16T12:27:18,602][WARN ][o.e.d.z.ZenDiscovery     ] [node2] not enough master nodes discovered during pinging (found [[Candidate{node={node2}{HYFYyQ31QmatkqJXoKsNCw}{ikvdDqarTHm-aWCX_89CcQ}{10.211.55.49}{10.211.55.49:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again | 
| 22 | [2017-05-16T12:27:21,604][WARN ][o.e.d.z.ZenDiscovery     ] [node2] not enough master nodes discovered during pinging (found [[Candidate{node={node2}{HYFYyQ31QmatkqJXoKsNCw}{ikvdDqarTHm-aWCX_89CcQ}{10.211.55.49}{10.211.55.49:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again | 
| 23 | [2017-05-16T12:27:24,606][WARN ][o.e.d.z.ZenDiscovery     ] [node2] not enough master nodes discovered during pinging (found [[Candidate{node={node2}{HYFYyQ31QmatkqJXoKsNCw}{ikvdDqarTHm-aWCX_89CcQ}{10.211.55.49}{10.211.55.49:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again | 
| 24 | [2017-05-16T12:27:35,256][INFO ][o.e.c.s.ClusterService   ] [node2] new_master {node2}{HYFYyQ31QmatkqJXoKsNCw}{ikvdDqarTHm-aWCX_89CcQ}{10.211.55.49}{10.211.55.49:9300}, added {{node1}{HrlO474CRxK0XJv_0w4cvg}{PlIc8KhdRKKTmWTK5bRfhQ}{10.211.55.48}{10.211.55.48:9300},}, reason: zen-disco-elected-as-master ([1] nodes joined)[{node1}{HrlO474CRxK0XJv_0w4cvg}{PlIc8KhdRKKTmWTK5bRfhQ}{10.211.55.48}{10.211.55.48:9300}] | 
| 25 | [2017-05-16T12:27:35,373][INFO ][o.e.g.GatewayService     ] [node2] recovered [0] indices into cluster_state | 
| 26 | ~]# netstat -ntlp | 
| 27 | Active Internet connections (only servers) | 
| 28 | Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name    | 
| 29 | tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      2504/sshd            | 
| 30 | tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      2654/master          | 
| 31 | tcp        0      0 :::9200                     :::*                        LISTEN      14495/java           | 
| 32 | tcp        0      0 :::9300                     :::*                        LISTEN      14495/java           | 
| 33 | tcp        0      0 :::22                       :::*                        LISTEN      2504/sshd            | 
| 34 | tcp        0      0 ::1:25                      :::*                        LISTEN      2654/master | 
Restful API
四类API:
- (1)检查集群、节点、索引健康与否,及获取相应状态
- (2)管理集群、节点、索引及元数据
- (3)执行CRUD操作
- (4)执行高级操作,例如paging,fitering等
ES访问接口: TCP/9200
| 1 | curl -X<VERB> '<PROTOCOL>://HOST:PORT/<PATH>?<QUERY_STRING>' -d '<BODY>' | 
首先我们先检查下集群和节点的状态:
| 1 | ~]# curl '10.211.55.49:9200/' | 
| 2 | { | 
| 3 |   "name" : "node2", | 
| 4 |   "cluster_name" : "MyES", | 
| 5 |   "cluster_uuid" : "VvFCdamHRJWoX8NJIK76Qw", | 
| 6 |   "version" : { | 
| 7 |     "number" : "5.5.0", | 
| 8 |     "build_hash" : "260387d", | 
| 9 |     "build_date" : "2017-06-30T23:16:05.735Z", | 
| 10 |     "build_snapshot" : false, | 
| 11 |     "lucene_version" : "6.6.0" | 
| 12 |   }, | 
| 13 |   "tagline" : "You Know, for Search" | 
| 14 | } | 
| 15 | ~]# curl '10.211.55.48:9200/' | 
| 16 | { | 
| 17 |   "name" : "node1", | 
| 18 |   "cluster_name" : "MyES", | 
| 19 |   "cluster_uuid" : "VvFCdamHRJWoX8NJIK76Qw", | 
| 20 |   "version" : { | 
| 21 |     "number" : "5.5.0", | 
| 22 |     "build_hash" : "260387d", | 
| 23 |     "build_date" : "2017-06-30T23:16:05.735Z", | 
| 24 |     "build_snapshot" : false, | 
| 25 |     "lucene_version" : "6.6.0" | 
| 26 |   }, | 
| 27 |   "tagline" : "You Know, for Search" | 
| 28 | } | 
| 29 | |
| 30 | 我们可以看到这连个节点都是属于MyES集群,就像ES的集群中tagline一样,“You Know, for Search”,这就是为了大数据搜索而准备的集群。 | 
| 31 | |
| 32 | ~]# curl -XGET "http://10.211.55.48:9200/_cluster/health?pretty" | 
| 33 | { | 
| 34 |   "cluster_name" : "MyES", | 
| 35 |   "status" : "green", | 
| 36 |   "timed_out" : false, | 
| 37 |   "number_of_nodes" : 2, | 
| 38 |   "number_of_data_nodes" : 2, | 
| 39 |   "active_primary_shards" : 0, | 
| 40 |   "active_shards" : 0, | 
| 41 |   "relocating_shards" : 0, | 
| 42 |   "initializing_shards" : 0, | 
| 43 |   "unassigned_shards" : 0, | 
| 44 |   "delayed_unassigned_shards" : 0, | 
| 45 |   "number_of_pending_tasks" : 0, | 
| 46 |   "number_of_in_flight_fetch" : 0, | 
| 47 |   "task_max_waiting_in_queue_millis" : 0, | 
| 48 |   "active_shards_percent_as_number" : 100.0 | 
| 49 | } | 
| 50 | |
| 51 | 我们的集群处于green状态,说名所以分片和副本都是可用正常的。 | 
| 52 | |
| 53 | ~]# curl -XGET "http://10.211.55.48:9200/_cluster/state?pretty" | 
| 54 | { | 
| 55 |   "cluster_name" : "MyES", | 
| 56 |   "version" : 2, | 
| 57 |   "state_uuid" : "Twy26y7dTtqilrjUEmDalQ", | 
| 58 |   "master_node" : "HYFYyQ31QmatkqJXoKsNCw", | 
| 59 |   "blocks" : { }, | 
| 60 |   "nodes" : { | 
| 61 |     "HrlO474CRxK0XJv_0w4cvg" : { | 
| 62 |       "name" : "node1", | 
| 63 |       "ephemeral_id" : "PlIc8KhdRKKTmWTK5bRfhQ", | 
| 64 |       "transport_address" : "10.211.55.48:9300", | 
| 65 |       "attributes" : { } | 
| 66 |     }, | 
| 67 |     "HYFYyQ31QmatkqJXoKsNCw" : { | 
| 68 |       "name" : "node2", | 
| 69 |       "ephemeral_id" : "ikvdDqarTHm-aWCX_89CcQ", | 
| 70 |       "transport_address" : "10.211.55.49:9300", | 
| 71 |       "attributes" : { } | 
| 72 |     } | 
| 73 |   }, | 
| 74 |   "metadata" : { | 
| 75 |     "cluster_uuid" : "VvFCdamHRJWoX8NJIK76Qw", | 
| 76 |     "templates" : { }, | 
| 77 |     "indices" : { }, | 
| 78 |     "index-graveyard" : { | 
| 79 |       "tombstones" : [ ] | 
| 80 |     } | 
| 81 |   }, | 
| 82 |   "routing_table" : { | 
| 83 |     "indices" : { } | 
| 84 |   }, | 
| 85 |   "routing_nodes" : { | 
| 86 |     "unassigned" : [ ], | 
| 87 |     "nodes" : { | 
| 88 |       "HrlO474CRxK0XJv_0w4cvg" : [ ], | 
| 89 |       "HYFYyQ31QmatkqJXoKsNCw" : [ ] | 
| 90 |     } | 
| 91 |   } | 
| 92 | } | 
| 93 | 这是查看集群状态的信息。 | 
| 94 | |
| 95 | ~]# curl "10.211.55.49:9200/_nodes/node1/state?pretty" | 
| 96 | { | 
| 97 |   "_nodes" : { | 
| 98 |     "total" : 1, | 
| 99 |     "successful" : 1, | 
| 100 |     "failed" : 0 | 
| 101 |   }, | 
| 102 |   "cluster_name" : "MyES", | 
| 103 |   "nodes" : { | 
| 104 |     "HrlO474CRxK0XJv_0w4cvg" : { | 
| 105 |       "name" : "node1", | 
| 106 |       "transport_address" : "10.211.55.48:9300", | 
| 107 |       "host" : "10.211.55.48", | 
| 108 |       "ip" : "10.211.55.48", | 
| 109 |       "version" : "5.5.0", | 
| 110 |       "build_hash" : "260387d", | 
| 111 |       "roles" : [ | 
| 112 |         "master", | 
| 113 |         "data", | 
| 114 |         "ingest" | 
| 115 |       ] | 
| 116 |     } | 
| 117 |   } | 
| 118 | } | 
| 119 | |
| 120 | 看不惯json接口的数据,ES集群也为我们提供一个_cat接口: | 
| 121 | ~]# curl -XGET "http://10.211.55.48:9200/_cat" | 
| 122 | =^.^= | 
| 123 | /_cat/allocation | 
| 124 | /_cat/shards | 
| 125 | /_cat/shards/{index} | 
| 126 | /_cat/master | 
| 127 | /_cat/nodes | 
| 128 | /_cat/tasks | 
| 129 | /_cat/indices | 
| 130 | /_cat/indices/{index} | 
| 131 | /_cat/segments | 
| 132 | /_cat/segments/{index} | 
| 133 | /_cat/count | 
| 134 | /_cat/count/{index} | 
| 135 | /_cat/recovery | 
| 136 | /_cat/recovery/{index} | 
| 137 | /_cat/health | 
| 138 | /_cat/pending_tasks | 
| 139 | /_cat/aliases | 
| 140 | /_cat/aliases/{alias} | 
| 141 | /_cat/thread_pool | 
| 142 | /_cat/thread_pool/{thread_pools} | 
| 143 | /_cat/plugins | 
| 144 | /_cat/fielddata | 
| 145 | /_cat/fielddata/{fields} | 
| 146 | /_cat/nodeattrs | 
| 147 | /_cat/repositories | 
| 148 | /_cat/snapshots/{repository} | 
| 149 | /_cat/templates | 
| 150 | |
| 151 | _cat api接口为我们提供了一个功能选择。 | 
| 152 | |
| 153 | ~]# curl -XGET "http://10.211.55.48:9200/_cat/nodes?v" | 
| 154 | ip           heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name | 
| 155 | 10.211.55.48            5          94   0    0.35    0.31     0.34 mdi       -      node1 | 
| 156 | 10.211.55.49            7          94   0    0.08    0.30     0.34 mdi       *      node2 | 
| 157 | 我们可以在状态中看见集群主节点时node2 | 
Plugin
ES集群很多功能都需要扩展来完成,而有些Plugin是必须安装的,常使用的Plugin有:
- marvel
- bigdesk
- head
- kopf
这些都是站点插件,可以在网页直接管理es集群。
那么如何安装Plugin呢?
- 直接将插件放置plugin目录中即可:/usr/share/elasticsearch/plugins
- 使用elasticsearch-plugin来安转:/usr/share/elasticsearch/bin/elasticsearch-plugin
CRUD
- 创建 - 1- ~]# curl -XPUT "10.211.55.48:9200/students/class1/1?pretty" -d '- 2- {- 3- "name": "jusene",- 4- "age": 25,- 5- "class": "English"- 6- }'- 7- {- 8- "_index" : "students",- 9- "_type" : "class1",- 10- "_id" : "1",- 11- "_version" : 1,- 12- "result" : "created",- 13- "_shards" : {- 14- "total" : 2,- 15- "successful" : 2,- 16- "failed" : 0- 17- },- 18- "created" : true- 19- }- 20- ~]# curl -XPUT "10.211.55.48:9200/students/class2/1?pretty" -d '- 21- {- 22- "name": "jack",- 23- "age": 24,- 24- "class": "Math"- 25- }'- 26- {- 27- "_index" : "students",- 28- "_type" : "class2",- 29- "_id" : "1",- 30- "_version" : 1,- 31- "result" : "created",- 32- "_shards" : {- 33- "total" : 2,- 34- "successful" : 2,- 35- "failed" : 0- 36- },- 37- "created" : true- 38- }
- 查看 - 1- ~]# curl -XGET "10.211.55.48:9200/students/class1/1?pretty"- 2- {- 3- "_index" : "students",- 4- "_type" : "class1",- 5- "_id" : "1",- 6- "_version" : 1,- 7- "found" : true,- 8- "_source" : {- 9- "name" : "jusene",- 10- "age" : 25,- 11- "class" : "English"- 12- }- 13- }
- 修改 - 1- ~]# curl -XPOST "10.211.55.48:9200/students/class1/1/_update?pretty" -d '{"doc": {"age": 26}}'- 2- {- 3- "_index" : "students",- 4- "_type" : "class1",- 5- "_id" : "1",- 6- "_version" : 2,- 7- "result" : "updated",- 8- "_shards" : {- 9- "total" : 2,- 10- "successful" : 2,- 11- "failed" : 0- 12- }- 13- }- 14- ~]# curl -XGET "10.211.55.48:9200/students/class1/1?pretty"- 15- {- 16- "_index" : "students",- 17- "_type" : "class1",- 18- "_id" : "1",- 19- "_version" : 2,- 20- "found" : true,- 21- "_source" : {- 22- "name" : "jusene",- 23- "age" : 26,- 24- "class" : "English"- 25- }- 26- }- 27- 28- 注意:如果用put是覆盖这个文档
- 删除 - 1- ~]# curl -XDELETE '10.211.55.48:9200/students/class1/1?pretty'- 2- {- 3- "found" : true,- 4- "_index" : "students",- 5- "_type" : "class1",- 6- "_id" : "1",- 7- "_version" : 3,- 8- "result" : "deleted",- 9- "_shards" : {- 10- "total" : 2,- 11- "successful" : 2,- 12- "failed" : 0- 13- }- 14- }- 15- ~]# curl -XGET "10.211.55.48:9200/students/class1/1?pretty"- 16- {- 17- "_index" : "students",- 18- "_type" : "class1",- 19- "_id" : "1",- 20- "found" : false- 21- }- 22- 23- 同理删除类 或者 索引- 24- 25- ~]# curl -XDELETE '10.211.55.48:9200/students/class1?pretty- 26- ~]# curl -XDELETE '10.211.55.48:9200/students?pretty
查询数据
Query API:
- Query DSL:JSON based language for building complex queries
 用于实现诸多类型的查询操作,比如,simple term query,phrase,range,boolean,fuzzy等
- 多索引、多类型查询
多索引、多类型查询
| 1 | /_search:所以索引 | 
| 2 | /INDEX_NAME/_search:单索引 | 
| 3 | /INDEX1,INDEX2/_search:多索引 | 
| 4 | /s*,t*/_search: | 
| 5 | /students/class1/_search:单类型搜索 | 
| 6 | /students/class1,class2/_search:多类型搜索 | 
ES:对每一个文档。会取得其所以域的所以值,生成一个名为_all的域:执行查询时,如果在query_string未指定查询的域,则在_all域上执行查询操作。
如:
| 1 | - GET /_search?q='zgx' | 
| 2 | - GET /_search?q='zhang%20guoxing' | 
| 3 | - GET /_search?q=name:'zgx' | 
| 4 | - GET /_search?q=name:'zhang%20guoxing' | 
| 1 | ~]# curl "10.211.55.48:9200/_search?q='zgx'&pretty" | 
| 2 | { | 
| 3 |   "took" : 74, | 
| 4 |   "timed_out" : false, | 
| 5 |   "_shards" : { | 
| 6 |     "total" : 5, | 
| 7 |     "successful" : 5, | 
| 8 |     "failed" : 0 | 
| 9 |   }, | 
| 10 |   "hits" : { | 
| 11 |     "total" : 2, | 
| 12 |     "max_score" : 0.17225473, | 
| 13 |     "hits" : [ | 
| 14 |       { | 
| 15 |         "_index" : "students", | 
| 16 |         "_type" : "class1", | 
| 17 |         "_id" : "4", | 
| 18 |         "_score" : 0.17225473, | 
| 19 |         "_source" : { | 
| 20 |           "name" : "zgx", | 
| 21 |           "age" : 25, | 
| 22 |           "class" : "English" | 
| 23 |         } | 
| 24 |       }, | 
| 25 |       { | 
| 26 |         "_index" : "students", | 
| 27 |         "_type" : "class1", | 
| 28 |         "_id" : "6", | 
| 29 |         "_score" : 0.17225473, | 
| 30 |         "_source" : { | 
| 31 |           "name" : "zhang guoxing", | 
| 32 |           "age" : 25, | 
| 33 |           "desc" : "zgx" | 
| 34 |         } | 
| 35 |       } | 
| 36 |     ] | 
| 37 |   } | 
| 38 | } | 
| 39 | |
| 40 | 我们还可看见对这个搜索我们还有score分数的评判 | 
| 1 | ~]# curl "10.211.55.48:9200/_search?q='zhang%20guoxing'&pretty" | 
| 2 | { | 
| 3 |   "took" : 12, | 
| 4 |   "timed_out" : false, | 
| 5 |   "_shards" : { | 
| 6 |     "total" : 5, | 
| 7 |     "successful" : 5, | 
| 8 |     "failed" : 0 | 
| 9 |   }, | 
| 10 |   "hits" : { | 
| 11 |     "total" : 2, | 
| 12 |     "max_score" : 1.3097504, | 
| 13 |     "hits" : [ | 
| 14 |       { | 
| 15 |         "_index" : "students", | 
| 16 |         "_type" : "class1", | 
| 17 |         "_id" : "6", | 
| 18 |         "_score" : 1.3097504, | 
| 19 |         "_source" : { | 
| 20 |           "name" : "zhang guoxing", | 
| 21 |           "age" : 25, | 
| 22 |           "desc" : "zgx" | 
| 23 |         } | 
| 24 |       }, | 
| 25 |       { | 
| 26 |         "_index" : "students", | 
| 27 |         "_type" : "class1", | 
| 28 |         "_id" : "5", | 
| 29 |         "_score" : 0.5753642, | 
| 30 |         "_source" : { | 
| 31 |           "name" : "zhang guoxing", | 
| 32 |           "age" : 25, | 
| 33 |           "class" : "English" | 
| 34 |         } | 
| 35 |       } | 
| 36 |     ] | 
| 37 |   } | 
| 38 | } | 
| 1 | ~]# curl "10.211.55.48:9200/_search?q=name:'zhang%20guoxing'&pretty" | 
| 2 | { | 
| 3 |   "took" : 5, | 
| 4 |   "timed_out" : false, | 
| 5 |   "_shards" : { | 
| 6 |     "total" : 5, | 
| 7 |     "successful" : 5, | 
| 8 |     "failed" : 0 | 
| 9 |   }, | 
| 10 |   "hits" : { | 
| 11 |     "total" : 2, | 
| 12 |     "max_score" : 0.6548752, | 
| 13 |     "hits" : [ | 
| 14 |       { | 
| 15 |         "_index" : "students", | 
| 16 |         "_type" : "class1", | 
| 17 |         "_id" : "6", | 
| 18 |         "_score" : 0.6548752, | 
| 19 |         "_source" : { | 
| 20 |           "name" : "zhang guoxing", | 
| 21 |           "age" : 25, | 
| 22 |           "desc" : "zgx" | 
| 23 |         } | 
| 24 |       }, | 
| 25 |       { | 
| 26 |         "_index" : "students", | 
| 27 |         "_type" : "class1", | 
| 28 |         "_id" : "5", | 
| 29 |         "_score" : 0.2876821, | 
| 30 |         "_source" : { | 
| 31 |           "name" : "zhang guoxing", | 
| 32 |           "age" : 25, | 
| 33 |           "class" : "English" | 
| 34 |         } | 
| 35 |       } | 
| 36 |     ] | 
| 37 |   } | 
| 38 | } | 
| 1 | ~]# curl "10.211.55.48:9200/_search?q=name:'zgx'&pretty" | 
| 2 | { | 
| 3 |   "took" : 27, | 
| 4 |   "timed_out" : false, | 
| 5 |   "_shards" : { | 
| 6 |     "total" : 5, | 
| 7 |     "successful" : 5, | 
| 8 |     "failed" : 0 | 
| 9 |   }, | 
| 10 |   "hits" : { | 
| 11 |     "total" : 1, | 
| 12 |     "max_score" : 0.80259144, | 
| 13 |     "hits" : [ | 
| 14 |       { | 
| 15 |         "_index" : "students", | 
| 16 |         "_type" : "class1", | 
| 17 |         "_id" : "4", | 
| 18 |         "_score" : 0.80259144, | 
| 19 |         "_source" : { | 
| 20 |           "name" : "zgx", | 
| 21 |           "age" : 25, | 
| 22 |           "class" : "English" | 
| 23 |         } | 
| 24 |       } | 
| 25 |     ] | 
| 26 |   } | 
| 27 | } | 
前两个:表示在_all域搜索
后两个: 表示在特定的类型上搜索
数据类型:string,number,boolean,dates
查看执行上mapping类型:
| 1 | ~]#  curl "10.211.55.48:9200/students/_mapping/class1?pretty"   | 
| 2 | { | 
| 3 |   "students" : { | 
| 4 |     "mappings" : { | 
| 5 |       "class1" : { | 
| 6 |         "properties" : { | 
| 7 |           "age" : { | 
| 8 |             "type" : "long"    | 
| 9 |           }, | 
| 10 |           "class" : { | 
| 11 |             "type" : "text", | 
| 12 |             "fields" : { | 
| 13 |               "keyword" : { | 
| 14 |                 "type" : "keyword", | 
| 15 |                 "ignore_above" : 256 | 
| 16 |               } | 
| 17 |             } | 
| 18 |           }, | 
| 19 |           "desc" : { | 
| 20 |             "type" : "text", | 
| 21 |             "fields" : { | 
| 22 |               "keyword" : { | 
| 23 |                 "type" : "keyword", | 
| 24 |                 "ignore_above" : 256 | 
| 25 |               } | 
| 26 |             } | 
| 27 |           }, | 
| 28 |           "name" : { | 
| 29 |             "type" : "text", | 
| 30 |             "fields" : { | 
| 31 |               "keyword" : { | 
| 32 |                 "type" : "keyword", | 
| 33 |                 "ignore_above" : 256 | 
| 34 |               } | 
| 35 |             } | 
| 36 |           } | 
| 37 |         } | 
| 38 |       } | 
| 39 |     } | 
| 40 |   } | 
| 41 | } | 
| 42 | |
| 43 | 我们可以看见在这个类中的字端的映射关系 | 
ES中的搜索的数据广义上可被理解两类:
types:exact 精确搜索:指未经加工的原始值:在搜索时进行精确匹配,类似于sql语句
full-text   全文搜索:用于引用文本中的数据:判断文档在多大程度上匹配查询请求:即评估文档与用户请求查询的相关度,这个才是ES最强大的地方
为了完成full-text搜索,ES必须首先分许文本,并创建出倒排索引,倒排索引中的数据还需正规化标准化处理,如全部小写等,当采用不同的分析器处理文本搜索的时候,因为不同的分析器采用的标准不同,所以搜索结果还是有出入的。
上述过程我们也可以同称为分析,分析按照Lucene来说可以是分词和正规化构建倒排索引的过程,分析由分析器组成,分析器由三个组件组成:字符过滤器,分词器,分词过滤器。ES内置的分析器:
- Standard analyzer
- Simple analyzer
- Whitespace analyzer
- Language analyzer
 分析器不仅在创建索引时用到:在构建查询时也会用到,索引在创建和查询的时候分析器使用不一致,查询结果都是不尽相同的。
Query DSL
Query DSL通过request body来完成:
分成两类:
- query dsl:执行full-text查询时,基于相关度来评判其匹配结
 查询执行过程复制,且不会被缓存
- filter dsl:执行exact查询,基于其结果为yes或者no进行评判
 速度快,且结果缓存
Filter DSL
- term filter:精准匹配包含指定term的文档 - 1- ~]# curl "10.211.55.24:9200/students/_search?pretty" -d {- 2- "query":{- 3- "term":{- 4- "name": "jusene"- 5- }- 6- }- 7- }- 8- {- 9- "took" : 4,- 10- "timed_out" : false,- 11- "_shards" : {- 12- "total" : 5,- 13- "successful" : 5,- 14- "failed" : 0- 15- },- 16- "hits" : {- 17- "total" : 2,- 18- "max_score" : 0.6931472,- 19- "hits" : [- 20- {- 21- "_index" : "students",- 22- "_type" : "class1",- 23- "_id" : "1",- 24- "_score" : 0.6931472,- 25- "_source" : {- 26- "name" : "jusene",- 27- "age" : 25,- 28- "class" : "English"- 29- }- 30- },- 31- {- 32- "_index" : "students",- 33- "_type" : "class1",- 34- "_id" : "3",- 35- "_score" : 0.2876821,- 36- "_source" : {- 37- "name" : "jusene",- 38- "age" : 25,- 39- "class" : "English"- 40- }- 41- }- 42- ]- 43- }- 44- }
- terms filter:精准匹配多个精致值 - 1- ~]# curl "10.211.55.48:9200/students/_search?pretty" -d {- 2- "query":{- 3- "terms":{- 4- "name":["jusene","zgx"]- 5- }- 6- }- 7- }
- range filter:用于指定范围内查找数值和时间 - 1- ~]# curl "10.211.55.48:9200/students/_search?pretty" -d '{- 2- "query":{- 3- "range":{- 4- "age":{- 5- "lt":25- 6- }- 7- }- 8- }- 9- }'
- exists filter - 1- ~]# curl "10.211.55.48:9200/students/_search?pretty" -d '{- 2- "query":{- 3- "exists":{- 4- "field": "age"- 5- }- 6- }- 7- }'
- boolean filter 
 基于boolean的逻辑来合并多个filter子句
must:其内部所以的子句条件必须同时匹配,即and
must_not: 其所有子句必须不匹配,即not
should: 至少有一个子句匹配,即or
| 1 | ~]# curl "10.211.55.48:9200/students/_search?pretty" -d '{ | 
| 2 | 	"query":{ | 
| 3 | 		"bool":{ | 
| 4 | 			"must":{ | 
| 5 | 				"term":{"age": 24} | 
| 6 | 			}, | 
| 7 | 			"must_not":{ | 
| 8 | 				"term":{"name":"zgx"} | 
| 9 | 			}, | 
| 10 | 			"should":[ | 
| 11 | 				{"term":{"class":"English"}}, | 
| 12 | 				{"term":{"class":"Math"}} | 
| 13 | 				] | 
| 14 | 		} | 
| 15 | 	} | 
| 16 | }' | 
Query DSL
- match_all:用于匹配所以文档,没有指定query,默认即为match_all query - 1- ~]# curl '10.211.55.48:9200/_search?pretty' -d '- 2- {- 3- "query": {"match_all": {}}- 4- }'
- match:在几乎任何域上执行full_text和exact-value查询 - 1- 执行full-text查询,首先对查询时的语句进行分析- 2- ~]# curl "10.211.55.48:9200/_search?pretty" -d '{- 3- "query":{- 4- "match":{"name":"zgx"}- 5- }- 6- }- 7- '- 8- 9- 如果执行exact-value查询:搜索精确值,此时,建议使用过滤,而非查询- 10- ~]# curl "10.211.55.48:9200/students/_search?pretty" -d '{- 11- "query":{- 12- "match":{"name":"zgx"}- 13- }- 14- }'
- multi_match:用于多个域上执行相同的查询 - 1- ~]# curl "10.211.55.48:9200/_search?pretty" -d '{- 2- "query":{- 3- “multi_match”:{- 4- "query":"zgx",- 5- "fields":["name","desc"]- 6- }- 7- 8- }- 9- }'
- bool query:基于boolean逻辑合并多个查询语句,与bool filter不同的是,查询子句不是返回yes或no,而是其计算出的匹配度分值,因此,boolean Query会为各子句合并其score - 1- ~]# curl "10.211.55.48:9200/students/_search?pretty" -d '{- 2- "query":{- 3- "bool":{- 4- "must":{- 5- "range":{"gte": 24}- 6- },- 7- "must_not":{- 8- "match":{"name":"zgx"}- 9- },- 10- "should":[- 11- {"match":{"class":"English"}},- 12- {"match":{"class":"Math"}}- 13- ]- 14- }- 15- }- 16- }'
- wildcards query:shell统配符查询 - 1- ~]# curl "10.211.55.48:9200/students/class1/_search?pretty" -d '{- 2- "query":{- 3- "wildcards":{- 4- "name":"z*x"- 5- }- 6- }- 7- }'
- regexp query:正则查询 - 1- ~]# curl "10.211.55.48:9200/_search?pretty" -d '{- 2- "query":{- 3- "regexp":{- 4- "age":"[0-9]+"- 5- }- 6- }- 7- }'
- prefix query:前缀查询 - 1- ~]# curl "10.211.55.48:9200/_search?pretty" -d '{- 2- "query":{- 3- "prefix":{- 4- "class":"M"- 5- }- 6- }- 7- }'
- phrase match:短语匹配 - 1- ~]# curl "10.211.55.48:9200/_search?pretty" -d '{- 2- "query":{- 3- "match_phrase":{- 4- "name": "zhang guoxing"- 5- }- 6- }- 7- }'
复合查询
即使用filter dsl和query dsl
| 1 | ~]# curl "10.211.55.48:9200/_search?pretty" -d '{ | 
| 2 | 	"query":{ | 
| 3 | 		"filtered":{ | 
| 4 | 			"filter":{ | 
| 5 | 				"range":{ | 
| 6 | 					"age":{"gt":24} | 
| 7 | 				} | 
| 8 | 			}, | 
| 9 | 			"query":{ | 
| 10 | 				"match":{ | 
| 11 | 					"name":"jusene" | 
| 12 | 				} | 
| 13 | 			} | 
| 14 | 		} | 
| 15 | 	} | 
| 16 | }' | 
高亮搜索
| 1 | ~]# curl "10.211.55.48:9200/_search?pretty" -d '{ | 
| 2 | 	"query":{ | 
| 3 | 		"match":{ | 
| 4 | 			"name":"jusene" | 
| 5 | 		} | 
| 6 | 	}, | 
| 7 | 	"highlight":{ | 
| 8 | 		"fields":{ | 
| 9 | 			"name":{} | 
| 10 | 		} | 
| 11 | 	} | 
这里包含了来自name字段中的文本,并且用来标识匹配到的单词。
检查DSL语法
| 1 | ~]# curl "10.211.55.48:9200/students/_validate?pretty" -d "body" | 
查考资料:
https://es.xiaoleilu.com/index.html
http://www.cnblogs.com/ghj1976/p/5293250.html