Mastering ElasticSearch extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Buch |
Sprache: | English |
Veröffentlicht: |
Birmingham
Packt Publishing
2013
|
Schlagworte: | |
Online-Zugang: | Inhaltsverzeichnis Klappentext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
MARC
LEADER | 00000nam a2200000zc 4500 | ||
---|---|---|---|
001 | BV041721998 | ||
003 | DE-604 | ||
005 | 20160620 | ||
007 | t| | ||
008 | 140306s2013 xx a||| |||| 00||| eng d | ||
020 | |a 9781783281435 |c Print |9 978-1-78328-143-5 | ||
035 | |a (OCoLC)892561775 | ||
035 | |a (DE-599)BVBBV041721998 | ||
040 | |a DE-604 |b ger |e aacr | ||
041 | 0 | |a eng | |
049 | |a DE-11 |a DE-739 |a DE-523 | ||
084 | |a ST 252 |0 (DE-625)143627: |2 rvk | ||
084 | |a ST 253 |0 (DE-625)143628: |2 rvk | ||
100 | 1 | |a Kuć, Rafał |e Verfasser |4 aut | |
245 | 1 | 0 | |a Mastering ElasticSearch |b extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings |c Rafal Kuc ; Marek Rogozinski |
264 | 1 | |a Birmingham |b Packt Publishing |c 2013 | |
300 | |a X, 361 S. |b Ill. | ||
336 | |b txt |2 rdacontent | ||
337 | |b n |2 rdamedia | ||
338 | |b nc |2 rdacarrier | ||
500 | |a Includes index | ||
650 | 4 | |a Application software | |
650 | 4 | |a Client/server computing / Software | |
653 | |a Electronic books | ||
700 | 1 | |a Rogoziński, Marek |e Verfasser |4 aut | |
776 | 0 | 8 | |i Erscheint auch als |n Online-Ausgabe |z 978-1-78328-144-2 |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027168963&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA |3 Inhaltsverzeichnis |
856 | 4 | 2 | |m Digitalisierung UB Passau - ADAM Catalogue Enrichment |q application/pdf |u http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027168963&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |3 Klappentext |
943 | 1 | |a oai:aleph.bib-bvb.de:BVB01-027168963 |
Datensatz im Suchindex
_version_ | 1819790815659032576 |
---|---|
adam_text | Table
of
Contents
Preface
Chapter
1:
Introduction to ElasticSearch
Introducing Apache Lucene
8
Getting familiar with Lucene
8
Overall architecture
8
Analyzing your data
10
Indexing and querying
11
Lucene query language
11
Understanding the basics
12
Querying fields
13
Term modifiers
13
Handling special characters
14
Introducing ElasticSearch
15
Basic concepts
15
Index
15
Document
15
Mapping
16
Type
16
Node
16
Cluster
16
Shard
17
Replica
17
Gateway
17
Key concepts behind ElasticSearch architecture
17
Working of ElasticSearch
18
The boostrap process
18
Failure detection
19
Communicating with ElasticSearch
20
Summary
23
labie
of
Contents______________________________________________________________________
Chapter
2: Power User
Query DSL
_____________________________25
Default Apache Lucene
scoring explained
26
When a document is matched
26
The TF/IDF scoring formula
27
The Lucene conceptual formula
27
The Lucene practical formula
28
The ElasticSearch point of view
29
Query rewrite explained
29
Prefix query as an example
29
Getting back to Apache Lucene
32
Query rewrite properties
33
Rescore
35
Understanding rescore
35
Example Data
35
Query
36
Structure of the rescore query
36
Rescore parameters
38
To sum up
39
Bulk Operations
39
MultiGet
39
MultiSearch
41
Sorting data
43
Sorting with multivalued fields
44
Sorting with multivalued
geo
fields
45
Sorting with nested objects
47
Update API
48
Simple field update
49
Conditional modifications using scripting
50
Creating and deleting documents using the Update API
50
Using filters to optimize your queries
51
Filters and caching
52
Not all filters are cached by default
53
Changing ElasticSearch caching behavior
54
Why bother naming the key for the cache?
55
When to change the ElasticSearch filter caching behavior
55
The terms lookup filter
55
How does it work?
58
Performance considerations
59
Loading terms from inner objects
59
Terms lookup filter cache settings
60
Filter and scopes in ElasticSearch faceting mechanism
60
Example data
61
labie
of
Contents
Faceting and filtering
61
Filter as a part of the query
63
The Facet filter
65
Global scope
67
Summary
69
Chapter
3:
Low-level Index Control
______________________________
Л
Altering Apache Lucene scoring
71
Available similarity models
72
Setting per-field similarity
73
Similarity model configuration
74
Choosing the default similarity model
75
Configuring the chosen similarity models
76
Configuring TF/IDF similarity
76
Configuring
Okapi BM25
similarity
77
Configuring DFR similarity
77
Configuring IB similarity
78
Using codecs
78
Simple use cases
78
Let s see how it works
79
Available posting formats
81
Configuring the codec behavior
82
Default codec properties
83
Direct codec properties
83
Memory codec properties
83
Pulsing codec properties
83
Bloom filter-based codec properties
84
NRT, flush, refresh, and transaction log
85
Updating index and committing changes
86
Changing the default refresh time
86
The transaction log
Ь7
The transaction log configuration
88
Near Real Time GET
89
Looking deeper into data handling
90
Input is not always analyzed
90
Example usage
94
Changing the analyzer during indexing
95
Changing the analyzer during searching
96
The pitfall and default analysis
97
Segment merging under control
97
Choosing the right merge policy
98
The tiered merge policy
99
The log byte size merge policy
99
The log doc merge policy
100
Table of
Contents
Merge policies configuration
100
The tiered merge policy
100
The log byte size merge policy
101
The log doc merge policy
102
Scheduling
103
The concurrent merge scheduler
103
The serial merge scheduler
104
Setting the desired merge scheduler
104
Summary
104
Chapter
4:
Index Distribution Architecture
_____________________105
Choosing the right amount of shards and replicas
106
Sharding and over allocation
106
A positive example of over allocation
108
Multiple shards versus multiple indices
108
Replicas
108
Routing explained
109
Shards and data
109
Let s test routing
110
Indexing with routing
112
Indexing with routing
114
Querying
115
Aliases
117
Multiple routing values
118
Altering the default shard allocation behavior
119
Introducing ShardAIIocator
119
The even_shard ShardAIIocator
119
The balanced ShardAIIocator
120
The custom ShardAIIocator
121
Deciders
121
SameShardAllocationDecider
121
ShardsLimitAllocationDecider
122
FilterAllocationDecider
122
ReplicaAfterPrimaryActiveAllocationDecider
122
ClusterRebalanceAllocationDecider
122
ConcurrentRebalanceAllocationDecider
123
DisableAllocationDecider
123
AwarenessAllocationDecider
123
ThrottlingAllocationDecider
124
RebalanceOnly WhenActiveAllocationDecider
124
DiskThresholdDecider
124
Adjusting shard allocation
125
Allocation awareness
126
Forcing allocation awareness
128
7able
of Contents
Filtering
128
But what those properties mean?
129
Runtime allocation updating
130
Index-level updates
130
Cluster-level updates
130
Defining total shards allowed per node
131
Inclusion
132
Requirements
133
Exclusion
134
Additional shard allocation properties
135
Query execution preference
136
Introducing the preference parameter
137
Using our knowledge
139
Assumptions
139
Data volume and queries specification
140
Configuration
142
Node-level configuration
143
Indices configuration
143
The directories layout
143
Gateway configuration
143
Recovery
144
Discovery
144
Logging slow queries
145
Logging garbage collector work
145
Memory setup
146
One more thing
146
Changes are coming
147
Reindexing
147
Routing
148
Multiple Indices
148
Summary
149
Chapter
5:
ElasticSearch Administration
151
Choosing the right directory implementation
-
the
store module
151
Store type
152
The simple file system store
152
The new
10
filesystem
store
152
The MMap
filesystem
store
153
The memory store
153
The default store type
154
Discovery configuration
155
Zen discovery
155
Multicast
156
Unicast
157
Minimum master nodes
157
Zen discovery fault detection
158
labie
of
Contents___________________________________________^_____^_^_____^___
Amazon EC2
discovery
158
EC2
plugin s installation
159
Gateway
and recovery configuration
161
Gateway
recovery process
1
б
1
Configuration properties
162
Expectations on nodes
163
Local gateway
163
Backing up the local gateway
164
Recovery configuration
164
Cluster-level recovery configuration
165
Index-level recovery settings
166
Segments statistics
166
Introducing the segments API
167
The response
167
Visualizing segments information
170
Understanding ElasticSearch caching
170
The filter cache
171
Filter cache types
171
Index-level filter cache configuration
172
Node-level filter cache configuration
173
The field data cache
1
73
Index-level field data cache configuration
174
Node-level field data cache configuration
174
Filtering
175
Clearing the caches
180
Index, indices, and all caches clearing
181
Clearing specific caches
181
Clearing fields-related caches
182
Summary
182
Chapter
6:
Fighting with Fire
_________________________________183
Knowing the garbage collector
184
Java memory
184
The life cycle of Java object and garbage collections
185
Dealing with garbage collection problems
186
Turning on logging of garbage collection work
186
Using JStat
187
Creating memory dumps
189
More information on garbage collector work
189
Adjusting garbage collector work in ElasticSearch
190
Avoiding swapping on Unix-like systems
191
When it is too much for I/O
-
throttling explained
193
Controlling I/O throttling
193
Configuration
193
Throttling type I93
Maximum throughput per second
194
labie
of
Contents
Node throttling defaults
194
Configuration example
194
Speeding up queries using warmers
196
Reason for using warmers
196
Manipulating warmers
197
Using the PUT Warmer API
197
Adding warmers during index creation
198
Adding warmers to templates
199
Retrieving warmers
199
Deleting warmers
200
Disabling warmers
200
Testing the warmers
201
Querying without warmers present
202
Querying with warmer present
203
Very hot threads
204
Hot Threads API usage clarification
205
Hot Threads API response
206
Reat-life scenarios
207
Slower and slower performance
207
Heterogeneous environment and load imbalance
210
My server is under fire
212
Summary
213
Chapter
7:
Improving the User Search Experience
_______________215
Correcting user spelling mistakes
216
Test data
216
Getting into technical details
217
Suggesters
218
Using the suggest REST endpoint
218
Including suggestions requests in a query
221
The term
suggester
224
The phrase
suggester
227
Completion
suggester
237
The logic behind completion
suggester
238
Using completion
suggester
238
Improving query relevance
243
The data
244
The quest for improving relevance
246
The standard query
247
The
Multi
match query
248
Phrases comes into play
250
Let s throw the garbage away
254
And now we boost
256
Making a misspelling-proof search
257
Drill downs with faceting
260
Summary
264
labie
of
Contents
Chapter
8:
ElasticSearch Java APIs
___________________________265
Introducing the ElasticSearch Java API
266
The code
267
Connecting to your cluster
268
Becoming the ElasticSearch node
268
Using the transport connection method
270
Choosing the right connection method
271
Anatomy of the API
272
CRUD
operations
274
Fetching documents
274
Handling errors
276
Indexing documents
276
Updating documents
279
Deleting documents
282
Querying ElasticSearch
284
Preparing a query
284
Building queries
285
Using the match all documents query
287
The match query
287
Using the
geo
shape query
288
Paging
289
Sorting
290
Filtering
290
Faceting
292
Highlighting
292
Suggestions
293
Counting
294
Scrolling
295
Performing multiple actions
295
Bulk
296
The delete by query
296
Multi
GET
296
Multi
Search
297
Percolator
297
ElasticSearch
1.0
and higher
298
The explain API
299
Building JSON queries and documents
300
The administration API
302
The cluster administration API
302
The cluster and indices health API
302
The cluster state API
303
The update settings API
303
The reroute API
303
labie
of
Contents
The nodes information API
304
The node statistics
АР І
304
The nodes hot threads API
305
The nodes shutdown API
305
The search shards API
305
The Indices administration API
306
The index existence API
306
The Type existence
АР І
306
The indices stats API
306
Index status
307
Segments information API
307
Creating an index API
307
Deleting an index
308
Closing an index
308
Opening an index
308
The Refresh API
308
The Flush API
309
The Optimize API
309
The put mapping API
309
The delete mapping API
310
The gateway snapshot API
310
The aliases API
310
The get aliases API
311
The aliases exists API
311
The clear cache API
311
The update settings API
312
The analyze API
312
The put template
APÍ
312
The delete template API
313
The validate query API
313
The put warmer API
314
The delete warmer API
314
Summary
314
Chapter
9:
Developing ElasticSearch
Plugins___________________315
Creating the Apache Maven project structure
316
Understanding the basics
316
Structure of the Maven Java project
317
The idea of POM
317
Running the build process
319
Introducing the assembly Maven plugin
319
Creating a custom river plugin
322
Implementation details
322
Implementing the URLChecker class
324
Implementing the JSONRiver class
327
Implementing the JSONRiverModule class
329
Implementing the JSONRiverPlugin class
329
Informing ElasticSearch about the JSONRiver plugin class
330
7аЫе
ofContents
Testing our river
331
Building our river
331
Installing our river
331
Initializing our river
332
Checking if our JSON river works
333
Creating custom analysis plugin
333
Implementation details
334
Implementing TokenFilter
335
Implementing the TokenFilter factory
336
Implementing custom analyzer
337
Implementing analyzer provider
338
Implementing analysis binder
339
Implementing analyzer indices component
340
Implementing analyzer module
342
Implementing analyzer plugin
342
Informing ElasticSearch about our custom analyzer
343
Testing our custom analysis plugin
343
Building our custom analysis plugin
344
Installing the custom analysis plugin
344
Checking if our analysis plugin works
345
Summary
346
Index
347
Mastering ElasticSearch
ElasticSearch is fast, distributed, scalable, and written
in the Java search engine that leverages Apache Lucene
capabilities providing a new level of control over how you
index and search even the largest set of data.
Mastering ElasticSearch covers the intermediate and
advanced functionalities of ElasticSearch and will let you
understand not only how ElasticSearch works, but will also
guide you through its internals such as caches, Apache
Lucene library, monitoring capabilities, and the Java
API. In addition to that you ll see the
practicai
usage of
ElasticSearch configuration parameters, monitoring APL
and easy-to-use and extend examples on how to extend
ElasticSearch by writing your own plugins.
If you are looking for a book that will allow you to easily
extend your basic knowledge about ElasticSearch or you
want to go deeper into the world of full text search using
ElasticSearch then this book is for you.
Who this book is written for
Mastering ElasticSearch is aimed at intermediate users
who want to extend their knowledge about ElasticSearch.
The topics that
aœ
described in the book are detailed.
Advanced users will also find this book useful· as the
examples are getting deep into the internals where it
is needed.
|
any_adam_object | 1 |
author | Kuć, Rafał Rogoziński, Marek |
author_facet | Kuć, Rafał Rogoziński, Marek |
author_role | aut aut |
author_sort | Kuć, Rafał |
author_variant | r k rk m r mr |
building | Verbundindex |
bvnumber | BV041721998 |
classification_rvk | ST 252 ST 253 |
ctrlnum | (OCoLC)892561775 (DE-599)BVBBV041721998 |
discipline | Informatik |
format | Book |
fullrecord | <?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01750nam a2200373zc 4500</leader><controlfield tag="001">BV041721998</controlfield><controlfield tag="003">DE-604</controlfield><controlfield tag="005">20160620 </controlfield><controlfield tag="007">t|</controlfield><controlfield tag="008">140306s2013 xx a||| |||| 00||| eng d</controlfield><datafield tag="020" ind1=" " ind2=" "><subfield code="a">9781783281435</subfield><subfield code="c">Print</subfield><subfield code="9">978-1-78328-143-5</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(OCoLC)892561775</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)BVBBV041721998</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-604</subfield><subfield code="b">ger</subfield><subfield code="e">aacr</subfield></datafield><datafield tag="041" ind1="0" ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="049" ind1=" " ind2=" "><subfield code="a">DE-11</subfield><subfield code="a">DE-739</subfield><subfield code="a">DE-523</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 252</subfield><subfield code="0">(DE-625)143627:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 253</subfield><subfield code="0">(DE-625)143628:</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Kuć, Rafał</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Mastering ElasticSearch</subfield><subfield code="b">extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings</subfield><subfield code="c">Rafal Kuc ; Marek Rogozinski</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="a">Birmingham</subfield><subfield code="b">Packt Publishing</subfield><subfield code="c">2013</subfield></datafield><datafield tag="300" ind1=" " ind2=" "><subfield code="a">X, 361 S.</subfield><subfield code="b">Ill.</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">Includes index</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Application software</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Client/server computing / Software</subfield></datafield><datafield tag="653" ind1=" " ind2=" "><subfield code="a">Electronic books</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Rogoziński, Marek</subfield><subfield code="e">Verfasser</subfield><subfield code="4">aut</subfield></datafield><datafield tag="776" ind1="0" ind2="8"><subfield code="i">Erscheint auch als</subfield><subfield code="n">Online-Ausgabe</subfield><subfield code="z">978-1-78328-144-2</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027168963&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Inhaltsverzeichnis</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="m">Digitalisierung UB Passau - ADAM Catalogue Enrichment</subfield><subfield code="q">application/pdf</subfield><subfield code="u">http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027168963&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA</subfield><subfield code="3">Klappentext</subfield></datafield><datafield tag="943" ind1="1" ind2=" "><subfield code="a">oai:aleph.bib-bvb.de:BVB01-027168963</subfield></datafield></record></collection> |
id | DE-604.BV041721998 |
illustrated | Illustrated |
indexdate | 2024-12-24T04:03:50Z |
institution | BVB |
isbn | 9781783281435 |
language | English |
oai_aleph_id | oai:aleph.bib-bvb.de:BVB01-027168963 |
oclc_num | 892561775 |
open_access_boolean | |
owner | DE-11 DE-739 DE-523 |
owner_facet | DE-11 DE-739 DE-523 |
physical | X, 361 S. Ill. |
publishDate | 2013 |
publishDateSearch | 2013 |
publishDateSort | 2013 |
publisher | Packt Publishing |
record_format | marc |
spellingShingle | Kuć, Rafał Rogoziński, Marek Mastering ElasticSearch extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings Application software Client/server computing / Software |
title | Mastering ElasticSearch extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings |
title_auth | Mastering ElasticSearch extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings |
title_exact_search | Mastering ElasticSearch extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings |
title_full | Mastering ElasticSearch extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings Rafal Kuc ; Marek Rogozinski |
title_fullStr | Mastering ElasticSearch extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings Rafal Kuc ; Marek Rogozinski |
title_full_unstemmed | Mastering ElasticSearch extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings Rafal Kuc ; Marek Rogozinski |
title_short | Mastering ElasticSearch |
title_sort | mastering elasticsearch extend your knowledge on elasticsearch and querying and data handling along with its internal workings |
title_sub | extend your knowledge on ElasticSearch, and querying and data handling, along with its internal workings |
topic | Application software Client/server computing / Software |
topic_facet | Application software Client/server computing / Software |
url | http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027168963&sequence=000003&line_number=0001&func_code=DB_RECORDS&service_type=MEDIA http://bvbr.bib-bvb.de:8991/F?func=service&doc_library=BVB01&local_base=BVB01&doc_number=027168963&sequence=000004&line_number=0002&func_code=DB_RECORDS&service_type=MEDIA |
work_keys_str_mv | AT kucrafał masteringelasticsearchextendyourknowledgeonelasticsearchandqueryinganddatahandlingalongwithitsinternalworkings AT rogozinskimarek masteringelasticsearchextendyourknowledgeonelasticsearchandqueryinganddatahandlingalongwithitsinternalworkings |