Skip to main content

ElasticSearch to RDBMS Glossary

In the early days of my playing with ElasticSearch, I remember struggling with some of the basic terminology and concepts. Naturally, many of us try to equate ElasticSearch, with what we know of RDBMS. To that end, I thought of posting this simple topic as a reference to others:

Node = DB Instance

One Database Instance
A node is simply one ElasticSearch instance (1 java process).
Consider this a running instance of MySQL. Just like you can have more than one MySQL instance running per machine on different ports… you can have more than one elasticsearch node running per machine on different ports.

Cluster = Database Cluster

1..N Nodes with the same Cluster Name.

Index = Database Schema

Similar to a Database, or Schema. Consider it a set of tables with some logical grouping. In ElasticSearch terms, an index is a Collection of Documents; where a “Document” is similar to a DB table.

Mapping Type = Database Table

ElasticSearch uses document definitions that act as tables. If you PUT (“Index”) a document in ElasticSearch, you will notice that it automatically tries to determine the property types. This is like inserting a JSON blob in MySQL, and MySQL determining the number of columns and column types (int, string, datetime, etc…) as it creates the DB table for you, on-the-fly.
Note: I’ve heard this refered to as “Type”, “Document Type”, and “Mapping Type”.

Shard = Uhhh…

I don’t think this one has a DB equivalent, but it’s likely the most important aspect listed here. A Shard is the smallest unit of worker in your cluster. It is one running Lucene instance. Shards are distributed across all of the nodes in your cluster and they are what makes ElasticSearch, elastic, sort-a-speak; giving your information and ES process redundancy.

Comments

Popular posts from this blog

Quicksort implementation by using Java

 source: http://www.algolist.net/Algorithms/Sorting/Quicksort. The divide-and-conquer strategy is used in quicksort. Below the recursion step is described: 1st: Choose a pivot value. We take the value of the middle element as pivot value, but it can be any value(e.g. some people would like to pick the first element and do the exchange in the end) 2nd: Partition. Rearrange elements in such a way, that all elements which are lesser than the pivot go to the left part of the array and all elements greater than the pivot, go to the right part of the array. Values equal to the pivot can stay in any part of the array. Apply quicksort algorithm recursively to the left and the right parts - the previous pivot element excluded! Partition algorithm in detail: There are two indices i and j and at the very beginning of the partition algorithm i points to the first element in the array and j points to the last one. Then algorithm moves i forward, until an element with value greater or equal

Live - solving the jasper report out of memory and high cpu usage problems

I still can not find the solution. So I summary all the things and tell my boss about it. If any one knows the solution, please let me know. Symptom: 1.        The JVM became Out of memory when creating big consumption report 2.        Those JRTemplateElement-instances is still there occupied even if I logged out the system Reason:         1. There is a large number of JRTemplateElement-instances cached in the memory 2.     The clearobjects() method in ReportThread class has not been triggered when logging out Action I tried:      About the Virtualizer: 1.     Replacing the JRSwapFileVirtualizer with JRFileVirtualizer 2.     Not use any FileVirtualizer for cache the report in the hard disk Result: The japserreport still creating the a large number of JRTemplateElement-instances in the memory        About the work around below,      I tried: item 3(in below work around list) – result: it helps to reduce  the size of the JRTemplateElement Object        

Stretch a row if data overflows in jasper reports

It is very common that some columns of the report need to stretch to show all the content in that column. But  if you just specify the property " stretch with overflow' to that column(we called text field in jasper report world) , it will just stretch that column and won't change other columns, so the row could be ridiculous. Haven't find the solution from internet yet. So I just review the properties in iReport one by one and find two useful properties(the bold  highlighted in example below) which resolve the problems.   example: <band height="20" splitType="Stretch" > <textField isStretchWithOverflow="true" pattern="" isBlankWhenNull="true"> <reportElement stretchType="RelativeToTallestObject" mode="Opaque" x="192" y="0" width="183" height="20"/> <box leftPadding="2"> <pen lineWidth="0.25"/>