Skip to main content

ElasticSearch to RDBMS Glossary

In the early days of my playing with ElasticSearch, I remember struggling with some of the basic terminology and concepts. Naturally, many of us try to equate ElasticSearch, with what we know of RDBMS. To that end, I thought of posting this simple topic as a reference to others:

Node = DB Instance

One Database Instance
A node is simply one ElasticSearch instance (1 java process).
Consider this a running instance of MySQL. Just like you can have more than one MySQL instance running per machine on different ports… you can have more than one elasticsearch node running per machine on different ports.

Cluster = Database Cluster

1..N Nodes with the same Cluster Name.

Index = Database Schema

Similar to a Database, or Schema. Consider it a set of tables with some logical grouping. In ElasticSearch terms, an index is a Collection of Documents; where a “Document” is similar to a DB table.

Mapping Type = Database Table

ElasticSearch uses document definitions that act as tables. If you PUT (“Index”) a document in ElasticSearch, you will notice that it automatically tries to determine the property types. This is like inserting a JSON blob in MySQL, and MySQL determining the number of columns and column types (int, string, datetime, etc…) as it creates the DB table for you, on-the-fly.
Note: I’ve heard this refered to as “Type”, “Document Type”, and “Mapping Type”.

Shard = Uhhh…

I don’t think this one has a DB equivalent, but it’s likely the most important aspect listed here. A Shard is the smallest unit of worker in your cluster. It is one running Lucene instance. Shards are distributed across all of the nodes in your cluster and they are what makes ElasticSearch, elastic, sort-a-speak; giving your information and ES process redundancy.

Comments

Popular posts from this blog

Stretch a row if data overflows in jasper reports

It is very common that some columns of the report need to stretch to show all the content in that column. But  if you just specify the property " stretch with overflow' to that column(we called text field in jasper report world) , it will just stretch that column and won't change other columns, so the row could be ridiculous. Haven't find the solution from internet yet. So I just review the properties in iReport one by one and find two useful properties(the bold highlighted in example below) which resolve the problems.   example:
<band height="20" splitType="Stretch"> <textField isStretchWithOverflow="true" pattern="" isBlankWhenNull="true"> <reportElement stretchType="RelativeToTallestObject" mode="Opaque" x="192" y="0" width="183" height="20"/> <box leftPadding="2"> <pen lineWidth="0.25"/> …

JasperReports - Configuration Reference

Live - solving the jasper report out of memory and high cpu usage problems

I still can not find the solution. So I summary all the things and tell my boss about it. If any one knows the solution, please let me know.


Symptom: 1.The JVM became Out of memory when creating big consumption report 2.Those JRTemplateElement-instances is still there occupied even if I logged out the system
Reason:         1. There is a large number of JRTemplateElement-instances cached in the memory 2.The clearobjects() method in ReportThread class has not been triggered when logging out
Action I tried:      About the Virtualizer: 1.Replacing the JRSwapFileVirtualizer with JRFileVirtualizer 2.Not use any FileVirtualizer for cache the report in the hard disk Result: The japserreport still creating the a large number of JRTemplateElement-instances in the memory     About the work around below,      I tried: item 3(in below work around list) – result: it helps to reduce  the size of the JRTemplateElement Object                Item 4,5 – result : it helps a lot to reduce the number of  JRTemplateE…