Installing and Configuring Elasticsearch (2018.0.0)

This document provides reference information and examples relating to installation and configuration of the Elasticsearch search feature for the Akana API Platform developer portal.

Configuring Elasticsearch with Security Installing and Configuring Elasticsearch (8.4x)

API Platform Version: 8.1 and later. Updated for 2018.0.0.

Note: For information about secure configuration of Elasticsearch, see Configuring Elasticsearch with security.

For information about installing and configuring Elasticsearch in version 8.4x, see Installing and Configuring Elasticsearch (8.4x).

Table of Contents

Elasticsearch feature overview:
  1. About the Elasticsearch feature
  2. Elasticsearch version
  3. Planning your Elasticsearch feature
  4. Should I choose Transport Client or REST Client mode?
  5. Links to additional information about Elasticsearch
Installing and Configuring Elasticsearch:
  1. Installing Elasticsearch
  2. High-level steps for Elasticsearch configuration
  3. What changes do I need to make to the Elasticsearch YAML file?
  4. How do I configure Elasticsearch?
  5. How do I configure the number of nodes and shards?

Elasticsearch feature information:

About the Elasticsearch feature

Elasticsearch is a search engine based on Apache Lucene. It is robust, and allows fast indexing and responsive updating. It's an extremely popular tool in very broad use—a scalable search solution that uses JSON messaging over an HTTP interface with a native Java API.

Elasticsearch is run in standalone mode. Your installation will need to have Elasticsearch installed on at least one server. Just as with a relational database, you'll need to provide the software and hardware required. You can get started with a trial license.

All containers running the Akana API Platform can use Elasticsearch. A cluster is recommended for redundancy.

There are two deployment options:

  • Transport Client: previously supported by the platform (not recommended for 2018.0.0).
  • REST Client: New in 2018.0.0.

For more information on these choices, see Should I choose Transport Client or REST Client mode? below.

Administration of Elasticsearch is done with the configuration wizard in the Akana Administration Console.

For more information on Elasticsearch terminology, refer to the Elasticsearch glossary: https://www.elastic.co/guide/en/elasticsearch/reference/current/glossary.html.

Note: In version 8.0, Elasticsearch embedded mode was offered as a short-term option to help customers to migrate from Compass to Elasticsearch. This was never the recommended option; standalone mode is better. In version 2018.0.0, only standalone mode is supported.

Back to top

Elasticsearch version

Support for Elasticsearch was introduced in Akana API Platform version 8.0. The initial version, and all later versions prior to 2018.0.0, use Elasticsearch 1.5.

In version 2018.0.0, the Akana API platform uses Elasticsearch 6.3.1. See Installing Elasticsearch.

Back to top

Planning your Elasticsearch feature

As part of planning your installation, you'll need to make some decisions about how you want to set up the Elasticsearch feature:

  • Do you want one or more Elasticsearch servers? Additional servers are recommended, for fallback reasons.
  • Do you want dedicated Elasticsearch servers? You could also install Elasticsearch on one or more servers running the Akana API Platform.
  • Which deployment mode do you want to use, Transport Client or REST Client? see Should I choose Transport Client or REST Client mode? below.

Back to top

Should I choose Transport Client or REST Client mode?

There are two options for Deployment Mode:

  • Transport Client: The client uses a TCP connection to communicate to the Elasticsearch server or cluster. Transport Client mode will be deprecated in a later version of Elasticsearch, and will be removed in 8.0. It's best to use REST Client.
  • REST Client: The client communicates to the Elasticsearch server or cluster by accessing a URL. Introduced in a recent version of Elasticsearch. Recommended.

Back to top

Links to additional information about Elasticsearch

For more information about Elasticsearch, refer to the following:

Back to top

Installing and configuring Elasticsearch:

Installing Elasticsearch

You'll need to download Elasticsearch version 6.3.1, and install it on the server or servers you'll be using.

Download file: https://www.elastic.co/downloads/past-releases/elasticsearch-6-3-1. Follow the instructions provided by Elasticsearch.

Back to top

High-level steps for Elasticsearch configuration

To use Elasticsearch on the Akana API Platform you'll need to:

Back to top

What changes do I need to make to the Elasticsearch YAML file?

You'll need to make some changes to one of the Elasticsearch configuration files, elasticsearch.yml, so that Elasticsearch will work for your Akana API Platform installation.

The elasticsearch.yml file is generally stored in the {elasticsearch_home}/config folder. It might have some default placeholder content, but not all the placeholder values.

As a starting point to model your changes, you can use the example in Sample Elasticsearch YAML file below.

Note: If you want security, you'll need to add some extra values, using the example in Sample Elasticsearch YAML file with security settings. If you don't want security, just set up the values listed below.

  • Cluster name: needed for both Transport Client and REST Client. For example:
    cluster.name: akana
  • Node name: needed for both Transport Client and REST Client, if you want to name your own node. For example:
    node.name: node-1
  • Transport TCP port: needed for Transport Client. For example:
    transport.tcp.port: 9300
  • Path to directory where to store the data: needed for both. For example:
    /vars/elasticsearch/data
  • Optional if you don't want to bind to all interfaces: Set the bind address to a specific IP. For example:
    network.host: 192.164.0.1
  • Pass an initial list of hostnames for all the nodes in the cluster, to provide a seed list of other nodes in the cluster that are likely to be live and contactable, as part of discovery. See Discovery Settings on the Elasticsearch website. For example:
    discovery.zen.ping.unicast.hosts: ["localhost", "[::1]"]

For general information about the elasticsearch.yml file, see https://www.elastic.co/guide/en/elasticsearch/reference/current/settings.html.

Sample Elasticsearch YAML file
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
cluster.name: akana
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
path.data: /vars/elasticsearch/data
#
# Path to log files:
#
path.logs: /vars/elasticsearch/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
#network.host: 192.168.0.1
network.host: 0.0.0.0

transport.tcp.port: 9300
#
# Set a custom port for HTTP:
#
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
discovery.zen.ping.unicast.hosts: ["localhost", "[::1]"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
discovery.zen.minimum_master_nodes: 1
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
# (This section is needed only if you want to configure security)

Note: If you want to use security with Elasticsearch, you'll need to set up additional values in the YAML file. See Sample Elasticsearch YAML file with security settings.

Default ports for Elasticsearch configuration

The default ports for Elasticsearch configuration are as follows:

  • HTTP: Default is 9200, default range is 9200–9299
  • TCP: Default is 9300, default range is TCP is 9300–9399

Range implies that if the first port is busy, the platform tries the next one and so on.

Note to Administrators: When setting up your implementation, make sure that all ports in the range 9300–9305 are open, on all containers that the Elasticsearch feature is installed on.

Back to top

How do I configure Elasticsearch?

In configuring the Elasticsearch feature for your Akana API Platform implementation, there are two deployment modes; follow the applicable set of steps as listed below.

For information on which mode to choose, see Should I choose Transport Client or REST Client mode?

Note: Configuration needs to be set up only once, for all containers, and can be done in any one container. The settings are stored in the database for the entire implementation.

Configure Elasticsearch Global Configuration: Transport Client
  1. In the Akana Administration Console, on the Configuration tab, under Configuration Actions, choose Configure Elasticsearch Global Configuration. The wizard opens.
  2. Make sure the Deployment Mode is set to Transport Client, as shown below.

    Configure Elasticsearch Global Configuration: Transport Client

  3. Provide the name of the cluster for the Elasticsearch feature; for example, akana.
  4. In the ES Server URL field, provide the transport address for the Elasticsearch server (one or more, separated by commas), in the format: {hostname}:{port} (without the protocol). Examples:
    • localhost:9300
    • 10.12.121.116:9300
    • 10.12.121.116:9300,10.12.122.140:9300
  5. Click Finish and then click Close.
Configure Elasticsearch Global Configuration: REST Client
  1. In the Akana Administration Console, on the Configuration tab, under Configuration Actions, choose Configure Elasticsearch Global Configuration. The wizard opens.
  2. For Deployment Mode, choose REST Client.

    Configure Elasticsearch Global Configuration: REST Client

  3. In the ES HTTP URLs field, provide the HTTP URLs for each container where Community Manager is running, as well as any container running scheduled jobs. Provide full URLs; use a comma separator between values. Examples:
    • http://localhost:9200
    • https://localhost:9200
    • http://localhost:9200,http://localhost:9250

    Note: If there are multiple URLs, the protocol must be the same for all. For example, you cannot mix HTTP and HTTPS.

  4. Click Finish and then click Close.

Back to top

How do I configure the number of nodes and shards?

New in version: 8.0.3

Note: This step is optional. The platform defaults are sufficient.

The platform includes configuration settings that you can use to manage your Elasticsearch setup. These are controlled by configuration settings in the Akana Administration Console.

In the Akana Administration Console, the configuration category is: com.akana.elasticsearch.

In the default configuration, shown below, there are two shards and one replica. Let's say there are two nodes in the cluster. One shard, approximately half the index, is stored on each node. The one replica includes a replica of each shard:

  • Node 1 has Shard 1 and the replica of Shard 2
  • Node 2 has Shard 2 and the replica of Shard 1

In this scenario, if one node goes down, the other node has the full search index. Additional nodes add additional safety.

There are two settings, as shown below.

elastic.config.index.number.of.replicas
The number of replicas (additional copies) of the Elasticsearch index. Each replica includes a replica of each shard, so one replica might be distributed across multiple nodes, just as the index itself is split into shards which are distributed across nodes.
Default: 1
elastic.config.index.number.of.shards
The number of shards (splits) for the Elasticsearch index.
Note: This is a one-time setting. An Elasticsearch index cannot be re-sharded; if you want to change the number of shards, after changing the setting you'll need to delete the /index folder that the search index is stored in and then rebuild the index.
Default: 2

For additional information about configuration settings in the Akana Administration Console, see Admin Console Settings.

Back to top