Monitoring the Health of a Container: Akana System Health Tool

This technical note provides information about the Akana System Health tool and its associated API.

Using the Admin Console Managing Containers

Supported Platforms: 8.2 and later

Table of Contents

  1. Summary
  2. Configuration
  3. Panels
  4. Settings
  5. System Health Tool API
  6. System Health Tool API: Links
  7. Thresholds
  8. Checking Health Statistics
  9. Custom Panel
  10. What's Next?

Summary

Before API Platform version 8.2, container health statistics were only available visually through the Monitoring Tool (see Using the Admin Monitoring Tool). Since these statistics were only available through the GUI, there was no way to use the data for operational use. With the introduction of the System Health Tool in version 8.2, you can now retrieve these health statistics by making a GET call to any Akana container. In addition, the System Health Tool allows thresholds to be configured for any monitored attributes, providing the means to get information on any values that fall outside normal or expected ranges.

This is a feature of the core platform and can be leveraged by all containers, including Policy Manager, Community Manager, and Envision. A primary use case for this tool is to check the health of the container to determine if it is ready to handle traffic.

The health statistics available are the same as the Monitoring Tool, which means that any standard OSGi Monitorable instance in the system can be tracked, including but not limited to:

  • Outgoing HTTP connection pool statistics
  • Incoming HTTP thread pools
  • Database connection pools
  • Container memory usage
  • Usage monitoring queues
  • JMS connections
  • Container configuration state
  • Container lifecycle

Back to top

Configuration

All 8.2 containers will have the System Health Tool installed when any of the PM, CM, or ND features are installed. No additional configuration is required.

You can start exploring this feature through the Health tab in the Akana Administration Console:

Health tab

Back to top

Panels

A panel is a grouping of various types of system health information. Each panel can have any combination of types of system health information that you want to monitor.

Depending on the features installed, the Health tab will include a number of pre-configured panels. You can expand them to view detailed monitoring information, as shown below.

detailed monitoring information

Back to top

Settings

The slider in the top right corner allows you to change the frequency of the polling interval. In addition, each panel has a toggle to enable or disable authentication when accessing the System Health Tool API using the links provided.

settings

Back to top

System Health Tool API

You can access the API by usinga GET method on /admin/health context for any Akana container. For example, if the container Akana Administration Console URL is http://acme.akana.com:9900/admin, the URL for the System Health Tool API is http://acme.akana.com:9900/admin/health.

A sample response is shown below.

System Health API: Response

{
  "links": [
    {
      "rel": "self",
      "href": "http://acme.akana.com:9905/admin/health/"
    },
    {
      "rel": "measurables",
      "href": "http://acme.akana.com:9905/admin/health/measurables"
    },
    {
      "rel": "available",
      "href": "http://acme.akana.com:9905/admin/health/available"
    }
  ],
  "measurables": [
    {
      "id": "akana.system.health",
      "name": "System Health",
      "path": "akana.system.health",
      "state": "NORMAL",
      "childCount": 2,
      "editable": false,
      "options": {
        "enableAuth": false,
        "links": [
          {
            "rel": "self",
            "href": "http://acme.akana.com:9905/admin/health/measurables/akana.system.health/configuration"
          }
        ]
      },
      "links": [
        {
          "rel": "self",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.system.health"
        },
        {
          "rel": "brief",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.system.health?brief=true"
        },
        {
          "rel": "children",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.system.health/children"
        },
        {
          "rel": "options",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.system.health/configuration"
        },
        {
          "rel": "values",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.system.health/values"
        }
      ]
    },
    {
      "id": "akana.service.container.readiness",
      "name": "Service Container Readiness",
      "path": "akana.service.container.readiness",
      "state": "NORMAL",
      "childCount": 6,
      "editable": false,
      "options": {
        "enableAuth": false,
        "links": [
          {
            "rel": "self",
            "href": "http://acme.akana.com:9905/admin/health/measurables/akana.service.container.readiness/configuration"
          }
        ]
      },
      "links": [
        {
          "rel": "self",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.service.container.readiness"
        },
        {
          "rel": "brief",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.service.container.readiness?brief=true"
        },
        {
          "rel": "children",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.service.container.readiness/children"
        },
        {
          "rel": "options",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.service.container.readiness/configuration"
        },
        {
          "rel": "values",
          "href": "http://acme.akana.com:9905/admin/health/measurables/akana.service.container.readiness/values"
        }
      ]
    }
  ]
}

This provides an overview of the available health monitors (named measurables), and a set of links that provide access to more detailed information. Each of the measurables in the response corresponds to a monitoring panel shown on the Akana Administration Console.

Back to top

System Health Tool API: Links

You can follow each of the links on the response to view different health dimensions as well as the configuration info.

For example, to get detailed information on the akana.system.health category, you would perform an HTTP GET using the self link. In the above example, the link is http://acme.akana.com:9900/admin/health/measurables/akana.system.health.

Or, for example, if you want to view all the available configurable options, you can fetch http://acme.akana.com:9900/admin/health/available.

Back to top

Thresholds

Each health monitor may have a set of threshold settings. These settings are used to indicate current health status using three values: NORMAL, WARNING, or FAILURE.

System default health monitors have the threshold ranges predefined. To view or modify the values, go to the Akana Administration Console and click the status icon, as shown below.

warning threshold

Thresholds are defined using the Apache Java Expression Language syntax. For more information, refer to the JEXL overview on the Apache Commons site. Currently, only a single variable named value is accessible, which is the current value of the monitored property.

Back to top

Checking Health Statistics

You can check the system health information by following the links provided in the health monitor's panel.

An example of how you can use this feature is to provide load balancers with an appropriate status based on the health of any aspect of container operation.

links

Checking Health Statistics: Response

{
  "id": "akana.service.container.readiness",
  "name": "Service Container Readiness",
  "path": "akana.service.container.readiness",
  "state": "NORMAL",
  "childCount": 6,
  "editable": false,
  "options": {
    "enableAuth": true,
    "links": [
      {
        "rel": "self",
        "href": "http://localhost:9905/admin/health/measurables/akana.service.container.readiness/configuration"
      }
    ]
  },
  "links": [
    {
      "rel": "self",
      "href": "http://localhost:9905/admin/health/measurables/akana.service.container.readiness"
    },
    {
      "rel": "brief",
      "href": "http://localhost:9905/admin/health/measurables/akana.service.container.readiness?brief=true"
    },
    {
      "rel": "children",
      "href": "http://localhost:9905/admin/health/measurables/akana.service.container.readiness/children"
    },
    {
      "rel": "options",
      "href": "http://localhost:9905/admin/health/measurables/akana.service.container.readiness/configuration"
    }
  ]
}

Using query parameters to define the HTTP response status code

If the health category is NORMAL, WARNING, or FAILURE, you can use an optional set of query parameters to control the HTTP status code returned for each of the thresholds. This avoids the need to parse the JSON response content, allowing decisions to be made solely on the response code.

The following query parameters are valid for specifying response HTTP status when you run the GET operation on the health category:

  • normal-status
  • warning-status
  • failure-status

In the example below, the GET call to check container readiness sets a warning status of 503 and a failure status of 503. This will return 503 until the health check is in the NORMAL state, at which point it will return 200.

GET http://acme.akana.com:9900/admin/health/measurables/akana.service.container.readiness?brief=true&warning-status=503&failure-status=503

Back to top

Custom Panel

To display a custom set of system health statistics, you can create your own panel. To create a new panel, click the plus icon. Once the panel has been created, you can then add your own combination of health monitors.

The sample custom panel below is configured on the Policy Manager container and is called Health Monitoring for ACME. The failure threshold is set to anything below the currently provisioned number of PM APIs.

The threshold below defines a normal condition where there is greater than 20% of capacity remaining in the pool.

links

The threshold below defines a warning condition where between 20% and 5% of capacity remains in the pool.

links

The threshold below defines a failure condition when less than 5% of capacity remains in the pool.

links

To check the status of this custom panel, you would simply follow the links provided, which in this case might be http://acme.akana.com:9900/admin/health/measurables/health.monitoring.for.acme.

Back to top

What's Next?

Container health data can be polled and collected into a data store. You can use the collected data to gain valuable operational insight.

Back to top