Topology View, for monitoring the Grid

The Topology View shows a logical view of the hosts running in the grid and all nodes running on them. The page is intended to give information on all individual runtime artifacts: grid agents, the grid registry, grid routers, and application nodes. Technically, each runtime artifact except hosts corresponds to a JVM.

The top of the page displays:

  • A link to a page with application focus (in contrast to the runtime artifact focus of this page)

    It displays a list of all applications deployed in the grid, where each application name is a link to a page with more detailed information and where you can manage the application.

  • A link to the logging page

    From the logging page, you can change the log levels and view the content of the log files

  • A link to pages showing more advanced details of the grid

    The advanced part contains management of the client connections, dispatchers, and proxies as well as advanced configuration options for the grid internal parts. The information displayed in this part is to be used for troubleshooting purposes and is not to be altered.

  • A link which displays if the grid is set offline or not, and which opens a dialog where the grid’s state may be toggled

  • A stop button that opens a dialog from where you can stop the grid or applications in the grid

  • A button to reset the loggers

  • A plus-sign where you may toggle the view between presenting the artifacts in a list and in boxes

For each runtime artifact, the page displays the following:

Column Description Comment
Type The type of runtime entity displayed on a row, for example, host, grid agent, registry, router, or node For hosts, this column also includes a start button that allows you to start nodes (bindings) on that particular host.
Application If the artifact belongs to the grid itself or to an application Displayed with either SYSTEM or the application’s name.
Name The name of the runtime entity

For nodes, this is the name of the binding used to start the node.

The name is a link to a page with more information about the artifact. The page also displays a link to an Advanced page, which in turn has links to advanced information about the artifact such as properties, counters, threads and proxies.

Status

The general status of the entity. Normally this should be "OK," but it may also be one of the following statuses:

  • "OFFLINE"

  • "STOPPING"

  • "STARTING"

  • "NOT RESPONDING"

  • "STALE"

  • "FAILED"

  • "LIMITED (applies to routers only)"

Number of errors and warnings in the log

When a node is off-line, it will take no new requests while letting already processing request finish. For more information on the off-line status, see Putting Applications or Parts of the Grid in an Off-Line State.

View the log for details. If errors or warnings are indicated, the cause should be investigated. When the problems have been taken care of or they have been deemed unimportant or irrelevant, it is possible, and recommended, to reset the loggers using the "Reset Loggers" button at the top of the page. This will reset the error and warning counts to zero so that it is easier to see if new errors appear. It is only the count that is reset. No information is removed from the log files.

A stop button for all entities except hosts and grid agents This is useful when an individual application node needs to be shut down, for example, because it is in an error state.
PID A unique ID for each runtime artifact.

The ID consists of address, port, and process id.

Only the process ID is shown in the table but hovering your mouse pointer over that process ID reveals the address and port.

Log Link that displays the log file belonging to this runtime artifact This is the default place to look for information if a node is showing signs of having problems.
Up Time How long this runtime artifact been running (time since last started) A short up time could indicate that a node has been automatically restarted due to some problem. Obviously, the short up time could also have a natural explanation.
CPU% Approximate CPU usage Different applications use the CPU in different ways. What is normal depends on the application. This information can indicate if an application does not behave as it usually does.
Heap Usage Memory usage of the runtime artifact (JVM)

Typically application should not be short of memory in order to perform well (there are exceptions).

If the heap of an individual node exceeds its threshold (85% is the default threshold though some nodes may have higher or lower thresholds) the node will be automatically taken offline by the Grid to protect it from out of memory conditions. During this time, the node will not respond to application calls and the application may as a result appear to be malfunctioning or unresponsive.

You can expand the artifacts so that more information is displayed. To do so, click on the small plus icon to the far left of each artifact’s row in the list. When expanded, routers will list information about external addresses and ports used. Nodes will list individual application modules.

At the bottom of the page there are links to all applications. If the applications provide management pages of their own, a link to them is also displayed. These links are useful to get more information about a particular application.

The Applications Page

The Applications page is accessed from the Applications link at the top of the Topology View page.

From this page you may access the same information as the Topology View page but the information is rearranged with a focus on applications instead of hosts. It presents a list of all deployed applications with some overview information of each application. The application name is a link to a page with more detailed information about the application. The detailed information is essentially the same as in the Topology View but only the runtime artifacts related to the application are displayed.

This page also provides convenient ways to access the configuration of the application, its management page (if the application has one) as well as starting, stopping, and managing the off-line state of the applications.

Advanced page for an artifact/node

For each artifact in the Topology View-table the name is a link to a page which is displaying more information about the artifact. The page also has two links, one for logging of the artifact and one to an Advanced page. The Advanced page presents detailed information about the node such as properties, counters, threads and proxies.