Actions view

Usage

This view displays the journal of all services events with their status.

Typical usage:

Quick insight of unacknowledged errors affecting a specific perimeter

Example: ack=0 & status=err & responsible=me

Retrospective problem analysis

Example: service=some_service & begin>2010-01-10% & end<2010-01-12%

Problem patterns

Example: service=some_service & action=syncnodes & status!=ok

Learn which system commands are run by the nodeware

Example: log=%sg_persist%|%scu%

Trace who was alerted of a problem, and when

Screenshot

_images/collector.view.actions.1.png

Fields

Service

Service name the action applied to. Private collectors usually report shortnames, shared collectors usually report service names with domain names.

App

The application code is a way to group services dedicated/paid by some corporate entity or project. You can setup any application code you want in your services configuration file, using the app parameter.

Responsible

The responsibles are persons receiving alerts for a service. Hovering the mouse over the icon spawns the name of the responsibles. No icon means no responsible, which is an anomaly, and as such, cause alerts to be emitted to the site’s administrator/manager.

Node

Node where the action has be executed. The node name is a link to the asset view.

Action

The executed action name. An action usually aggregates a number of log lines, plus a line with no log message as a header. The default action view has an ‘empty’ log filter active so you are presented only actions without their logs. You can drill down a specific action by clicking on its pid.

Action

Description

start

start resources of type : ip, loop, disk group, zpool, fs, container, app

stop

stop resources of type : app, container, fs, zpool, disk group, loop, ip

startdisk

start resources of type : loop, disk group, zpool, fs

stopdisk

stop resources of type : fs, zpool, disk group, loop

startip

start resources of type : ip

stopip

stop resources of type : ip

startloop

start resources of type : loop

stoploop

stop resources of type : loop

startvg

start resources of type : disk group

stopvg

stop resources of type : disk group

mount

start resources of type : fs

umount

stop resources of type : fs

prstart

acquire scsi persistent reservations on disks of the service (wrapped by startvg and startdisk)

prstop

release scsi persistent reservations on disks of the service (wrapped by stopvg and stopdisk)

syncnodes

trigger hard-coded and user-defined file synchronization to secondary nodes. Optionally creates snapshots to send a coherent file set. No-op if run from a node not running the service.

syncdrp

trigger hard-coded and user-defined file synchronization to disaster recovery nodes. Optionally creates snapshots to send a coherent file set. No-op if run from a node not running the service.

print_status

print status of all service resources

Status

Status Description

ok

The action completed succesfully.

warn

The action completed with some warnings. No acknowledgement needed.

err

The action completed with some error. Investigation and acknowledgement are needed.

err (strikethrough)

The action completed with some error. Acknowledged by a user.

empty

The collector has been informed of the action begining but has no yet received ending logs.

Begin

Begin timestamp of the action.

End

End timestamp of the action.

Pid

Process identifier of the session handling the action on the node. Click to active a pid filter with this value.

Log

The action log as it is displayed on the node standart output.

Dashboard notifications

  • Service action error count