This view displays the journal of all services events with their status.
Quick insight of unacknowledged errors affecting a specific perimeter
ack=0 & status=err & responsible=me
Retrospective problem analysis
service=some_service & begin>2010-01-10% & end<2010-01-12%
service=some_service & action=syncnodes & status!=ok
Learn which system commands are run by the nodeware
Trace who was alerted of a problem, and when
Service name the action applied to. Private collectors usually report shortnames, shared collectors usually report service names with domain names.
The application code is a way to group services dedicated/paid by some corporate entity or project. You can setup any application code you want in your services configuration file, using the app parameter.
The responsibles are persons receiving alerts for a service. Hovering the mouse over the icon spawns the name of the responsibles. No icon means no responsible, which is an anomaly, and as such, cause alerts to be emitted to the site's administrator/manager.
Node where the action has be executed. The node name is a link to the asset view.
The executed action name. An action usually aggregates a number of log lines, plus a line with no log message as a header. The default action view has an 'empty' log filter active so you are presented only actions without their logs. You can drill down a specific action by clicking on its pid.
Action Description start start resources of type : ip, loop, disk group, zpool, fs, container, app stop stop resources of type : app, container, fs, zpool, disk group, loop, ip startdisk start resources of type : loop, disk group, zpool, fs stopdisk stop resources of type : fs, zpool, disk group, loop startip start resources of type : ip stopip stop resources of type : ip startloop start resources of type : loop stoploop stop resources of type : loop startvg start resources of type : disk group stopvg stop resources of type : disk group mount start resources of type : fs umount stop resources of type : fs prstart acquire scsi persistent reservations on disks of the service (wrapped by startvg and startdisk) prstop release scsi persistent reservations on disks of the service (wrapped by stopvg and stopdisk) syncnodes trigger hard-coded and user-defined file synchronization to secondary nodes. Optionally creates snapshots to send a coherent file set. No-op if run from a node not running the service. syncdrp trigger hard-coded and user-defined file synchronization to disaster recovery nodes. Optionally creates snapshots to send a coherent file set. No-op if run from a node not running the service. print_status print status of all service resources
Status Description ok The action completed succesfully. warn The action completed with some warnings. No acknowledgement needed. err The action completed with some error. Investigation and acknowledgement are needed. err (strikethrough) The action completed with some error. Acknowledged by a user. empty The collector has been informed of the action begining but has no yet received ending logs.
Begin timestamp of the action.
End timestamp of the action.
Process identifier of the session handling the action on the node. Click to active a pid filter with this value.
The action log as it is displayed on the node standart output.
- Service action error count