Home > Just make it work > Hyperion System Monitoring

Hyperion System Monitoring

In my KSCOPE12 presentation,  Planning for and Managing Hyperion Infrastructure, I give a 100,000 foot view of monitoring.

In this article I bring monitoring down to a 20,000 foot view.

There are many products on the market to choose from for monitoring your Hyperion system(s).

In a nut shell you need a solution which has the below capabilities and characteristics:

  • Little or no false alarms
  • Ability to monitor: Database, Disk, Processor, Memory, and Network
  • Ability to extend base product capabilities with custom monitoring
  • Ability to test key parts of the system to prove it really is working

I narrow the field down to five that I believe have one or more of the following qualities: easy to implement, inexpensive, and/or purpose built for Hyperion.   You may note I have excluded solutions from: BMC, IBM, HP, and CA which have products not exhibiting the former qualities.

When implementing any new software you should go through an evaluation cycle to understand how one tool compares to another in terms of capabilities and price.


The Problem

You understand Perl scripting and want to deploy a quick solution to monitor which is free and has no real infrastructure needs and can be deployed in less than an hour.


The Solution

The below Perl code was a weekend project several months back.  It does have a feature (testing of Smart View logins)  you may wish to integrate with a more formal monitoring solution.

It is easy to implement and can run from your existing Hyperion deployment with no other software.  It relies on Perl and will work with the versions available on the 11.1.2 Hyperion line.

Modify the configuration file to suite your environment.   This means changing the server names to correspond to those which are applicable to your environment, adding or removing services, adding or removing log and keywords, setting your system outage times, and finally scheduling the script to run on an interval (usually every 5 minutes).

NOTE: The mail server configuration is hard-coded in the script so be sure to update that.

Sample config file:

#Current code has some windows specifics e.g. the \\ notation.
#UPDATE BELOW SAMPLE and change JAVA_APP_01 to your main java application server
#UPDATE BELOW SAMPLE and change ESS_APP_01 to be your Essbase
#TYPE,ID,TYPE_ATTRIB
#Down time cannot cross days.
down_time,ALL,SAT_2100_2330
down_time,ESB,ALL_0300_0430
credentials,user,password
email,OPS,notify1@myco.com
email,OPS,notify2@myco.com
#We can clean this up as we know which java.lang we really care about.
#You will probably ignore all java exceptions as there are a lot of "normal" ones
error_action,java.lang,notify
#ORA-01033: ORACLE initialization or shutdown in progress
#error_action,ORA-01033,notify
log,APSWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9aps-sysout.log
log,CALCWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9CALC-sysout.log
log,EASWEBB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9eas-sysout.log
log,SSWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9FoundationServices-sysout.log
log,FRWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9FRReports-sysout.log
log,PLNWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9Planning-sysout.log
log,RAFWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9RaFramework-sysout.log
log,RAFAGENT,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9RaFrameworkAgentOut.log
log,WAWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9WebAnalysis-sysout.log
log,APSWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9aps-sysout.log
log,CALCWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9CALC-sysout.log
log,EASWEBB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9eas-sysout.log
log,SSWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9FoundationServices-sysout.log
log,FRWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9FRReports-sysout.log
log,PLNWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9Planning-sysout.log
log,RAFWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9RaFramework-sysout.log
log,RAFAGENT,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9RaFrameworkAgentOut.log
log,WAWEB,\\JAVA_APP_01\c$\Oracle\Middleware\user_projects\epmsystem1\diagnostics\logs\services\HyS9WebAnalysis-sysout.log
service,APSWEB,HyS9aps
service,CALCWEB,HyS9CALC
service,EASWEB,HyS9eas
service,EPMADS,HyS9EPMADataSynchronizer
service,EPMAWEB,HyS9EPMAWebTier
service,SSWEB,HyS9FoundationServices
service,ESB,opmn_EPM_epmsystem1
service,FRWEB,HyS9FRReports
service,PLNWEB,HyS9Planning
service,RAFWEB,HyS9RaFramework
service,RAFAGENT,HyS9RaFrameworkAgent
service,WAWEB,HyS9WebAnalysis
service,RMIREG,Hyperion RMI Registry
machine,ESB,ESS_APP01
machine,RAFAGENT,JAVA_APP_01
machine,FRWEB,JAVA_APP_01
machine,PLNWEB,JAVA_APP_01
machine,EASWEB,JAVA_APP_01
machine,RMIREG,JAVA_APP_01
machine,APSWEB,JAVA_APP_01
machine,WAWEB,JAVA_APP_01
machine,SSWEB,JAVA_APP_01
machine,RAFWEB,JAVA_APP_01
machine,RAFAGENT,JAVA_APP_01
machine,FRWEB,JAVA_APP_01
machine,PLNWEB,JAVA_APP_01
machine,EASWEB,JAVA_APP_01
machine,RMIREG,JAVA_APP_01
machine,APSWEB,JAVA_APP_01
machine,WAWEB,JAVA_APP_01
machine,SSWEB,JAVA_APP_01
machine,RAFWEB,JAVA_APP_01
port,ESB,1423
port,EPMADIM,5251
port,EPMAJNI,5255
port,RAFAGENT,6860
port,FRWEB,8200
port,PLNWEB,8300
port,CALCWEB,8500
port,EASWEB,10080
port,RMIREG,11333
port,APSWEB,13080
port,WAWEB,16000
port,EPMADS,19101
port,EPMAWEB,19091
port,SSWEB,28080
port,RAFWEB,45000
appcheck,RAFAGENT,port
appcheck,ESB,port
appcheck,RAFWEB,loginwks
appcheck,FRWEB,port
appcheck,PLNWEB,port
appcheck,CALCWEB,port
appcheck,EASWEB,port
appcheck,RMIREG,port
appcheck,APSWEB,port
appcheck,WAWEB,port
appcheck,EPMADS,port
appcheck,EPMAWEB,port
appcheck,SSWEB,port

Perl script:

use strict;
use Switch;
use File::stat;
use IO::Socket;
use LWP::UserAgent;
use HTTP::Request::Common;
use HTTP::Cookies;
use URI::Escape;
use Net::SMTP;

my $user_id;
my $password;
my @Machines=();
my @OPSEmailNotify=();
my %LoggedErrors;
my %ApplicationCheck;
my %Downtime;
my %ErrorAction;
my %Logs;
my %Ports;
my %Service;

#DB Outage window Saturdays from 9:00 PM - 11:30

my $service_message="";
my $service_timestamp="";
my %day_hash = ('SUN',0,'MON',1,'TUE',2,'WED',3,'THU',4,'FRI',5,'SAT',6);

my($filepath,$junk) = split(/\./,$0);
#Name the configuration file with same name as script. e.g. system_mon.pl and system_mon.cfg or mon.pl and mon.cfg
my $config_file=$filepath.".cfg";
print "Reading $config_file\n";
open (CONFIG_FILE,$config_file) || die "cannot open file $config_file\n";

my @lines = <CONFIG_FILE>;
close(CONFIG_FILE);

my $index;
my $type;
my $id;
my $type_attrib;
my $count = @lines;
print "Found $count lines in ".$config_file."\n";
for ($index = 0;$index< $count;$index++) {
if (!($lines[$index] =~m/^\#/)) {
($type,$id,$type_attrib)=split(/,/,$lines[$index]);
chomp $type_attrib;

switch ($type) {
 case "error_action" {$ErrorAction{ $id } = $type_attrib;}
 case "down_time" {$Downtime{ $id } = $type_attrib;}
 case "log" {$Logs{ $id } = $type_attrib;}
 case "service" {$Service{ $id } = $type_attrib;}
 case "port" {$Ports{ $id } = $type_attrib;}
 case "appcheck" {$ApplicationCheck{ $id } = $type_attrib;}
 case "machine" {push(@Machines,"$id,$type_attrib");}
 case "email" {if ($id eq "OPS") {push(@OPSEmailNotify,"$type_attrib");}}
 case "credentials" { $user_id = $id; $password = $type_attrib; }
 else { print "unknown type $type, ignoring\n"};

 }
 }
}

my $ERROR_FH;
my $error_file=$filepath.".err";

if (-f $error_file) {
 open ($ERROR_FH,"<", $error_file) || die "cannot open file $error_file\n";
 while (<$ERROR_FH>) {
 chomp $_;
 my($ts,$app,$log)=split(/_/);
 $LoggedErrors{ $app.$ts } = $log;
 }
 close $ERROR_FH;
}

open ($ERROR_FH,">", $error_file) || die "cannot open file $error_file\n";

#Loop through down-time and perform checks if we are not in down_time window
if (downtime_check("ALL")) { print("Down time window, no system checks shall be ran.\n"); }
else {
 application_check($ERROR_FH);
 monitor_logs($ERROR_FH);
}

close $ERROR_FH;

sub application_check() {
my $FH = shift @_;
my $error_message="";
my $ts=localtime;
foreach (@Machines) {
my($service_id,$machine_name)=split(/,/);
if (downtime_check($service_id)==0) {
 switch ($ApplicationCheck{$service_id}) {
 case "port" { if (port_check($machine_name,$service_id)) {printf $FH "%s_%s_%s\n",$machine_name,$Ports{$service_id},$ts;
 if (!exists($LoggedErrors{$Ports{$service_id}.$machine_name})) { $error_message=$error_message."Machine Name: $machine_name Service:$Service{$service_id} Port:$Ports{$service_id} not available\n"}} }
 case "loginwks" {
 switch (logon_raf_smartview($machine_name,"IIS",$user_id,$password)) {
 case "1" { printf $FH "%s_%s_%s\n",$machine_name,"loginwks",$ts; if (!exists($LoggedErrors{"loginwks".$machine_name})) {$error_message=$error_message."logon_raf_smartview: $machine_name Invalid User or Password passed to login_raf_smartview!"; }}
 case "2" { printf $FH "%s_%s_%s\n",$machine_name,"loginwks",$ts; if (!exists($LoggedErrors{"loginwks".$machine_name})) {$error_message=$error_message."logon_raf_smartview: $machine_name Service:".$Service{'SSWEB'}." FAILED\n";}}
 case "3" { printf $FH "%s_%s_%s\n",$machine_name,"loginwks",$ts; if (!exists($LoggedErrors{"loginwks".$machine_name})) {$error_message=$error_message."logon_raf_smartview: $machine_name Service:".$Service{'RAFWEB'}." FAILED\n";}}
 case "4" { printf $FH "%s_%s_%s\n",$machine_name,"loginwks",$ts; if (!exists($LoggedErrors{"loginwks".$machine_name})) {$error_message=$error_message."logon_raf_smartview: $machine_name Service:".$Service{'RAFAGENT'}." FAILED\n";}}
 else {}
 }
 }
 else { print "No application check found for type $service_id\n"};
 }

}
}
if (length($error_message)>0) { email_alert("Hyperion Log Alert",$error_message,\@OPSEmailNotify);}
}

sub restart_service {
my $computer_name=shift @_;
my $service_name=shift @_;
system("sc stop $service_name");
sleep(30);
system("sc start $service_name");
}

sub port_check () {
my $machine_name=shift @_;
my $service_id=shift @_;
my $sock = new IO::Socket::INET (
 PeerAddr => $machine_name,
 PeerPort => $Ports{$service_id},
 Proto => 'tcp',
 Timeout => '2',
 );
if ($sock) { close($sock); return 0; }
else { return 1;}
}

sub monitor_logs {
my $FH = shift @_;
my $ts;
my $error_message="";

print "Monitor Logs\n";

while ( my ($app_id, $log_path) = each(%Logs) ) {
if (-f $log_path) {
 open (FILE, "< $log_path") or die("Cannot open input file $log_path\n");
 while () {
 if (substr($_, 0, 1) eq "<") { $service_message=$_; my $junk; ($ts,$junk)=split(/\>/);
 ($junk,$ts)=split(/\0) { email_alert("Hyperion Log Alert",$error_message,\@OPSEmailNotify);}
}

sub logon_raf_smartview() {
my $SERVER=shift;
my $WEBSERVER=shift;
my $USER=shift;
my $PASSWORD=shift;
my $CLIENT_VERSION="4.2.0.0.0";
my $SERVER_PORT;

if ($WEBSERVER !~ m/IIS/) { $SERVER_PORT=$SERVER.":19000"; }
else {$SERVER_PORT=$SERVER;}

my $userAgent = LWP::UserAgent->new(agent => 'HttpApp/1.0');

# Store Cookies
$userAgent->cookie_jar(
 HTTP::Cookies->new(
 file => 'mycookies.txt',
 autosave => 1
 )
 );

my $message = "<req_ConnectToProvider>".$CLIENT_VERSION."en_US";
my $response = $userAgent->request(POST 'http://'.$SERVER_PORT.'/workspace/SmartViewProviders',
Content_Type => 'text/xml',
Content => $message);

if (!$response->is_success || $response->as_string !~ m/Oracle Enterprise Performance Management System Workspace/) {
print("login_raf_smartview: Failed to receive workspace response from $SERVER_PORT, check Hyperion Foundation Services - Managed Server\n");
return 2;
}

my $message = "<req_GetProvisionedDataSources>";
my $response = $userAgent->request(POST 'http://'.$SERVER_PORT.'/workspace/SmartViewProviders',
Content_Type => 'text/xml',
Content => $message);

if (!$response->is_success || $response->as_string !~ m/User authentication needed/) {
print("login_raf_smartview: Failed to receive workspace authentication challenge from $SERVER_PORT, check Hyperion Foundation Services - Managed Server\n");
return 2;
}

my $message = "<req_GetProvisionedDataSources>".$USER."".$PASSWORD."";
my $response = $userAgent->request(POST 'http://'.$SERVER_PORT.'/workspace/SmartViewProviders',
Content_Type => 'text/xml',
Content => $message);

if ($response->is_success && $response->as_string =~ m/Invalid login/) {
print("login_raf_smartview: Invalid username or password passed to login_raf_smartview function in monitoring script\n");
return 1;
}

if (!$response->is_success || $response->as_string !~ m/\<sso\>/) {
print("login_raf_smartview: Failed to receive sso token from $SERVER_PORT, check Hyperion Foundation Services - Managed Server\n");
return 2;
}

my $sso_token = substr($response->as_string,index($response->as_string,"")+5,index($response->as_string,"")-index($response->as_string,"")-5);

$message="<req_GetProvisionedDataSources>".$sso_token."";
my $response = $userAgent->request(POST 'http://'.$SERVER_PORT.'/workspace/SmartViewProviders',
Content_Type => 'text/xml',
Content => $message);

if (!$response->is_success || $response->as_string !~ m/res_GetProvisionedDataSources/) {
print("login_raf_smartview: Failed to receive response to GetProvisionedDataSources request from $SERVER_PORT, check Hyperion Foundation Services - Managed Server\n");
return 2;
}

my $message = "rcp_version=1.4&sso_token=".uri_escape($sso_token)."&applicationtype=officeAddin&applicationversion=1.0.0&format=excel.2003&hycmnaddin18467=41&action=server";
my $response = $userAgent->request(POST 'http://'.$SERVER_PORT.'/raframework/browse/listXML',
Content_Type => 'application/x-www-form-urlencoded;charset=UTF-8',
Content => $message);

if (!$response->is_success && $response->as_string =~ m/Service Unavailable/) {
print("login_raf_smartview: Failed to conect to $SERVER_PORT. Check Hyperion Reporting and Analysis Framework Web Application\n");
return 3;
}

if ($response->is_success && $response->as_string =~ m/port 6800/) {
print("login_raf_smartview: Failed to conect. Server cannot connect to port 6800, check Hyperion Reporting Analysis Framework\n");
return 4;
}

print("login_raf_smartview: passed for $SERVER_PORT\n");
return 0;
}

sub downtime_check() {
my $service_id=shift @_;

my $system_downtime=0;
my ($seconds,$minute,$hour,$day,$month,$year,$wday,$yday,$isdst)=localtime(time);
my $hourmin=$hour*100 + $minute;

if (exists($Downtime{$service_id})) {
 my($dow,$start_time,$end_time)=split(/_/,$Downtime{$service_id});
 if ( (uc($dow) eq "ALL" || $day_hash{uc($dow)} == $wday)&& $hourmin>=$start_time && $hourmin<=$end_time) { return 1;} }
return 0; }

sub email_alert {
my $SUBJ=shift;
my $MESSAGE=shift;
my $NOTIFY_USER_ARRAY=shift;
my $MAIL_SERVER='smtpl.myco.com';
my $BATCH_USER='Hyperion_Mon@myco.com';
my $mailto;
my $smtp = Net::SMTP->new($MAIL_SERVER);
print $smtp->banner();
$smtp->mail($BATCH_USER);

print $smtp->code();
print $smtp->message();

$smtp->recipient(@$NOTIFY_USER_ARRAY);

print $smtp->code();
print $smtp->message();

$smtp->data();
foreach $mailto (@$NOTIFY_USER_ARRAY) {
print "Notifying $mailto \n";
$smtp->datasend("To: $mailto\n");
}
$smtp->datasend("Subject: $ENV{COMPUTERNAME} - $SUBJ\n\n");
$smtp->datasend("\n");
$smtp->datasend("$MESSAGE\n");
$smtp->dataend();

print $smtp->code();
print $smtp->message();

$smtp->quit;
}
Company:     Blue Stone International, http://www.bluestoneinternational.com
Categories: Just make it work
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 31 other followers

%d bloggers like this: