Troubleshoot Stack Monitoring
The topics in this section provide troubleshooting information to identify and address common issues that may occur while working with Stack Monitoring.
Troubleshoot General Issues
New permissions in resource-types are not propagated
This happens because IAM does not recompile a policy unless there is a change to the policy statement.
For any existing policies that use resource-types, when new permissions are added to the resource-type, edit the policy by adding a blank space. Then, save the policy.
For more information, see New permissions in resource-types are not propagated.
Troubleshoot PeopleSoft
Discovery Job Behavior
This is an example of logs for two Process Scheduler Domain work items, one successful and saved, and the other presenting a domain down error. Each detailed log with its respective work item id.
Discovery Error Messages
Database validation failed error
The example below is the output from a failed discovery job. Using the Work Item (WI) ID, search through out the entire log message for additional details to determine the cause of the failed discovery.
Message Error | Troubleshooting |
---|---|
Fetchlet exception error displays A password/username input validation is necessary to ensure entering the correct credentials for our discovery job in Database Credentials section. |
|
Message displays Then displays the host entered and the message Validate it and then retry the discovery job. |
|
Error due to a connection failure with PSFT Database displays Then the host name and direction is displayed along with its port. This is an error triggered when the Database Port is incorrect. Retry discovery with the right port. |
|
Error message with a fetchlet exception and the log displays Retry discovery job and enter the right Database Service Name under PSFT Database section. |
Resource families validation failed error
PeopleSoft has the following resource families:
- Application Server Domain
- Process Scheduler Domain
- PeopleSoft Internet Architecture (PIA)
There can be several resources of each family in a discovery job. A discovery job will be marked as successful if at least one resource of each type is successful. Therefore, a job can be successful even if there are some work items failing for some child resources.
In case of error, it will show the following logs:
General error message
This describes that none of the resource families in the discovery job met the requitement of having at least one resource successful for each family. Then provides a list of resources families and shows the next summary logs (one for each family failing):
Summary of failing work items
This log example provides a list of failed work items for App Server family resources. Using the provided work items ids get the rest of logs with more details about the failures. Each work item can fail for different reasons and it is important to refer to each work item id in logs to see specifics. The following are possible errors for each work item and its solution.
Message Error | Troubleshooting |
---|---|
This type of error appears when discovery is provided with invalid credentials. Example on left column shows a description for an Application Server Domain work item: "Discovery failed for oracle_psft_appserv", but this error is also applicable to process scheduler domain (oracle_psft_pcrs). To fix this error enter the right credentials under that section. |
|
This error indicates a domain is down for the resource that failed in discovery. To fix it, verify that the application is running in PeopleSoft console, and turn the domain back on. This type of error can occur for Process Scheduler Domain and AppServer Domain, with the At the beginning of the log see which work item failed and also the reference to the work item id to easily identify the resource failing. |
|
This error occurs when there is a misconfiguration for a PIA domain (down status). |
Elasticsearch errors
If Elastic Search is discovered together with PeopleSoft discovery, this work item discovery will define the success or fail of the PeopleSoft discovery. If an error occurs while discovering Elastic Search and the work item fails, then the PeopleSoft discovery job will not be successful either.
The following is the message shown when an Elastic Search error appears. It provides a work item id to find detailed logs about what is provoking the failure.
General Error
Message Error | Troubleshooting |
---|---|
Failed to collect data, 500 SERVER ERROR. There was an error trying to connect and collect data from the specified host. This error log happens when an invalid username in the discovery was provided. |
|
Failed to collect data, status 401. Unauthorized access due to invalid credentials. Ensure entering the right password while performing the discovery. |
|
FileNotFoundException. TrustStore path location provided is incorrect. This could be due to a mistyped value entered in TrustStore path field or the file does not exists in the specified location. Also, please ensure that the file is accessible on the agent host. |
|
Password verification failed. The TrustStore password provided is incorrect. |
Troubleshoot SOA
Monitoring SOA applications created from Marketplace images:
When a SOA application is provisioned using Market place Image, then data in SOA related metrics are not populated. The Marketplace images places SOA and WebLogic configuration files in two seperate locations. To populate the SOA metrics, copy the configuration files from the configuration files to the WebLogic directory.
Please copy the files as indicated and restart Weblogic. SOA Infra Metrics will start appearing in a few minutes after Weblogic restart Marketplace image is installing SOA Suites in a different location than the Weblogic stack
|
Please copy the following files: From: -rwxrwxr-x. 1 oracle oracle 21156 May 18 2011 server-scheduler_service.xml -rwxrwxr-x. 1 oracle oracle 15788 May 18 2011 domain-scheduler_service.xml -rwxrwxr-x. 1 oracle oracle 2929 Nov 11 2013 server-bea_alsb.xml -rwxrwxr-x. 1 oracle oracle 242238 Feb 28 2016 server-oracle_soainfra.xml -rwxrwxr-x. 1 oracle oracle 232504 Jul 10 2016 server-oracle_soainfra_partition.xml -rwxrwxr-x. 1 oracle oracle 2992 Aug 15 2016 server-oracle_soa_composite-11.0.xml -rwxrwxr-x. 1 oracle oracle 95241 Jan 16 2017 domain-oracle_soainfra.xml To: |
Troubleshoot a Maintenance Window
Retry a Maintenance Window
A retry can be performed only after an operation is marked as Partial Success, for Active Maintenance Windows.
Access the actions menu of the Maintenance Window to access the Retry option.
Updated topology
When a resource changes its topology, like a cluster adding or removing one or several of its servers, the Maintenance Window is not automatically updated. To updated the resources included in the Maintenance Window after a topology change, it's necessary to edit the Maintenance Window according to the resource's new topology.