Showing posts with label Top 10 WebLogic Performance Metrics to Proactively Monitor a Server Farm. Show all posts
Showing posts with label Top 10 WebLogic Performance Metrics to Proactively Monitor a Server Farm. Show all posts

Top 10 WebLogic Performance Metrics to Proactively Monitor a Server Farm

My 10 Key Metrics
Now – let me get into some of the key areas I personally monitor and an explanation of why I monitor them. Note, they are not in any specific order. The list below is only a partial list to provide an example of the type of data you can use to tune, troubleshoot and learn about how your system performs.
#1: JVM – Percent of time in Garbage Collection
Time Spent in GC is a key indicator of the apps way to use memory
Time Spent in GC is a key indicator of the apps way to use memory
GC is a stop the world process. So it is very important to verify the system is not spending too much time in this state. This metric is also helpful in validating configuration changes and for capacity management. Factors that go into this include system cores, memory allocated etc…

<a target="_blank" href="https://www.amazon.in/b?_encoding=UTF8&tag=malli12-21&linkCode=ur2&linkId=91df7e8327372224af36ea81f59c0b29&camp=3638&creative=24630&node=1375248031">https://weblogicadmintutorials.blogspot.in/</a><img src="//ir-in.amazon-adsystem.com/e/ir?t=malli12-21&l=ur2&o=31" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />
#2: Execute Thread Counts
Watch the number of threads as they give an indication on how well your system runs
Watch the number of threads as they give an indication on how well your system runs
When WebLogic opens up too many threads to service the load, there is a decrease in performance. Threads take resources (CPU, Memory). This metric can be used for monitoring and also for capacity planning.
#3: Workmanager Thread usage
Workmanager Threads ensure resources are properly assigned to applications
Workmanager Threads ensure resources are properly assigned to applications
A Workmanager is used to limit resources or to ensure that the right application gets the resources. Here is where you can validate the Workmanagers are not capped at an inappropriate level etc…
#4: JDBC
Determine whether you have proper sizing and isolate connection leaks in your app
Determine whether you have proper sizing and isolate connection leaks in your app
Current Capacity vs. Current Capacity High allows you to validate that you have the correct amount of resources available to service the client’s needs. It’s also helpful to determine if you need to increase or decrease the pool size. While Connection Delay Times can be used to determine DB responsiveness.
#5: Application Health and Applications deployed
Actively monitor your application health and not just rely on the WebLogic Health Status
Actively monitor your application health and not just rely on the WebLogic Health Status
Validate all deployments are deployed to the correct servers and are in an active state. I can’t count the number of times Weblogic said everything was Active but the server it was deployed to said “Failed.” IT can also start monitoring active sessions to the servers. This is great data for capacity planning.
#6a: JMS Oldest Message Age
Old messages in a queue means that your system can't keep up with processing them
Old messages in a queue means that your system can’t keep up with processing them
Normally it doesn’t matter how many messages are on the queue, it is a good idea to pay close attention to how old the oldest message is. This is normally a key indication of issues, or shows that the system is being affected by an excessive message dump on the queue – something the system cannot keep up with the load (capacity). This can be verified (below).
#6b: JMS Consumers
Make sure to monitor all key metrics for JMS
Make sure to monitor all key metrics for JMS
In the picture above, you are able to see how many consumers are on the queue. If no consumers are visible, then we don’t process messages.
#7: Cluster Server Alive Count
This metric ensures all servers in the cluster are talking and know about each other.
#8: Server Listen Address
Verify availability and responsiveness of the server listening port.
Verify availability and responsiveness of the server listening port.
This metric allows you to know if all servers are communicating properly to the Admin Server. If the listen address is not reporting properly (<host>/<IP>) the managed server is not communicating with the Admin Server. You lose the ability to monitor and also troubleshoot through the console. Normally this happens when the server is under extremely heavy load or, depending on your Weblogic version, it is a Weblogic bug.
#9: Server Running Time
Catch servers that keep restarting due to crashes by watching the total run time
Catch servers that keep restarting due to crashes by watching the total run time
This is a great metric to catch servers that crash and get restarted by Nodemanager. :-)
#10: Monitoring Time
Keep an eye on the overhead of your own monitoring
Keep an eye on the overhead of your own monitoring
You want to limit monitoring so that you leave the maximum resources available for the system. I’m often asked how to figure out the right balance of system monitoring. Every system is different so there is no magic number. A few key things to look at:
§  Are your servers properly sized? Is there enough CPU/Memory available?
§  You have monitoring in place, on the individual servers, so you can validate whether or not you are placing undue load on those servers.
§  With the Dynatrace agents you can compare response times to your monitoring intervals to see if you are straining the system or notice any performance impact.
Those pieces of information can help you determine the appropriate amount of monitoring. In a future blog, maybe we can cover these in more detail.