The ultimate guide to Performance Testing and Engineering using LoadRunner

Wednesday 30 July 2014

.Net Application profiling in a nutshell

Below are the most popular .Net application profiling tools:-

ANTS Memory and Performance Profiler
CLR Profiler
Microsoft Visual Studio 2008 Profiling Tools

The following KPIs (Key performance indicators) can be used to identify potential bottlenecks and problems in application code:-

1. High call count
Functions with very high call counts should be treated with suspicion and investigation. Sometimes the high call count is valid, but sometimes it could be due to an error in event handling, and might stress the resources with unintended processing. Use call graphing facility in your profiling tool, and try to track back to where the calls to function originate, and then decide if it is intended behavior. This can be optimized quickly if a problem is found

2. Slowest function excluding child calls
Indicates the slowest function where the body of the function itself is taking time. It does not includes the time spent in calling other source code functions (child functions) but includes the time spent in calling .net framework functions.Identify the slowest functions excluding child calls and then, if available, look for the slowest code lines and determine if they are optimizable. You will often see slow lines waiting for database and web service calls to return.

3. Slowest function including child calls
Indicated the slowest function where the total cost of the functions, includes the time spent in calling other source code functions (child functions). Use the call graph available in profiler tool to explore the slowest part of the call tree.

4. Functions with Wait time
Functions with high Wait time can indicate performance problems induced from other application layers such as DB, Web service etc. or problems with thread locking. Identify the application layer that has introduced the delay and cause of contention on that layer.

5. Functions Generating Network Activity
Network activity is an expensive task and should be carefully analyzed. Make sure that the network activity generated from the function is valid. Make sure the network activity occurs is as low as possible and we fetch more data in a single call (if the number of network hops are more, the number of time latency affects the response time will be more).

6. Functions generating disk activity
A function generating disk activity needs to be investigated further, as it is demanding resources and is therefore a potential bottleneck.
Disk activity requires resources to be complete, any function generating disk activity should be analyzed to make sure the disk activity is necessary. Also, try finding alternatives to cut down on disk activity, if possible.

7. Functions with high CPU utilization
The amount of resource available in a system is fixed, functions that are CPU intensive should be analyzed for optimization.

Tuesday 29 July 2014

How to detect a memory leak

Memory leak detection

Finding memory leaks is all about identifying objects that are created but never garbage collected. Memory leaks always get worse so, in theory, the longer the application runs, the bigger the leak will get, and the easier it will be to find.

You can identify a memory leak by running a load test on application for a prolonged duration of 10-12 hours and monitoring the memory utilization. That doesn't really help when profiling, though, because you need to be able to identify a leak quickly.

Profiling tools help leak detection by allowing you to take memory snapshots. A snapshot usually involves forcing a garbage collection and then recording all of the objects that are left behind in memory. Objects that repeatedly survive garbage collection should be investigated further.

If objects of the same type continually survive garbage collection and keep building up in memory, you need to investigate the references that are keeping those objects in memory. Tracking object references back to source code allows you to find the cause of the leak in your own code, which means you can fix it.

Some profilers track memory allocation by function calls, which allows you to see the functions that are potentially leaking memory. This can also be a highly effective technique for finding a memory leak.

How to Pinpoint Memory related performance bottlenecks

The basic symptoms of memory related performance problems are -

1. Memory leak

Memory usage slowly increases over time
Performance degrades
Application will freeze/crash requiring a restart
After restart application is running fine again, and the behavior continues

2. Excessive memory footprint

Application is slow to load
After load, other application runs slower than expected

3. Inefficient allocation

Application performance suddenly degrades and then recovers quickly
% Time in GC Statistic in PerfMon is greater than 20–30%

Application Performance Bottlenecks

Below areas can be looked into for performance tuning an application:-

1. Wall-clock (elapsed) vs. CPU time

A function may take a long time to execute, but use comparatively little CPU time because it is actually waiting for a database / web service call to return or for a thread synchronization lock to free up.Identifying Wait time can help you identify where your application may benefit from asynchronous processing.
At the same time, a CPU-intensive function is usually a good candidate for optimization, because the CPU is a finite resource and a potential bottleneck.

2. Resource bottlenecks

Resources such as disk space, network bandwidth, server availability, graphics cards, and shared threads can all create bottlenecks in an application. Identifying functions causing high levels of resource activity and contention is a key goal in profiling. This kind of activity, when scaled, could quickly become a problem and reduce the scalability of the application.

3. Call count

Function call count is the easiest statistic to look at first, because a non-trivial function with a high call count often indicates an immediate problem. It's always worth validating the origins of the high call count.

4. Memory profiling

The way you write your code directly impacts how and when the objects you create are allocated and destroyed. Get it right, and your application will use memory efficiently as needed, with minimal performance impact. Get it wrong, however, and your application could use more memory than necessary, which will cause the memory manager to work harder than it needs to, and this will directly impact performance.
Even worse than that, your application could just keep allocating memory until no more is left, causing the application or the machine to crash. This is the memory leak, which every developer fears.
Checking that an application doesn't have memory leaks, and that it uses memory efficiently, together with fixing any issues found, will improve its overall stability and performance.

5.Garbage collection

The Java/.NET memory management model ensures that any allocated objects which are no longer in use by the application will be reclaimed automatically. This relieves developers of the responsibility of having to free memory explicitly, which is something that was often omitted in native C/C++ applications, leading to memory leaks.

Thursday 10 July 2014

LoadRunner Controller Error :- Server xxx has shut down the connection prematurely

The Error "Server xxx has shut down the connection prematurely" is often seen during performance test execution in controller.

1. Is this a performance issue with the web/app server?

No, it is not.

2. Is this related to Loadrunner?

Yes, it is.

3. Can we resolve it by simply setting web_set_sockets_option

("IGNORE_PREMATURE_SHUTDOWN", "1"); to the scripts?
No, we cannot. Adding the function - web_set_sockets_option ("IGNORE_PREMATURE_SHUTDOWN", "1"); will just help in ignoring the error on controller and the issue may still persist.

4. What is the root cause of this error?

The error - "server has shutdown the connection prematurely" is because the server closed a live connection, created by your loadrunner script. The server can do so because of multiple reasons and the most common one is that the client is not able to read data from the server's socket in the specified time out limit.

If you have access to the server logs, scan the logs and you might be able to identify the root cause of this. Below is the captured log from an ORACLE web logic server.

<BEA-000449> <Closing the socket, as no data read from it on ##.###.#.###:##,### during the configured idle timeout of 5 seconds.>

So, it clearly states that the client (you load generator machine) is not able to read the data from the socket in the specified time limit and then the server closed the connection. This resulted an error on the controller.

5. How to resolve the "Server xxx has shut down the connection prematurely" error:-

As stated above, the error is because of slow client (LG) which was not able to read the data from server's socket. This could be because you have over loaded your load generator machine then it's capacity or the shared bandwidth is too less for the Vusers.

Try distributing the load to multiple load generator machines to resolve the issue.

Important performance monitors/counters for .net application

The following performance counters can act as general guidelines for different performance problems related to the .net application under test.

• Processor\% Processor Time
• Process\Working Set
• Memory\% Committed Bytes in Use
• Memory\% Available Mbytes
• Memory\% Pages/Sec
• Memory\% Page Faults/Sec
• PhysicalDisk\% Idle Time
• Network Interface\Output Queue Length
• .NET CLR Memory\% Time in GC
• .NET CLR Memory\# Gen 0,1,2 Collections
• .NET CLR Memory\# of Pinned Objects
• .NET CLR Memory\Large Object Heap Size
• .NET CLR LocksAndThreads\Contention Rate/sec
• .NET CLR Exceptions\# of Exceps Thrown / sec
• .NET CLR Jit\% Time in Jit
• ASP.NET\Requests Queued
• ASP.NET\Requests Rejected

Thursday 3 July 2014

Loadrunner Correlation : Dynamic left boundary and right boundary in web_reg_save_param

You can use following regular expression attributes in your loadrunner script to deal with dynamic left or right boundaries :-

LB/DIG - This attribute interprets the # sign as a wildcard for single digit. eg:- "Error5##" matches for "Error500", "Error501" till "Error599".
LB/ALNUM - This attribute interprets the ^ sign as a wildcard for single US-ASCII alphanumeric character.
LB/ALNUMIC - This attribute interprets the ^ sign as a wildcard for single US-ASCII alphanumeric character, this regular expression is case insensitive. eg:- "Er^^r" matches for "Error", "ErRor", "ErrOr", "ErROr", "Er12r", Err1r" etc.
LB/ALNUMLC - This attribute interprets the ^ sign as a wildcard for single US-ASCII alphanumeric character and lower case. In this case "Er^^r" will not match for "ErROr", "ErRor", "ErR1r" and others with an uppercase alphabet present.
LB/ALNUMUC - This attribute interprets the ^ sign as a wildcard for single US-ASCII alphanumeric character and upper case.

Loadrunner : Error -26601: Decompression function (wgzMemDecompressBuffer) failed, return code=-5 (Z_BUF_ERROR), inSize=0, inUse=0, outUse=0

The decompression function error in loadrunner occurs because of insufficient buffer size set in the runtime settings. The Network buffer size is set to 12288 bytes, by default.

In a scenario where user requires to use more than the specified buffer, we encounter the "Decompression function (wgzMemDecompressBuffer) failed" error.

Solution:- Increasing the network buffer size in Runtime settings> Preferences> Option>General>Netwok Buffer Size

Tuesday 1 July 2014

"How to identify the dynamic values in loadrunner script and do correlation in Loadrunner for the dynamic values"

Correlation is one of the most important concept in loadrunner, especially when you are working on web protocols in loadrunner like web(http/html), sap web or Oracle web applications etc.

Correlation in loadrunner is used to deal with the dynamic values in script that changes with each script execution. The inbuilt function web_reg_save_param in loadrunner can be used to implement correlation.

Below are the step by step instructions to do the correlation:-

Note:- It is advised to have 2 recordings of same business flow to compare and identify the dynamic values.

Step 1 : Identify the dynamic value :-

Replay the recorded script and see if the script fails somewhere

Note:- If script does not fail and the intended job is verified using the check points - there may not be any dynamic values or there is no correlation required.

Go to the failed request in script and compare it with the same request in second recording
Identify the dynamic values present in the request

Step 2 : Find the left boundary, right boundary and correct place in the script to put the web_reg_save_param function

Find the first occurrence of dynamic value in generation log (in output window of loadrunner) - It should be present in the response header/response body to do the correlation. (Note:- If the dynamic value is present in the request header / body then the value might have generated from client side and cannot be correlated)
Copy the left and right boundary to capture the dynamic value (eg:- consider the dynamic value DATA is present in response in the format - abcDATAxyz. Here abc is the LB and xyz is the RB to capture DATA). we should always try choosing the unique combination of LB and RB to avoid finding out the ordinance.
Note down the id no. from ****** Response Header/Body For Transaction With Id # ****** (search upward for 'transacion with id' text)
Find the event generated for the previous response****** Add Event For Transaction With Id # ****** (search downward for 'Add event' text)
Note down the snapshot no. for reference (snapshot=##.inf)
Search for the snapshot no. ##.inf in the script

Step 3 : Put the web_reg_save_param function in script

You need to put the web_reg_save_param function in script before the request identified in previous step (identify through the inf file number)
Use LB and RB capture in previous step - web_reg_save_param("Correlation_parameter", "LB=abc", "RB=xyz",LAST);

The dynamic value should get captured in the correlation_parameter and can be seen in replay log during replay by enabling extended log in run-time settings>log - enable logging (checked), always send messages (checked), extended log (checked), parameter substitution (checked).

Important Notes about web_reg_save_param:-

If the ordinance attribute is not specified in the function, it will take 1 as the default ordinance)
if the search attribute is not specified in the function, it will take search=body as default argument. Specify search=header if the dynamic value has to be captured from response header
If the length of the captured string in correlation_parameter is greater than 256 bytes, specify web_set_max_html_param_len("9999"); at the beginning of action.
If you are not able to find an add event (corresponding request) for a particular response, where the dynamic value is present. Try using "relframid=all" argument in the web_reg_save_param (put web_reg_save_param before the request which ever is available first after the response).

How to deal with dynamic left and right boundaries in loadrunner correlation will be covered in next post...

Monday 30 June 2014

how to Capture a sub string or data from a string based on delimiters in Loadrunner

There is an inbuilt function in loadrunner that can be used to capture data from string by specifying delimiters. strtok function can be used to do the trick -

In below example it is shown that how all words can be captured from string "http://localhost/app/myapp:8080" by using 2 delimeters / and :

extern char * strtok(char * string, const char * delimiters ); // Explicit declaration

char String_org[] = "http://localhost/app/myapp:8080"; // original string

char delimiter[] = "/:";

char * token;

token = (char *)strtok(String_org, delimiter); // capture 1st sub string based on defined delimiter

if (!token) {

lr_output_message ("No tokens found in string!");

return( -1 );

}

while (token != NULL ) { // While valid tokens are returned

lr_output_message ("%s", token );

token = (char *)strtok(NULL, delimiter); // Get the next token

}

Output:

Starting iteration 1.

Starting action Action.

Action.c(15): http

Action.c(15): localhost

Action.c(15): app

Action.c(15): myapp

Action.c(15): 8080

Ending action Action.