Sunday, June 21, 2009

LR-Basics

What is LoadRunner?

Software applications are becoming advanced and complex, they are now capable of holding 100s of 1000s of users. With complexity and large volumes, arises problem of managing them and making them work at any given point of time.

Also, almost every organization is moving in the era of Web 2.0 (or 3.0). This intricate network comes along with lot of challenges to any company. With servers, routers ,cables, applications all interlinked to each other in a mesh like structure every single point becomes a candidate for performance bottlenecks. The best way to test and overcome the performance problems is to use testing tools which are capable of simulating the end user behavior.

The industry leader from HP, LoadRunner is our tool of study.

LoadRunner consists of:

  1. Virtual User Generator (VUGen): We can emulate the real world user behavior using VuGen that's why the name virtual user [Dictionary meaning: Existing or resulting in essence or effect though not in actual fact, form, or name]. This is the place where we record and write automated scripts.
  2. Controller: Here we run the scripts generated above. This controls
    the various load generators* and scenarios** associated with them.
  3. Analysis: This gives the detailed results and presents them beautifully using reports, charts and graphics.

This was just a brief overview. We will talk in details on the three parts of LR in the coming posts.

*Load generators: Machines used to generate load on the server.

**Scenarios: This describes aspects like which scripts will run, no of virtual users and association of load generators with scripts

Points to note with VuGen and Controller


 

  • When a script is opened in Controller, run-time settings  also gets copied from VUGen to controller.
  • Any changes done in the script and run-time settings are not reflected in the controller unless you refresh them.
  • Refresh in controller can be done by going to Design > {Highlighting scenario group} that are using script in question > Clicking Details button > Clicking the REFRESH button on the Group Information pop-up window. So next time when controller asks you to load new script iteration settings do the refresh.
  • While doing Save As:
    • Default directory in VUGen can be changed by going to vugen.ini file located under C:\Program Files\HP\LoadRunner\config and appending the required file path to LastScriptPath (as shown on the right).
    • Default directory in Controller can be changed by going to wlrun.ini file located under C:\Program Files\HP\LoadRunner\config and appending the required file path to M_ROOT

Note that THINK time is ignored in VUGen while played back as recorded in Controller.

LoadRunner — Correlation

If you simply record and playback a script in VuGen, you might encounter errors in your playback. Often, those errors are related to the session values which are sent by the server to the client to identify that particular session.

Why error? Well, session values will change with every playback of the script.

To overcome this we need a way which can capture these dynamically generated session values and pass it subsequently to any part of the script, wherever required. This method to identify and set the dynamic generated value is known as correlation.

If your new to loadtesting, don't confuse this term with parameter which you might have used in tools like QTP to pass varying values. Parameter is not a dynamic value captured from server response but it is something for which the user has predefined data values available.

LoadRunner use three functions to correlate scripts:

  1. Web_reg_save_param
  2. Web_create_html_param
  3. Web_create_html_param_ex

All about web_url and web_link in LoadRunner

Points to note with web_url and web_link:

  • web_url is not a context sensitive function while web_link is a context sensitive function. Context sensitive functions describe your actions in terms of GUI objects (such as windows, lists, and buttons). Check HTML vs URL recording mode.
  • If web_url statement occurs before a context sensitive statement like web_link, it should hit the server, otherwise your script will get error'ed out.
  • While recording, if you switch between the actions, the first statement recorded in a given action will never be a context sensitive statement.
  • The first argument of a web_link, web_url, web_image or in general web_* does not affect the script replay. For example: if your web_link statements were recorded as

    web_link("Hi There",


     

    "Text=Hello, ABC",


     

    LAST);


     

    Now, when you parameterize/correlate the first argument to

    web_link("{Welcome to LearnLoadRunner}",


     

    "Text=Hello, ABC",


     

    LAST);


     

    On executing the above script you won't find the actual text of the parameter {Welcome to Learn LoadRunner} instead you will find {Welcome to Learn LoadRunner} itself in the execution log. However to show the correlated/parameterized data you can use lr_eval_string to evaluate the parameter

HTML vs URL recording mode.

There are three types of recording mode/levels in LoadRunner. GUI-based, HTML based and URL based. For the uninitiated, recording levels tells you the amount of and what information is recorded during the recording process. As the title says, for this post we will keep focus on HTML based and URL based recording levels only and will touch upon GUI based mode, in a later post.

  1. HTML based mode, records script for every user action that is performed during recording (hmmm…sounds like QTP) while URL based mode records each and every browser request to the server and resources received from the server. Confused? ok, HTML based mode does recording as you perform clicks and doesn't give you inside information like what is happening behind the recording while URL based mode records each and every step and emulate Javascript code.
  2. From the point1) above you can guess, HTML mode would have less correlation to do while URL mode has much more complex correlation requirements.
  3. HTML mode is smaller and is more intuitive to read as the statements are inside the functions corresponding to the user action performed. In the case of URL based, all statements gets recorded into web_url()
  4. HTML mode is recommended for browser applications while URL mode is recommended for non-browser applications.
  5. Lastly, don't get the impression that I am advocating for HTML mode :). URL mode can be of real help when you want to have control over the resources that need to be or need not to be downloaded, since you have each and every statement in-front of you (point 1)

Difference between concurrent and simultaneous vuser


 

All the vusers in a particular scenario are called Concurrent vusers. They may or may not perform the same tasks. On the other hand simultaneous vusers is more to do with rendezvous points. When we set rendezvous points we instruct the system to wait till a certain no of vusers arrive so that they all can do a particular task simultaneously. These vusers performing the same task at the same time are called Simultaneous vusers.

For example in a Yahoo Mail application: Suppose a scenario consists of 100 vusers with 3 tasks – 1) Login, 2) Check no of unread mails 3) Logout. Vusers at 1) + 2) + 3) will be called as concurrent vusers as they are part of same scenario performing some task but if have set a rendezvous point so that say 25 vuser perform the 2) task at the same time these 25 vusers would be termed as simultaneous vusers.

What is memory leak, page fault and how they affect LoadRunner performance?


 

What is memory leak?

A memory leak is a particular type of unintentional memory consumption by a computer program where the program fails to release memory when no longer needed. This condition is normally the result of a bug in a program that prevents it from freeing up memory that it no longer needs.This term has the potential to be confusing, since memory is not physically lost from the computer. Rather, memory is allocated to a program, and that program subsequently loses the ability to access it due to program logic flaws.

What is a page fault?

An interrupt that occurs when a program requests data that is not currently in real memory. The interrupt triggers the operating system to fetch the data from a virtual memory and load it into RAM.

An invalid page fault or page fault error occurs when the operating system cannot find the data in virtual memory. This usually happens when the virtual memory area, or the table that maps virtual addresses to real addresses, becomes corrupt.

Now the most important question comes up, how do they affect LoadRunner functioning?

As you might guess, memory leak, if left unattended and not corrected, could prove to be fatal. Memory leaks can be found out by running tests for long duration (say about an hour) and continuously checking memory usage.

Issues caused by memory leaks are essentially based on two variable for a standalone windows application 1) Frequency of usage 2) size of memory leak . If either one or both are very high it could cause the computer to come to a point when no memory is available for other applications causing it to crash. If it is a network based application then you will also have to consider network traffic . If each network transaction causes a memory leak , then a high volume of network transactions could also prove dangerous

What is the difference between a process and a thread?


 

Process is defined as the virtual address space and the control information necessary for the execution of a program while Threads are a way for a program to split itself into two or more simultaneously running tasks. In general, a thread is contained inside a process and different threads in the same process share some resources while different processes do not.

Source

In terms of Loadrunner, when we run Vuser as a process, LoadRunner creates 1 process called mmdrv.exe per Vuser. So if we have 10 Vusers, we will have 10 mmdrv.exe processes on our machines.

while when we run Vuser as a thread, LoadRunner creates 1 thread per Vuser. So if we have 10 Vusers, then we will have 1 process with 10 threads running inside it if the limit is 10 threads per process.

Running Vuser as a thread is more memory efficient that running Vuser as a process for obvious reasons that less memory resources are utilized when we run them as thread. I read somewhere that running as a process has an advantage that system becomes more stable. Now how is that stability achieved


 

RAM, Memory Usage, CPU Usage, Paging in terms of LoadRunner


 

You are going to encounter these terms again and again on your journey to become a LoadRunner expert. We will clarify their meaning first, and shall see how are they related to LoadRunner.

Hard Disk vs RAM:

  1. Hard Disk is used for long-term storage of work while RAM is used to store your current work.
  2. Hard Disk holds the original copy of the program permanently while When you want to use a program, a temporary copy is put into RAM and that's the copy you use.
  3. When working on a file, the original file is left untouched in the Hard Drive until you do a "save;" the "save" copies the new version of the file that's in RAM onto the Hard Disk (and usually replaces the original file) while The file you are modifying, plus all the changes you make, are kept in RAM until you do a "save"

Virtual Memory and Paging:

Virtual Memory is an essential part of all Operating Systems. As we saw above, RAM stores info about all the programs currently running on your desktop. If you open a program when RAM is full, your OS will try to locate programs on RAM which are not in use currently. It will then transfer those programs to some areas of hard disk, that ways space will be created on RAM for your new programs to run. So effectively, though there was no space on RAM but your OS created a memory space with the help of your hard disk. This memory is called as Virtual Memory. The area of hard disk where RAM image is copied is known as page file and process as paging.

You might ask why can't we eliminate the use of hard disk or RAM, given the above scenario…here is a beautiful explanation of this, from the source cited below.

The read/write speed of a hard drive is much slower than RAM, and the technology of a hard drive is not geared toward accessing small pieces of data at a time. If your system has to rely too heavily on virtual memory, you will notice a significant performance drop. The key is to have enough RAM to handle everything you tend to work on simultaneously — then, the only time you "feel" the slowness of virtual memory is is when there's a slight pause when you're changing tasks. When that's the case, virtual memory is perfect.

When it is not the case, the operating system has to constantly swap information back and forth between RAM and the hard disk. This is called thrashing, and it can make your computer feel incredibly slow.

Full Explanation here

CPU Usage:

It represent the percentage of time that a process used the CPU since the last update. The steps to find out current CPU usage:

Go to "Windows Task Manager" [Ctrl-Shift-Esc] > Performance > Top left graph shows you CPU usage as shown below.



In terms of LoadRunner you should ensure that CPU usage should always be below (80-85)% on your loadgenerator machines for efficient functioning.

Memory usage:

It is the current working set of processes in kilobytes. In the above figure, Commit Charge (K) represents Memory usage. In terms of LoadRunner, you should ensure that Commit charge should always be less than Physical Memory (RAM) on your loadgenerator machines so that minimal paging is required.

web_reg_save_param function explained


 

As explained in one of my previous posts, web_reg_save_param is THE most important function when you are working with LoadRunner. We will start with the syntax and then touch upon some examples to get a clear idea.

int web_reg_save_param (const char *mpszParamName, <List of Attributes>,LAST);

Find below the available attributes [<List Of Attributes>]. Note that the attribute value strings (e.g. Search=all) are not case sensitive.

NotFound The handling method when a boundary is not found and an empty string is generated. "ERROR," the default, indicates that VuGen should issue an error when a boundary is not found. When set to "EMPTY," no error message is issued and script execution continues. Note that if Continue on Error is enabled for the script, then even when NOTFOUND is set to "ERROR," the script continues when the boundary is not found, but it writes an error message to the Extended log file.

LB The left boundary of the parameter or the dynamic data. This parameter must be a non-empty, null-terminated character string. Boundary parameters are case sensitive; to ignore the case, add "/IC" after the boundary. Specify "/BIN" after the boundary to specify binary data.

RB The right boundary of the parameter or the dynamic data. This parameter must be a non-empty, null-terminated character string. Boundary parameters are case sensitive; to ignore the case, add "/IC" after the boundary. Specify "/BIN" after the boundary to specify binary data.

RelFrameID The hierarchy level of the HTML page relative to the requested URL.

Search The scope of the search—where to search for the delimited data. The possible values are Headers (search only the headers), Body (search only Body data, not headers), or ALL (search Body and headers). The default value is ALL.

ORD This optional parameter indicates the ordinal or occurrence number of the match. The default ordinal is 1. If you specify "All," it saves the parameter values in an array.

SaveOffset The offset of a sub-string of the found value, to save to the parameter. The default is 0. The offset value must be non-negative.

Savelen The length of a sub-string of the found value, from the specified offset, to save to the parameter. The default is -1, indicating until the end of the string.

Convert The conversion method to apply to the data:

HTML_TO_URL: convert HTML-encoded data to a URL-encoded data format

HTML_TO_TEXT: convert HTML-encoded data to plain text format

Examples:

The examples below are taken from the LoadRunner tutorial to give clarity on topic. We will see more examples in the coming posts.

Sample Correlation for Web Vusers
Suppose the script contains a dynamic session ID:

web_url("FirstTimeVisitors","URL=/exec/obidos/subst/help/first-time-visitors.html/002-8481703-4784428>Buy books for a penny ", "TargetFrame=","RecContentType=text/html","SupportFrames=0″,LAST);

The dynamic id here is 002-8481703-4784428

You insert a web_reg_save_param statement before the above statement:

web_req_save_param ("user_access_number", "NOTFOUND=ERROR","LB=first-time-visitors.html/","RB=>Buy books for a penny", "ORD=6″,LAST);

ORD=6 saves the sixth occurrence of the value in user_access_number. I think everything else is self explanatory

After implementing correlated statements, the modified script looks like this, where user_access_number is the name of the parameter representing the dynamic data.

web_url("FirstTImeVisitors","URL=/exec/obidos/subst/help/first-time-""visitors.html/{user_access_number}Buy books for a penny ",
"TargetFrame=","RecContentType=text/html","SupportFrames=0″,LAST);

Note: Each correlation function retrieves dynamic data once, for the subsequent HTTP request. If another HTTP request at a later point in the script generates new dynamic data, you must insert another correlation function.

Also as I wrote in my last post don't confuse correlation with parameter which you might have used in tools like QTP to pass varying values. Parameter is not a dynamic value captured from server response but it is something for which the user has predefined data values available.

Tips to identify the dynamic string boundaries:

  • Always analyze the location of the dynamic data within the HTML code itself, and not in the recorded script.
  • Identify the string that is immediately to the left of the dynamic data. This string defines the left boundary of the dynamic data.
  • Identify the string that is immediately to the right of the dynamic data. This string defines the right boundary of the dynamic data.
  • web_reg_save_param looks for the characters between (but not including) the specified boundaries and saves the information beginning one byte after the left boundary and ending one byte before the right boundary. web_reg_save_param does not support embedded boundary characters.

For example, if the input buffer is {a{b{c} and "{" is specified as a left boundary, and "}" as a right boundary, the first instance is c and there are no further instances—it found the right and left boundaries but it does not allow embedded boundaries, so "c" is the only valid match. By default, the maximum length of any boundary string is 256 characters.
Include a web_set_max_html_param_len function in your script to increase the maximum permitted length. For example, the following function increases the maximum length to 1024 characters: web_set_max_html_param_len("1024");

Advantages of LoadRunner

Any performance testing tool (or for that matter any other automation tool) should be used on a case-to-case basis, depending upon the requirements, client budget etc. Since the topic of our blog is limited to LoadRunner, I would like to present some advantages and disadvantages of using LoadRunner.

Advantages:

  1. No need to install it on the server under test. It uses native monitors. For Ex: perfmon for windows or rstatd daemon for Unix
  2. Uses ANSI C as the default programming language1 and other languages like Java and VB.
  3. Excellent monitoring and analysis interface where you can see reports in easy to understand colored charts and graphics.
  4. Supports most of the protocols2.
  5. Makes correlation3 much easier. We will dig into correlation through a series of posts later.
  6. Nice GUI generated script through a one click recording, of course you would need to modify the script according to your needs.
  7. Excellent tutorials, exhaustive documentation and active tool support from HP.

Disadvantages:

The only disadvantage I can think is the prohibitive cost associated with the tool but that can also be compensated in the long run when you start getting a good ROI from the tool.

1Programming/Scripting language is used to represent the captured protocol data and manipulate the data for play-back.

2Protocol is simply a language that your client uses to communicate with the system.

3Correlation is a way to substitute values in dynamic data to enable successful playback.

No comments: