Sunday, July 12, 2009

LoadRunner Good Material

 

Soak Tests (Also Known as Endurance Testing)

Soak testing is running a system at high levels of load for prolonged periods of time.  A soak test would normally execute several times more transactions in an entire day (or night) than would be expected in a busy day, to identify any performance problems that appear after a large number of transactions have been executed.

Also, it is possible that a system may ‘stop’ working after a certain number of transactions have been processed due to memory leaks or other defects.  Soak tests provide an opportunity to identify such defects, whereas load tests and stress tests may not find such problems due to their relatively short duration.

Typical usage profile, that needs to be considered before commencing soak testing.

The above diagram shows activity for a certain type of site.   Each login results in an average session of 12 minutes duration with and average eight business transactions per session.

A soak test would run for as long as possible, given the limitations of the testing situation.  For example, weekends are often an opportune time for a soak test.  Soak testing for this application would be at a level of 550 logins per hour, using typical activity for each login. 

The average number of logins per day in this example is 4,384 per day, but it would only take 8 hours at 550 per hour to run an entire days activity through the system.

By Starting a 60 hour soak test on Friday evening at 6 pm  (to finish at 6am Monday morning), 33,000 logins would be put through the system, representing 7½ days of activity.  Only with such a test, will it be possible to observe any degradation of performance under controlled conditions. 

Some typical problems identified during soak tests are listed below:

 

bullet

Serious memory leaks that would eventually result in a memory crisis,

bullet

Failure to close connections between tiers of a multi-tiered system under some circumstances which could stall some or all modules of the system.

bullet

Failure to close database cursors under some conditions which would eventually result in the entire system stalling.

bullet

Gradual degradation of response time of some functions as internal data-structures become less efficient during a long test.

Apart from monitoring response time, it is also important to measure CPU usage and available memory.  If a server process needs to be available for the application to operate, it is often worthwhile to record its memory usage at the start and end of a soak test.   It is also important to monitor internal memory usages of facilities such as Java Virtual Machines, if applicable.

Long Session Soak Testing

When an application is used for long periods of time each day, the above approach should be modified, because the soak test driver is not Logins and transactions per day, but transactions per active user for each user each day.

This type of situation occurs in internal systems, such as ERP and CRM systems, where users login and stay logged in for many hours, executing a number of business transactions during that time.  A soak test for such a system should emulate multiple days of activity in a compacted time-frame rather than just pump multiple days worth of transactions through the system.

Long session soak tests should run with realistic user concurrency, but the focus should be on the number of transactions processed.  VUGen scripts used in long session soak testing may need to be more sophisticated than short session scripts, as they must be capable of running a long series of business transactions over a prolonged period of time.

Test Duration

The duration of most soak tests is often determined by the available time in the test lab.  There are many applications, however, that require extremely long soak tests.  Any application that must run, uninterrupted for extended periods of time, may need a soak test to cover all of the activity for a period of time that is agreed to by the stakeholders, such as a month.  Most systems have a regular maintenance window, and the time between such windows is usually a key driver for determining the scope of a soak test.

A classic example of a system that requires extensive soak testing is an air traffic control system.  A soak test for such a system may have a multi-week or even multi-month duration.

 

Failover Tests

Failover Tests verify of redundancy mechanisms while the system is under load.  This is in contrast to Load Tests which are conducted under anticipated load with no component failure during the course of a test.  

For example, in a web environment, failover testing determines what will happen if multiple web servers are being used under peak anticipated load, and one of them dies. 

Does the load balancer react quickly enough?

Can the other web servers handle the sudden dumping of extra load? 

Failover testing allows technicians to address problems in advance, in the comfort of a testing situation, rather than in the heat of a production outage.  It also provides a baseline of failover capability so that a 'sick' server can be shutdown with confidence, in the knowledge that the remaining infrastructure will cope with the surge of failover load.

Explanatory Diagrams:

The following is a configuration where failover testing would be required.

Diagram: Example failover configuration for a web system

This is just one of many failover configurations.  Some failover configurations can be quite complex, especially when there are redundant sites as well as redundant equipment and communications lines. 

In this type of configuration, when one of the application servers goes down, then the two web servers that were configured to communicate with the failed application server can not take load from the load balancer, and all of the load must be passed to the remaining two web servers.  See diagram below:

Diagram: web system after failover of application server

When such a failover event occurs, the web servers are under substantial stress, as they need to quickly accommodate the failed over load, which probably will result in doubling the number of HTTP connections  as well as application server connections in a very short amount of time.  The remaining application server will also be subjected to severe increase in load and the overheads associated with catering for the increased load.

It is crucial to the design of any meaningful failover testing that the failover design is understood, so that the implications of a failover event, while under load can, be scrutinized.

Fail-back Testing:

After verifying that a system can sustain a component outage, it is also important to verify that when the component is back up, that it is available to take load again, and that it can sustain the influx of activity when it comes back online.

 

Stress Tests

Stress Tests determine the load under which a system fails, and how it fails.  This is in contrast to Load Testing, which attempts to simulate anticipated load.  It is important to know in advance if a ‘stress’ situation will result in a catastrophic system failure, or if everything just “goes really slow”.  There are various varieties of Stress Tests, including spike, stepped and gradual ramp-up tests. Catastrophic failures require restarting various infrastructure and contribute to downtime, a stress-full environment for support staff and managers, as well as possible financial losses.  If a major performance bottleneck is reached, then the system performance will usually degrade to a point that is unsatisfactory, but performance should return to normal when the excessive load is removed.

Before conducting a Stress Test, it is usually advisable to conduct targeted infrastructure tests on each of the key components in the system.   A variation on targeted infrastructure tests would be to execute each one as a mini stress test.

The diagram below shows an unexpectedly high amount of demand on a typical web system.  Stress situations are not expected under normal circumstances.  

Diagram: Stress on system making it a candidate for Stress Testing

The following table lists possible situations for a variety of applications where stress situations may occur.

Type of Application

Circumstances that could give rise to Stress levels of activity.

Online Banking

After an outage - when many clients have been waiting for access to the application to do their banking transactions.

Marketing / Sales Application

Very successful advertising campaign - or substantial error in advertising campaign that understates pricing details.

Various applications

Unexpected publicity - for example, in a news article in a national online newspaper.

Focus of stress test.

In a stress event, it is most likely that many more connections will be requested per minute than under normal levels of expected peak activity.  In many stress situations, the actions of each connected user will not be typical of actions  observed under normal operating conditions.  This is partly due to the slow response and partly due to the root cause of the stress event. 

Lets take an example of a large holiday resort web site.  Normal activity will be characterized by browsing, room searches and bookings.  If a national online news service posted a sensational article about the resort and included a URL in the article, then the site may be subjected to a huge number of hits, but most of the visits would probably be a quick browse.  It is unlikely that many of the additional visitors would search for rooms and it would be even less likely that they would make bookings.  However, if instead of a news article, a national newspaper advertisement erroneously understated the price of accommodation, then there may well be an influx of visitors who clamour to book a room, only to find that the price did not match their expectations.

In both of the above situations, the normal traffic would be increased with traffic of a different usage profile.  So a stress test design would incorporate a Load Test as well as additional virtual users running a special series of 'stress' navigations and transactions. 

For the sake of simplicity, one can just increase the number of users using the business processes and functions coded in the Load Test. However, one must then keep in mind that a system failure with that type of activity may be different to the type of failure that may occur if a special series of 'stress' navigations were utilized for stress testing.

Stress test execution.

Typically, a stress test starts with a Load Test, and then additional activity is gradually increased until something breaks.  An alternative type of stress test is a Load Test with sudden bursts of additional activity.  The sudden bursts of activity generate substantial activity as sessions and connections are established, where as a gradual ramp-up in activity pushes various values past fixed system limitations. 

Diagram showing two types of Stress Tests - Gradual Rampup and Burst.

Ideally, stress tests should incorporate two runs, one with burst type activity and the other with gradual ramp-up as per the diagram above, to ensure that the system under test will not fail catastrophically under excessive load.  System reliability under severe load should not be negotiable and stress testing will identify reliability issues that arise under severe levels of load.

An alternative, or supplemental stress test is commonly referred to as a spike test, where a single short burst of concurrent activity is applied to a system.  Such tests are typical of simulating extreme activity where a 'count-down' situation exists.  For example, a system that will not take orders for a new product until a particular date and time.  If demand is very strong, then many users will be poised to use the system the moment the count down ends, creating a spike of concurrent requests and load.

 

 

Load Tests

Load Tests are end to end performance tests under anticipated production load.  The objective such tests are to determine the response times for various time critical transactions and business processes and ensure that they are within documented expectations (or Service Level Agreements - SLAs).  Load tests also measures the capability of an application to function correctly under load, by measuring transaction pass/fail/error rates.  An important variation of the load test is the Network Sensitivity Test, which incorporates WAN segments into a load test as most applications are deployed beyond a single LAN.

Load Tests are major tests, requiring substantial input from the business, so that anticipated activity can be accurately simulated in a test environment.  If the project has a pilot in production then logs from the pilot can be used to generate ‘usage profiles’ that can be used as part of the testing process, and can even be used to ‘drive’ large portions of the Load Test.  

Load testing must be executed on “today’s” production size database, and optionally with a “projected” database.  If some database tables will be much larger in some months time, then Load testing should also be conducted against a projected database.  It is important that such tests are repeatable, and give the same results for identical runs.  They may need to be executed several times in the first year of wide scale deployment, to ensure that new releases and changes in database size do not push response times beyond prescribed SLAs. 

 

What is the purpose of a Load Test?

The purpose of any load test should be clearly understood and documented.  A load test usually fits into one of the following categories:

  1. Quantification of risk.  - Determine, through formal testing, the likelihood that system performance will meet the formal stated performance expectations of stakeholders, such as response time requirements under given levels of load.  This is a traditional Quality Assurance (QA) type test.  Note that load testing does not mitigate risk directly, but through identification and quantification of risk, presents tuning opportunities and an impetus for remediation that will mitigate risk.
  2. Determination of minimum configuration.  - Determine, through formal testing, the minimum configuration that will allow the system to meet the formal stated performance expectations of stakeholders - so that extraneous hardware, software and the associated cost of ownership can be minimized.  This is a Business Technology Optimization (BTO) type test.

 

What functions or business processes should be tested?

The following table describes the criteria for determining the business functions or processes to be included in a test.

Basis for inclusion in Load Test

Comment

High frequency transactions

The most frequently used transactions have the potential to impact the performance of all of the other transactions if they are not efficient.

Mission Critical transactions

The more important transactions that facilitate the core objectives of the system should be included, as failure under load of these transactions has, by definition, the greatest impact.

Read Transactions

At least one READ ONLY transaction should be included, so that performance of such transactions can be differentiated from other more complex transactions.

Update Transactions

At least one update transaction should be included so that performance of such transactions can be differentiated from other transactions.

 

Example of Load Test Configuration for a web system

The following diagram shows how a thorough load test could be set up using LoadRunner. 

Comprehensive Load Testing Configuration

The important thing to understand in executing such a load test is that the load is generated at a protocol level, by the load generators, that are running scripts developed with the VUGen tool.  Transaction times derived from the VUGen scripts do not include processing time on the client PC, such as rendering (drawing parts of the screen) or execution of client side scripts such as JavaScript.  The WinRunner PC(s) is utilized to measure end user experience response times.  Most load tests would not employ a WinRunner PC to measure actual response times from the client perspective, but is highly recommended where complex and variable processing is performed on the desktop after data has been delivered to the client.

The LoadRunner controller is capable of displaying real-time graphs of response times as well as other measures such as CPU utilization on each of the components behind the firewall.  Internal measures from products such as Oracle, WebSphere are also available for monitoring during test execution.

After completion of a test, the Analysis engine can generate a number of graphs and correlations to help locate any performance bottlenecks. 

 

Simplified Load Test Configuration for a web system

Simplified Load Testing Configuration

In this simplified load test, the controller communicates directly to a load generator that can communicate directly to the load balancer.  No WinRunner PC is utilized to measure actual user experience.  The collection of statistics from various components is simplified as there is no firewall between the controller and the web components being measured.

Reporting on Response Time at various levels of load.

Expected output from a load test often includes a series of response time measures at various levels of load, eg 500 users, 750 users and 1,000 users.  It is important when determining the response time at any particular level of load, that the system has run in a stable manner for a significant amount of time before taking measurements.

For example, a ramp-up to 500 users may take ten minutes, but another ten minutes may be required to let the system activity stabilize.  Taking measurements over the next ten minutes would then give a meaningful result.  The next measurement can be taken after ramping up to the next level and waiting a further ten minutes for stabilization and ten minutes for the measurement period and so on for each level of load requiring detailed response time measures.

 

 

 

 

Targeted Infrastructure Tests

Targeted Infrastructure Tests are Isolated tests of each layer and or component in an end to end application configuration.   It includes communications infrastructure, Load Balancers, Web Servers, Application Servers, Crypto cards, Citrix Servers, Database… allowing for identification of any performance issues that would fundamentally limit the overall ability of a system to deliver at a given performance level.

Each test can be quite simple, For example, a test ensuring that 500 concurrent (idle) sessions can be maintained by Web Servers and related equipment, should be executed prior to a full 500 user end to end performance test, as a configuration file somewhere in the system may limit the number of users to less than 500.  It is much easier to identify such a configuration issue in a Targeted Infrastructure Test than in a full end to end test. 

The following diagram shows a simple conceptual decomposition of load to four different components in a typical web system.

Targeted infrastructure testing separately generates load on each component, and measures the response of each component under load.  The following diagram shows four different tests that could be conducted to simulate the load represented in the above diagram.

Different infrastructure tests require different protocols.  For example, VUGen™ supports a number of database protocols, such as DB2 CLI, Informix, MS SQL Server, Oracle and Sybase. 

Performance Tests

Performance Tests are tests that determine end to end timing (benchmarking) of various time critical business processes and transactions, while the system is under low load, but with a production sized database.  This sets ‘best possible’ performance expectation under a given configuration of infrastructure.  It also highlights very early in the testing process if changes need to be made before load testing should be undertaken.  For example, a customer search may take 15 seconds in a full sized database if indexes had not been applied correctly, or if an SQL 'hint' was incorporated in a statement that had been optimized with a much smaller database.  Such performance testing would highlight such a slow customer search transaction, which could be remediated prior to a full end to end load test.

It is 'best practice' to develop performance tests with an automated tool, such as WinRunner, so that response times from a user perspective can be measured in a repeatable manner with a high degree of precision.  The same test scripts can later be re-used in a load test and the results can be compared back to the original performance tests.

Repeatability

A key indicator of the quality of a performance test is repeatability.  Re-executing a performance test multiple times should give the same set of results each time.  If the results are not the same each time, then the differences in results from one run to the next can not be attributed to changes in the application, configuration or environment.

Performance Tests Precede Load Tests

The best time to execute performance tests is at the earliest opportunity after the content of a detailed load test plan have been determined.  Developing performance test scripts at such an early stage provides opportunity to identify and remediate serious performance problems and expectations before load testing commences. 

For example, management expectations of response time for a new web system that replaces a block mode terminal application are often articulated as 'sub second'.  However, a web system, in a single screen, may perform the business logic of several legacy transactions and may take 2 seconds.  Rather than waiting until the end of a load test cycle to inform the stakeholders that the test failed to meet their formally stated expectations, a little education up front may be in order.  Performance tests provide a means for this education. 

Another key benefit of performance testing early in the load testing process is the opportunity to fix serious performance problems before even commencing load testing. 

A common example is one or more missing indexes.  When performance testing of a "customer search" screen yields response times of more than ten seconds, there may well be a missing index, or poorly constructed SQL statement.  By raising such issues prior to commencing formal load testing, developers and DBAs can check that indexes have been set up properly.

Performance problems that relate to size of data transmissions also surface in performance tests when low bandwidth connections are used.  For example, some data, such as images and "terms and conditions" text are not optimized for transmission over slow links. 

Pre-requisites for Performance Testing

A performance test is not valid until the data in the system under test is realistic and the software and configuration is production like.  The following table list pre-requisites for valid performance testing, along with tests that can be conducted before the pre-requisites are satisfied:

Performance Test

Pre-Requisites

Comment

Caveats on testing where

pre-requisites are not satisfied.

Production Like Environment

Performance tests need to be executed on the same specification equipment as production if the results are to have integrity.

Lightweight transactions that do not require significant processing can be tested, but only substantial deviations from expected transaction response times should be reported.

Low bandwidth performance testing of high bandwidth transactions where communications processing contributes to most of the response time can be tested.

Production Like Configuration

Configuration of each component needs to be production like. 

For example: Database configuration and Operating System Configuration.

While system configuration will have less impact on performance testing than load testing, only substantial deviations from expected transaction response times should be reported.

Production Like Version

The version of software to be tested should closely resemble the version to be used in production.

Only major performance problems such as missing indexes and excessive communications should be reported with a version substantially different from the proposed production version.

Production Like Access

If clients will access the system over a WAN, dial-up modems, DSL, ISDN, etc. then testing should be conducted using each communication access method.

See Network Sensitivity Tests for more information on testing WAN access.

Only tests using production like access are valid.

Production Like Data

All relevant tables in the database need to be populated with a production like quantity with a realistic mix of data.

e.g. Having one million customers, 999,997 of which have the name "John Smith" would produce some very unrealistic responses to customer search transactions

Low bandwidth performance testing of high bandwidth transactions where communications processing contributes to most of the response time can be tested.

Documenting Response Time Expectations.

Rather that simply stating that all transactions must be 'sub second', a more comprehensive specification for response time needs to be defined and agreed to be relevant stakeholders.

One suggestion is to state an Average and a 90th Percentile response time for each group of transactions that are time critical.   In a set of 100 values that are sorted from best to worst, the 90th percentile simply means the 90th value in the list.

Click on this link for more information on response time definition.

Executing Performance Tests.

Performance testing involves executing the same test case multiple times with data variations for each execution, and then collating response times and computing response time statistics to compare against the formal expectations.  Often, performance is different when the data used in the test case is different, as different numbers of rows are processed in the database, different processing and validation come into play, and so on.

By executing a test case many times with different data, a statistical measure of response time can be computed that can be directly compared against a formal stated expectation.

 

Network Sensitivity Tests

Network sensitivity tests are variations on Load Tests and Performance Tests that focus on the Wide Area Network (WAN) limitations and network activity (eg. traffic, latency, error rates...).  Network sensitivity tests can be used to predict the impact of a given WAN segment or traffic profile on various applications that are bandwidth dependant.  Network issues often arise at low levels of concurrency over low bandwidth WAN segments.  Very 'chatty' applications can appear to be more prone to response time degradation under certain conditions than other applications that actually use more bandwidth.  For example, some applications may degrade to unacceptable levels of response time when a certain pattern of network traffic uses 50% of available bandwidth, while other applications are virtually un-changed in response time even with 85% of available bandwidth consumed elsewhere.

This is a particularly important test for deployment of a time critical application over a WAN.

Also, some front end systems such as web servers, need to work much harder with 'dirty' communications compared with the clean communications encountered on a high speed LAN in an isolated load and performance testing environment.

Why execute Network Sensitivity Tests

The three principle reasons for executing Network Sensitivity tests are as follows:

bullet

Determine the impact on response time of a WAN link.  (Variation of a Performance Test)

bullet

Determine the capacity of a system based on a given WAN link. (Variation of a Load Test)

bullet

Determine the impact on the system under test that is under 'dirty' communications load. (Variation of a Load Test)

Execution of performance and load tests for analysis of network sensitivity require test system configuration to emulate a WAN.  Once a WAN link has been configured, performance and load tests conducted will become Network Sensitivity Tests.

There are two ways of configuring such tests.

bullet

Use a simulated WAN and inject appropriate background traffic.

This can be achieved by putting back to back routers between a load generator and the system under test.   The routers can be configured to allow the required level of bandwidth, and instead of connecting to a real WAN, they connect directly through to each other.

Diagram of simple back to back router setup to conduct bandwidth testing.

 

When back to back routers are configured to be part of a test, they will basically limit the bandwidth.  If the test is to be more realistic, then additional traffic will need to be applied to the routers.

 

This can be achieved by a web server at one end of the link serving pages and another load generator generating requests.  It is important that the mix of traffic is realistic.  For example, a few continuous file transfers may impact response time in a different way to a large number of small transmissions.

Diagram of more realistic back to back router setup to conduct bandwidth testing and network sensitivity testing.

 

By forcing extra more traffic over the simulated WAN link, the latency will increase and some packet loss may even occur.  While this is much more realistic than testing over a high speed LAN, it does not take into account many features of a congested WAN such as out of sequence packets. 

 

bullet

Use the WAN emulation facility within LoadRunner.

The WAN emulation facility within LoadRunner supports a variety of WAN scenarios.   Each load generator can be assigned a number of WAN emulation parameters, such as error rates and latency.  WAN parameters can be set individually, or WAN link types can be selected from a list of pre-set configurations.  For detailed information on WAN emulation within LoadRunner follow this link - mercuryinteractive.com/products/LoadRunner/wan_emulation.html.

 

It is important to ensure that measured response times incorporate the impact of WAN effects both at an individual session, as part of a performance test, and under load as part of a load test, because a system under WAN affected load may work much harder than a system doing the same actions over a clean communications link.

Where is the WAN?

Another key consideration in network sensitivity tests is the logical location of a WAN segment.  A WAN segment is often between a client application and it's server.  Some application configurations may have a WAN segment to a remote service that is accessed by an application server.   To execute a load test that determines the impact of such a WAN segment, or the point at which the WAN link saturates and becomes a bottleneck, one must test with a real WAN link, or a back to back router setup - as described above.  As the link becomes saturated, response time for transactions that utilize the WAN link will degrade.

Response Time Calculation Example.

A simplified formula for predicting response time is as follows:

Response Time = Transmission Time + Delays + Client Processing Time + Server Processing Time.

Where:

Transmission Time = Data to be transferred  divided by  Bandwidth.

Delays = Number of Turns multiplied by 'Round Trip' response time.

Client Processing Time = Time taken on users software to fulfil request.

Server Processing Time = Time taken on server computer to fulfil request.

 

Try entering in values and clicking on various buttons below to see how various parameters affect response time.  Note that this is a simplified model to demonstrate impact of various parameters.  Other parameters such as error rates, lost pack rates .... are not included.

Simple Response Time Calculator / Model

Data transfer for transaction (KB) 

[          ]

Press any buttons to update values.

Number of Turns (or resources on web page [          ]  eg gifs )

[          ]

 

Effective Bandwidth (Kbps) 

[          ]

 

Round Trip Time (ms) 

[          ]

 

Server Processing Time (ms) 

[          ]

 

Client Processing Time (ms) 

[          ]

/

 

 

/

 

 

/

 

Estimated Response Time (in seconds)

[          ]   

                   [Reset values to defaults] 

If you run ping from your command line, first with a small number of bytes and then with a moderate number of bytes, and enter the results here, the actual values of bandwidth and latency  from where you are to the site you pinged will be incorporated into the above table.

 

Number of bytes

Time (mS)

Average roundtrip time for small number (eg 48)of bytes.  (eg ping -l 48 www.merc-int.com.au)

[          ]bytes

[          ]mS

Average roundtrip time for moderate number (eg 2048)of bytes.  (eg ping -l 2048 www.merc-int.com.au)

[          ]bytes

[          ]mS

 

A final word on bandwidth congestion.

Care should be taken when considering the congestion of an existing network link, when attempting to replicate that link in a test environment.  Take the example of a site that has four staff.  If one of those staff members spent all day downloading stuff from the web, using up all of the bandwidth, then analysis would show a link with high utilization.  If however, three staff spent all day downloading files, the line utilization would be much the same, but the available bandwidth of the remaining staff member would be greatly reduced when compared with the first scenario of only one person downloading files.

Determining the effective available bandwidth takes into account this effect of excessive bandwidth demand and should be used in preference to the 'stated' bandwidth.

 

Volume Tests

Volume Tests are often most appropriate to Messaging, Batch and Conversion processing type situations.  In a Volume Test, there is often no such measure as Response time.  Instead, there is usually a concept of Throughput. 

A key to effective volume testing is the identification of the relevant capacity drivers.  A capacity driver is something that directly impacts on the total processing capacity.  For a messaging system, a capacity driver may well be the size of messages being processed. 

Volume Testing of Messaging Systems

Most messaging systems do not interrogate the body of the messages they are processing, so varying the content of the test messages may not impact the total message throughput capacity, but significantly changing the size of the messages may have a significant effect.  However, the message header may include indicators that have a very significant impact on processing efficiency.  For example, a flag saying that the message need not be delivered under certain circumstances is much easier to deal with than a message with a flag saying that it must be held for delivery for as long as necessary to deliver the message, and the message must not be lost.  In the former example, the message may be held in memory, but in the later example, the message must be physically written to disk multiple times (normal disk write and another write to a journal mechanism of some sort plus possible mirroring writes and remote failover system writes!)

Before conducting a meaningful test on a messaging system, the following must be known:

bullet

The capacity drivers for the messages (as discussed above).

bullet

The peak rate of messages that need to be processed, grouped by capacity driver.

bullet

The duration of peak message activity that needs to be replicated.

bullet

The required message processing rates.

A test can then be designed to measure the throughput of a messaging system as well as the internal messaging system metrics while that throughput rate is being processed.  Such measures would typically include CPU utilization and disk activity.

It is important that a test be run, at peak load, for a period of time equal to or greater than the expected production duration of peak load.  To run the test for less time would be like trying to test a freeway system with peak hour vehicular traffic, but limiting the test to five minutes.  The traffic would be absorbed into the system easily, and you would not be able to determine a realistic forecast of the peak hour capacity of the freeway.  You would intuitively know that a reasonable test of a freeway system must include entire 'morning peak' and 'evening peak' of traffic profiles, as both peaks are very different. (Morning traffic generally converges on a city, whereas evening traffic is dispersed into the suburbs.) 

Volume Testing of Batch Processing Systems

Capacity drivers in batch processing systems are also critical as certain record types may require significant CPU processing, while other record types may invoke substantial database and disk activity.  Some batch processes also contain substantial aggregation processing, and the mix of transactions can significantly impact the processing requirements of the aggregation phase. 

In addition to the contents of any batch file, the total amount of processing effort may also depend on the size and makeup of the database that the batch process interacts with.  Also, some details in the database may be used to validate batch records, so the test database must 'match' test batch files.

Before conducting a meaningful test on a batch system, the following must be known:

bullet

The capacity drivers for the batch records (as discussed above).

bullet

The mix of batch records to be processed, grouped by capacity driver.

bullet

Peak expected batch sizes (check end of month, quarter & year batch sizes).

bullet

Similarity of production database and test database.

bullet

Performance Requirements (eg. records per second)

Batch runs can be analysed and the capacity drivers can be identified, so that large batches can be generated for validation of processing within batch windows.  Volume tests are also executed to ensure that the anticipated numbers of transactions are able to be processed and that they satisfy the stated performance requirements.

Sociability (sensitivity) Tests

Sensitivity analysis testing can determine impact of activities in one system on another related system.  Such testing involves a mathematical approach to determine the impact that one system will have on another system.  For example, web enabling a customer 'order status' facility may impact on performance of telemarketing screens that interrogate the same tables in the same database.  The issue of web enabling can be that it is more successful than anticipated and can result in many more enquiries than originally envisioned, which loads the IT systems with more work than had been planned.  

Tuning Cycle Tests

A series of test cycles can be executed with a primary purpose of identifying tuning opportunities.  Tests can be refined and re-targeted 'on the fly' to allow technology support staff to make configuration changes so that the impact of those changes can be immediately measured.

Protocol Tests

Protocol tests involve the mechanisms used in an application, rather than the applications themselves.  For example, a protocol test of a web server may will involve a number of HTTP interactions that would typically occur if a web browser were to interact with a web server - but the test would not be done using a web browser.  LoadRunner is usually used to drive load into a system using VUGen at a protocol level, so that a small number of computers (Load Generators) can be used to simulate many thousands of users.

Thick Client Application Tests

A Thick Client (also referred to as a fat client) is a purpose built piece of software that has been developed to work as a client with a server.  It often has substantial business logic embedded within it, beyond the simple validation that is able to be achieved through a web  browser.  A thick client is often able to be very efficient with the amount of data that is transferred between it and its server, but is also often sensitive to any poor communications links.  Testing tools such as WinRunner are able to be used to drive a Thick Client, so that response time can be measured under a variety of circumstances within a testing regime.

Developing a load test based on thick client activity usually requires significantly more effort for the coding stage of testing, as VUGen must be used to simulate the protocol between the client and the server.  That protocol may be database connection based, COM/DCOM based,  a proprietary communications protocol or even a combination of protocols.

 

Thin Client Application Tests

An internet browser that is used to run an application is said to be a thin client.  But even thin clients can consume substantial amounts of  CPU time on the computer that they are running on.  This is particularly the case with complex web pages that utilize many recently introduced features to liven up a web page.  Rendering a page after hitting a SUBMIT button may take several seconds even though the server may have responded to the request in less than one second.  Testing tools such as WinRunner are able to be used to drive a Thin Client, so that response time can be measured from a users perspective, rather than from a protocol level.