Application Development Chronicles: Test Specification techniques

Thursday, 4 October 2007

Test Specification techniques

Context

I assembled some information about techniques you can use to specify the test cases to see whether the code contains errors while assuring that the test cases covered all the code and possible conditions.

Generally the techniques are divided in two groups ; Black-box and white-box testing techniques where black-box techniques treat the unit only from an API standpoint and the white-box techniques require full understanding of the code.

It is important to note that some formal techniques can take a lot of effort to so I suggest a more pragmatic approach based on a combination of techniques.

There are different kinds of testing but for the discussion of this text we will use the following taxonomy.

Unit testing is a process of testing the individual subprograms (classes), subroutines or procedures (methods) in a program. Most of the times these “units” are tightly coupled with other modules. Sometimes the latter are stubbed out to assure that we only test the unit under test and not the secondary units. Unit integration testing concerns the inter-operation between the different units a developer has made.
Component testing or subsystem testing (unit integration) verifies a particular portion of the application, for example the data access layer code. This for example requires the integration of a database server.
System test is a higher level test that verifies non-functional characteristics at the entry point of a system. For example performance, stress, volume, etc.
Functional tests test the system much like a client would use the application. Things like the order of input , user error messages , etc

Some of the techniques discussed are more fit for a particular type of testing than others . I will concentrate on the techniques you could use in developer testing (unit and unit integration testing).

Test psychology

Is testing something we do to show that a piece program is working as expected or that it contains no errors or to find (all) errors? Or everything mentioned?

This might seem like subtle differences in semantics to some readers but it is actually quite strong because it gives you a different starting point while specifying test cases. Testing should have something destructive in nature. You’re going to write test cases to break the code; At least that should be your mindset while testing.

Testing has the purpose of increasing the reliability of your code. So actually when a test case finds an error , the test case should be mentally regarded as a success because it found an error and we can fix it before it was put through acceptance testing or worse, in production. Of course the meaning of success is double-faced because once we fix the code and execute the test case again we should not find the error anymore. The test-case passed successfully, meaning this time processing the correct result.
So saying that my piece of code is working as expected so not the ideal mindset because you can more easily write test case that will demonstrate that. That being said these “positive attitude“ test cased should nevertheless be included into your collection test cases to assure you code coverage objective.

Proving that your program contains no errors is very difficult to state because the psychological burden to achieve this is too big for a programmer. It is better to uncover as much as errors as possible. By increasing the test case you have, you’re more likely to find more errors. The more errors you find, an of course fix, the more confident you will be of your code. This will reduce the stress you’re feeling.

Typical Software defects and their causes

This is a huge area but ranging from wrong error messages to unable to free unused memory or missing database fields because of incomplete specifications. I will narrow the list to some typical coding errors and their sources.

Error against numeric boundaries
Wrong number of loops
Wrong values for constants
Wrong operation order
Overflows
Incorrect operator used in comparison
precision loss due to rounding or truncation
Unhandled case in logic
Failure to initialize a loop control variable
Failure to (re)initialize
Assuming one event always finishes before another
Required resource not available
Doesn't free unused memory
Device unavailable
Unexpected end of file
Wrong data types used

Test strategies

So how should we go about finding all errors in a program? We could try it by using every possible input condition as a test case. This would lead quickly to huge amounts of test cases with usually negligible added value. So what is out test-case design strategy?

One important testing strategy is black-box testing. Your goal is to be completely unconcerned about the internal behaviour and structure of the program. Instead, concentrate on finding circumstances in which the program does not behave according to its specifications. In this approach, test cases are derived solely from the specifications (i.e., without taking advantage of knowledge of the internal structure of the program).

Another testing strategy, white-box testing, forces you to examine the internal structure of the program. This strategy derives test data from an examination of the program’s logic.
In finding test cases, the following paragraphs sum up several techniques you can use to find appropriate test cases. It is not a “cook-book –like” description. It merely tries to serve as a reminders when you faced with coming up with test cases.

It neither a mutually exclusive set of techniques. As a matter a fact, the recommended procedure for example unit testing is to develop test cases using the black-box methods and then develop supplementary test cases as necessary with white-box methods.

Sources for test case specification

Depending on the type test some sources are more appropriate than others .

Equivalence classes

Exhaustive-input test of a program is impossible. Hence, in testing a program, you are limited to trying a small subset of all possible inputs. Of course, then, you want to select the right subset, the subset with the highest probability of finding the most errors. Instead of randomly selecting a subset, you can structure your effort with technique called equivalence partioning.The equivalence classes are identified by taking each input condition (a sentence or phrase in the specification, or a parameter of a method) and partitioning it into two or more groups. An equivalence class consists of a set of data that is treated the same by the module or that should produce the same result. Two types of equivalence classes can be identified: valid equivalence classes represent valid inputs to the program, and invalid equivalence classes represent all other possible states of the condition (i.e., erroneous input values).These equivalence classes will form the basis for your test cases and test case data

Here are some examples

Boundary value analysis

Boundary conditions are those situations directly on, above, and beneath the edges of input equivalence classes and output equivalence classes. Boundary value testing focuses on the boundaries simply because that is where so many defects hide. For example using > (greater than) in a condition instead of >= (greater than or equal as). "Below" and "above" are relative terms and depend on the data value's units. So rather than selecting any element in an equivalence class as being representative, boundary-value analysis requires that one or more elements be selected such that each edge of the equivalence class is the subject of a test.Another attention point is that rather than just focusing attention on the input conditions (input space), test cases are also derived by considering the result space (output equivalence classes). For example for determining the percentage of discount for a particular order, you could try to come up with a test case that would possible generate a discount more than the foreseen maximum discount. These boundary values will be the test case data values.

Here are some examples.

Decision table testing

In equivalence Class and Boundary Value testing we considered the testing of individual variables that took on values within specified ranges. Sometimes value of one variable constrains the acceptable values of another. In this case, certain defects cannot be discovered by testing them individually.This technique explores the combinations of input circumstances to come up with test cases.The testing of input combinations is not a simple task because the number of combinations can grow quicklyIf you have no systematic way of selecting a subset of a combination of input conditions, you’ll probably select an arbitrary subset of conditions, which could lead to an ineffective test.Decision tables represent actions that have to be taken based on a set of conditions. These are actually the rules. Each rule defines a unique combination of conditions that result in the execution ("firing") of the actions associated with that rule (If this and that is true than do this). If the system under test has complex business rules, presenting the system behaviour represented in this complete and compact form enables you to create test cases directly from the decision table.Create at least one test case for each rule. If the rule's conditions are binary, a single test for each combination is probably sufficient. On the other hand, if a condition is a range of values, consider testing at both the low and high end of the range. In this way we merge the ideas of Boundary Value testing with Decision Table testing.

Consider following psuedo-code

This exercise shows us that we need at least 18 different test case data sets to test all the logic.
You can merge this with explicit boundary checks.
Qty = 0, Qty = 9, qty= 10, qty = 11, qty = 14, qty = 15, qty = 16
Km = 0, km =49, km= 50, km = 51, km =99, km = 100, km=101
Of course combinations between them are also possible. You could even try a test with negative quantity to see if the code can handle that situation.

This all depends if you can look into the code to see how the decision tree is actually implemented. Some situations will result in the same outcome. You could decide to collapse the table. But be careful with collapsed tables. While this is an excellent idea to make the table more comprehensible it is dangerous from a testing standpoint. It is always possible that the table was collapsed incorrectly or the code was written improperly. Try to use, if possible, the un-collapsed table as the basis for your test case design.

Control flow testing

Modules of code are converted to graphs, the paths through the graphs are analyzed, and test cases are created from that analysis. This techniques requires are lot of preparation. Maybe usefull for safety -critical applications.

Depth of test specification

Some factors that come into play of determining the appropriate amount of testing effort (test case, time, depth , etc)
Cost of failure
Testability of the design
Life cycle of the application ; new or maintenance

Programming language

Sometimes the data types available in your programming language can help you reduce the number of test cases. Suppose you have a method in VB.NET that accepts an integer value that, according to the specs, must between 0 and 100. In case of the boundary values we can actually help reduce the number of test case because in VB.NET we have the possibility to use an unsigned integer data type (UInteger). It simply can not hold negative numbers so consequently we don’t have to specify a test that gives a negative number to the method.Enumerations can also help.

Design-by contract

In the design-by-contract approach, modules (called "methods" in the object-oriented paradigm, but "module" is a more generic term) are defined in terms of pre-conditions and post-conditions. Post-conditions define what a module promises to do (compute a value, open a file, print a report, update a database record, change the state of the system, etc.). Pre-conditions define what that module requires so that it can meet its post-conditions.More the code must use a mechanism to assert that these pre – and post conditions are met. Usually these assertions are placed inside the method. In this case the module is designed to accept any input. If the normal preconditions are met, the module will achieve its normal post conditions. If the normal pre-conditions are not met, the module will notify the caller by returning an error code or throwing an exception (depending on the programming language used). This notification is actually another one of the module's post conditions.Based on this approach we could define defensive testing: an approach that tests under both normal and abnormal pre-conditions. On the other hand if we trust the assertion mechanism we could go about and create test cases only for the situations in which the pre-conditions are met.
There are programming environments where the specification of pre-and post conditions are integrated directly in the language or language execution environment.
Should we create test cases for invalid input?

Difficult test conditions

Often modules have code that is executed only in exceptional circumstances—low memory, full disk, unreadable files, lost connections, Database locking, high network load etc. You may find it difficult or even impossible to simulate these circumstances and thus code that deals with these problems will remain untested. Still you would like to know how your code and more over your error handling code reacts on these situations.It is recommend to write that test case down and make the remark that the situation has not been tested.There are tools on the market that can help you simulate some of these conditions and check how your program reacts. For example DevPartner Fault Simulator 2.0. For example using DevPartner Fault Simulator you could observe how the application would handle the failures generated by a simulated network failure instead of you physically disconnecting a cable from the network,

Code coverage analysis

While the previous sections described techniques to specify test cases before or during programming, a code coverage analysis tool investigates your code during the execution of the current set of test cases (assuming you did some upfront test case design) if all code regions have been covered by the sets of test. It can point you the certain code path that haven’t been covered by the current set of test cases. It gives you the opportunity to specify new test case that will execute that part of your code. It can help you to eliminate gaps in your test suite.It is important to note that 100% code coverage does not equal to the fact that all errors are removed from your piece of code. It merely states that all code paths have been gone trough with your current test case suite. You still can miss out on errors simply because you’ve used a wrong comparisons operator ( > instead of >=) . So the previous techniques are still valuable. Code coverage gives you additional information that you can use. It just one of the methods to maximize the ability to find the errors in your code.Microsoft research is currently working on a tool that augments code coverage analysis with the creation of missing test case. Pex performs a systematic program analysis. It records detailed execution traces of existing test cases. Pex learns the program behaviour from the execution traces, and a constraint solver produces new test cases with different behaviour. The result is a test suite with maximal code coverage

Concluding remarks

Writing effective test cases is hard work and not always that easy as it seems to maximize error detection.
First of all, specifying test case requires a “destructive” mindset. You should find pleasure in finding errors. This is the right attitude to find as many errors as possible during your programming effort. So this implies that we write imperfect code. This is not always easy to accept for some people but nevertheless a good starting point to begin specifying test case data.

I summed up several techniques that can help you in finding the right set of test cases to find as many as errors as possible. The most important advice we can give you is to use the different methods in combination. Because only when you look at the problem from different angles, you can reasonably find the test cases that will filter out the bulk of the error conditions within the restrictions of budget and resources.

As you gradually become more familiar with the techniques you will not only come up with effective test cases but you will also think about testing while you’re coding. This will make you think how you can avoid an error condition in your code. Mind though you still will have to write out the test case even if you know your code can handle the situation.

Maybe while looking for test cases will give you more insight into the specifications that lead up to the code you’re investigating, making it more clear what you’re actually programming and how it fits in the whole picture. Maybe you will detect things that were omitted from the specification, either by accident or because the writer felt them to be obvious.

Continuing with this logic of thinking, Test-driven development puts testing completely on the fore-ground. You actually write test before writing any code. In that respect, testing becomes part of your (micro) design process. But that's for another post.

Sources and further reading

Test-driven development by example, Kent Beck, Addison-Wesley 2003
The Art of Software Testing, Second Edition, Glenford J. Myers ,John Wiley & Sons 2004
A Practitioner's Guide to Software Test Design, Lee Copeland ,Artech House 2004
BUG TAXONOMY AND STATISTICS, Boris Beizer , http://inet.uni2.dk/~vinter/bugtaxst.doc
Developing and using defect classification schemes, Freimut, B.,http://publica.fraunhofer.de/eprints/N-8167.pdf
Pex ; Dynamic Analysis and Test Generation for .NET, http://research.microsoft.com/Pex/
Code Coverage Analysis, Steve Cornett, http://www.bullseye.com/coverage.html
How Do You Know When You Are Done Testing? Richard Bender http://www.benderrbt.com/Bender-How%20Do%20You%20Know%20When%20You%20Are%20Done%20Testing.pdf
Don't be fooled by the coverage report, Andrew Glover ,http://www-128.ibm.com/developerworks/java/library/j-cq01316/index.html