csc 510-001, (1877)
fall 2024, software engineering
Tim Menzies, timm@ieee.org, com sci, nc state


home :: syllabus :: corpus :: groups :: moodle :: license

Testing

Concepts to know

Quotes

Why test?

V-Diagram

Fault: as an incorrect step, process, or data definition in a program.

Test-drive development

Large Test Suites: Test Case Prioriization

Regression testing:

But for very large test suites, cannot retest everything. What to do? Test selectively:

Black-box testing

Smart black box testing: Metamorphic Testing

Smart black box testing: Diversity Sampling

(ipo '(2 2 2 7 10)) ; ==>
((2 2 1 1 1) ; e.g. (true true true and first value of rest)
 (2 1 2 2 2) (1 2 2 3 3) (1 1 1 4 4)
 (2 2 2 7 5) (2 2 2 6 6) (2 2 2 5 7)
 (2 2 2 4 8) (1 1 2 1 9) (1 1 1 7 10)
 (1 1 1 6 5) (1 1 1 5 6) (2 1 1 3 3)
 (1 2 1 2 2) (2 2 1 7 9) (1 1 1 7 8)
 (1 1 1 7 7) (0 0 0 7 6)  ; note "0" means "don't care"
 (2 2 2 7 4) (0 0 0 7 3) (0 0 0 7 2)
 (1 1 2 7 1) (2 2 2 6 10) (0 0 0 6 9)
 (0 0 0 6 8) (0 0 0 6 7) (0 0 0 6 4)
 (0 0 0 6 3) (0 0 0 6 2) (0 0 0 6 1)
 (0 0 0 5 10) (0 0 0 5 9) (0 0 0 5 8)
 (0 0 0 5 5) (0 0 0 5 4) (0 0 0 5 3)
 (0 0 0 5 2) (0 0 0 5 1) (0 0 0 4 10)
 (0 0 0 4 9) (0 0 0 4 7) (0 0 0 4 6)
 (0 0 0 4 5) (0 0 0 4 3) (0 0 0 4 2)
 (0 0 0 4 1) (0 0 0 3 10) (0 0 0 3 9)
 (0 0 0 3 8) (0 0 0 3 7) (0 0 0 3 6)
 (0 0 0 3 5) (0 0 0 3 4) (0 0 0 3 2)
 (0 0 0 3 1) (0 0 0 2 10) (0 0 0 2 9)
 (0 0 0 2 8) (0 0 0 2 7) (0 0 0 2 6)
 (0 0 0 2 5) (0 0 0 2 4) (0 0 0 2 3)
 (0 0 0 2 1) (0 0 0 1 10) (0 0 0 1 8)
 (0 0 0 1 7) (0 0 0 1 6) (0 0 0 1 5)
 (0 0 0 1 4) (0 0 0 1 3) (0 0 0 1 2))

Smart black box testing: Fault Localization

For more on test failure localization, , see Jones JA, Harrold MJ, Stasko J. Visualization of test information to assist fault localization. Proceedings of the 24th International Conference on Software Engineering. ACM, Orlando, Florida, 2002; 467–477.

Smart black box testing: Doodling

Smart black box testing: Fuzzing

US_PHONE_GRAMMAR = {
     "<start>": ["<phone-number>"],
     "<phone-number>": ["(<area>)<exchange>-<line>"],
     "<area>": ["<lead-digit><digit><digit>"],
     "<exchange>": ["<lead-digit><digit><digit>"],
     "<line>": ["<digit><digit><digit><digit>"],
     "<lead-digit>": ["2", "3", "4", "5", "6", "7", "8", "9"],
     "<digit>": ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
}

[simple_grammar_fuzzer(US_PHONE_GRAMMAR) for i in range(5)]
['(692)449-5179',
 '(519)230-7422',
 '(613)761-0853',
 '(979)881-3858',
 '(810)914-5475']
def mutate(s):
    """Return s with a random mutation applied"""
    mutators = [
        delete_random_character,
        insert_random_character,
        flip_random_character
    ]
    mutator = random.choice(mutators)
    # print(mutator)
    return mutator(s)

for i in range(10):
    print(repr(mutate("A quick brown fox")))


'A qzuick brown fox'
' quick brown fox'
'A quick Brown fox'
'A qMuick brown fox'
'A qu_ick brown fox'
'A quick bXrown fox'
'A quick brown fx'
'A quick!brown fox'
'A! quick brown fox'
'A quick brownfox'

Whitebox Testing

White box: we can open up the code and look inside:

Symbolic execution: - Find the abstract syntax tree of the code - e.g. python3’s ctree package

import ctree


def f(a):
    for x in range(10):
        a[x] += x


tree1 = ctree.get_ast(f)
ctree.ipython_show_ast(tree1)

Applications of symbolic execution:

Formal methods

Express English requirements as check-able logic, then use logic to reason about it

Other examples:

(From One-Click Formal Methods:

Why not widely used?

Recent experience at Amazon:

“Testing” and Product Lines and Formal Methods

A feature model is a “product line”; i.e. a description of a space of products.

Question: what are the different products we can pull from the following?

Now that was a small feature model. Suppose we are talking about something really big like a formal model of the LINUX kernel with 4000 variables and 300,000 contrast. Q: How to reason over that space? A: use a theorem prover. e.g. Pycosat.

The following example comes from the excellent documentation at the Python Picostat Github page

Let us consider the following clauses, represented using the DIMACS cnf <http://en.wikipedia.org/wiki/Conjunctive_normal_form>_ format::

    p cnf 5 3
    1 -5 4 0
    -1 5 3 4 0
    -3 -4 0

Here, we have 5 variables and 3 clauses, the first clause being

x1 or not x5 or x4

Note that the variable x2` is not used in any of the clauses, which means that for each solution with x2 = True, we must also have a solution with x2 = False. In Python, each clause is most conveniently represented as a list of integers. Naturally, it makes sense to represent each solution also as a list of integers, where the sign corresponds to the Boolean value (+ for True and - for False) and the absolute value corresponds to i-th variable::

    >>> import pycosat
    >>> cnf = [[1, -5, 4], [-1, 5, 3, 4], [-3, -4]]
    >>> pycosat.solve(cnf)
    [1, -2, -3, -4, 5]

This solution translates to: x1=x5=True, x2=x3=x4=False

To find all solutions, use itersolve::

    >>> for sol in pycosat.itersolve(cnf):
    ...     print sol
    ...
    [1, -2, -3, -4, 5]
    [1, -2, -3, 4, -5]
    [1, -2, -3, 4, 5]
    ...
    >>> len(list(pycosat.itersolve(cnf)))
    18

In this example, there are a total of 18 possible solutions, which had to be an even number because x2 was left unspecified in the clauses.

The fact that itersolve returns an iterator, makes it very elegant and efficient for many types of operations. For example, using the itertools module from the standard library, here is how one would construct a list of (up to) 3 solutions::

    >>> import itertools
    >>> list(itertools.islice(pycosat.itersolve(cnf), 3))
    [[1, -2, -3, -4, 5],
     [1, -2, -3, 4, -5],
     [1, -2, -3, 4, 5]]

Example

Feature Models and Product Lines: Software installation as a formal methods problem

Lets represent software dependencies in a logical framework:

If we run Picosat over these formulae then:

Variants:

Important note: in practice, except for trivally small problems, no one writes DIMACS manually.