CSCE 431 Lecture 16

From Notes
Jump to navigation Jump to search

« previous | Thursday, March 20, 2014 | next »


Testing

Untested systems will not work

Why?

  • Requirements not correct (customer's fault)
  • Misunderstood requirements (programmer's fault)
  • Coding errors (programmer's fault)
  • Miscommunication (everyon's fault)


Program testing can be used to show the presence of bugs, but never to show their absence! Edsger W. Dijkstra, 1970

It's impractical or impossible to exhaustively test all possible executions of a program.

Choose tests wisely

Increasing System Reliability

Fault avoidance:

  • Detect faults automatically, without relying on executing any system models
  • includes
    • development methodologoies
    • configuration management
    • verification

Fault detection:

  • Debugging, testing
  • Controlled (and uncontrolled) experiments during development process to identify erroneous states and their underlying faults before system release


Fault tolerance

  • Assume that system can be released with foults and failures that can be dealt with
  • For example, redundant subsystems, majority wins (multiple implementations perform same task, and the moderator service compares answers for discrepancies)
  • For extreme approach, see Martin Rinard: Acceptability-Oriented Computing, Failure-Oblivious Computing


Avoidance and Detection

Static Analysis

  • hand-execution (read source code)
  • walk-through (informal presentation to others)
  • code inspection (formal presentation to others)
  • automated "linting" tools can check for syntactic errors, semantic errors, and departure from coding standards

Dynamic Analysis

  • Black-box testing (test input/output behavior)
  • White-box testing (test internal logic of subsystem or class)

Data-Structure based testing (Data types determine test cases.

Terminology

test component
part of system isolated for testing
test case
set of inputs and expected results that exercises a test component (with purpose of causing failures or detecting faults
edge cases are good examples of test cases
test stub
partial implementation of a component on which a test component depends
returning a hard-coded value
test driver
partial implementation of a compononet that depends on a test component
fault
design or coding mistake that may cause abnormal behavior
design mistakes are a case when the requirements were misunderstood
erroneous state
manifestation of a fault during execution. Caused by one or more faults and can lead to a failure
failure
deviation between observed and specified behavior

When exact meaning is not important, "fault", "failure", and "erroneous states" are called "bugs"


Categories of Testing

Unit Testing

Goal: verify that a component or subsystem is correctly implemented and carries out the intended functionality

individual test components are tested in isolation

choose input data based on knowing the source code (white-box)

  • well within acceptable input range
  • well outside acceptable input range
  • at or near the boundary

Usually performed by programmer developing the module

Purchased components should be unit-tested too

Integration Testing

Goal: test interfaces between subsystems

Run collections of subsystems / components together

This should eventually encompass the entire system

Usually carried out by developers

Integration testing can start early:

  • stubs can be placeholder for components that have not yet been implemented
  • fits with agile development ethos


System testing

Goal: Determine if system meets its requirements (functional and non-functional)

Entire system is tested: hardware and software working together (black-box)


Robust testing: science of selecting test cases to maximize coverage

Carried out by developres, but likely a separate, dedicated testing group.


Reliability Testing

Run with same data repeatedly

Helps find

  • timing problems
  • undesired consequences of changes (regression testing)

Fully automated test suites are used to run regression tests repeatedly


Stress Testing

determines how much load can system can handle and how it performs

  • more than maximum anticipated loads
  • no load at all
  • load fluctuating from very high to very low

exceptional situations

  • longer than anticipated run times
  • loss of hardware devices (disk errors, sensor failure
  • exceeding physical resource limits (memory, disk space)
  • Backup/Restore


Acceptance Testing

Goal: Enable the customer to decide whether to accept a product

Users evaluate sytem delivered by developers

Carried out by or with client

All about validation; all other tests are concerned with verification.


Verification vs. Validation

validation
"Are you building the right thing?"
Code could be bug-free but fail validation
verification
"Are you building it right?"


Categorization by Intent

Regression testing

  • retest previously tested element after changes
  • Goal is to assess whether changes have re-introduced faults

Mutation testing

  • Introduce faults to assess test quality.


Categorization by Process Phase

  • Unit testing driven by implementation
  • Integration and System testing driven by subsystem integration
  • Acceptance testing driven by deployment
  • Regression testing driven by maintenance

Partition Testing

Cannot test all possible input data

For each test, partition input data into equivalence classes such that

  • test fails for all elements in equivalence class
  • test succeds for all elements in equivalence class

In theory, this means that only one input from each equivalence class suffices, but this is rarely the case.

  • Clearly good values
  • Clearly bad values
  • Values just within boundary
  • Values just outside boundary


Applicable to all levels of testing as a black-box strategy (based only on input space, not implementation)

No rigorous basis for measuring effectiveness


Choosing Values

each choice (EC)
for every class , at least one test case must use a value from
all combinations (AC)
for every combination of equivalence classes, at least one test case must use a set of values from
more extensive, but might be unrealistic (e.g. testing a compiler)


Example

Date-related program:

  • Month has 28, 29, 30, or 31 days
  • Year could be leap, standard non-leap, special non-leap (every 100 years), special leap (every 400 years)
  • month-to-month transition (28th day + 5 days)
  • year-to-year transition (December 31st + 1 day)
  • time zone / date line location changes

AC strategy -- some do not make sense: testing Dec 31–Jan 1 case tests both month-to-month and year-to-year


Test Automation

Testing is time-consuming

Should be automated as much as possible At a minumum, regression tests should be run repeatedly and automatically.

Many tools exist to help (e.g. XUnit)

  1. Generation difficult to automate
  2. Selection of test data
  3. Test driver code
  4. Execution (run test code; recover from failed test cases) easy to automate
  5. Evaluation (classify pass/no pass)
  6. Test quality estimation (coverage; feedback to data generator)
  7. Management (save all tests for regression testing)


What To Unit Test?

  • Mission Critical: test or die
  • Complex systems: test or suffer
  • non-trivial stuff: test or waste time
  • everything trivial: testing is a waste of time


Coverage Metrics

Critical view: Java's getters and setters are usually trivial

not testing them results in low code coverage metric: (e.g. <50%)