CSCE 431 Lecture 16
« previous | Thursday, March 20, 2014 | next »
Testing
Untested systems will not work
Why?
- Requirements not correct (customer's fault)
- Misunderstood requirements (programmer's fault)
- Coding errors (programmer's fault)
- Miscommunication (everyon's fault)
Program testing can be used to show the presence of bugs, but never to show their absence! Edsger W. Dijkstra, 1970
It's impractical or impossible to exhaustively test all possible executions of a program.
Choose tests wisely
Increasing System Reliability
Fault avoidance:
- Detect faults automatically, without relying on executing any system models
- includes
- development methodologoies
- configuration management
- verification
Fault detection:
- Debugging, testing
- Controlled (and uncontrolled) experiments during development process to identify erroneous states and their underlying faults before system release
Fault tolerance
- Assume that system can be released with foults and failures that can be dealt with
- For example, redundant subsystems, majority wins (multiple implementations perform same task, and the moderator service compares answers for discrepancies)
- For extreme approach, see Martin Rinard: Acceptability-Oriented Computing, Failure-Oblivious Computing
Avoidance and Detection
Static Analysis
- hand-execution (read source code)
- walk-through (informal presentation to others)
- code inspection (formal presentation to others)
- automated "linting" tools can check for syntactic errors, semantic errors, and departure from coding standards
Dynamic Analysis
- Black-box testing (test input/output behavior)
- White-box testing (test internal logic of subsystem or class)
Data-Structure based testing (Data types determine test cases.
Terminology
- test component
- part of system isolated for testing
- test case
- set of inputs and expected results that exercises a test component (with purpose of causing failures or detecting faults
- edge cases are good examples of test cases
- test stub
- partial implementation of a component on which a test component depends
- returning a hard-coded value
- test driver
- partial implementation of a compononet that depends on a test component
- fault
- design or coding mistake that may cause abnormal behavior
- design mistakes are a case when the requirements were misunderstood
- erroneous state
- manifestation of a fault during execution. Caused by one or more faults and can lead to a failure
- failure
- deviation between observed and specified behavior
When exact meaning is not important, "fault", "failure", and "erroneous states" are called "bugs"
Categories of Testing
Unit Testing
Goal: verify that a component or subsystem is correctly implemented and carries out the intended functionality
individual test components are tested in isolation
choose input data based on knowing the source code (white-box)
- well within acceptable input range
- well outside acceptable input range
- at or near the boundary
Usually performed by programmer developing the module
Purchased components should be unit-tested too
Integration Testing
Goal: test interfaces between subsystems
Run collections of subsystems / components together
This should eventually encompass the entire system
Usually carried out by developers
Integration testing can start early:
- stubs can be placeholder for components that have not yet been implemented
- fits with agile development ethos
System testing
Goal: Determine if system meets its requirements (functional and non-functional)
Entire system is tested: hardware and software working together (black-box)
Robust testing: science of selecting test cases to maximize coverage
Carried out by developres, but likely a separate, dedicated testing group.
Reliability Testing
Run with same data repeatedly
Helps find
- timing problems
- undesired consequences of changes (regression testing)
Fully automated test suites are used to run regression tests repeatedly
Stress Testing
determines how much load can system can handle and how it performs
- more than maximum anticipated loads
- no load at all
- load fluctuating from very high to very low
exceptional situations
- longer than anticipated run times
- loss of hardware devices (disk errors, sensor failure
- exceeding physical resource limits (memory, disk space)
- Backup/Restore
Acceptance Testing
Goal: Enable the customer to decide whether to accept a product
Users evaluate sytem delivered by developers
Carried out by or with client
All about validation; all other tests are concerned with verification.
Verification vs. Validation
- validation
- "Are you building the right thing?"
- Code could be bug-free but fail validation
- verification
- "Are you building it right?"
Categorization by Intent
Regression testing
- retest previously tested element after changes
- Goal is to assess whether changes have re-introduced faults
Mutation testing
- Introduce faults to assess test quality.
Categorization by Process Phase
- Unit testing driven by implementation
- Integration and System testing driven by subsystem integration
- Acceptance testing driven by deployment
- Regression testing driven by maintenance
Partition Testing
Cannot test all possible input data
For each test, partition input data into equivalence classes such that
- test fails for all elements in equivalence class
- test succeds for all elements in equivalence class
In theory, this means that only one input from each equivalence class suffices, but this is rarely the case.
- Clearly good values
- Clearly bad values
- Values just within boundary
- Values just outside boundary
Applicable to all levels of testing as a black-box strategy (based only on input space, not implementation)
No rigorous basis for measuring effectiveness
Choosing Values
- each choice (EC)
- for every class , at least one test case must use a value from
- all combinations (AC)
- for every combination of equivalence classes, at least one test case must use a set of values from
- more extensive, but might be unrealistic (e.g. testing a compiler)
Example
Date-related program:
- Month has 28, 29, 30, or 31 days
- Year could be leap, standard non-leap, special non-leap (every 100 years), special leap (every 400 years)
- month-to-month transition (28th day + 5 days)
- year-to-year transition (December 31st + 1 day)
- time zone / date line location changes
AC strategy -- some do not make sense: testing Dec 31–Jan 1 case tests both month-to-month and year-to-year
Test Automation
Testing is time-consuming
Should be automated as much as possible At a minumum, regression tests should be run repeatedly and automatically.
Many tools exist to help (e.g. XUnit)
- Generation difficult to automate
- Selection of test data
- Test driver code
- Execution (run test code; recover from failed test cases) easy to automate
- Evaluation (classify pass/no pass)
- Test quality estimation (coverage; feedback to data generator)
- Management (save all tests for regression testing)
What To Unit Test?
- Mission Critical: test or die
- Complex systems: test or suffer
- non-trivial stuff: test or waste time
- everything trivial: testing is a waste of time
Coverage Metrics
Critical view: Java's getters and setters are usually trivial
not testing them results in low code coverage metric: (e.g. <50%)