Regression tests are a collection of extensions executed to test
various parts of the system and a control command that drives those
extensions. These tests should be run after each major modification
to the system or the compiler to ensure that things known (or thought)
to work still do. Each component of the system should be accompanied
by a set of regression tests that would somehow verify its
functionality. Additionally, if a bug is fixed, providing a simple
test that brakes verifies the fix.
Each regression test is installed into the system as a shell command
and can be executed independently of other tests or all regression
tests can be executed together. Functionally, a regression test is a
collection of procedures (test cases) that return a BOOLEAN value
indicating whether the case test succeeded or not. The regression
control is also a shell command and it allows changing the mode of
execution of regression tests. It is a statically linked extension,
hence it is always available.
Shell commands
Before you run the tests:
- load regression tests you want to run, to load all of them execute
script ~/spin/user/scripts/regress.rc
- type help to see which regression tests are installed. They will
cause help to print regress `name of the test' - `some
information'
If you want to execute all or some of the tests:
- regress -all - to run all the currently loaded tests
- regress `test name' - to run a particular test
- regress -clean - to remove all tests, e.g., to be able to
reload them
The amount of information printed by regression tests can be regulated
- default (no flag specified) - a line saying which test or test case
is executed is printed indicating whether the test has succeeded
or not, output from the test extensions is suppressed.
- quiet mode (-quiet flag) - no output is printed except for error
messages, output from the test extensions is suppressed.
- verbose mode (-verbose flag) - a line is printed before and after
each test case to be able to see which tests are run, all output
from test extensions is visible.
Interpreting results
A regression test failed if a line >>> `test-name' => ERROR is printed.
There will be also a line that says which case have failed. Run the
test in a verbose mode to see how exactly the test failed.
Existing tests
- hello - simply prints "hello world" (pardy)
- dyntypes - tests dynamically loaded types (pardy)
- disp-type - tests dynamic typing of procedures done by the
dispatcher (pardy)
- disp-guardeval - verify correctness of guard and handler order
evaluation (pardy)
- disp-args - verify that arguments are passed correctly (pardy)
- disp-noinvoke - tests exception raised when no handler is
invoked (pardy)
- disp-closure - tests passing closures to guards and
handlers (pardy)
- disp-cancel - tests cancelation (pardy)
- disp-impose - tests imposed guards (pardy)
- disp-asynch - test asynchronous event raise (pardy)
- disp-except - tests passing exceptions from handlers and guards
to the caller (pardy)
- disp-save - tests passing callee-saved registers to
handlers (pardy) (as used by Strand.Stop handlers).
- view - tests the VIEW expression (whsieh)
- bitfield - tests the fixes for bitfield bugs (pardy)
- machbasic - tests basic Mach-style allocation and
deallocation (savage)
Writing a regression test
To write a regression test you need:
- a name,
- a help string,
- a list of test cases (a procedure and a string describing it),
- start-up and clean-up code.
and of course an implementation of all those procedures which tests
something important.
A regression test with a name "test" is expected to be installed as a
command "regress test", and is supposed to execute all of the
procedure cases clearly indicating which one was passed and which one
not.
To ease writing tests and to provide uniformity, a generic
RegressionTest is provided. All you need to do is to give it those
four things in a right interface. A template regress.tmpl takes care
of putting everything together so that in your m3makefile just type
"RegressionModule(YourTestModuleName)" and you got yourself a
regression test.
Here's an example of the simple test:
- Hello.i3
- The interface
- Hello.m3
- The implementation
- m3makefile
- The m3makefile.
- hello.conf
- The downloadable config file.
How does it work?
The RegressionTest generic takes the name of the test and the help
string and installs the test as a right command. It provides right
guards and the Run procedure which iterates over all test cases and
printing messages. The start-up (Start) procedure is executed before
any test case and it enables initialization of the test. The clean-up
(End) procedure is executed after all test cases and can be used, for
example, to deallocate memory, uninstall handlers, etc. All test
cases and the cleanup procedure are executed even if one of the cases
fails or raises an exception.
Text-based regression tests
There is an alternative implementation of regression tests. The
primary one (described above) forces the writer of a test to check
internally whether a test has failed or not. To ease the task of
programming regression test, text-based regressions are avaialble.
Those tests still can test internally for inconsistency in results
and return a result that indicates success or failure. They can
however delegate this responsibility to the right generic by providing
it with a text that is the expected output from the test. The generic
matches the acutual output with the expected one and if they are
different, it reports a failure. All other options are identical with
regular regression tests.
Some important files
Five files are involved in implementing regression tests:
- RegressionTest.mg.
- Generic implementation of regular regressions.
- RegressionTest.ig.
- Generic interface of regular regressions.
- TextRegressionTest.mg.
- Generic implementation of text-based regressions.
- TextRegressionTest.ig.
- Generic interface of text-based regressions.
- Regress.m3.
- Regression control implementation.
- Regress.i3.
- Regression control interface.
- regress.tmpl.
- The quake directives that instantiate your generic from the
RegressionModule directive.
Conventions
Please, follow these several rules. They will make creating,
maintaining, and running regression tests easier.
- Always use the provided generics.
- All regression tests are placed in the spindle/regress directory.
- All tests for a given subsystem (e.g. for strands) should
do to a subdirectory of the regress directory with that subsystem's
name (spindle/regress/spindle in this case),
- Use the name of your test as the name of the conf file for that test.
- Provide right Makefiles so that your test can be build using gmake
together with other tests (see
this file
for an example).
- Add your regression tests to the regress.rc script.
- Use IO interface to print on the screen.
- As much as possible, a regression test should be self-contained.
It should be possible to just load the.
- It has to be possible to run a regression test many time without
rebooting or reloading the test. Use startup and cleanup functions
to make sure that the system is in the right state.
Suggested usage
The tests can may our life easier if we use them right. We will learn
what is "right" as we use them but for now a couple suggestions from
the author of this page. None of these rules contain words like
"always" or "never". Everyone has to judge for yourself, but it's
good to remember that sometimes people other than the programmer who
wrote a piece of code have to believe that it works.
Submitting new tests
What should be tested and how is sometimes hard to decide, but two major
guidelines are proposed:
- cover functionality -
Each major system service should be covered by a set of tests that
exercise each piece of functionality, stress test the service,
and exercise the interaction of the service with other
components of the system.
- make bugs harder to repeat -
If a bug is found it might be profitable to submit a simple
regression test that shows that the bug is removed. This,
of course, should not be done for each trivial bug but may be
considered if a bug is found that was hard to detect, has
some "generality" to it (as in, it would be easy to make a similar
one). Another advantage is that such regression tests simplify
the lives of the mergermeisters.
Also, it is better to submit too many tests then too few. If the
number of tests ever becomes a problem we can always prioritize them,
if a test is missing someone may spend hours trying to figure out why
stuff doesn't work.
Testing
"Run them all, run them often"
- It is imperative to run the tests before submitting a tag and
during the merge after merging in a new tag.
- It is suggested to run regression tests after each major modification
to the system, libraries, or the compiler to make sure that
these changes do not break some unrelated part of the system.
- The tests may be very useful in hunting down bugs. Execution of
the test could pin-point where the problem is or where it is
unlikely to be. They will be even more usable when we can
statically load them and execute them automatically (see the
TODO list)
Running tests
- Start by running all the regression tests in the normal mode
("regress -all").
- Repeat the test several times, some bugs may show only if the test
is repeated or a test can contain a bug that prevents it
from being run many times (there will be automatic support for
repeating tests).
- If any problem is detected, run the offending test or tests by
hand in a verbose mode (e.g., "regress -verbose hello") to try to
find hints on the problem in the output.
- If you cannot figure out why the test fails, get in touch with the
writer of the test who is more likely to find what the source
of the problem is.
- Running the tests in the quiet mode is not recommended unless
they have been run successfully in the normal mode before.
The reason to avoid them is that if, for some reason, the
regression tests were broken one could not see whether they
actually run anything or not.
- For the same reason, it might be good to run at least some tests
in the verbose mode to see that they actually get executed.
ToDo
I am still working on the regression test control to provide more flexible
execution of tests and easier interpretation of results.
- More tests (Uncle Pardy, wants YOU to write them!),
- Enable and use static extensions. Some tests should be available
even if parts of the system (e.g., tftp) do not work so we should
be able to build a kernel with some extensions statically linked in,
- Automatic running of the tests even some fundamental components of
the system (e.g., shell or threads) do not work. This requires that
we can statically link spindles. An example of when this would be
useful is if the dispatcher or memory system made threads
unusable. In this case, the dispatcher tests should be run before
the threads are started, for example, first thing after run-time
initialization.
- Testing options (e.g., number of times tests should be repeated),
- Uninstalling separate tests,
- Running separate test cases,
- Parameterizing tests (???),
- Priorities (???).