I've been fascinated with software development since I got my first C64 thirty years ago. And I still like the I emerging possibilities the technology offers us every day. I work as developer and software architect in internal and customer projects focusing on Java and Oracle. Thomas has posted 9 posts at DZone. You can read more from them at their website. View Full User Profile

MagicTest - an Automated Visual Approach for Testing

12.18.2009
| 5735 views |
  • submit to reddit

Automated tests have become widely accepted in the software development. Numerous tools like TestNG or JUnit provide great support for writing and running them efficiently. New methods like TDD or BDD strive to integrate testing even more in the development process.

The only which hinders even more acceptance is the fact that writing tests takes additional time. I do not want to dive into the philosophical discussion whether writing tests pays off on the long run, but it remains a fact, that instead of writing just the single method needed in the productive code, the developer has to write at least one additional test method.

So the obvious goal for a test framework must be to reduce the overhead needed for writing and maintaining tests – and to make sure that for the additional time spent, we get the most out of it.

In this article, I will introduce a new approach to testing. It will not replace, but extend the traditional approach, so it can easily be integrated with already available tests and tools.

The new approach offers the following advantages:

  • Creating tests is easier because the expected result does not have to be included in the test source, but is checked using a visual, but automated approach

  • Maintaining tests is easier because all needed information is visible at a glance and changes can easily be made without need to change Java code

  • Documenting tests is done automatically without additional effort in a way that can even be shown to your manager or customer

You think this sounds to good to be true? Let's have a look at a first example.

A first example

In our first example, we want to check that a simple static string concatenation method works like expected. To test this method, we write the following test method:


@Test
public static void concat() {
StaticMethod.concat("s1", "s2");
StaticMethod.concat(null, "s2");
StaticMethod.concat("s1", null);
}

If we run this test using MagicTest, we get the following output:

This looks promising, but how does this work?

The visual approach

To make this work, we have to change the definition when a test is considered successful.

Currently, your test methods are made of various assertions or calls that can throw exceptions. If you run your test, a test is considered successful if it completed without throwing any exception or if it threw an exception that was expected.

If we want to test our string concatenation method with the traditional approach, we end up with test code like that:


assertEquals(StaticMethod.concat("s1", "s2"), "s1s2");

As we can see we have to include the expected result in the test code so it can be compared using an assertion.

To avoid the need to include the expected result in the test code, we specify that our test methods should output the relevant data which is then compared by the test framework against a stored reference output. If the actual and the reference output differ, the test has failed.

So the extended definition is the following:

A test is considered successful if it completed without throwing any exception - or if it threw an exception that was expected - and if the actual output is equal to the stored reference output.

In fact we simply adopt the way a user would do testing without using a test framework: He would test his code visually by looking at the output of println() statements or by inspecting values interactively in the debugger.

So we can say that MagicTest does automate the visual test approach.

Actual and reference output

We have already seen that if a test is run, its output is collected (referred to as actual output).

But how actual and reference output come into existence? Let's look at the typical life cycle:

  • A new test is run the first time. As there is now reference output, the comparison of the output and hence the test will fail.

  • The developer will now examine the collected actual output.

  • If the actual output contains erroneous data, the test program must be corrected first and run again.

  • If the actual output contains what is expected, he will confirm the result and save the actual output as reference output.

  • If the test is then run again, the comparison of the output will succeed and the test is considered successful.

  • If the test is run after some changes to the method under test, the actual output may change. If the actual output changes and is therefore different from the reference output, the test is considered failed.

  • The developer must now again compare the two outputs and decide whether the new actual output should be saved as new reference output or whether the test program must be adjusted to produce the old reference output again.

As we have seen, there is a new concept of comparing actual and reference output and saving the actual output of a test as new reference output. For these steps, we will need support from our test framework.

To make this approach work, the output created by a test method must be stable, i.e. the output should not contain volatile data like the current time.

Implementation

It should now be clear how MagicTests works conceptually. But it still may be unclear how the simple call concat("s1", "s2") in our test program creates the necessary output for comparison.

To generate the needed output without having to manually code these statements, we instrument the byte code before the test method is executed: Each call to the concat() method is instrumented so that parameters, return value, and thrown exception are automatically traced.

Looking at the byte code level, a call to the method under test will roughly look like this:

try {
   printParameters("s1, s2");
String result = StaticMethod.concat("s1", "s2");
printResult(result);
} catch (Throwable t) {
printError(t);
}

The data traced out with the print-methods is then collected and visualized as HTML report.

The fact that each call to the method under test is properly documented, will allow us to use conditions or loops in our tests as well if the need arises: It is still clear and documented, what has been tested, even if the test method will become quite complex.

Testing error conditions

The pseudo byte-code has shown that each call to the method under test is surrounded by a try-catch-block. This makes testing error conditions a breeze.

Using the traditional approach, testing error conditions has always been cumbersome. Look at the following two possibilities available:


@Test
public static void concatErr1() {
try {
StaticMethod.concat(null, "s2");
fail("should fail");
}
catch (IllegalArgumentException e) {
// expected result
}
}

@Test(expectedExceptions = { IllegalArgumentException.class } )
public static void concatErr2() {
StaticMethod.concat(null, "s2");
}


None of them looks appealing: either you end up with a lot of boiler-plate code in your test method or with a lot of test methods in your test class.

Using the visual approach, we can test error conditions like normal method calls. So most of the time, there will be no need to have more than one test method for a method under test. If you still want to test different behaviors with different test methods, you are free to do this.

The fact that exception are automatically caught will offers us another advantage: If the execution of a single call to the method under test fails, this failure will get documented, but the execution of the rest of the test method will normally continue.

This make correcting failed tests really fast and convenient as we always have all relevant information at hand. With the traditional approach, execution stops after the first error so you have to guess whether subsequent calls can fail as well. And if your guess is not right, you have to run the test again and again.

Creating tests

After having heard about actual and reference output, it is clear, that the test report shown with the first example comes from a run with already saved reference output.

If we run this test the first time, the report will look like this:

 

As you can see, the report shows the actual ("[act]") beside the expected reference ("[ref]") output. Because we do not have yet a reference output, these lines remains empty and the test failed.

We can now check the actual output of the whole test method at a glance and then save it as new reference output by a single click on the save link using the Eclipse plug-in.

After having saved the new reference output, the test report will automatically be reloaded and you will see that the test is considered successful right now.

Maintaining tests

The advantage of the visual approach for maintaining tests becomes even more apparent if our tests must be adapted due to changes in the method under test.

Let's assume that we have a function for building file paths. As there are many different cases to consider, we have quite a bunch of test cases. Then it has been decided that we should use file URIs instead of local paths. So every returned file path must now have "file://" at the beginning.

With the traditional approach we must now incorporate this change in each single call of the test.

With the visual approach, the test report will show that all test methods have failed, but it will also unveil that the failure is only due to the missing "file://". So we can adapt all test methods at once by a single click on the save link – without need to change any test source code.

Of course it will happen that you also must made changes to the test code after you changed the methods under test with the new approach, but nevertheless maintaining tests will be much faster.

This effortlessness in handling output will motivate developers to really check all relevant data in a test. So if you have to maintain a sorted list and add a new entry, you can easily dump the whole list to check that everything is correct. Nobody will do that using assertions as it is just too painful.

With the visual approach this becomes feasible and MagicTest offers support for this kind of operations with formatters and returners. So your tests gain accuracy if more data is written out and compared, because it is likely that more errors will be caught.

Testing theory says that you should test one behavior with just a single test, but this is often difficult to reach in reality and adds additional costs when writing the tests . While this repeated testing really becomes a problem with the traditional approach where you have to change each test method manually, the visual approach helps you to make all changes easily and quickly.

Published at DZone with permission of its author, Thomas Mauch.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Amish Gandhi replied on Fri, 2009/12/18 - 8:06am

Sounds like a good util, where can I download this from?

Artur Biesiadowski replied on Fri, 2009/12/18 - 8:10am

I understand that if you at some point decide to delete one line in the middle of the method (because it is not longer relevant for example), everything falls apart ?

Thomas Mauch replied on Fri, 2009/12/18 - 10:08am in response to: Amish Gandhi

I am currently preparing the web site with software downloads and additional documentation. I hope to be ready next week. It will be available under http://www.magicwerk.org/magictest.

 

Thomas Mauch replied on Fri, 2009/12/18 - 11:21am in response to: Artur Biesiadowski

It depends on what you mean with everything falls apart.

If you remove a call to the method under test from the test method, the test method will fail as the actual output does not match the reference output any longer. However - as shown in the second screen shot - the test report will show you a diff view of the actual and the reference output, so it should be quite easy to detect that the only problem is the call that you just deleted. And to confirm that this change is correct, you just click on the "save" link for the test method - and everything is fine again.

As I tried to point out in the article, the main advantage of the visual approach is the fact that maintaining tests after changes to the method under test but also to the test method itself is really fast due to the visual approach.

Dan Wiebe replied on Sun, 2010/03/28 - 9:02am

You're talking about characterization tests--specifically, automatically-generated characterization tests.

I have some experience with automatically-generated characterization tests, although mine came from Verde rather than from MagicTest.

My observation is that they're very much better than no tests, but that they're significantly inferior to real hand-written unit tests, and very much inferior to the kind of tests that come out of test-driven development.

I have found two problems with them: one stems from automatically generating the tests, and one stems from writing the tests after writing the code.

You know how when you go to register an account on a website, they make you type your proposed password in twice?  And if the two representations are different, they pop an error and make you do it again?  They do that because if you have to say the same thing twice, and what you say the first time matches what you say the second time, the probability that you said what you meant to say is much higher than it is if you only have to say it once.

The major benefit of hand-written tests is similar to this.  For each new bit of functionality, you have to instruct the computer twice: you have to describe what you want to happen in a test, and you have to describe the same thing over again in code.  Not only do you have to say the same thing twice the way you do to create a password, you have to say it twice in different ways.  You have to add twice as much information, in other words.  If you accidentally say different things, your test will fail and you'll have to fix it.  This doesn't ensure that you've said what you meant to say the way mathematical proof does, but it increases the likelihood by a large factor.  I don't know what the factor is, but from what I've seen I suspect it has two digits.

With automatically-generated tests, you lose that factor, because you only say what you have to say once, and the computer generates tests that say the same thing automatically.  It's like having an "onblur" event handler on the first password field that automatically copies its value into the second password field for you.  If what you said is wrong, then your error gets propagated automatically and silently.  There's no additional information content to catch you.  It's said that computers are the first invention that allows people to make mistakes bigger and faster than beer and hand grenades.  You're ensuring that the code doesn't change, but you're not ensuring that the code is right.  Generally, during development, you want those assurances to be reversed.

The second problem has to do with tests that come into existence after the code they test.  The problem there is that you have no assurance that the code is actually tested, only that it's covered.

The two are different.  Code has to be covered to be tested, but it can certainly be covered without being tested.

The only way I know of to absolutely ensure that a snippet of code is tested is to observe that when it was written, A) it fixed a failing test, and B) no smaller snippet of code could possibly have fixed that test.  If your tests are green, and you write another test and they're still green, you have no proof that the new test actually tests anything; there could be some condition hiding in your code that makes the test evaluate to "assertTrue (true)," or your test might somehow not be being run, or the section of it that does the actual testing work might be being skipped.  If you don't see a test fail before it succeeds, you're setting yourself up for embarrassment.  And of course if your tests are green and you add production code and they're still green, you've pretty much proven that the new production code isn't tested.

Automatically-generated tests will always be inferior to hand-written tests for both of the reasons above; so if you're going to use them, you need some significant benefit to offset their inferiority. 

Your contention, if I understand it correctly, is that MagicTest tests are faster and easier to write than normal hand-written JUnit or TestNG tests.  I don't disagree, but it seems to me that this benefit doesn't really get you to where you want to be.  It might take me longer to hand-write JUnit tests than it takes you to produce MagicTest tests (but don't be too sure: I'm pretty good at JUnit tests by now), but my tests will ensure that my code is correct, while yours will only stick you with a maintenance problem as your development proceeds.  Based on what I've seen in the real world, I think once we're both a month or two into the project, you'll be struggling with defects in the framework you've accumulated and be adding much less business functionality than you were able to at first, while I'll be using my own confidently-correct framework to significantly accelerate the addition of business functionality.

The benefit of Verde that's intended to offset the fact that the tests it automatically generates are of low quality compared to hand-written tests is that a human doesn't have to write anything but a small set of pointcuts.  You write the pointcuts designating the parts of the application you want to test, you run the application through its paces, and Verde automatically generates thousands of JUnit tests (one for every advised method call) to characterize it.  And you only do this with legacy applications where the whole point is that the characterized functionality isn't supposed to change, rather than during development of new applications where functionality is constantly changing all the time.  The offsetting benefit is that while real developers may be able to write much better tests than Verde can write, Verde can be used to write hundreds or thousands of low-quality tests in a much shorter time than even a large team of developers could.

And still, I'd rather use hand-written tests than Verde characterization tests if there's any realistic way I can do so.

Thomas Mauch replied on Mon, 2010/03/29 - 3:24am in response to: Dan Wiebe

Dan, thanks for your interesting input.

In my opinion however, MagicTest does does not do characterization tests - even if the generated output may look similar to characterization test.

Let me proof this:

  • MagicTest does not automatically generate tests for you - it is still the developer who has to write the tests manually
  • MagicTest does automatically catch and compare the generated test output, but it is still the developer who has to confirm the that the output matches the expectation
  • With the traditional approach, the developer writes assertEquals(StaticMethod.concat("s1", "s2"), "s1s2") and then run the test to make sure everything is ok
  • Using MagicTest, the developer writes StaticMethod.concat("s1", "s2"), then runs the test, checks whether the output is correct and confirms the result. Note that if you run a test the first time, it will fail because there is no reference ouput yet, so the developer has to make an explicit choice whether the output is correct or not.
I however agree with you that it may the chance that a wrong output is accidentally confirmed may be higher compared to the approach that you have to code it in advance - but as you don't have to code it twice, it makes you developing faster.

You can have a look at the screencast to see the use of MagicTest for a simple example.

Dan Wiebe replied on Sat, 2010/04/03 - 10:39pm

Okay, here's a somewhat more realistic test, maintaining your theme of concatenation.

@Test
public void shouldConcatenateStatusIfRecordAlreadyExists () {
Record record = new Record ();
record.setKey ("key");
record.setStatus ("first");
Dao dao = new StaticMockDao ();
int recordId = dao.save (record);
Service service = new DefaultService (dao);

service.recordStatus (key, "second");

record = dao.get (recordId);
assertEquals ("first, second", record.getStatus ());
}

 How can MagicTest benefit me here?

Thanks.

Thomas Mauch replied on Thu, 2010/04/08 - 9:25am

Dan,

sorry for the late reply.

You're testing the DefaultService.recordStatus() method which changes the object record referenced by the parameter key passed to the recordStatus() function.

This indirect reference makes it hard to benefit from MagicTest. Let me first show what we could do if the situation is somehow changed.

If the signature would be recordStatus(Record, String), you could define a formatter for the Record class like this:

@Formatter
public static String format(Record record) {
return record.getStatus();
}

and then benefit from the automatic tracing of parameters and result by using this annotationg:

@Trace(result=Trace.ALL_PARAMS1)

So MagicTest would catch the parameters record="first", status="second" and the result "first, second" and display this in the HTML report.

To make use of a formatter in your example, you would have to store the record in an instance variable so the formatter could access it as well.

The other approach would be to create the test output manually:

String status0 = "first";
String status1 = "second";
record.setStatus(status0);
service.recordStatus(key, status1);
String status1 = record.getStatus();

Report.printStep("recordStatus", "record=" + status0 + ", status="+status1, status1);
With both approaches there is no real benefit by the use of MagicTest for your simple example.

The visual approach featured by MagicTest will help you however if you must check the behavior of the recordStatus() method in a lot of different test cases to have a good test coverage (as it is the case in the example in the screencast). If this would be case, the effort of setting up the formatter etc. would pay out.

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.