Fabrizio Giudici is a Senior Java Architect with a long Java experience in the industrial field. He runs Tidalwave, his own consultancy company, and has contributed to Java success stories in a number of fields, including Formula One. Fabrizio often appears as a speaker at international Java conferences such as JavaOne and Devoxx and is member of JUG Milano and the NetBeans Dream Team. Fabrizio is a DZone MVB and is not an employee of DZone and has posted 67 posts at DZone. You can read more from them at their website. View Full User Profile

Asynchronous Testing of Swing Applications

03.02.2009
| 7995 views |
  • submit to reddit
I'm telling you about some experiences I had with specific issues when testing Swing applications (it happened with the blueMarine development, but this article is focused on Swing and not on the NetBeans Platform). I'd like to hear your opinion about a solution I've found and, before I code more on it, whether you know an existing framework that works in a similar way.

As many have said many times, testing is one of the most important activities for the success of a project. This is probably even more true for desktop applications than regular web applications, since they have more sophisticated behaviours and interactivity with the user. Indeed, many Web 2.0 applications are getting rich in interactivity too, but most of them don't have sophisticated asynchronous models, either for limitations of the technology or because they are made simple on purpose. After all the Web is still modelled on network transactions, even if you're using AJAX; this means that you still have plenty of probing points for integration testing using the communicating channel between the client and the server.

Consider instead a Swing-based desktop application with the following behaviour:

  1. you have a file system explorer and you select folders
  2. upon each selection, the application scans the folders (potentially in a recursive fashion) for searching contained media
  3. when the selection has been completed, some thumbnails appear in a viewer.

Of course, this must be done with background threads, so the application is responsive, that is the user is able to keep on selecting other folders while the previous scan is not completed yet; in this case, the still-running thread is cancelled.

Now, let's suppose we want to write a high-level integration test for this scenario. The probing point is just the user interface: you perform a selection and want to assert that "some time later" another part of the UI has been updated. We could think of this test code sketch:

package it.tidalwave.bluemarine.filesystemexplorer.test;

public class FolderSelectionTest extends AutomatedTest
{
private static final int THUMBNAIL_SELECTION_TIMEOUT = 4000;

private FileSystemExplorerTestHelper f;
private ThumbnailViewerTestHelper t;

...

@Before
public void prepare()
{
f = new FileSystemExplorerTestHelper();
t = new ThumbnailViewerTestHelper();
}

@After
public void shutDown()
{
f.dispose();
t.dispose();
}

@Test
public void run()
throws Exception
{
activate(f.fileSystemExplorerTopComponent);
f.resetSelection();

selectNode(f.upperExplorer, f.view.findUpperNodeByPath(BaseTestSet.getPath()));
select(f.cbSubfolders, true);
// WAIT FOR THE COMPUTATION TO COMPLETE

assertActivated(f.fileSystemExplorerTopComponent.getClass().getName());
assertOpened(t.thumbnailViewerTopComponent.getClass().getName());
ThumbnailViewerTestHelper.assertShownFiles(t.thumbnailListView, BaseTestSet.getPath(), BaseTestSet.getAllFiles());
...
}

...
}

This is actually real code from blueMarine's tests, and I think it's pretty readable (at least if you know a bit of basic concepts of the NetBeans Platform). It activates a TopComponent, resets the folder selection, selects a node in an explorer, selects a checkbox (cbSubfolders is a checkbox that enables recursive scanning), and asserts that the TopComponent for selecting files is still active (= it has got the focus), the thumbnail viewer TopComponent is opened and its view is populated with all the files of the test set. The TestHelpers are facility classes that cooperate with some parts of the UI classes, offering references that point to the relevant UI components and specific assertion methods.

The various methods in the test body are statically imported from an utility class and perform the proper UI manipulation in the EDT thread - this is a common solution of many Swing-related testing frameworks.

blueMarine at the moment doesn't use any of the existing, such as Abbot and Costello or Jemmy / Jelly. This is for historical reasons (initial tests were written quite a few years ago, before the conversion to NetBeans, even though most of them has been "lost" in the conversion) and they will eventually converge to the standard framework used by NetBeans. But this is not relevant with the problem I'm talking about.

The tough point is that "WAIT FOR THE COMPUTATION TO COMPLETE". How to implement it? A simple solution could be a delay of an appropriate length, but this is really a bad idea; first, it will unnecessarily slow down tests; second, it will prevent you from measuring the performance while testing; third, you'll soon discover that, sooner or later, a specific run of the test will casually take much longer than expected (e.g. because the operating system is swapping memory) and the test will fail.

This has been a trouble for me until today, especially if you consider that I've configured this kind of tests for being executed by users (I call these tests "Acceptance Tests"). This is a very important point, especially for an open source project in order to take advantage of its community, and specifically for finding problems in a context that you can't reproduce (for instance, a specific computer of a troubled user). A subtle point is that my users aren't computer engineers, but end users: so you can't ask them for downloading, compiling code and running tests with Ant. In fact, I made tests available as plugins that can be installed into blueMarine by means of an update center. But you can't either pretend that users get into technical details so they can understand if a test failed because of a spike or because of a real bug. Indeed, what I expect is that users just press a couple of buttons and either tell me "all tests passed" or they report some failure by just sending a log file (a thing that could even happen automatically).

Add to this another point: once you've added an option for repeating the test suite for a potentially high number of times you are able to run load tests (for instance to find problems in the long run, such as memory leaks and so on).

The Automated Test pane in blueMarine

This means that you can't tolerate false positives in tests and that wait must be at the same time the shortest possible and effective.

As I've said, until today my Acceptance Tests were plagued by spikes and were not suitable for being executed by non technical people. I've spent days and days trying to define a good way to guess whether a certain asynchronous process has been completed, but trouble arose even in a scenario that looks simple as the three points that I've introduced at the beginning of this article.

The idea I pursued until yesterday was to have a very small facility for detecting events. Look at the following code variation and in particular the use of the Waitable object:

    @Test
public void run()
throws Exception
{
activate(f.fileSystemExplorerTopComponent);
f.resetSelection();

selectNode(f.upperExplorer, f.view.findUpperNodeByPath(BaseTestSet.getPath()));
final Waitable selectionChanged = t.thumbnailViewSelectionChanged();
select(f.cbSubfolders, true);
selectionChanged.waitForOccurrences(2, DEFAULT_TIMEOUT);

assertActivated(f.fileSystemExplorerTopComponent.getClass().getName());
assertOpened(t.thumbnailViewerTopComponent.getClass().getName());
ThumbnailViewerTestHelper.assertShownFiles(t.thumbnailListView, BaseTestSet.getPath(), BaseTestSet.getAllFiles());
...
}

 The Waitable encapsulates the logics for detecting that a certain event has been triggered. TestHelpers provide factory methods for different Waitables representing a number of interesting events that you might want to wait for. In this specific case, the selectionChanged Waitable listens for changes in the view component that renders the thumbnails. The timeout here is quite large and is useful only for preventing the test from lingering indefinitively in case of something going wrong.

So far, it's straightforward. The hard part that I faced with is that blueMarine might generate multiple events even in simple cases. For instance, every selection usually ends up in two events: first all views are notified with a special empty result that stands for "Please wait, search in progress" (and at the same times immediately clears the results of a previous search), later followed by the real result. That's why I added a parameter for specifying that you might want to wait for multiple occurences of the same event (2 in this cases). Any kind of event has a progressive counter; when you invoke the Waitable factory method, the current value of the counter is copied into the Waitable (e.g. 10) and when waitForOccurences(2, ...) is called the thing waits for the counter to reach 10 + 2 = 12. So you're able to wait for a certain number of occurences of an event starting from a known point.

But this solution was not enough. Consider the above sample: both the selectNode(...) and select(f.cbSubFolders, true) operations *might* trigger a scan, if they are changing the current state of the UI (i.e. the specified node was not already selected, or the checkbox was already checked); they won't in the opposite case. Since a test must be reproductable, at the very beginning of the test you are sure about what the initial state is; you aren't a few steps after the beginning. Please bear in mind that these are NOT unit tests, that are usually pretty simple, but integration tests: they mimick a real user interaction with the application and they can be complex and pretty long.

Having to consider preconditions every time you are coding a wait for the completion of a process resulted extremely frustrating, as even minor changes caused the code to break. Furthermore, the advanced asynchronicity of the application makes things even worse: sometimes a scan triggered by previous calls starts with a great delay, eventually *after* the Waitable is created. This of course breaks the mechanism of counters. Furthermore, sometimes pending operations are cancelled because are replaced by others; other times they are allowed to finish before they are cancelled. Summing up, the number of occurences of any event can change in very complex and unexpected ways. Strange things usually happens during spikes, for instance when the computer is busy in other tasks, thus in very unpredictable scenarios.

I've spent a lot of time trying to write more effective syncing points: on this purpose I added more and more probes in the code and eventually used more complex conditionals (for instance: ignore events when happens this and that). While this proved to be not enough for more complex sequences, it started polluting the implementation code of the application in an unbearable way. Furthermore, this approach was more and more coupling the test code with the implementation, making tests even more fragile.

When I'm spending too long time for solving a problem and I'm not satisfied with the elegance of the solution I think it's high time I stopped and think of something different. So I reverted all the latest changes and looked in a different direction. When you feel you're getting into a complexity trap, a good approach is to try thinking as close as the real world as you can. What's happening in my scenario? Well:

1a. I press a button
1b. a sequence of things happens
1c. a result is made available
2a. I press another button
2b. another sequence of things happens
2c. another result is made available

and eventually 1* and 2* sequences can be intermixed like that:

1a. I press a button
1b. a sequence of things happens
2a. I press another button
2b. another sequence of things happens
1c. a result is made available
2c. another result is made available

Still, looking at the above pseudo-code, we are able to track what's happening thanks to the 1* and 2* identifiers. That is, we have "tagged" the different operation sequences. Since any operation is a sequence of threads, why don't we just tag them? Bingo.

Look at the following code:

    @Test
public void run()
throws Exception
{
activate(f.fileSystemExplorerTopComponent);
f.resetSelection();

assertActivated(f.fileSystemExplorerTopComponent.getClass().getName());
select(f.cbSubfolders, false);
final Tag tag1 = selectNode(f.upperExplorer, f.view.findUpperNodeByPath(BaseTestSet.getPath()));
t.thumbnailSelectionChanged().waitForNextOccurrence(tag1, THUMBNAIL_SELECTION_TIMEOUT);
delay(200); // Give time for the AWT thread to work so changes go to the UI

assertActivated(f.fileSystemExplorerTopComponent.getClass().getName());
assertOpened(t.thumbnailViewerTopComponent.getClass().getName());
...
}

Now the point is that every method that initiates an interaction with the UI creates a new instance of Tag. The tag is attached to the EDT thread (by means of ThreadLocal) before calling Swing and propagated to related threads (for instance, typically the EDT starts a background thread by means of a java.util.concurrency.Executor and, when things are ready, the result is passed again to the EDT for refreshing the UI). This means that instead of waiting for countable event occurences, we wait for tagged events occurrences; in the last code example waitForNextOccurrence(tag1, ...) blocks until the specified event happens in a properly tagged thread.

The tricky part is how to propagate the tag from thread to thread, but in the end I made it to work without too many hassles. The most annoying part is custom code in place of EventQueue.invokeLater(), but I think that the AWT event queue is customizable and I could make it possible for tagging to work with the standard method calls.

Enough for today. First I'd like to know what you think about it; second, if you know an existing framework that works in this way. If it doesn't, I'll show you the underlying code next time.

Published at DZone with permission of Fabrizio Giudici, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Tim Lavers replied on Tue, 2009/03/03 - 6:39pm

Hi Fabrizio,

 This sounds like an interesting problem...almost too interesting! Anyhow, we've faced similar problems, and have a technique that works. There are two ingredients in our solution. First is a rock-solid thread-safe way of making assertions about the state of the user interface (for example, in your case you might want a method that tests that a certain number of thumbnails are showing, or that a progress bar is showing, or whatever).  The second ingredient is a waiting framework that will periodically check whether or not the user interface is in the desired state, according to the first ingredient. Then  your test can wait for the state to be as expected, with an assertion failure if it times out. This way, you never have to make guesses about how long an operation will take.

Building a framework to safely read the state of your user interface is not very hard once you know how. It's a matter of naming your swing components and reading all component state in the EDT. We show you how in our book Swing Extreme Testing. The timing framework is much simpler and is also described in our book. I honestly think that reading the book, at least the chapters 6 to 10, will solve your problem (and then you can download the source code).

Fabrizio Giudici replied on Wed, 2009/03/04 - 6:16am

Hi Tim, I'll have a look at the book. In any case, I fully agree with you on the two ingredients: I don't have problems in those areas (finding components and asserting on Swing components); it's that in real-world applications the UI is likely to be updated frequently and the tricky part is to find out which update matches the testing stimulus you submitted. As far as I've read so far, books and tutorials cover simpler cases.

* Update: I've bought the PDF version of your book and quickly browsing it I found at page 187 "The Unit Test for waitForNamedThreadToFinish()". While I agree on the point of giving names to threads, which I usually do at least for a more understandable log, this can't be a solution for every case. Sometimes the thread is not controlled by you, instead is created by a 3rd party library; but, above all, in the case I've described in my post you have this sequence "EDT -> Thread -> EDT" (which is typical), so the completion of the computing sequence happens in the EDT and you can't just wait for a named thread. With tags the thing seems to work.

Jean-Francois P... replied on Wed, 2009/03/04 - 9:42am

Why reinvent the wheel? Can't any existing solutions out there for Swing testing (eg Abbot, FEST) fit the bill?

I have used Abbot extensively in very complex scenarios (my tests launched 2 instances of the same GUI on a different JVM and would check the impact of an action on one GUI over the other one, all this running asynchroneously) without big problems.

For FEST I use it currently and like it much but my current test scenarios are quite simple, I'm not sure it would fit your requirements but it might well (its author is particularly active in implementing enhancements or bug fixes in case your situation didn't work well).

Just my 2 cents

Tim Lavers replied on Wed, 2009/03/04 - 4:21pm in response to: Fabrizio Giudici

Hi Fabrizio, the naming threads point would not be relevant in your case. (We do this because we have a lot of reports and so on that are created on the server. The client code just starts a thread and then (asynchronously) gets updated with the report. In our tests we like to test that if the user cancels the report generation, then the thread gets cleaned up on the server. Having named threads makes this much easier, and is helpful for logging, as you point out).

Hopefully some of the code in the book will help with finding named components and also finding components in a particular state (for example a JLabel or button showing certain text, a progress bar at a certain level, a particular image showing etc etc). I can assure you that the techniques developed really work for real-world apps: we've been using them on our extremely complex software for about 8 years now.

By the way, the user interface for blueMarine looks great, as do photos.

Tim Lavers replied on Wed, 2009/03/04 - 4:29pm in response to: Jean-Francois Poilpret

The 'wheels' we looked at were square. Actually, to be fair, some were octagonal. Ours is circular, and comes with tyres!

Fabrizio Giudici replied on Thu, 2009/03/05 - 5:30am

Hmm... it seems I didn't explain myself properly. Looking e.g. at the FEST description I see that the typical features of existing Swing-testing frameworks are:

  1. Simulation of user interaction with a GUI (e.g. mouse and keyboard input)
  2. Reliable GUI component lookup (by type, by name or custom search criteria)
  3. Support for all Swing components included in the JDK
  4. Compact and powerful API for creation and maintenance of functional GUI tests
  5. Supports Applet testing
  6. Ability to embed screenshots of failed GUI tests in HTML test reports
  7. Can be used with either TestNG or JUnit
  8. Supports testing violations of Swing's threading rules

 I agree that it's a nonsense re-writing new code for the above features - the fact that blueMarine has specific code for those is just a legacy and will disappear in the near future. Since blueMarine is a NetBeans Platform application, I'll use the specific support that the Platform provides.

My critical point is - still quoting FEST documentation - "testing long-duration tasks":

The following are the typical steps to complete such scenario:

  1. User launches the application
  2. A login window appears
  3. User enters her username and password and clicks the "Login" button
  4. User is authenticated and authorized successfully
  5. The main window of the application is displayed

The "tricky" part here is step 4. Authentication/authorization can take some time (depending on network traffic, etc.) and we need to wait for the main window to appear in order to continue our test. It is possible to test this scenario with FEST:

loginDialog.textBox("username").enterText("yvonne");
loginDialog.textBox("password").enterText("welcome");
loginDialog.button("login").click();

// now the interesting part, we need to wait till the main window is shown.
FrameFixture mainFrame = findFrame("main").using(loginDialog.robot);

 Paraphrasing the above example, I've got some scenarios where the "main" frame would appear by itself, as a consequence of other actions started in the past; thus, its mere appearance is not a good signal that I can proceed with the test. I must be sure that "main" appearance is strictly a consequence of the click on the "login" button. Unfortunately there are no different properties in the many appearances of "main" that I could use to discriminate them.

When I looked at FEST I found promising that "using(loginDialog.robot)" because it sounded as the "tagging" idea I have in mind; maybe that means that FEST will ignore any appearance of a "main" frame that has not been triggered by the "login" button click?

But the following explanation:

This is necessary because, in a given test, only one instance of Robot can be running, to prevent GUI tests from blocking each other on the screen. In another words, in a test class you can only use one and only one instance of Robot. 

seems to tell me that it's another matter; for sure that "one and only instance" contrasts with my needs, as my test could launch multiple activities at the same time - still paraphrasing the previous example, think of this:

  1. click on a "login" button
  2. click again on a "login" button
  3. wait for a "main" appearance as consequence of the first click on "login"
  4. wait for another "main" appearance as a consequence of the second click on "login"

 

Of course this doesn't make much sense for a login window, but I'm just reusing the same example for clarity.

Does FEST or any other Swing testing framework support this scenario?

 

Michael Bushe replied on Wed, 2009/03/11 - 5:42pm

I know I solved this once upon a time while using Abbot test scripts, I'll get try to dig it up and post. Meanwhile, I have a workaround that may be more work than you want to do, but I think strikes at the heart of the problem.

The problem is that the UI presentation or the UI "business logic" ("Request Folder Open", "Login") is coded to the UI implementation (Click on this or that widget, open a window). Thus the test of your application's presentation logic has to depend on listeners on in the UI implementation.

A hidden jewel in using an EventBus is that it decouples tests from the underlying UI implementation (node click, window open).

If the implementation (the actionlistener) published UI presentation events like so:

... returnFromServerLogin(LoginResult loginResult) {
   EventBus.publish(new LoginEvent(loginResult.status);
}
or
... actionPerformed() {
   EventBus.publish(new RequestFolderOpenEvent((MyTreeNode)event.getTarget()).getPictureFolderPath()))";
}

... then the implementation will essentially start translating GUI events to UI presentation workflow.

Such events allows your tests to be written in presentation logic:

final LoginEvent[] result = new LoginEvent[1];//final array in order to use in inner class
EventBus.subscribe(LoginEvent.class,  new EventSubscriber() {
  public void onEvent(LoginEvent ev) {
    result = ev;
  }
});
...populate text boxes and click the button (even cooler - do it by publishing their values!)
while(result[0] == null) {
 ..sleep a while
}

jiji530 (not verified) replied on Mon, 2009/06/29 - 9:45pm

thanks for your post.perhaps you will like abercrombie ed hardy mortgage rates tiffanys ed hardy Is not it?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.