What’s happening is that some Swing methods are being called off the EDT by
various classes in the VASSAL.chat hierarchy. Sometime between beta5 and beta6,
I noticed that a lot of communication with the server was taking place on the
EDT. Passing messages back and forth with the server can sometimes take a few
seconds, so whenever that happened the GUI would block. In order to stop the
GUI from hanging during long waits, I moved these requests into background
threads. Unfortunately, there are more Swing method calls in VASSAL.chat than
I realized, and some of what I moved into background threads is itself calling
Swing methods—this is what’s leading to the mysterious
“ArrayIndexOutOfBoundsException: No such child: 3” messages.
I’m uncertain how to tackle this problem, since I think sorting out threading
in VASSAL.chat will be rather a large job. How will VASSAL.chat change when
we switch to the new Jabber-based server? I don’t want to put a lot of effort
into untangling this if we’re going to chuck the code I’d be fixing a few
months from now.
I once worked on a project where the powers-that-be used some
different thread-handling class that was able to detect when Swing
calls were not being made on the EDT. It allowed one to track down
those locations. Unfortunately, I don’t recall exactly what was being
done there.
at the head of each Swing method. That way, we could turn on assertions if
we wanted to go bug hunting, and there’s zero performance impact. (I don’t
know why Sun hasn’t done that, honestly. One person and a Perl script could
make the changes in a few minutes.)
I know some more about this problem now: The most common exception, and the
only one I know how to cause right now, is the result of
SimpleStatusControlsInitializer.uninitializeControls() being called off of
the EDT.
I think that potentially any class which implements ChatControlsInitializer
could have its initializeControls() or uninitializeControls() methods called
off the EDT. An ugly kludge we could apply would be to wrap all calls to
those methods in an invokeLater() (or maybe invokeAndWait()). I suspect that
would eliminate most of the symptoms of the problem—but it won’t leave
the VASSAL.chat code looking any nicer… Eventually we’ll have to clean it
up.
Switching to Jabber will probably not have an impact on this issue.
What you’re probably running into is the DelegateClient, which queries the web site and dynamically builds a client when connecting. Different clients have different GUI controls, so the GUI gets rearranged depending on which client gets built. The right solution is probably to push the logic of spawning a background thread down one level into the lower-level client implementations. Since you’re waiting for the result of a network operation to know what to do with the GUI, there’s no getting around having to wrap the GUI operations in a SwingUtilities.invokeXXX when the operation completes. An imperfect but simpler solution is to be asynchronous on send() operations, but not on connect/disconnect. Doing connect/disconnect on the EDT makes the GUI operations safe, and you can safely put send() into a background thread because it doesn’t do any GUI operations.
Hmm. Ok. I don’t understand the architecture of the chat package—I haven’t
been able to glean much from it in the time I’ve spent with the code, and
there’s almost no documentation for it. Would you be able to describe how it
all fits together? (I also don’t see where the server code is—I was
very surprised to find that the server code is under VASSAL.chat.)
Will clients other than the Jabber one still be in use once the server
changes? If so, why?
That would be fine if the client stuff were already running on an independent
background thread—but it wasn’t before. The changes I made cause some
requests to be executed in the background by a SwingWorker, so the client
still isn’t really operating on it’s own long-lived thread. I think it would
be better if it were that way, since then it would be more straightforward to
isolate the GUI from the communications code.
Making connection and disconnection asynchronous was the nearly the whole
purpose of the changes I made, as those were the two operations which were
noticeably blocking the GUI.
On the client side, the server is abstracted by the ChatServer interface. It’s inherently asynchronous: methods like setConnected() and setRoom() can return immediately, but the fact that the connection or room change was successful is communicated by a subsequent PropertyChangeEvent. So the room GUI registers as PropertyChangeListener to the ROOM property and changes the view when it get an event, and the connect/disconnect button is enabled/disabled in response to events on the CONNECTED property. Incoming game events are handled the same way.
There are three ChatServer implementations: the current server, the peer-to-peer direct connection, and the new Jabber server. I’d like to keep the current server implementation around at least for the initial release. That way, in case there’s a problem with Jabber we can switch over to the old server transparently without users having to download an update. I expect P2P to stick around indefinitely.
Let me take a look at this. It might be that the only thing we need to do is wrap the (un)initializeControls() calls in a SwingUtilitities.invokeLater().
This seems ok for the moment, though I’m not convinced that we’ve eliminated
all off-EDT Swing calls in VASSAL.chat. In the long-term, I’d like to do some
refactoring here (and elsewhere) which makes it clear that Swing methods are
being called from the correct thread.
Other than the ones we introduced to fix this issue, there weren’t any non-EDT calls to begin with. The only exception is the socket reader waiting for messages from the server, which get decoded into a Command and executed on the EDT.
Knock yourself out, but I find that refactorings whose purpose is to improve code “clarity” or “readability” are a waste of time. More than likely, the next person to come across your refactored code will find it just as confusing as you found the original code.
When I say that I want to improve clarity, what I’m implying is that I think
there are a ton of burried problems which would become clear in so doing. This
would not be merely a readability exercise.