threading problems in VASSAL.chat, Bug 2475197

uckelman · January 6, 2009, 10:20pm

I understand the cause of the frequently-reported Bug 2475197 now:

sourceforge.net/tracker2/?func= … tid=594231

What’s happening is that some Swing methods are being called off the EDT by
various classes in the VASSAL.chat hierarchy. Sometime between beta5 and beta6,
I noticed that a lot of communication with the server was taking place on the
EDT. Passing messages back and forth with the server can sometimes take a few
seconds, so whenever that happened the GUI would block. In order to stop the
GUI from hanging during long waits, I moved these requests into background
threads. Unfortunately, there are more Swing method calls in VASSAL.chat than
I realized, and some of what I moved into background threads is itself calling
Swing methods—this is what’s leading to the mysterious
“ArrayIndexOutOfBoundsException: No such child: 3” messages.

I’m uncertain how to tackle this problem, since I think sorting out threading
in VASSAL.chat will be rather a large job. How will VASSAL.chat change when
we switch to the new Jabber-based server? I don’t want to put a lot of effort
into untangling this if we’re going to chuck the code I’d be fixing a few
months from now.

tar · January 6, 2009, 10:45pm

I once worked on a project where the powers-that-be used some
different thread-handling class that was able to detect when Swing
calls were not being made on the EDT. It allowed one to track down
those locations. Unfortunately, I don’t recall exactly what was being
done there.

However, I did find this interesting article:

thejavacodemonkey.blogspot.com/2 … ns-of.html

I don’t know if that counts as being too much trouble to track down or
not.

On Jan 6, 2009, at 2:20 PM, uckelman wrote:

…

Messages mailing list
Messages@forums.vassalengine.org
forums.vassalengine.org/mailman/ … engine.org

Post generated using Mail2Forum (mail2forum.com)

uckelman · January 6, 2009, 11:24pm

Thus spake Thomas Russ:

That looks interesting. I’d settle for having an

assert(SwingUtilities.isEventDispatchThread());

at the head of each Swing method. That way, we could turn on assertions if
we wanted to go bug hunting, and there’s zero performance impact. (I don’t
know why Sun hasn’t done that, honestly. One person and a Perl script could
make the changes in a few minutes.)

–
J.

Messages mailing list
Messages@forums.vassalengine.org
forums.vassalengine.org/mailman/ … engine.org

Post generated using Mail2Forum (mail2forum.com)

uckelman · January 6, 2009, 11:30pm

Thus spake “uckelman”:

I understand the cause of the frequently-reported Bug 2475197 now:

sourceforge.net/tracker2/?func= … 90612&atid
=594231

What’s happening is that some Swing methods are being called off the EDT by
various classes in the VASSAL.chat hierarchy. Sometime between beta5 and beta
6,
I noticed that a lot of communication with the server was taking place on the
EDT. Passing messages back and forth with the server can sometimes take a few
seconds, so whenever that happened the GUI would block. In order to stop the
GUI from hanging during long waits, I moved these requests into background
threads. Unfortunately, there are more Swing method calls in VASSAL.chat than
I realized, and some of what I moved into background threads is itself callin
g
Swing methods—this is what’s leading to the mysterious
“ArrayIndexOutOfBoundsException: No such child: 3” messages.

I’m uncertain how to tackle this problem, since I think sorting out threading
in VASSAL.chat will be rather a large job. How will VASSAL.chat change when
we switch to the new Jabber-based server? I don’t want to put a lot of effort
into untangling this if we’re going to chuck the code I’d be fixing a few
months from now.

I know some more about this problem now: The most common exception, and the
only one I know how to cause right now, is the result of
SimpleStatusControlsInitializer.uninitializeControls() being called off of
the EDT.

I think that potentially any class which implements ChatControlsInitializer
could have its initializeControls() or uninitializeControls() methods called
off the EDT. An ugly kludge we could apply would be to wrap all calls to
those methods in an invokeLater() (or maybe invokeAndWait()). I suspect that
would eliminate most of the symptoms of the problem—but it won’t leave
the VASSAL.chat code looking any nicer… Eventually we’ll have to clean it
up.

–
J.

Messages mailing list
Messages@forums.vassalengine.org
forums.vassalengine.org/mailman/ … engine.org

Post generated using Mail2Forum (mail2forum.com)

rk1 · January 7, 2009, 12:32am

Switching to Jabber will probably not have an impact on this issue.

What you’re probably running into is the DelegateClient, which queries the web site and dynamically builds a client when connecting. Different clients have different GUI controls, so the GUI gets rearranged depending on which client gets built. The right solution is probably to push the logic of spawning a background thread down one level into the lower-level client implementations. Since you’re waiting for the result of a network operation to know what to do with the GUI, there’s no getting around having to wrap the GUI operations in a SwingUtilities.invokeXXX when the operation completes. An imperfect but simpler solution is to be asynchronous on send() operations, but not on connect/disconnect. Doing connect/disconnect on the EDT makes the GUI operations safe, and you can safely put send() into a background thread because it doesn’t do any GUI operations.

Let me know if you’d like me to code up this fix.

rk

Post generated using Mail2Forum (mail2forum.com)

rk1 · January 7, 2009, 5:56am

The actual class name is DynamicClient.

rk

Post generated using Mail2Forum (mail2forum.com)

uckelman · January 7, 2009, 11:47am

Thus spake “Rodney Kinney”:

Hmm. Ok. I don’t understand the architecture of the chat package—I haven’t
been able to glean much from it in the time I’ve spent with the code, and
there’s almost no documentation for it. Would you be able to describe how it
all fits together? (I also don’t see where the server code is—I was
very surprised to find that the server code is under VASSAL.chat.)

Will clients other than the Jabber one still be in use once the server
changes? If so, why?

That would be fine if the client stuff were already running on an independent
background thread—but it wasn’t before. The changes I made cause some
requests to be executed in the background by a SwingWorker, so the client
still isn’t really operating on it’s own long-lived thread. I think it would
be better if it were that way, since then it would be more straightforward to
isolate the GUI from the communications code.

Making connection and disconnection asynchronous was the nearly the whole
purpose of the changes I made, as those were the two operations which were
noticeably blocking the GUI.

–
J.

Messages mailing list
Messages@forums.vassalengine.org
forums.vassalengine.org/mailman/ … engine.org

Post generated using Mail2Forum (mail2forum.com)

rk1 · January 7, 2009, 8:38pm

On the client side, the server is abstracted by the ChatServer interface. It’s inherently asynchronous: methods like setConnected() and setRoom() can return immediately, but the fact that the connection or room change was successful is communicated by a subsequent PropertyChangeEvent. So the room GUI registers as PropertyChangeListener to the ROOM property and changes the view when it get an event, and the connect/disconnect button is enabled/disabled in response to events on the CONNECTED property. Incoming game events are handled the same way.

There are three ChatServer implementations: the current server, the peer-to-peer direct connection, and the new Jabber server. I’d like to keep the current server implementation around at least for the initial release. That way, in case there’s a problem with Jabber we can switch over to the old server transparently without users having to download an update. I expect P2P to stick around indefinitely.

Let me take a look at this. It might be that the only thing we need to do is wrap the (un)initializeControls() calls in a SwingUtilitities.invokeLater().

rk

Post generated using Mail2Forum (mail2forum.com)

uckelman · January 7, 2009, 8:53pm

Thus spake “Rodney Kinney”:

Do you find that to be a misleading name? This is probably the source of
some of my confusion. Would you mind if it were called ChatClient instead?

It’s inherently asynchronous: methods like setConnected() and setRoom() can
return immediately, but the fact that the connection or room change was
successful is communicated by a subsequent PropertyChangeEvent. So the room
GUI registers as PropertyChangeListener to the ROOM property and changes the
view when it get an event, and the connect/disconnect button is
enabled/disabled in response to events on the CONNECTED property. Incoming
game events are handled the same way.

There are three ChatServer implementations: the current server, the
peer-to-peer direct connection, and the new Jabber server. I’d like to keep
the current server implementation around at least for the initial release.
That way, in case there’s a problem with Jabber we can switch over to the
old server transparently without users having to download an update. I
expect P2P to stick around indefinitely.

So where does the actual server code live?

Please do.

Would you like me to wait with beta7 for you to do that?

–
J.

Messages mailing list
Messages@forums.vassalengine.org
forums.vassalengine.org/mailman/ … engine.org

Post generated using Mail2Forum (mail2forum.com)

Brent_Easton · January 7, 2009, 9:03pm

The Server Main is in VASSAL.chat.node.Server.

B.

Messages mailing list
Messages@forums.vassalengine.org
forums.vassalengine.org/mailman/ … engine.org

Post generated using Mail2Forum (mail2forum.com)

rk1 · January 8, 2009, 1:15am

Yes, let’s hold up beta7 for this fix, since it’s causing automated bug reports.

rk

Post generated using Mail2Forum (mail2forum.com)

rk1 · January 8, 2009, 5:36am

I checked a fix in to the trunk.

rk

Post generated using Mail2Forum (mail2forum.com)

uckelman · January 8, 2009, 12:27pm

Thus spake “Rodney Kinney”:

This seems ok for the moment, though I’m not convinced that we’ve eliminated
all off-EDT Swing calls in VASSAL.chat. In the long-term, I’d like to do some
refactoring here (and elsewhere) which makes it clear that Swing methods are
being called from the correct thread.

–
J.

Messages mailing list
Messages@forums.vassalengine.org
forums.vassalengine.org/mailman/ … engine.org

Post generated using Mail2Forum (mail2forum.com)

rk1 · January 8, 2009, 8:34pm

Other than the ones we introduced to fix this issue, there weren’t any non-EDT calls to begin with. The only exception is the socket reader waiting for messages from the server, which get decoded into a Command and executed on the EDT.

Knock yourself out, but I find that refactorings whose purpose is to improve code “clarity” or “readability” are a waste of time. More than likely, the next person to come across your refactored code will find it just as confusing as you found the original code.

rk

Post generated using Mail2Forum (mail2forum.com)

uckelman · January 8, 2009, 8:45pm

Thus spake “Rodney Kinney”:

When I say that I want to improve clarity, what I’m implying is that I think
there are a ton of burried problems which would become clear in so doing. This
would not be merely a readability exercise.

–
J.

Messages mailing list
Messages@forums.vassalengine.org
forums.vassalengine.org/mailman/ … engine.org

Post generated using Mail2Forum (mail2forum.com)