Sluggish online play

marktb1961 · December 5, 2021, 12:21pm

Is it to be expected that Vassal could get very slow in a multiplayer game with a module that, with fewer players, gives those same players a good level of responsiveness ?

The main issue we experienced was that some players - especially as the game went on - found that selecting and dragging a piece took far too long. Others were more or less ok throughout.

Things that didn’t seem to make much difference included Max Heap Size and whether or not centre on opponents more was turned on or not. Disconnecting/Reconnecting and even saving the game and restarting didn’t seem to help either.

Is there anything particular to online games that can cause extended delay. Is it possible for one of the players to become a bottleneck for others being able to move pieces, e.g. if they are on a slow machine or network ?

uckelman · December 5, 2021, 8:06pm

How many players are we talking about here?

Only commands are sent over the wire—i.e, things which would be written to the log, if you’re writing a log. Selection and deselection aren’t sent; drags don’t result in a command until the drop happens.

There’s no reason I can think of that the pre-drop part of a drag would be affected just by the number of players connected to a live game.

marktb1961 · December 5, 2021, 8:21pm

“Only” 8 players and one observer 2 or 3 were writing a log.
People mainly on windows, with a Mac and a Linux machine.
All on v3.6.0 or 3.6.1. Observer/log writer on v3.5.8.

But, thanks you confirmed pretty much what I thought.

uckelman · December 5, 2021, 8:23pm

I doubt it contributed to the problem you’re seeing, but it’s asking for other problems if the users are running different major versions.

Did dragging slow down for everyone?

marktb1961 · December 5, 2021, 8:51pm

Some noticed a slow down a lot more than others. I didn’t really notice at all. I opened the game room and preset file but I had to reconnect part way through due to a network glitch. Responsiveness was still ok for me afterwards.

Syncing back in took during the game seemed to take longer than at the start too.

Point taken about mixed versions. I was running the old instance purely for logging but maybe even that isn’t a great idea.

marktb1961 · December 5, 2021, 9:22pm

I think my Vassal log file got over-written when I restarted, but in what remains there are several repeats of this error block, which I haven’t seen in v3.5.8 logs:

2021-12-04 16:52:20,162 [90933-ProcessLauncher-2] WARN  VASSAL.tools.logging.LoggedOutputStream - 2021-12-04 16:52:20.162 java[90935:21673463] Bad JNI lookup accessibilityHitTest
2021-12-04 16:52:20,162 [90933-ProcessLauncher-2] WARN  VASSAL.tools.logging.LoggedOutputStream - 2021-12-04 16:52:20.162 java[90935:21673463] (
	0   libawt_lwawt.dylib                  0x0000000105761d39 -[JavaComponentAccessibility accessibilityHitTest:withEnv:] + 153
	1   libawt_lwawt.dylib                  0x000000010570dd93 -[AWTView accessibilityHitTest:] + 179
	2   AppKit                              0x00007fff235fed71 -[NSWindow(NSWindowAccessibility) accessibilityHitTest:] + 309
	3   AppKit                              0x00007fff231a2d0c -[NSApplication(NSApplicationAccessibility) accessibilityHitTest:] + 342
	4   AppKit                              0x00007fff23173bf3 CopyElementAtPosition + 150
	5   HIServices                          0x00007fff257e7a2b _AXXMIGCopyElementAtPosition + 336
	6   HIServices                          0x00007fff25808708 _XCopyElementAtPosition + 369
	7   HIServices                          0x00007fff257c693c mshMIGPerform + 182
	8   CoreFoundation                      0x00007fff20533a44 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE1_PERFORM_FUNCTION__ + 41
	9   CoreFoundation                      0x00007fff20533925 __CFRunLoopDoSource1 + 619
	10  CoreFoundation                      0x00007fff20531faf __CFRunLoopRun + 2400
	11  CoreFoundation                      0x00007fff20530f8c CFRunLoopRunSpecific + 563
	12  HIToolbox                           0x00007fff28778a83 RunCurrentEventLoopInMode + 292
	13  HIToolbox                           0x00007fff287786b6 ReceiveNextEventCommon + 284
	14  HIToolbox                           0x00007fff28778583 _BlockUntilNextEventMatchingListInModeWithFilter + 70
	15  AppKit                              0x00007fff22d3a172 _DPSNextEvent + 864
	16  AppKit                              0x00007fff22d38945 -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 1364
	17  libosxapp.dylib                     0x000000010592956a -[NSApplicationAWT nextEventMatchingMask:untilDate:inMode:dequeue:] + 122
	18  AppKit                              0x00007fff22d2ac69 -[NSApplication run] + 586
	19  libosxapp.dylib                     0x0000000105929339 +[NSApplicationAWT runAWTLoopWithApp:] + 185
	20  libawt_lwawt.dylib                  0x000000010576a1b9 +[AWTStarter starter:headless:] + 505
	21  libosxapp.dylib                     0x000000010592b00f +[ThreadUtilities invokeBlockCopy:] + 15
	22  Foundation                          0x00007fff212e2b81 __NSThreadPerformPerform + 204
	23  CoreFoundation                      0x00007fff205332bc __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
	24  CoreFoundation                      0x00007fff20533224 __CFRunLoopDoSource0 + 180
	25  CoreFoundation                      0x00007fff20532fa4 __CFRunLoopDoSources0 + 242
	26  CoreFoundation                      0x00007fff205319cc __CFRunLoopRun + 893
	27  CoreFoundation                      0x00007fff20530f8c CFRunLoopRunSpecific + 563
	28  libjli.dylib                        0x0000000103476d32 CreateExecutionEnvironment + 402
	29  libjli.dylib                        0x0000000103472615 JLI_Launch + 1493
	30  java                                0x0000000103467c0e main + 414
	31  libdyld.dylib                       0x00007fff20456f3d start + 1
	32  ???                                 0x000000000000000f 0x0 + 15

uckelman · December 5, 2021, 9:34pm

That looks like the result of a bug in something upstream from us, either Java or MacOS.

marktb1961 · December 6, 2021, 9:29am

Looks like a MacOS Java 17 regression that is fixed in 17.0.2. See if that’s your reading of it too.

Anyway, I can’t say I saw any impact of it during usage. I’ll look again when you next ship a bundled Java update.

marktb1961 · December 6, 2021, 11:38am

@uckelman on the original problem, I have asked the players to send me log files if they still have them.
On PCs what directory should they be looking in normally? and is the Vassal log file of the form errorLog-3.6.x on all systems (3.6.0,3.6.1 etc) ?

uckelman · December 6, 2021, 11:58am

Yes, it looks like it’s fixed in 17.0.2. I don’t see a released 17.0.2 yet, though. We’re bundling 17.0.1+12 right now. I upgrade the bundled Java when a new release is available, so we’ll ship 17.0.2 in the next Vassal release after 17.0.2 is out.

Error log locations: Error Logs - Vassal

Yes, the error log should be named for the version in use.

marktb1961 · December 6, 2021, 8:56pm

Thinking about this some more, wondering if it might have been synchronisation from other players’ machines causing the performance issue.

No one seems to have a VASSAL log unfortunately, though based on my log, I don’t think it would add anything. Likewise, the players’ machines were reasonable even for those who had very bad performance - a good spec, available disk space etc.

Here is a typical report “Dragging and dropping gave no visual feedback and it took several seconds after releasing the mouse button before the unit appeared (and if I moved the mouse in the meantime, the unit appeared at the new location). Rolling dice, opening other windows (my hand, card piles) did not seem affected.”

This makes me think that it might be related to particular part of the module (the main map or more likely particular types of piece).

Could contention arise between the local select/drag and a state change on the selected piece from another instance (Restrict Commands or Calculated Properties, perhaps), or maybe changes to other unrelated pieces?

Is there any VASSAL or system logging that could be enabled to get a better handle on this?