About bug 12554

Flint1b · May 29, 2020, 8:26am

I could reproduce the bug, did two runs and the move/undo command pairs looked like this (2nd run in brackets, where the values were different):

* move command
  - MovePiece
    - oldPosition: java.awt.Point[x=75,y=316]
    - newPosition: java.awt.Point[x=185,y=309] [2nd: 195/299]
    - seq
      - 0: NullCommand
        - seq
          - 0: ChangePropertyCommand
            - property
              - name: Blockade
              - oldValue: 0
              - newValue: 1
            - seq
              - 0: NullCommand
                - seq
                  - 0: MovePiece
                    - oldPosition: java.awt.Point[x=185,y=309] [2nd: 195/299]
                    - newPosition: java.awt.Point[x=189,y=316]
                    - seq
                      - 0: AddPiece
                        - state: Map;189;316;1590737932607
          - 1: DisplayText
            - msg: Union Naval Blockade Level set to 1.

* undo command
  - DisplayText
    - msg: * UNDO: * Union Naval Blockade Level set to 1.
    - seq
      - 0: RemovePiece
        - target: TheStack
        - seq
          - 0: MovePiece
            - oldPosition: java.awt.Point[x=189,y=316] [2nd: 195/299]
            - newPosition: java.awt.Point[x=185,y=309] [2nd: 75/316]
          - 1: ChangePropertyCommand
            - name: Blockade
            - oldValue: 1
            - newValue: 0
      - 1: MovePiece
        - oldPosition: java.awt.Point[x=185,y=309] [2nd: 195/299]
        - newPosition: java.awt.Point[x=75,y=316]

First questions that come to mind:

The oldPosition (75/316) is the same for both runs, the newPosition is different as I am obviously not able to drop the counter at the exact same point, but what is this third position, the (189/316) that is the same for both runs? Is it a predefined position on the map, the one that the counter is supposed to snap to?
Why is the tree structure of the initial MoveCommand so weird, why are MovePiece and AddPiece children of the ChangePropertyCommand, shouldn’t they be siblings if anything?

And the Undo command is even more weird, its structure is nothing like the Move command, it is missing all these NullCommands, and the positions are even more messed up, between the Move and Undo commands, and between both runs - why is (189/316) which seems to be a fixed position on the map present in the 1st run’s Undo but not present in the 2nd run’s Undo? Why does the 2nd test run have the same old/new position values for both MovePiece commands inside the Undo command, while the 1st test run has different ones? Is this possibly the source of the bug, that the positions get messed up when the Undo command is created?

Flint1b · May 29, 2020, 12:09pm

Another question, the initial Move command is never properly executed with Command.execute(), instead it is built inside the PieceMover.movePieces(), which only calls Command.execute() on the DisplayText command.

The Undo OTOH works as I would expect it, it builds the tree of commands first, then calls Command.execute() on the root command, which in turn calls execute() recursively on all the commands inside the tree.

Is this intended to work that way? I would expect both actions to work synchronous to each other, i.e. either both properly call Command.execute() on the root command, or both run through PieceMover.movePieces().

The recursive processing inside Command.execute() looks more correct though, maybe the PieceMover.movePieces() is how it used to work before this recursive processing, and this particular case was forgotten when the recursive idiom was implemented?

Or is this legacy code and there is no one who knows how it is supposed to work

Cattlesquat · May 29, 2020, 5:31pm

So a couple things I know:

The test-module simulates a case where a piece receives a second move command while processing the keystroke-after-move from a map. So the positions you are seeing in the move commands are… (a) position where piece started, (b) position where you dragged it to, and (c) position where module moved it again instead.
PieceMover.movePieces() is one of the methods that generate commands (note that it returns a Command), as opposed to the ones that process them. The commands themselves are very simple, but the command generators are quite complex, especially the ones that deal with stacks of units. The commands are supposed to contain their own undos, and to a certain extent they do, but here we’ve encountered some complexity.
Worth noting that this behavior doesn’t happen if you add a “does not stack” trait (Immobilized class) to the counter – which causes pieces to be treated in a different way and doesn’t run as much complex stacking code.
The exotic behavior you’re seeing in the tree of commands has to do with the way the command strings are being generated/appended-to on the fly. You get a much more straightforward command string on the way back, because it just tries to reverse the meaningful things it did before. As near as I can see, though, none of the funny tree structure is causing any problems – when I’ve traced through I see the right commands being executed in the right order, at least to the extent you trust the set of commands being generated (more on that below)
It looks like what’s getting sent in the command package is MovePiece-MovePiece-AddPiece, and so it moves the piece twice and then “adds it” at the final location (and I think the “adding” is supposed to make sure it has a real “stack”, a little virtual extra piece node, to point at it at its final location). So when it tries to reverse it, we get instead… RemovePiece-MovePiece-MovePiece. And so after “removing” the piece from its current stack, it moves it twice… but then does not “add” it at its destination!
So I’m not 100% on the function/necessity of the adding and removing, but it’s easy to see that the command and the “undo”, even though they both nominally move pieces, are NOT SYMMETRICAL. And that seems bad. Apparently it’s close enough to work (or at least “work” in the sense of not creating an immediately obvious symptom – historical reports of difficult-to-reproduce undo problems are manifold however) if we undo only a single move, but it visibly fails on the double move.
What I imagine is likely to be a solution is to make sure the original command string is symmetrical so that its undo will be symetrical as well. So a single-move of a piece, instead of generating “MovePiece-AddPiece” should generate “RemovePiece-MovePiece-AddPiece”. And that way its undo will generate “RemovePiece-MovePiece-AddPiece”! And so then a double-move ought to be, not “Move-Move-Add”, but rather “Remove-Move-Add-Remove-Move-Add”, which would reverse to precisely the same sequence (just with the coordinates reversed) and ought to undo it.
But assuming I’m right on my theory above, that means having to go through all the various StackMetrics and PieceMover and SendToLocation (and other?) code that generate command strings to move stacked pieces, and making them write the command strings in this more robust order. The fact that it’s currently generating “Move-Move-Add” which isn’t even consistent in the one direction, gives one an idea that this might be challenging.

I’m hopefully not THE expert on this (please let me not be THE expert on this!) but I’ve certainly been doing some research on it and that’s what I’ve got so far.

Brian

p.s. VERY happy to have you take a peek!

Flint1b · May 29, 2020, 7:11pm

I get the feeling that after we have solved this, we will be the experts on command processing

Anyways, thanks a lot for your input and for the test-module, it makes it really easy to reproduce the bug. My initial idea seems to be correct, a single move by the user becomes two separate moves in the engine, this makes sense, the moved piece snaps into a predefined location when the user drops it somewhere near that location.

The two parts not being symmetrical also makes some sense as the initial move command tree has a lot of NullActions, these are not significant and are probably not recorded in the command log and don’t have to be part of the undo command tree.

What is really suspicious right now is this PieceMover.movePieces() thing, yes it has a lot of code to build the initial move command, but it doesn’t just build the command, it actually calls execute() on one of the commands from the command tree. Also very suspicious, after movePieces() returns the command tree, no one calls execute() on the root command of that tree, unlike in the undo part where the undo command is first created, then execute() is called on its root command. It sounds like movePieces() should NOT be calling execute() on any of the commands, it should only build the whole tree and return it, and also sounds like the caller of movePieces() should be calling execute() on the root of the tree, which no one does at the moment. If these assumptions are correct then the fact that the piece moves in the first place could be just a lucky coincidence which happens due to some side-effect. Maybe it doesn’t even finish the move properly but only appears to have finished moving while behind the curtain it is still stuck somewhere down the middle. Either way, there is definitely something wrong in that area, the MoveCommands should be processed equally, regardless of whether they are part of the initial move or part of the undo. Right now they are not processed equally at all.

One other thing that I’m afraid of, most test runs gave me the same results, same command structure, same undo. But some of the test runs, especially those where I had breakpoints that stopped the execution in the middle of the move, resulted in a completely different command structure, even though my input was always the same. In one test run I even had the move command become two completely separate commands, and the undo was also possible two times instead of just one. This smells like threading problems, race conditions. But first things first, the “regular” situation that happened in most of the test runs is already broken, that one should be fixed first, after that we can still look at these rare special cases.

One quick question though, the piece that is moved in this test module is an object of class “Stack”, which has an object of class “ReportState” as its child. I understand what a stack is, but what is a ReportState, is this some special kind of counter? Probably a very dumb question but please bear with me, I only learned half a year ago what a wargame and a Combat Result Table is, and when I see you guys write about doing things in the Vassal Editor I understand next to nothing.

Cattlesquat · May 29, 2020, 8:54pm

To your question, probably the best thing is for me to give you a quick tour – from your main vassal launcher, find the “Undo Bug” module, right click on it, and select “Edit Module.”

When the editor window opens up, open up the “Game Piece Prototype Definitions”. Within that, double click the “Blockade Marker”, which is the prototype for the little piece we’re tracking in this.

So then what you see listed are the piece’s “traits”. Some of them (e.g. Layer & the shadow prototype) just have to do with the visuals.

When the piece gets dragged to a new location, the first Trigger it counters is the “Detect Movement” trait. You can open that up and see that it then fires two more triggers, “SetBlockade” and “ReportOnMap”.

The SetBlockade one invokes the Set Global Property trait down below, which is what causes the change-global-property part of the command to get added. But SetBlockade also causes the “UpdateValue” trigger to fire, which in turn fires a “MoveMarker”. “MoveMarker” is the keystroke for that “Send To Location” trait up at the top, and that causes the piece to be moved immediately to the same location (just now to the “center” of that location).

Take a minute or two and click around between the traits, just to get the feel for it. Each time a “keystroke” comes through (“keystroke” being a slightly legacy and confusing name for these little triggers that fire), imagine it searching through all the traits from top to bottom to see if any of them are activated by this keystroke (so if the keystroke “MovedOnMap” is activated, as it is each time the player drags the piece somewhere, then it goes through the list and checks if anything in here activates on MovedOnMap, etc). And then some of the trigger traits cause other ones to fire off in a chain reaction, etc. Kind of like a primitive scripting language.

The processing of all these keystrokes within the piece, caroming around in there, all happens as a result of the keystroke-on-move thing toward the end of PieceMover.movePieces() – that’s what’s sending the “MovedOnMap” keystroke. And see at this point it already has a Command started, consisting of the its initial idea of what MovePiece stuff etc, is necessary to get the piece to move from the old spot to the place the player dragged it. And so now it appends any additional parts of the command that are generated by the processing of the MovedOnMap keystroke – so in this case the change to the global property and the SendToLocation second move.

Now as far as “Stacks” go, for pieces that have the stacking property enabled (this piece does), they go in what I think are in little linked listy things. On top will be a “piece” that represents the stack itself - it’s just an empty piece that doesn’t do anything else - and it then points to the first “actual piece” in the stack. In this case this little piece isn’t really stacking with anything so we’re not seeing multiple pieces in stacks and stuff, but that’s part of what the complexity of the PieceMover is all about – because we might be pulling a piece out of a stack with other pieces, or it might be alone. And then where we’re going to we might be needing to start a brand new stack just for this piece, or we might be needing to “merge” into an existing stack in the target location. And so it needs to generate “chunks of a command” in order to cause all that to happen.

Hopefully that’s a useful tour! You’ll certainly want to spend a little time getting familiar with what’s going on “up on top”. FWIW, I opened my very first VASSAL module back this most recent November

Brian

shilinski · May 29, 2020, 9:18pm

Just for my information, but what exactly is bug 12554? I assume it has something to do with Undo? I recently made a vmod so I could play the game Barrage solo, and I can produce an undo error so predictably that I am about to eliminate the Undo button as a precaution. And excuse my ignorance, but how do I look up bugs?

This reminds me of long ago when I worked at NASA, and I told the hardware engineer about some bugs I’d found. “Those aren’t bugs,” he explained. “Those are features.”

Cattlesquat · May 29, 2020, 9:42pm

Scroll to the (very) top of this page, and on the (very) top bar you will see “Tracker”. That takes you to the Bugzilla tracker for the project. You can then “Browse” bugs or search for things, etc, etc. It’s a typical bug-tracking format for the software industry, etc.

Meanwhile you can get directly to bug 12554 here: vassalengine.org/tracker/sho … i?id=12554

It is indeed an UNDO bug. Good to know you too have a reliable Undo bug repro – could I ask you to please describe it in detail? Does it by any chance involve situations where a piece the player has dragged somewhere ends up being moved a second time during the processing of the move?

Brian

Brent_Easton · May 29, 2020, 11:09pm

Large parts of the Vassal core is this way. You touch it and every thing breaks.

The general process of transmitting actions in Vassal is:

Do some stuff and encapsulate it into Commands.
Call GameModule.getGameModule().sendAndLog(commands) to send the commands to a log file and/or other players connected to the same online session.
The players at the other end (online or replaying log file) read the Commands, decode them and call the execute() method to make them happen on their client.

The assumption is that (3) makes the same things happen that (1) did in the first place.

The problem is in section 1 with the way the Commands are built that record the ‘doing of stuff’.

Ideally, the generating client will build a command that does the stuff, then call execute() at their end, then call sendAndLog() to send the exact same Command to the other end.

However, what you find happening in a lot of the very old code is this:

Do some stuff.
Build a Command that represents that stuff but don’t call execute() because we have already done the stuff.
Call sendAndLog() to send the Command off.

You can see there is a problem if the Command built in (2) doesn’t actually do all the stuff that got done in (1). This can happen where the old code in section (1) is changed, but the commands built in section (2) aren’t updated to match. These problems are usually pretty obvious - online clients and log file replayers don’t see the same thing that the source client saw.

Undo is handled by each Command individually. When a Command is built, it builds it’s own Undo command that is supposed to exactly reverse its own action. When you click Undo, then the Undo Command from the last logged Command is retrieved, its execute() method is called and then it is sent to sendAndLog() to let everyone else see it.

Potential problems with Undo are:

The Undo Command built by a Command doesn’t actually exactly Undo the Commands original stuff.
In a Chained command sequence, All of the sub-command undo Commands need to be executed and it all has to happen in the exact reverse order that they where issued in.

NullCommand are just NOP’s added in by calls that need to return a Command, but have nothing to do. They are also often created as the first Command of Sequence to ensure there is at least one Command in the List. A NullCommand has no Undo.

Hope this helps.

Brent_Easton · May 29, 2020, 11:38pm

Brian,

The AddPiece Command is very low level, it isn’t doing very much, it is just adding the piece to the GameState and setting it’s state, which includes it’s position BUT NOT its previous location, this saved in Properties in the piece, but is only encoded in the MovePiece undo command that moved it there.

I think you will find that the MovePiece Commands do this anyway, so I am not sure why that extra AddPiece at the end is even needed.

I would concentrate on the MovePiece commands as their Undo Command is the one that is apparently failing.

Cattlesquat · May 29, 2020, 11:39pm

Here we definitely have “Potential Problem #1”. It is actually happening in exactly the reverse order (I stepped through and watched it), it’s just that reverse-order turns out not to be symmetrical because an insufficient command string was built in the first place.

I’m just aghast at what it might potentially take to make it work right (see also: “You touch it and everything breaks.”)

Brent_Easton · May 29, 2020, 11:48pm

Welcome to my world

Brent_Easton · May 30, 2020, 12:24am

There are some other approached to this problem.

This problem is being caused by a ‘Apply after Move’ key causing a drag and dropped unit to move again before the first move has completed.

Are we aware of any other Undo problems that are NOT caused by this? I suspect not.

Apply After Move was a relatively late addition and the drag’n’drop moves appear to be fine if no ‘Apply After Move’ is involved (Rule number 1 - Touch it and everything breaks).

It would be an interesting experiment to see if you can reproduce the same bug using Send To Location trait’s. I suspect not as the coding is far simpler, it creates a single set of Commands that are executed to perform the move. And, importantly, the Apply After Move Command is added after the Move has completed.

Rather than trying to fix the commands created by the PieceMover, which works fine if no Apply After Move key occurs while it is running, it may be easier to defer the ‘Apply After move’ key execution until after the move completes.

shilinski · May 30, 2020, 12:54am

My version of the Undo problem is different. No dragging and dropping seems to be involved. No stacking either. It involves sending pieces from zones to regions.

A player window has 6 zones. Each zone is numbered ‘1’ to ‘6’. In each zone there is a corresponding region, “r1” to “r6.” I press on button on the toolbar, which sends a global key command to all pieces in the window. The command causes every piece in a zone to go to a single defined region in the next zone.

Specifically, each piece in a zone uses send-to a region = “r” + CurrentZone + 1. So a piece in zone ‘1’ goes to region ‘r2’ in zone 2. I suppose I could try zone to zone, but I haven’t. Mine are all zone-to-region.

(The 6 zones are pie slices in a cylinder. I divided the “pie” into 6 slices. The effect is all pieces on the pie advance to their next zone in a clockwise fashion. Don’t worry about what happens after 6.)

If I put a single piece in a zone, nothing goes wrong with undo’s that I have detected.
But suppose I put a piece in zone 1 and another in zone 2. I advance once. I undo. It works, but now everything goes wrong.
If I advance them again via the GKC, they both end up stacked in zone 3. The piece in zone 1 skips to 3. I put a key command in the pieces so if I right clicked them, they would report their current zone. The piece in zone 1 said “I am in zone 1,” but it still went to zone 3. This seems to mean that when the GKC calculates the next destination, it’s not using CurrentZone as it is suppose to, or CurrentZone is not what the piece said it was at the instant of calculation.
And to make things worse, whatever the undo poisoned in the red player’s window also migrated to the other player window. I got the same behavior in those windows without doing an undo.

This problem is guaranteed reproducible in my case.

shilinski · May 30, 2020, 1:00am

Oh, and be careful about stacking send-to’s in a trigger trait. Things can go awry there too with pieces not going where intended, and you might wrongly conclude it’s involved with the undo problem.

Brent_Easton · May 30, 2020, 1:13am

Can you send me a log file showing the issue and I will look into this one while the others keep working on the Move after Apply case.
Cheers.

Brent_Easton · May 30, 2020, 2:44am

Did you mean stacking Send-to’s or stacking ‘Move Fixed Distance’?

I couldn’t find any problem with stacked Send-to’s in a Trigger, but Stacked ‘Move Fixed Distance’ commands in a Trigger are totally borked.

shilinski · May 30, 2020, 4:04am

I don’t remember. About a year ago, a guy had problems. He was simulating an airplane. He had these discrete commands to “move 3 north”, “move 2 East”, etc, and he wanted to put them together in triggers to simulate complex maneuvers. I discovered when in a trigger, vassal did not update the piece’s state after each discrete move. The plane never ended up where he expected. I can’t believe I didn’t try send-to’s, and I remember we never found a solution using his approach. Maybe I can work up a test module.

I sent you some private messages about my other problem, but maybe you were asking about this one.

shilinski · May 30, 2020, 4:27am

I just checked out stacked send-to’s in a trigger by offsetting CurrentX and CurrentY–and it worked correctly. It must have been “move fixed distance.” My error. I wish I remembered what I was thinking back then.

Cattlesquat · May 30, 2020, 5:00am

Hi Brent,

By this:

Do you mean “execute” the piece move’s move command and then generate (and execute) a second separate command with the apply-after-move stuff?

One side effect of this would be that move commands would undo in two steps, right? First undo would undo apply-after-move and second click of undo would undo the actual primary move? Because undo reversed the (single) most recent command chain.

Or were you thinking of a different approach?

Brian

Brent_Easton · May 30, 2020, 5:59am

Not as a second separate command. You are correct, this would require 2 Undo’s to undo. Either appended last to the original command stream, or create a new Command with the results of the Move appended first and the Apply after Move second.