CPU profiling

Flint1b · July 30, 2020, 11:52pm

What does this mean for the overall CPU time, for PR 75, what is a Trait stack, what is a “outermost piece”… I don’t know the internals well enough, I’m afraid this is beyond me.

uckelman · July 31, 2020, 12:09am

Thus spake Flint1b:

What does this mean for the overall CPU time, for PR 75, what is a Trait
stack, what is a “outermost piece”… I don’t know the internals well
enough, I’m afraid this is beyond me.

Each trait class except for BasicPiece wraps another trait class, which
is the “inner” piece. (That’s Decorator.inner, literally.)

So if you have, say, a piece that can has two sides and can be marked
moved, you might have this:

MovementMarkable → Embellishment → BasicPiece

The instance of MM is the “outermost”. It has some properties, but
delegates inward to Embellishment any properties its asked about but
doesn’t have itself, and Embellishment does the same thing. If you
get to the bottom and BasicPiece doesn’t have the property, then nothing
does and a null gets returned.

Ask Brian how many traits he has on some of the pieces in his modules. It’s
more than three.

This is a totally inappropriate design once you start getting traits more
than a few deep. Just do do a simple thing like get the value of a string,
you may have to go through all the layers one at a time, and that can be
horribly slow. This is the main reason why operations on pieces take
ages sometimes.

The solution to this problem is what I was talking about in the post
linked below:

This all orthogonal to PR 75. I’ll see about merging that on Friday.

–
J.

Flint1b · July 31, 2020, 12:34am

Oh my, that’s some good design.

Would it be possible to just put these traits with their properties into a list?

Something like this:

interface TraitProperty<T> {
  String getName();
  T getValue();
}

interface Trait {
  String getName();
  List<TraitProperty<?>> getTraitProperties();
}

interface GamePiece {
  List<Trait> getTraits();
  Object getTraitPropertyValue(String traitName, String traitPropertyName);
}

Or with classes or enums instead of String traitName…

Why was the Decorator pattern chosen anyways, it’s good for hiding what’s inside from the outside world, why do traits need to hide the other traits underneath them?

Brent_Easton · July 31, 2020, 1:11am

Decorator was chosen because the original author had read the Design Pattern book and thought it sounded like a good idea.

Conceptually, Decorator is actually an excellent match, because each trait potentially draws something on top of and hides or modifies the drawing of Decorators below it.

Unfortunately, instead of just using this for the visual drawing pipeline of the pieces, everything non-drawing related also got sucked into the Decorator design including property handling, status information (like Does Not Stack), Command & Control (Triggers), Reporting (Report State) etc.

In the early days, this wasn’t really a problem as the number of traits per counter was relatively small. As more and more functionality was crammed into Decorators, the efficiency problem has been getting steadily worse, regardless of the increase in CPU power over that time.

There is no shortcut to fix this that won’t completely break every module currently in existence. In my book, this is the fundamental, unfixable flaw. In one fell swoop it locks the core behavior into an unsustainable, inefficient design that also tightly binds the Model, View and Control into an untangle-able knot.

Unfortunately, flattening out the Decorator pattern to a List is not really going to help performance as the entire flow of control of all information will still have to pass through every trait. Joel’s mod is an attempt to short-circuit trait stack search for get/set property calls be essentially indexing direct links to the traits by property name.

The best we can do is remedial work in two main ways:

Short-circuit the trait stack traversals where possible. (Joel’s property work)
Locate and optimise expensive operations (PieceCloner fix).
Try and prevent unnecessary Decorator traversals in the first place.

BTW, I thought Joel’s and you results on PR#75 where comparable. Both showed a 40-50% improvement.

uckelman · July 31, 2020, 1:12am

Thus spake Flint1b:

Would it be possible to just put these traits with their properties into
a list?

O(n) access is not the best you can do for this. A map giving O(1) access
would be better.

Go reread the post I linked to further up this thread. It is possible and
I did some of the design work for it about 10 years ago.

Something like this:

Code:

interface TraitProperty {
String getName();
T getValue();
}

interface Trait {
String getName();
List<TraitProperty<?>> getTraitProperties();
}

interface GamePiece {
List getTraits();
Object getTraitPropertyValue(String traitName, String
traitPropertyName);
}

You’ll find something similar to this in the VASSAL.property package
already.

Or with classes or enums instead of String traitName…

Enums unfortunately don’t work so well for this because custom traits
will need to define more properties.

Why was the Decorator pattern chosen anyways, it’s good for hiding
what’s inside from the outside world, why do traits need to hide the
other traits underneath them?

It’s not to hide the traits. The purpose of it is to build up the pieces
in a particular order, in layers. Order matters. E.g., a rotation trait
rotates what’s inner with respect to it, so moving it in the order rotates
different stuff.

This is ok-ish as a design concept for module designers for thinking about
how pieces are assembled, but it’s not a good idea to have the code directly
mirror that. It was really easy for Rodney to implement initially, and I
imagine that he never expected anyone to have traits more than a few deep.

–
J.

Brent_Easton · July 31, 2020, 1:19am

If BasicPiece doesn’t have the property, then it passed to the Current Zone to resolve as a zone property, then to the Current Map to resolve as a Map level property, then to the GameModule to resolve as a top-level Global Property.

So,

Global Properties are resolved on GamePieces by first searching the full trait stack. There’s a real low-hanging fruit. The only advantage of searching through the trait stack for them is if a trait ‘over-rides’ a Global Property name for some reason. This would normally be a bug.
There are a large number of properties satisfied by BasicPiece that will greatly benefit from Joel’s approach.
BasicPiece also has a scratch-pad HashMap for property storage. Any SetProperty call that reaches BasicPiece stores into this HashMap. These property values are ephemeral and are not encoded in the GameState. This HashMap is used by several traits for temporary storage and also by the map-level GUI to store some state. The SELECTED property used to be stored here, but years ago I did some work to store it in the top-most Decorator as it is essentially a ‘global’ variable for the GamePiece. You will see some special code handling for this in Decorator that may need to be reworked into the new scheme.
There are some properties that have no ‘home’, but are built up piece by piece from all traits. PieceName, visibleState

Rgds.

Flint1b · July 31, 2020, 9:57am

Well you know the internals better than I do, and figured out the improvements. Let’s get working. Can I help somehow?

uckelman · July 31, 2020, 10:41am

Let me type up a description of what needs to be done later today. This is a task where a similar change will be applied to each class which implements GamePiece, but it’s entirely internal to each of those classes and won’t affect their behavior—so the bulk of the changes can be done in parallel and we could split them up and not step on each others’ toes.

uckelman · July 31, 2020, 8:39pm

Here’s a sketch of what we need to do for flattening GamePiece property access:

// We're turning get/set for each property into lambdas, so we need a
// single-method interface for each 
interface PropertyGetter {
  Object get(Object key);
};

interface PropertySetter {
  void set(Object key, Object value);
}

// Add to GamePiece:

  java.util.Map<String, PropertyGetter> getPropertyGetter();
  java.util.Map<String, PropertySetter> getPropertySetter();

// Implementation for Decorator

  // It would be great if this could be a constant static map 
  // with the accessors and lambdas having an argument for the piece;
  // that way, we could have one of these per class rather than one per
  // instance. Unfortunately, if we're going to use the accessors for
  // direct access, we need with them the trait which they will access,
  // so one per instance looks necessary.
  //
  // It would also be great if we could have ONE map rather than two,
  // but unfortunately there are some properties where the get and set
  // are not at the same level. Which may be necessary, or may be something
  // we can rectify.

  // The contents of lgmap are extracted from the old getProperty();
  // "l" is for "local". It's proably useful to compare the contents of this
  // map with what's in Decorator.getProperty() now, as that's the
  // source of it.
  private final java.util.Map<Object, PropertyGetter> lgmap = java.util.Map.ofEntries(
    java.util.Map.entry(
      Properties.KEY_COMMANDS,
      k -> getKeyCommands();
    ),
    java.util.Map.entry(
      Properties.INNER,
      k -> piece
    ),
    java.util.Map.entry( 
      Properties.OUTER,
      k -> dec
    ),
    java.util.Map.entry(
      Properties.VISIBLE_STATE,
      k -> myGetState() + piece.getProperty(Properties.VISIBLE_STATE),
    ),
    java.util.Map.entry(
      Properties.SELECTED,
      k -> selected
    )
  );

  // Isn't it annoying that the best name for one of the items we deal with
  // is "map" when the interface is also called "Map"?

  private final PropertyGetter default_getter = k -> piece.getProperty(k);

  // The contents of lsmap are extracted from the old setProperty()
  private final java.util.Map<Object, PropertySetter> lsmap = java.util.Map.ofEntries(
    java.util.Map.entry(
      Properties.INNER,
      (k, v) -> setInner(v)
    ),
    java.util.Map.entry(
      Properties.OUTER,
      (k ,v) -> dec = (Decorator) v
    ),
    java.util.Map.entry(
      Properties.SELECTED,
      (k, v) -> {
        setSelected(v instanceof Boolean ? v : false);
        piece.setProperty(Properties.SELECTED, v);
      }
    )
  );

  private final PropertySetter default_setter = (k, v) -> piece.setProperty(k, v);

  // getPropertyGetters(), getPropertySetters() should return a modifiable
  // map (i.e., NOT lgmap, lsmap which are unmodifiable) so that we can add to the
  // map from the bottom up:

  public java.util.Map<Object, PropertyGetters> getPropertyGetters() {
    java.util.Map<Object, PropertyGetters> m = piece.getPropertyGetters();
    m.putAll(lgmap);
    return m;
  }

  public java.util.Map<Object, PropertySetters> getPropertySetters() {
    java.util.Map<Object, PropertySetters> m = piece.getPropertySetters();
    m.putAll(lsmap);
    return m;
  }

  // Anything which is terminal (I think that's only BasicPiece, all the
  // other GamePieces should be Decorators?) would implement it like this:
  
  public java.util.Map<Object, PropertyGetter> getPropertyGetters() {
    return new HashMap<>(lgmap);
  }

  public java.util.Map<Object, PropertySetter> getPropertySetters() {
    return new HashMap<>(lsmap);
  }

  // This will work for property access, but won't do any flattening:
  // It might still be faster, though, since the lookups at each stage
  // are O(1) instead of O(n)---that depends on what the constant is...

  private java.util.Map<Object, PropertyGetter> gmap = getPropertyGetters(); 
  private java.util.Map<Object, PropertySetter> smap = getPropertySetters(); 

  public Object getProperty(Object key) {
    return gmap.getOrDefault(key, default_getter).get(key);
  }

  public void setProperty(Object key, Object val) {
    smap.getOrDefault(key, default_setter).set(key, val); 
  }

  // So... for flattening property access. 
  //
  // The outermost trait must use the maps returned by getPropertyGetters()
  // and getPropertySetters() to get and set properties. That's the point
  // of all of this and how we achieve flat property access.
  //
  // We _could_ store the result of getPropertyGetters() and 
  // getPropertySetters() at each level, but that is potentially massively
  // expensive in memory---it's quadratic in the number of properties in a
  // trait stack. Hence, it would be better not to do that. This is the tricky
  // bit, and I'm not entirely sure of the best way.
  // 
  // The following bit of code would work so long as we could be sure it ran
  // only after the whole trait stack was constructed:
  
  if (this == getOutermost(this)) {
    // we are the top of the stack, resolve properties using the flat maps
    gmap = getPropertyGetters();
    smap = getPropertySetters(); 
  }
  else {
    // we are an inner trait, resolve properties using our local map
    gmap = lgmap;
    smap = lsmap; 
  }

  // As I said, I'm not certain where precisely this should go, but there
  // will be a suitable place somewhere, or perhaps a slightly different
  // method---e.g., if we had a new pass-through trait that did nothing but
  // served as the top-level wrapper, that class could do the job. I don't
  // know what will be simplest.

It should be possible to convert all the getProperty() and setProperty() methods in the classes which implement GamePiece fairly straightforwardly and without much risk of breakage.

Flint1b · July 31, 2020, 9:16pm

I think I understand this, roughly.

Do you want to start a draft PR and implement the main structure and I’ll help by filling in the implementation in the various trait classes?

About the naming, can we use regular Java syntax i.e. camelCase, getterMap setterMap localGetterMap localSetterMap instead of gmap smap lgmap lsmap and C-style “variable_name”?

uckelman · July 31, 2020, 10:16pm

Thus spake Flint1b:

I think I understand this, roughly.

Do you want to start a draft PR and implement the main structure and
I’ll help by filling in the implementation in the various trait classes?

I’d like to get thoughts (Brent’s especially) on the simplest way we can
handle making the flattened maps available to the outermost GamePiece.
That’s the tricky part, and there may be pitfalls or ways of doing it I’m
not considering.

About the naming, can we use regular Java syntax i.e. camelCase,
getterMap setterMap localGetterMap localSetterMap instead of gmap smap
lgmap lsmap and C-style “variable_name”?

Yeah, fine.

All of the local changes—extracting the lambdas for the local maps and
using those in getProperty() and setProperty()—could go ahead now, on
the assumption that we’ll find a good place to set the outermost maps.

–
J.

Brent_Easton · August 1, 2020, 3:07am

I am still digesting the design. My concerns relate to how the ‘exceptions’ to a standard trait property will be handled. e.g. Global/Map/Zone properties, scratch-pad properties in BasicPiece and the ‘assembled’ properties pieceName and visibleState. Also need to look carefully at Prototype Definitions which are fragments of a unit, and unit definitions with unexpanded prototype definitions. I don’t expect problems with these, but need to be checked.

When I was working on the SELECT property, the hardest part was working out how and where to store it, so it ends up stored in every Decorator.

I really like the idea of implementing a top-hat trait that we place on top of each GamePiece when it is created so that every single unit has one. There are very few places where unit GamePieces (as opposed to Stacks or Decks) get created (PieceCloner may be it). I think it would be quite simple to add the top-hat at this point, if the top trait is not already a top-hat. Once created, they will just get saved in the log file like a normal trait. This would make implementation of this scheme, and any further optimisations vastly easier. It think it would be a mistake to try and implement something this complex inside the Decorator class.

We could pre-process the unit at this point and do other optimisations like setting a direct link to the Top-Hat and BasicPiece inside each Decorator, instead of the insane Decorator.getOutermost(this) and Decorator.getInnermost(this) loops.

Flint1b · August 1, 2020, 8:56am

In a module, is it possible to write a custom class for a trait that has all the properties that a gamepiece needs, instead of building a trait stack out of the builtin trait classes?

uckelman · August 1, 2020, 11:14am

Thus spake Flint1b:

In a module, is it possible to write a custom class for a trait that has
all the properties that a gamepiece needs, instead of building a trait
stack out of the builtin trait classes?

That depends on what you mean by “all the properties that a gamepiece
needs”.

If you mean a custom class for one particular kind of piece in your module,
sure, but it’s common to have dozens of different kinds of pieces so you’d
end up with a class for each and it would be a huge effort to start and a
maintenance nightmare thereafter.

If you mean a custsom class that can handle any kind of piece, sure, it’s
computable, I don’t see why not. It’s what I intend to build for V4. The
problems you’d encounter in V3 with doing that are that you’d be
replicating essentially all of the VASSAL.counter package inside that
class if you wanted to maintain current behaivor, and you’d also have to
do something about the module editor to make the custom class usable
there.

In both cases you wouldn’t pick up changes we make to VASSAL.counter
upstream. I think it would be a mess, and wouldn’t recommend it.

–
J.

uckelman · August 1, 2020, 11:17am

Thus spake Brent Easton:

I really like the idea of implementing a top-hat trait that we place on
top of each GamePiece when it is created so that every single unit has
one. There are very few places where unit GamePieces (as opposed to
Stacks or Decks) get created (PieceCloner may be it). I think it would
be quite simple to add the top-hat at this point, if the top trait is
not already a top-hat. Once created, they will just get saved in the log
file like a normal trait. This would make implementation of this scheme,
and any further optimisations vastly easier. It think it would be a
mistake to try and implement something this complex inside the Decorator
class.

If a top-hat trait is always present and has no properties of its own,
we could just peel it off when saving and never write it out.

–
J.

Flint1b · August 1, 2020, 11:45am

What if…

We refactor the guts of each trait class into a kind of abstract trait scheme, and have the current trait classes reference these abstract traits to stay backwards-compatible,

then add a new kind of trait class, a “universal trait”, which has a list of these abstract traits, and the module designer can add the traits that he wants to that umbrella trait, just like wrapping traits inside each other but now in an ordered list?

uckelman · August 1, 2020, 12:09pm

Thus spake Flint1b:

[This message has been edited.]

What if…

We refactor the guts of each trait class into a kind of abstract trait
scheme, and have the current trait classes reference these abstract
traits to stay backwards-compatible,

then add a new kind of trait class, a “universal trait”, which has a
list of these abstract traits, and the module designer can add the
traits that he wants to that umbrella trait, just like wrapping traits
inside each other but now in an ordered list?

I suspected this was where you were headed.

It would be far, far more work than adding a top-level wrapper trait
and you’d still have to traverse the traits in the same cases you have
to traverse the traits now.

–
J.

Flint1b · August 1, 2020, 1:39pm

That’s true, just the traversal would be inside a single object, there wouldn’t be several trait objects involved. And it would traverse in an iterative manner over a list and not in a kind-of recursive manner, this would save a lot of overhead. You’re C++ developer I don’t have to tell you how expensive it is to create objects and stackframes for function calls

And another question. I understand that currently traits are traversed piece by piece, in a single thread. I think there is a lot of potential for parallel processing here, if it was possible to collect all pieces in a List first, then do a List.parallelstream(). As I already said in another thread, modern Java applications use all CPU cores to get things done, while Vassal only uses 1 core.

uckelman · August 1, 2020, 2:11pm

Brent, where are pieces created when a saved game or log is loaded? PieceCloner can’t be used for that because there won’t be pieces to clone yet, right?

uckelman · August 1, 2020, 2:47pm

Thus spake Flint1b:

That’s true, just the traversal would be inside a single object, there
wouldn’t be several trait objects involved. And it would traverse in an
iterative manner over a list and not in a kind-of recursive manner, this
would save a lot of overhead. You’re C++ developer I don’t have to tell
you how expensive it is to create objects and stackframes for function
calls

There’s more to it than just property access. Traits delegate inward pretty
much everything they don’t handle locally. If you’re planning to have all
the traversal happen over the list, then you’re looking at rewriting all the
traits.

Based on profiling, property access is by far the most expensive thing
in the trait stacks and flattening it isn’t so complicated or risky. It’s
low-hanging fruit, and we may find that things are fast enough after
dealing with it.

I want to try this, as it’s something I’ve had on the table for a decade
now and looks like it will be a small effort for potentially a huge gain.
Rewriting all the traits, on the other hand, looks to me like throwing
good time after bad. I’d rather spend the time I could put into rewriting
traits into V4.

And another question. I understand that currently traits are traversed
piece by piece, in a single thread. I think there is a lot of potential
for parallel processing here, if it was possible to collect all pieces
in a List first, then do a List.parallelstream(). As I already said in
another thread, modern Java applications use all CPU cores to get things
done, while Vassal only uses 1 core.

Well, it depends heavily on what it is you’re doing. Some operations could
be parallelized, some not. If you have a group of pieces which all need to
be flipped to their next face, you could do that in parallel (though the
operations for that are going to be so short I’d expect the threading
machinery to more than eat up all your gains). If you have a group of pieces
which all need to have some global key command executed on them which then
trigers five other things etc.—good luck figuring out programatically if
that can be parallelized.

Furthermore, modern Java applications also don’t have a GUI for the most
part, and virtually every operation you could do on game state will affect
the GUI, so it will be rough going without model-view separation. Again,
I think it’s not worth the effort on the current codebase.

After the chatter issue is merged, I’d like to put out a 3.3.3 beta. We
can aim for having flattened property access in 3.3.4 if it works out.

–
J.