Friday Fun #11

Welcome to Friday Fun, the blog where I poke my head up from underground, let you all know I’m still alive, and hopefully entertain you with some bits about my project progress and game development experience.

Lua Performance

Unsurprisingly, all of the script interpretation now going on in Dehoarder 2 takes its toll on performance, both in frame update time and in memory usage. Moonsharp may be the start of my answer to scripting, but it is not the end. The engine that Moonsharp supplies will need to be tuned to our needs in Dehoarder 2. Luckily it is open source, so that is possible. Undaunted by the fact that I didn’t write it, I rolled up my sleeves and dove in to see where gains could be made.

One of the things that I noticed right away is that each script thread being created was consuming over 2MB of memory, before any script code was ever run! Mind, these are not complex scripts that I am running. Some investigation under the covers revealed that Moonsharp creates an execution stack and a data stack for each thread, and the size of each of these stacks is hard-coded at 131,072 (2^17) entries (131,072 * 8 bytes * 2 = 2MB). A quick test revealed that I was using a bit under 4,000 of these data stack entries during the Database Building phase, but far less than ten for Story Events, and far less than ten entries for the execution stack in all cases. So one of my first customizations to Moonsharp was to make these stack sizes injectable when creating a script. Since the 2MB hit doesn’t really matter during the Database Building phase, I left the stack sizes for processing those scripts alone, but for Story Events, I defaulted to 16 entries for each stack, for a memory savings of 99.988% (256 bytes allocated for the stacks)! I don’t really anticipate any Story Event scripts blowing past those limitations, but just in case, I allow the default stack sizes for each script to be overridden by technical settings in the definition of the Story Event.

I was also able to optimize the memory layout of Moonsharp’s core DynValue type, which is the core data type for holding values in Lua. This was an obvious target for optimization, as hundreds and thousands of instances of it are created when running Lua scripts. My deep knowledge of C# and .Net really shone through here, and I was able to decrease the heap size for DynValue from 112 bytes all the way down to 32 40 bytes through a combination of techniques and tricks:

First, the .Net CLR still does not do anything to pack the memory layout of a class; it lays it out in the order you define it in the code, but it will align things on two, four, and eight byte boundaries to make things faster and more convenient for the CPU. So you can end up with significant gaps in the memory layout of the class. To optimize for this, generally you will want to count bytes and ensure that eight-byte values fall on eight-byte boundaries, four-byte values fall on four-byte boundaries, and so forth. Sorting fields in code from largest to smallest byte size generally works well to accomplish this.

The other big memory optimization comes from knowing that a DynValue will hold either an 8-byte double precision floating point number or an 8-byte reference to an object, but never both at the same time. Knowing this, we can take control of the layout of the class members (dangerous!) and overlap the number and object reference fields (very dangerous!) It’s easy to get in trouble with this if you don’t know what you are doing (or even if you do), which is why it is often not done. I added a few guard clauses to prevent unsafe access to these overlapped, because things can get crashy if you try to interpret a number as an object reference. (This turned out to be too dangerous after all, and I should have realized it – while this trick may work with numeric types, it will never work with object references no matter how careful you are, because the garbage collector will invariably access the invalid object reference, which WILL crash you.)

A third optimization was to get rid of some unused state. DynValue keeps a non-guaranteed-unique ID around for debugging display purposes. Since this value would be of limited worth even if we were using the debugging module, I’ve sidelined it for now.

Finally, I was able to squeeze a couple more bytes by specifying that the DataType enumeration be based on a 16-bit integer rather than the default 32-bit integer.

Performance also caused me to alter the Dehoarder 2 API a bit. Initially I had intended for all of the available functions of the game and all object factories to be available as globals just to keep things as simple as possible. However, injecting dozens of globals into every script context was just taking too much time. So now just the game object, a factory object and a handful of other objects are exposed to Story Event scripts as globals. I also figured out how to have my Wait() that does the all-important coroutine yield implemented in C# rather than Lua so that it did not have to be compiled and injected into every script, saving a millisecond or so with every script loaded.

I also got pre-compiled Lua working, which about triples the performance of loading Lua scripts. For now, knowing the facility is there and ready to be used is good enough; I will need to do a little work to integrate script pre-compilation into my build/debug workflow.

Many of these optimizations are usually only worthwhile when you actually encounter performance issues, like in my case. Running a script 41 times to update 41 objects was taking over two seconds and allocating over 80MB initially, obviously a problem. Through these optimizations and others, I was able to bring the same workload down to 4.2 milliseconds and 115.6Kb of allocation, and I still have more room for improvement.

At some point I would like to contribute back these improvements, but for now I must focus on survival and getting this game out the door. I have more ideas, too, like a gamedev-oriented Lua variant that provides 64-bit integer and 32-bit floating point numeric types instead of 64-bit floating point types, because that would be more compatible and performant with game development and avoid all of that casting of numbers between types, which really does not sit well with me.

Gardens and Plants are now Furnishings

Of course, how we got to know that our Lua performance needed to be improved was through starting to consolidate entities in the game’s engine, pushing the special cases out to the Lua code. Gardens and Plants are a prime example of where that effort is needed.

In the early design of Dehoarder 2, a more naive approach was taken where objects with special behavior would become their own entity type. However, this results in base code that is a real pain to extend, and can never really be closed to modifications. Learning from my earlier design, I am consolidating into as few major entities as possible. I have the Junk entity to represent a physics-based interactive object, I have the Furnishing entity to represent a statically placed interactive object. I have a Vermin entity to describe the vermin, and I have a Prop entity to describe non-interactive objects. I also had Garden and Plant entities, but I realized that these were just special cases of Furnishing entities that could have their special behavior provided in script.

The Dehoarder 2 API consequently gained a few more features, as did gameplay.

There is now a standard way for furnishings to subscribe to periodic changes in game time using the onTimeAdvancedInterval and onTimeAdvancedEventId properties. With the way that game time works, you can subscribe for an update, say, every 5 game minutes, and you will get notified on the first frame on which the game time passes that interval. You will get notified and will have a LastUpdate counter variable available from the target object that you can use to determine exactly how much time has passed (which may be more than the configured interval because some actions can cause a large amount of game time to pass all at once).

The code side of the generalized tool interface is now starting to shape up, with tools now being able to launch projectiles that can communicate which object they came from (available as the “source” global in the script handling the collision event). This facilitates having multiple individual tool items such as the fertilizer sprayer with their own finite capacity, rather than have a global game-level resource capacity. This is only proper, because now the player will need to deal with taking delivery of full containers and treating empty containers as trash.

All that remains to complete this transition is to port the visual updates to the plants based on their current health and growth state.

Glitch Gallery

Finally, since I’ve been talking largely about non-visual stuff so far, I wanted to leave you this week with a couple of examples of interesting glitches I encountered. We’ll start off with the plates. Smh the plates.

Then we had this little gem, where all of the furnishings in the house loaded up and displayed, but nothing else did:

Conclusion and Calls To Action

That wraps another Friday Fun blog. If you are enjoying what you are reading, if you want to see me succeed, then please wishlist Dehoarder 2, support me on Patreon, check out Prepare For Warp (currently on sale) and tell your favorite streamers or other influencers about me.