Skip to content

Conversation

@dpiers
Copy link

@dpiers dpiers commented Dec 3, 2025

Fixes #817

Root Cause

Spawning more than 16 different NPC types causes out-of-bounds array access. The bgNumAnimEvents counter increments without checking against MAX_ANIM_FILES, so when it reaches 16 it overflows the bgAllEvents[MAX_ANIM_FILES] array.

Fix

Increase MAX_ANIM_FILES from 16 to 64 in both MP and SP, matching JK2's existing limit.

Memory impact: ~900 KB additional static allocation (trivial on modern systems).

@dpiers dpiers requested a review from a team as a code owner December 3, 2025 03:29
@SomaZ
Copy link
Member

SomaZ commented Dec 3, 2025

Hm, while I don't have an issue with increasing limits, this doesn't fix the underlying issue of the out of bounds read, but just masks it. Claiming that the limit was chosen because of QVM limits feels like a wild guess. Also makes me wonder if it will be a problem later on when we get QMVs back into the codebase as theres already a pending PR that implements those again.

@taysta
Copy link
Contributor

taysta commented Dec 3, 2025

Were you able to trace where the overflow was occurring?

@dpiers dpiers force-pushed the max-anim-files-fix branch from ef8269c to ba3230e Compare December 3, 2025 19:40
@dpiers
Copy link
Author

dpiers commented Dec 3, 2025

Crash was reproduced, traced and fix tested with spawning >16 distinct NPCs.

In practice, vanilla MP would need ~15 NPC slots max (+1 for player). 64 should be plenty even for heavily modded servers, and the bounds check ensures we degrade gracefully.

I updated the branch to add bounds checking in BG_ParseAnimationEvtFile - if the limit is exceeded, it now logs a warning and falls back to index 0 (humanoid) instead of crashing.

SP does not have this issue because G_ParseAnimFileSet does boundary checking (and presumably does not exceed the limit without mods).

@Daggolin
Copy link
Member

Daggolin commented Dec 3, 2025

The limit of 16 was originally chosen to conserve memory when game code ran in QVMs. OpenJK compiles to native code, so this constraint no longer applies.

I would like to hear the reasoning behind the claim that the limit was chosen, because of the QVMs. I know several servers and clients that have the limit increased to 64 in their QVM mods and I have run a server with the limit raised to 1024 inside of a QVM. None of this should have relevance for QVMs. Yes, QVMs are limited regarding memory, but that refers to not being able to perform dynamic allocations - fixed size buffers like the ones used here work just fine and static buffers can be several hundred MBs, as long as the engine / host supports it.

Increase MAX_ANIM_FILES from 16 to 64 in both MP and SP, matching JK2's existing limit.

I assume when you are talking about JK2's limit you are refering to JK2 SP, not MP?

In practice, vanilla MP would need ~15 NPC slots max (+1 for player). 64 should be plenty even for heavily modded servers, and the bounds check ensures we degrade gracefully.

It probably depends on the map and what users are doing, but even using the vanilla assets you can spawn enough NPCs to exceed the 64 slots limit. Maybe we should seperate the MAX_ANIM_FILES define into 2 defines: one for event file parsing and one for the actual animation file parsing. I think MAX_ANIM_FILES set to 64 makes sense for the data parsed in BG_ParseAnimationFile, but for the data parsed in BG_ParseAnimationEvtFile it probably makes sense to pick a higher limit.

I updated the branch to add bounds checking in BG_ParseAnimationEvtFile - if the limit is exceeded, it now logs a warning and falls back to index 0 (humanoid) instead of crashing.

That roughly matches the behavior of the original 2013 code release (there was a bounds check in the 2013 code release, this bounds check missing from OpenJK is a regression due to the modbase merge), but I still prefer the original. The original also made sure the humanoid index is parsed when falling back to it (rather than just returning the index without a check). The original code:

    if ( animFileIndex < 0 || animFileIndex >= MAX_ANIM_FILES )
    {//WTF??!!
        return 0;
    }

    if ( eventFileIndex < 0 || eventFileIndex >= MAX_ANIM_FILES )
    {//WTF??!!
        forcedIndex = 0;
    }
    else
    {
        forcedIndex = eventFileIndex;
    }

	if (bg_animParseIncluding <= 0)
	{ //if we should be parsing an included file, skip this part
		if ( bgAllEvents[forcedIndex].eventsParsed )
		{//already cached this on
			return forcedIndex;
		}
	}

@dpiers dpiers force-pushed the max-anim-files-fix branch from ba3230e to d79c44a Compare December 4, 2025 19:01
@dpiers
Copy link
Author

dpiers commented Dec 4, 2025

Updated to split the limits:

  • MAX_ANIM_FILES = 64 for animation skeletons (bgAllAnims[])
  • MAX_ANIM_EVENT_FILES = 128 for event files (bgAllEvents[])

This addresses the concern about event files outnumbering anim files since multiple models can share a skeleton while having unique animevents files. The arrays and counters are independent, so separate limits make sense. 128 is arbitrary - let me know if anyone has a better idea.

Added bounds checking in BG_ParseAnimationEvtFile (bg_panimate.c:2162-2166) that logs a warning and falls back to index 0 if exceeded, similar to the check before the regression. Also added bounds checking in BG_AnimsetAlloc (bg_panimate.c:1713-1717) to protect bgAllAnims[] - returns NULL if limit is hit, which callers already handle.

To clarify earlier questions:

  • The QVM comment was speculation, not a claim
  • "JK2's limit" = JK2 SP (codeJK2/game/anims.h:1432)

Fixes JACoders#817 - spawning more than 16 different NPC types causes a crash
due to out-of-bounds array access in bgAllEvents[].

Split limits into MAX_ANIM_FILES (64) for animation skeletons and
MAX_ANIM_EVENT_FILES (128) for event files, since multiple models can
share a skeleton while having unique animevents.cfg files.

Add bounds checking in BG_ParseAnimationEvtFile and BG_AnimsetAlloc
to prevent out-of-bounds access if limits are exceeded.
@dpiers dpiers force-pushed the max-anim-files-fix branch from d79c44a to 4fe24b3 Compare December 4, 2025 19:17
@Daggolin
Copy link
Member

Daggolin commented Dec 5, 2025

128 is arbitrary - let me know if anyone has a better idea.

When I split the 2 defines on my module codebase I initially wanted to chose the arbitrary value 256, but eventually decided on MAX_MODELS+1 (513). As each model/skin for NPCs must be stored in a CS_MODEL configstring, the MAX_MODELS value of 512 should be providing the upper limit of different NPCs (less in reality if other entities register model configstrings as well) unless the server tracks configstring use and repurposes unused ones. As the default cgame module also loads the animevents for _humanoid I added one more slot. So MAX_MODELS+1 is the value I chose on my codebase and I would recommend the same for OpenJK. My local test client has 1750 NPC definitions by the way (some may have duplicate names), so it's possible to reach the limit.

Also added bounds checking in BG_AnimsetAlloc (bg_panimate.c:1713-1717) to protect bgAllAnims[] - returns NULL if limit is hit, which callers already handle.

Why did you add this check? I haven't checked all the code paths, but a quick look suggests that this may be introducing out of bounds access in some places.

The QVM comment was speculation, not a claim

A speculation presented like facts without any indication that it's just a speculation is a claim, is it not? If you're just guessing, please do not present it like facts. It can confuse reviewers, users and it might end up in LLM training, replicating the claim to future users.

@dpiers
Copy link
Author

dpiers commented Dec 6, 2025

Also added bounds checking in BG_AnimsetAlloc (bg_panimate.c:1713-1717) to protect bgAllAnims[] - returns NULL if limit is hit, which callers already handle.

Why did you add this check? I haven't checked all the code paths, but a quick look suggests that this may be introducing out of bounds access in some places.

BG_ParseAnimationFile is the only caller for BG_AnimsetAlloc, and already does null checks to see if the call fails. Without this fix, an assert will fire but OOB access would still occur if we exceed MAX_ANIM_FILES.

@Daggolin
Copy link
Member

Daggolin commented Dec 7, 2025

Also added bounds checking in BG_AnimsetAlloc (bg_panimate.c:1713-1717) to protect bgAllAnims[] - returns NULL if limit is hit, which callers already handle.

Why did you add this check? I haven't checked all the code paths, but a quick look suggests that this may be introducing out of bounds access in some places.

BG_ParseAnimationFile is the only caller for BG_AnimsetAlloc, and already does null checks to see if the call fails. Without this fix, an assert will fire but OOB access would still occur if we exceed MAX_ANIM_FILES.

Yes, BG_ParseAnimationFile is the only caller for BG_AnimsetAlloc, but the checks in BG_ParseAnimationFile were essentially dead code that would never trigger and their behavior of returning -1 as index would likely lead to out of bounds access in other places. So your new check that relies on the behavior of the formerly unused checks being correct is actually going to shift the OOB access to other places rather than fixing it.

I would suggest dropping with a Com_Error here. Changing it to return NULL could be done at a later time after reviewing all uses of BG_ParseAnimationFile and stored return codes across the codebase to ensure the rest of the code handles it correctly rather than performing OOB access (currently there seems to be places that might perform OOB access if -1 is returned). However, please do not give this task to some LLM, these changes should be done by people who are familiar with the codebase. Reviewing LLM output for that is likely going to waste more time than it would for someone to do it manually, so let's use Com_Error for now, as the increased limit is already going to allow more non-humanoid NPCs and vehicles to be used than before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Client crash when connecting to server that spawned too many different types of npcs

4 participants