AICH hashset save can fail spuriously after hashing because `known2.met` lock wait times out
Summary¶
Current main gives CAICHRecoveryHashSet::SaveHashSet() only five seconds to acquire the
global known2.met mutex. If the lock is still held, the function returns false and the
just-calculated hashset is treated as a save failure.
eMuleAI removes that timeout and waits for the mutex instead. For this specific path that looks like the right stock-preserving fix: the expensive hashing work is already finished, the call is serialized through a single shared file, and a false negative save is worse than a short wait.
Current Mainline Status¶
Done in main via commit 8a5a33c (BUG-032 remove AICH hashset save timeout).
The landed fix is intentionally narrow: CAICHRecoveryHashSet::SaveHashSet() now waits for
the known2.met mutex normally instead of treating a 5-second wait as save failure. The
file format, write path, and caller behavior were otherwise left unchanged.
Evidence In Current Tree¶
srchybrid/SHAHashSet.cppCAICHRecoveryHashSet::SaveHashSet()does:CSingleLock lockKnown2Met(&m_mutKnown2File);if (!lockKnown2Met.Lock(SEC2MS(5))) return false;
srchybrid/KnownFile.cpp- both
CreateFromFile()andCreateAICHHashSetOnly()callSaveHashSet()after building the recovery hashset analysis\emuleai\srchybrid\SHAHashSet.cpp- replaces the five-second timeout with an unconditional wait
- the inline rationale is explicit: timing out here was causing "Failed to save AICH Hashset" after successful hashing
Why This Looks Real¶
This is a classic false-failure race:
- hashing already consumed the expensive I/O and CPU work
- another thread can still hold the
known2.metmutex for legitimate reasons - the caller gives up after five seconds and reports failure even though the state is otherwise recoverable by simply waiting a bit longer
This is especially poor during busy startup or concurrent shared-file hashing.
Likely Fix Shape¶
Keep the fix narrow:
- remove the five-second timeout in
CAICHRecoveryHashSet::SaveHashSet() - wait for the mutex normally
- keep the existing file-format, save path, and long-path-safe open logic unchanged
Do not blend this with the broader eMuleAI AICH tree semantics changes.
Validation Target¶
- run concurrent shared-file hashing / AICH generation so
SaveHashSet()calls contend - verify the hashset no longer fails spuriously on timeout
- verify shutdown or close behavior still completes cleanly when hashing is in flight