Part-file hash layout drift — hash tree can mutate during concurrent hashing
Historical reference only:
stale-v0.72a-experimental-cleanandanalysis\stale-v0.72a-experimental-cleanare retired reference sources, not active branch targets or current baselines. Use them only as provenance or idea-extraction sources; landed status is determined againstmain. See Historical References.
Summary¶
CPartFile tracks a hash count and chunk/gap layout that can change while background hashing
is in progress. If the layout drifts (chunks added/removed, file state changes) between when
the hasher reads it and when it writes results back, the hash tree becomes inconsistent with
the actual file layout. Three separate manifestations were found:
-
626c868— Part-file hash layout drift: Hasher reads layout snapshot, butPartFile.cppwrites back the hash result without verifying the layout is still the same. Guard missing. -
f2720dc— Known-file progress owner drift:CKnownFileprogress owner can change mid-hash, causing the progress update to be applied to a stale owner pointer. -
f059749— File progress UI post asserts:PostMessageto the main window with a file progress pointer that may have already been freed by the time the message is processed — stale pointer in message payload.
Location¶
srchybrid/PartFile.cpp— hash write-back and layout consistency checksrchybrid/KnownFile.cpp— progress owner tracking- Seam headers in experimental:
PartFileHashSeams.h
Experimental Reference Implementation¶
Status in stale-v0.72a-experimental-clean: All three facets fixed in experimental
commits 626c868, f2720dc, f059749 (CPP_024, 2026 timeframe).
The approach:
- Add a layout generation counter to CPartFile that increments on every structural change
(gap list, chunk count, file state transitions)
- The hasher snapshots the generation before starting; on write-back, it re-checks the counter
and discards the result if the layout changed
- For progress posting: use weak owner tracking or check object liveness before posting;
alternatively marshal via QueueDisplayUpdate pattern instead of raw PostMessage
Files changed in experimental:
- srchybrid/PartFile.cpp (+22 lines with guard)
- srchybrid/PartFileHashSeams.h (+12 lines seam interface)
Main Branch Port Status¶
Status: Done in main
The staged branch port landed all three stabilization facets:
- generation-based part-file hash write-back guard
- progress-owner drift protection via immutable hash/size snapshots
- queued main-thread part-file progress updates instead of raw pointer
PostMessagepayloads
Landed commits:
826ea4dBUG-018 harden part file hash drift handling0aca93cBUG-018 update part file hash regressions0a85dc0BUG-018 update part file hashing tracking
Current main implementation notes:
CPartFilenow tracks a hash-layout generation that changes on structural gap/file-size mutations and direct gap-list resets.- Hash worker results carry the captured generation snapshot;
PartFileHashFinisheddiscards stale results and requeues hashing when the file is still complete. - Hash/import/copy worker progress no longer posts raw
CPartFile*payloads. Worker threads sendUM_PARTFILE_PROGRESS_UPDATErequests keyed by file hash + file size, and the UI resolves the live part file before applying the update.
Current main verification:
- app
Debug x64build passed - targeted doctest runs passed for:
Part-file hash seam rejects worker results whose theoretical hash layout driftedKnown-file progress seam posts only for matching known-file ownersKnown-file progress seam accepts zero-length owners and rejects stale owners
Relationship to Existing Items¶
- BUG-003 (AICH / metadata paths incomplete): related but distinct — BUG-003 is about FIXME markers for large-file AICH; BUG-018 is about concurrent mutation during hashing.
- BUG-019 (AICH sync thread concurrency): BUG-019 covers the AICH sync thread itself; BUG-018 covers the part-file hasher thread.
Severity¶
Rare in practice (requires file state change during active hashing), but when it occurs can produce silently corrupt AICH trees that only manifest as hash verification failures at download completion.