Skip to content

Part-file hash layout drift — hash tree can mutate during concurrent hashing

Historical reference only: stale-v0.72a-experimental-clean and analysis\stale-v0.72a-experimental-clean are retired reference sources, not active branch targets or current baselines. Use them only as provenance or idea-extraction sources; landed status is determined against main. See Historical References.

Summary

CPartFile tracks a hash count and chunk/gap layout that can change while background hashing is in progress. If the layout drifts (chunks added/removed, file state changes) between when the hasher reads it and when it writes results back, the hash tree becomes inconsistent with the actual file layout. Three separate manifestations were found:

  1. 626c868 — Part-file hash layout drift: Hasher reads layout snapshot, but PartFile.cpp writes back the hash result without verifying the layout is still the same. Guard missing.

  2. f2720dc — Known-file progress owner drift: CKnownFile progress owner can change mid-hash, causing the progress update to be applied to a stale owner pointer.

  3. f059749 — File progress UI post asserts: PostMessage to the main window with a file progress pointer that may have already been freed by the time the message is processed — stale pointer in message payload.

Location

  • srchybrid/PartFile.cpp — hash write-back and layout consistency check
  • srchybrid/KnownFile.cpp — progress owner tracking
  • Seam headers in experimental: PartFileHashSeams.h

Experimental Reference Implementation

Status in stale-v0.72a-experimental-clean: All three facets fixed in experimental commits 626c868, f2720dc, f059749 (CPP_024, 2026 timeframe).

The approach: - Add a layout generation counter to CPartFile that increments on every structural change (gap list, chunk count, file state transitions) - The hasher snapshots the generation before starting; on write-back, it re-checks the counter and discards the result if the layout changed - For progress posting: use weak owner tracking or check object liveness before posting; alternatively marshal via QueueDisplayUpdate pattern instead of raw PostMessage

Files changed in experimental: - srchybrid/PartFile.cpp (+22 lines with guard) - srchybrid/PartFileHashSeams.h (+12 lines seam interface)

Main Branch Port Status

Status: Done in main

The staged branch port landed all three stabilization facets:

  • generation-based part-file hash write-back guard
  • progress-owner drift protection via immutable hash/size snapshots
  • queued main-thread part-file progress updates instead of raw pointer PostMessage payloads

Landed commits:

  • 826ea4d BUG-018 harden part file hash drift handling
  • 0aca93c BUG-018 update part file hash regressions
  • 0a85dc0 BUG-018 update part file hashing tracking

Current main implementation notes:

  • CPartFile now tracks a hash-layout generation that changes on structural gap/file-size mutations and direct gap-list resets.
  • Hash worker results carry the captured generation snapshot; PartFileHashFinished discards stale results and requeues hashing when the file is still complete.
  • Hash/import/copy worker progress no longer posts raw CPartFile* payloads. Worker threads send UM_PARTFILE_PROGRESS_UPDATE requests keyed by file hash + file size, and the UI resolves the live part file before applying the update.

Current main verification:

  • app Debug x64 build passed
  • targeted doctest runs passed for:
  • Part-file hash seam rejects worker results whose theoretical hash layout drifted
  • Known-file progress seam posts only for matching known-file owners
  • Known-file progress seam accepts zero-length owners and rejects stale owners

Relationship to Existing Items

  • BUG-003 (AICH / metadata paths incomplete): related but distinct — BUG-003 is about FIXME markers for large-file AICH; BUG-018 is about concurrent mutation during hashing.
  • BUG-019 (AICH sync thread concurrency): BUG-019 covers the AICH sync thread itself; BUG-018 covers the part-file hasher thread.

Severity

Rare in practice (requires file state change during active hashing), but when it occurs can produce silently corrupt AICH trees that only manifest as hash verification failures at download completion.