Same-hash KnownFile replacement can unshare or mis-track equivalent files
Summary¶
CKnownFileList::SafeAddKFile() no longer resolves every same-MD4 collision by
destructively replacing the old entry with the new one. Current main preserves live
shared/download owners, adopts an incoming live owner only over an inactive known-file
record, and keeps inactive collisions non-destructive.
The shared-file side also has a persisted duplicate-path cache that prevents the most visible startup duplicate-path regression.
Fixed Mainline Scope¶
The first implementation slice landed on main in commits 4c974a3,
c495525, and d7aa382:
- persist duplicate shared paths in a
shareddups.datsidecar - reuse remembered duplicate shared paths at startup when path, size, date, and canonical MD4 still match
- include the duplicate-path cache in the shared startup-cache save and purge flow
The core collision slice landed on main in commit 05eabec with parity-test
coverage in emulebb-build-tests commit b0ab2e8:
- add a seam-level
ResolveKnownFileCollision(...)policy - keep existing live shared or downloading known-file entries on same-MD4 collision
- adopt incoming shared/downloading entries only when the existing known-file entry is inactive
- merge compatible statistics into the retained owner
- update equivalent-path spelling without removing the authoritative live owner
- teach shared-file hashing and completed-download handoff to cleanly unwind when the known-file list rejects a duplicate live owner
The remaining replacement branch now applies only when an incoming live owner replaces an inactive known-file entry.
Why This Matters¶
Representative low-drama failure cases:
- a shared file is moved between shared directories
- the same hashed file exists in two shared directories
- startup or reload rediscovers an equivalent file before share state fully settles
- a previously downloaded/shared file is reintroduced through another path
Those are exactly the kinds of cases that can create unshared files, mismatched GUI state, or accidental loss of the authoritative shared instance.
Comparison Notes¶
analysis\emuleaigoes much further with duplicate-path/history tracking and shows that the problem space is real- the focused Xtreme mod archive still carries the historical warning on this surface, which suggests the logical flaw has been known for a long time
The branch does not need eMuleAI's full duplicate-history feature set to justify fixing the core destructive replacement behavior.
Later Option¶
If the narrow MD4-only fix still leaves too much ambiguity, a local strong-hash sidecar is a viable later stabilization option:
- keep MD4 as the protocol and
known.metidentity - add a separate local cache keyed by path / size / mtime with a strong content hash such as BLAKE3
- use that sidecar only to distinguish true same-content rediscovery from local same-MD4
ambiguity before merging or replacing
KnownFilestate
That should be treated as a local persistence/consolidation aid, not as a first-pass protocol change.
Validation¶
python scripts\build_emule_tests.py --workspace-root EMULE_WORKSPACE_ROOT\workspaces\v0.72a --app-root EMULE_WORKSPACE_ROOT\workspaces\v0.72a\app\eMule-main --build-output-mode ErrorsOnly --run -- --test-case=*Known-file*— 18 passed- full parity test run built cleanly but still hit the pre-existing environment-sensitive
long-path current-directory case in
other_functions.tests.cpp python -m emule_workspace build app --workspace-root EMULE_WORKSPACE_ROOT --workspace-name v0.72a --config Debug --platform x64 --build-output-mode ErrorsOnly --variant main— passed, including CFG verificationpython scripts\shared-files-ui-e2e.py --workspace-root EMULE_WORKSPACE_ROOT\workspaces\v0.72a --app-root EMULE_WORKSPACE_ROOT\workspaces\v0.72a\app\eMule-main --configuration Debug --scenario duplicate-startup-reuse— passed