Concrete scenario
What this looks like in practice
A hospital research group shares a de-identified cohort. Another team filters rows, joins an external registry, and ships three derivative datasets to separate model vendors. When a regulator asks which version each vendor trained on, the shared drive has folders — but no cryptographically linked trail between manifest, transform, and downstream receipt.
Problem
What breaks today
Data pipelines fork, merge, and get reused across teams. Months later, nobody wants a folder name or slide deck. They need to know which dataset version moved, which transformation changed it, and where the trail stopped.
Mechanism
How ZK-SNAP responds
Each accountable pipeline stage can emit a ZK-SNAP receipt binding dataset manifests, transformation claims, and artifact roots. Evidence Graph indexes those receipt facts for discovery and forensic traversal; it correlates trails without becoming the validity authority.
Verifiable outcome
What a verifier can check
- Manifest and artifact roots in the claim recompute from disclosed material or committed openings.
- Each stage receipt verifies independently; missing stages are visible gaps, not silent merges.
- Evidence Graph correlation resolves related receipts and anchors without overriding signature checks.
- Profiles declare whether a receipt is dataset-audit, training-run, or evaluation scoped.
Scope boundary
What a receipt does not replace
Receipts document what was signed at each stage; they do not automatically prove regulatory compliance, data quality, or that every intermediate copy was receipt-backed unless operators adopt that profile end-to-end.