Prior anonymization methods either redact or conduct frame by frame edits, resulting in unrealistic results. ReGenHuman is the first anonymization method that results in temporally consistent videos that are realistic and anonymous by construction.
The same clip, anonymized eight ways. Pixel and face methods leave the body identifiable; prior full-body methods flicker and distort frame to frame. Ours stays realistic and consistent with the original video. We provide two versions of our model – StructHuman, which preserves the background and text, and StructAll, which anonymizes it.
The video must be fully private and not be re-identifiable.
The video must be realistic – it should be temporally and geometrically consistent.
The video must remain recognizable for downstream tasks and be contextually consistent with the original input video.
Human-centric video is everywhere—and almost always identifiable. To share or train on it safely, we need to anonymize the people in it. However, blurring and redacting detected humans destroys realism and negatively effects downstream usability, while existing generative anonymization methods generate frame-by-frame, resulting in temporal flickering and identity drift.
Our insight is simple: regenerate, don't edit. Instead of hiding or editing the original video directly (which compromises privacy), we throw their pixels away entirely and regenerate the human pixels from identity-free structural cues—pose, depth, and segmentation—using a video diffusion model that denoises the whole clip at once. Because no original human pixel ever reaches the output, privacy is guaranteed by construction, and because the clip is generated jointly, the result is temporally consistent.
We composite identity-free pose, depth, and segmentation into a structural conditioning video, then a video diffusion model (Wan2.1-VACE + a LoRA) regenerates the human from it jointly across the whole clip, so the output is temporally consistent.
Pick a method to compare against, then drag the slider across the clips. Everything left of the divider is the comparison method; everything right is our method. The slider moves across all four clips at once.
ReGenHuman (★) shwos the strongest anonymization and best generation quality, while maintaining good downstream utility.
Per-axis numbers below. In every table, orange marks the best and blue the second-best per column.
| Method | Appearance leakage | Body re-ID: Standard | Body re-ID: Clothes-Changing | |||||
|---|---|---|---|---|---|---|---|---|
| ID Sim. ↓ | Text Sim. ↓ | R-1 ↓ | R-5 ↓ | mAP ↓ | R-1 ↓ | R-5 ↓ | mAP ↓ | |
| Classical | ||||||||
| EgoBlur (face only) | 0.0014 | 0.9263 | 100.0 | 100.0 | 99.9 | 86.6 | 91.2 | 87.2 |
| Gaussian blur | 0.0148 | 0.7694 | 50.0 | 60.8 | 53.3 | 40.0 | 51.6 | 43.3 |
| Pixelation | 0.0039 | 0.7549 | 100.0 | 100.0 | 99.8 | 84.8 | 90.6 | 85.9 |
| Blackout | 0.0039 | 0.7349 | 7.6 | 18.4 | 14.2 | 17.4 | 29.6 | 20.7 |
| Face-only, generative | ||||||||
| Face Anonymization Made Simple | 0.0884 | 0.8396 | 100.0 | 100.0 | 100.0 | 90.8 | 92.0 | 90.1 |
| DeepPrivacy2 (face) | 0.0603 | 0.9166 | 79.1 | 87.3 | 79.0 | 69.4 | 79.2 | 70.4 |
| Body-only, generative | ||||||||
| DeepPrivacy2 (body) | 0.0309 | 0.7404 | 7.0 | 13.3 | 11.3 | 29.0 | 40.0 | 33.7 |
| FADM | 0.0728 | 0.7961 | 96.2 | 98.7 | 96.7 | 86.4 | 90.4 | 85.9 |
| Ours (Wan2.1-VACE conditioning) | ||||||||
| StructHuman | 0.0092 | 0.3593 | 3.2 | 3.2 | 4.6 | 1.4 | 2.0 | 2.5 |
| StructAll | 0.0063 | 0.0753 | — | — | — | — | — | — |
Appearance-leakage and body re-identification metrics on HOIGen-1M. For a fair re-ID comparison (which preserves the background), we report only StructHuman.
| Method | Subject Consist. | Backgr. Consist. | Motion Smooth. |
Temporal Flicker. | Overall Consist. | Human Anatomy |
Human Identity | Realism |
|---|---|---|---|---|---|---|---|---|
| Classical | ||||||||
| EgoBlur (face only) | 92.18 | 92.54 | 96.92 | 95.10 | 12.43 | 83.70 | 13.20 | 84.00 |
| Gaussian blur | 92.02 | 92.96 | 97.41 | 95.80 | 13.62 | 75.30 | 14.49 | 75.04 |
| Pixelation | 93.01 | 93.32 | 96.44 | 94.98 | 13.14 | 70.15 | 6.93 | 60.92 |
| Blackout | 92.07 | 92.42 | 97.43 | 95.96 | 11.79 | 81.13 | 7.69 | 69.76 |
| Face-only, generative | ||||||||
| Face Anonymization Made Simple | 90.71 | 90.18 | 96.61 | 94.75 | 12.16 | 74.31 | 8.23 | 87.86 |
| DeepPrivacy2 (face) | 91.67 | 91.26 | 96.77 | 94.88 | 11.99 | 87.34 | 11.86 | 88.45 |
| Body-only, generative | ||||||||
| DeepPrivacy2 (body) | 89.16 | 88.76 | 95.78 | 94.03 | 12.06 | 76.17 | 10.72 | 83.74 |
| FADM | 88.19 | 88.48 | 95.86 | 94.11 | 12.02 | 83.55 | 10.88 | 88.00 |
| Ours (Wan2.1-VACE conditioning) | ||||||||
| StructHuman | 92.23 | 91.52 | 97.12 | 95.05 | 16.36 | 89.74 | 19.30 | 89.49 |
| StructAll | 92.88 | 92.35 | 96.98 | 94.91 | 16.11 | 86.85 | 17.35 | 89.65 |
VBench / VBench-2.0-aligned quality metrics on HOIGen-1M (all reported as percentages).
| Method | HOIGen (zero-shot) | NExT-QA (zero-shot) | MedVideoCap (zero-shot) | |||
|---|---|---|---|---|---|---|
| Acc. ↑ | Δ | Acc. ↑ | Δ | Acc. ↑ | Δ | |
| Original | ||||||
| Original (upper bound) | 86.40 | — | 78.31 | — | 90.47 | — |
| Classical | ||||||
| EgoBlur | 86.90 | +0.50 | 77.56 | -0.75 | 90.60 | +0.13 |
| Gaussian blur | 81.33 | -5.07 | 72.93 | -5.38 | 85.27 | -5.20 |
| Pixelation | 78.20 | -8.20 | 73.76 | -4.55 | 85.13 | -5.34 |
| Blackout | 73.47 | -12.93 | 73.00 | -5.31 | 74.80 | -15.67 |
| Face-only, generative | ||||||
| Face Anonymization Made Simple | 86.13 | -0.27 | 77.97 | -0.34 | 89.60 | -0.87 |
| DeepPrivacy2 (face) | 85.60 | -0.80 | 77.80 | -0.51 | 89.80 | -0.67 |
| Body-only, generative | ||||||
| DeepPrivacy2 (body) | 73.10 | -13.30 | 73.39 | -4.92 | 78.07 | -12.40 |
| FADM | 83.33 | -3.07 | 76.49 | -1.82 | 86.30 | -4.17 |
| Ours (Wan2.1-VACE conditioning) | ||||||
| StructHuman | 82.37 | -4.03 | 75.37 | -2.94 | 85.20 | -5.27 |
| StructAll | 81.80 | -4.60 | 75.85 | -2.46 | 84.67 | -5.80 |
Zero-shot Video Question Answering accuracy with a frozen Qwen3-VL-8B on anonymized clips. Δ is relative to the un-anonymized upper bound (top row).
@article{regenhuman2026,
title = {ReGenHuman: Re-Generating Human Appearances for Realistic
Full-Body Video Anonymization},
author = {Sun, Adam and Barkataki, Eshaan and Milstein, Arnold
and Wetzstein, Gordon and Adeli, Ehsan},
journal = {Advances in Neural Information Processing Systems},
year = {2026},
}