ReGenHuman: Re-Generating Human Appearances for Realistic Full-Body Video Anonymization

Prior anonymization methods either redact or conduct frame by frame edits, resulting in unrealistic results. ReGenHuman is the first anonymization method that results in temporally consistent videos that are realistic and anonymous by construction.

Adam Sun*,  Eshaan Barkataki,  Arnold Milstein,  Gordon Wetzstein,  Ehsan Adeli*
Stanford University
*Corresponding author
Ours · StructHuman
Ours · StructAll

The same clip, anonymized eight ways. Pixel and face methods leave the body identifiable; prior full-body methods flicker and distort frame to frame. Ours stays realistic and consistent with the original video. We provide two versions of our model – StructHuman, which preserves the background and text, and StructAll, which anonymizes it.

A Good Anonymizer Must Satisfy Three Axes

🔒

Privacy

The video must be fully private and not be re-identifiable.

🎬

Quality

The video must be realistic – it should be temporally and geometrically consistent.

🧠

Utility

The video must remain recognizable for downstream tasks and be contextually consistent with the original input video.

The Idea

Human-centric video is everywhere—and almost always identifiable. To share or train on it safely, we need to anonymize the people in it. However, blurring and redacting detected humans destroys realism and negatively effects downstream usability, while existing generative anonymization methods generate frame-by-frame, resulting in temporal flickering and identity drift.

Our insight is simple: regenerate, don't edit. Instead of hiding or editing the original video directly (which compromises privacy), we throw their pixels away entirely and regenerate the human pixels from identity-free structural cues—pose, depth, and segmentation—using a video diffusion model that denoises the whole clip at once. Because no original human pixel ever reaches the output, privacy is guaranteed by construction, and because the clip is generated jointly, the result is temporally consistent.

How It Works

We composite identity-free pose, depth, and segmentation into a structural conditioning video, then a video diffusion model (Wan2.1-VACE + a LoRA) regenerates the human from it jointly across the whole clip, so the output is temporally consistent.

ReGenHuman pipeline architecture
Fully anonymous structural conditioning (StructAll)
Regenerated human (Ours)

Interactive Comparison

Pick a method to compare against, then drag the slider across the clips. Everything left of the divider is the comparison method; everything right is our method. The slider moves across all four clips at once.

Compare against
Our method (right side)
Four sequences from the HOIGen-1M test set. Drag anywhere on a clip to move the divider on all four.

Quantitative Results

ReGenHuman (★) shwos the strongest anonymization and best generation quality, while maintaining good downstream utility.

Privacy-quality-utility tradeoff: ReGenHuman vs. baselines

Per-axis numbers below. In every table, orange marks the best and blue the second-best per column.

Privacy, lower is better

Method Appearance leakage Body re-ID: Standard Body re-ID: Clothes-Changing
ID Sim. ↓Text Sim. ↓ R-1 ↓R-5 ↓mAP ↓ R-1 ↓R-5 ↓mAP ↓
Classical
EgoBlur (face only)0.00140.9263100.0100.099.986.691.287.2
Gaussian blur0.01480.769450.060.853.340.051.643.3
Pixelation0.00390.7549100.0100.099.884.890.685.9
Blackout0.00390.73497.618.414.217.429.620.7
Face-only, generative
Face Anonymization Made Simple0.08840.8396100.0100.0100.090.892.090.1
DeepPrivacy2 (face)0.06030.916679.187.379.069.479.270.4
Body-only, generative
DeepPrivacy2 (body)0.03090.74047.013.311.329.040.033.7
FADM0.07280.796196.298.796.786.490.485.9
Ours (Wan2.1-VACE conditioning)
StructHuman0.00920.35933.23.24.61.42.02.5
StructAll0.00630.0753

Appearance-leakage and body re-identification metrics on HOIGen-1M. For a fair re-ID comparison (which preserves the background), we report only StructHuman.

Quality, higher is better

Method Subject
Consist.
Backgr.
Consist.
Motion
Smooth.
Temporal
Flicker.
Overall
Consist.
Human
Anatomy
Human
Identity
Realism
Classical
EgoBlur (face only)92.1892.5496.9295.1012.4383.7013.2084.00
Gaussian blur92.0292.9697.4195.8013.6275.3014.4975.04
Pixelation93.0193.3296.4494.9813.1470.156.9360.92
Blackout92.0792.4297.4395.9611.7981.137.6969.76
Face-only, generative
Face Anonymization Made Simple90.7190.1896.6194.7512.1674.318.2387.86
DeepPrivacy2 (face)91.6791.2696.7794.8811.9987.3411.8688.45
Body-only, generative
DeepPrivacy2 (body)89.1688.7695.7894.0312.0676.1710.7283.74
FADM88.1988.4895.8694.1112.0283.5510.8888.00
Ours (Wan2.1-VACE conditioning)
StructHuman92.2391.5297.1295.0516.3689.7419.3089.49
StructAll92.8892.3596.9894.9116.1186.8517.3589.65

VBench / VBench-2.0-aligned quality metrics on HOIGen-1M (all reported as percentages).

Utility, downstream VQA, higher is better

Method HOIGen (zero-shot) NExT-QA (zero-shot) MedVideoCap (zero-shot)
Acc. ↑Δ Acc. ↑Δ Acc. ↑Δ
Original
Original (upper bound)86.4078.3190.47
Classical
EgoBlur86.90+0.5077.56-0.7590.60+0.13
Gaussian blur81.33-5.0772.93-5.3885.27-5.20
Pixelation78.20-8.2073.76-4.5585.13-5.34
Blackout73.47-12.9373.00-5.3174.80-15.67
Face-only, generative
Face Anonymization Made Simple86.13-0.2777.97-0.3489.60-0.87
DeepPrivacy2 (face)85.60-0.8077.80-0.5189.80-0.67
Body-only, generative
DeepPrivacy2 (body)73.10-13.3073.39-4.9278.07-12.40
FADM83.33-3.0776.49-1.8286.30-4.17
Ours (Wan2.1-VACE conditioning)
StructHuman82.37-4.0375.37-2.9485.20-5.27
StructAll81.80-4.6075.85-2.4684.67-5.80

Zero-shot Video Question Answering accuracy with a frozen Qwen3-VL-8B on anonymized clips. Δ is relative to the un-anonymized upper bound (top row).

BibTeX

@article{regenhuman2026,
  title     = {ReGenHuman: Re-Generating Human Appearances for Realistic
               Full-Body Video Anonymization},
  author    = {Sun, Adam and Barkataki, Eshaan and Milstein, Arnold
               and Wetzstein, Gordon and Adeli, Ehsan},
  journal   = {Advances in Neural Information Processing Systems},
  year      = {2026},
}