ReGenHuman: Re-Generating Human Appearances for Realistic Full-Body Video Anonymization

Prior anonymization methods either redact or conduct frame by frame edits, resulting in unrealistic results. ReGenHuman is the first anonymization method that results in temporally consistent videos that are realistic and anonymous by construction.

Adam Sun^*, Eshaan Barkataki, Arnold Milstein, Gordon Wetzstein, Ehsan Adeli^*

Stanford University

^*Corresponding author

arXiv Code

Original

EgoBlur

Face Anonymization Made Simple

DeepPrivacy2 (face)

DeepPrivacy2 (body)

FADM

Ours · StructHuman

Ours · StructAll

The same clip, anonymized eight ways. Pixel and face methods leave the body identifiable; prior full-body methods flicker and distort frame to frame. Ours stays realistic and consistent with the original video. We provide two versions of our model – StructHuman, which preserves the background and text, and StructAll, which anonymizes it.

A Good Anonymizer Must Satisfy Three Axes

🔒

Privacy

The video must be fully private and not be re-identifiable.

🎬

Quality

The video must be realistic – it should be temporally and geometrically consistent.

🧠

Utility

The video must remain recognizable for downstream tasks and be contextually consistent with the original input video.

The Idea

Human-centric video is everywhere—and almost always identifiable. To share or train on it safely, we need to anonymize the people in it. However, blurring and redacting detected humans destroys realism and negatively effects downstream usability, while existing generative anonymization methods generate frame-by-frame, resulting in temporal flickering and identity drift.

Our insight is simple: regenerate, don't edit. Instead of hiding or editing the original video directly (which compromises privacy), we throw their pixels away entirely and regenerate the human pixels from identity-free structural cues—pose, depth, and segmentation—using a video diffusion model that denoises the whole clip at once. Because no original human pixel ever reaches the output, privacy is guaranteed by construction, and because the clip is generated jointly, the result is temporally consistent.

How It Works

We composite identity-free pose, depth, and segmentation into a structural conditioning video, then a video diffusion model (Wan2.1-VACE + a LoRA) regenerates the human from it jointly across the whole clip, so the output is temporally consistent.

Fully anonymous structural conditioning (StructAll)

➞

Regenerated human (Ours)

Interactive Comparison

Pick a method to compare against, then drag the slider across the clips. Everything left of the divider is the comparison method; everything right is our method. The slider moves across all four clips at once.

Compare against

Our method (right side)

Four sequences from the HOIGen-1M test set. Drag anywhere on a clip to move the divider on all four.

Quantitative Results

ReGenHuman (★) shwos the strongest anonymization and best generation quality, while maintaining good downstream utility.

Privacy-quality-utility tradeoff: ReGenHuman vs. baselines

Per-axis numbers below. In every table, orange marks the best and blue the second-best per column.

Privacy, lower is better

Method	Appearance leakage		Body re-ID: Standard			Body re-ID: Clothes-Changing
Method	ID Sim. ↓	Text Sim. ↓	R-1 ↓	R-5 ↓	mAP ↓	R-1 ↓	R-5 ↓	mAP ↓
Classical
EgoBlur (face only)	0.0014	0.9263	100.0	100.0	99.9	86.6	91.2	87.2
Gaussian blur	0.0148	0.7694	50.0	60.8	53.3	40.0	51.6	43.3
Pixelation	0.0039	0.7549	100.0	100.0	99.8	84.8	90.6	85.9
Blackout	0.0039	0.7349	7.6	18.4	14.2	17.4	29.6	20.7
Face-only, generative
Face Anonymization Made Simple	0.0884	0.8396	100.0	100.0	100.0	90.8	92.0	90.1
DeepPrivacy2 (face)	0.0603	0.9166	79.1	87.3	79.0	69.4	79.2	70.4
Body-only, generative
DeepPrivacy2 (body)	0.0309	0.7404	7.0	13.3	11.3	29.0	40.0	33.7
FADM	0.0728	0.7961	96.2	98.7	96.7	86.4	90.4	85.9
Ours (Wan2.1-VACE conditioning)
StructHuman	0.0092	0.3593	3.2	3.2	4.6	1.4	2.0	2.5
StructAll	0.0063	0.0753	—	—	—	—	—	—

Appearance-leakage and body re-identification metrics on HOIGen-1M. For a fair re-ID comparison (which preserves the background), we report only StructHuman.

Quality, higher is better

Method	Subject Consist.	Backgr. Consist.	Motion Smooth.	Temporal Flicker.	Overall Consist.	Human Anatomy	Human Identity	Realism
Classical
EgoBlur (face only)	92.18	92.54	96.92	95.10	12.43	83.70	13.20	84.00
Gaussian blur	92.02	92.96	97.41	95.80	13.62	75.30	14.49	75.04
Pixelation	93.01	93.32	96.44	94.98	13.14	70.15	6.93	60.92
Blackout	92.07	92.42	97.43	95.96	11.79	81.13	7.69	69.76
Face-only, generative
Face Anonymization Made Simple	90.71	90.18	96.61	94.75	12.16	74.31	8.23	87.86
DeepPrivacy2 (face)	91.67	91.26	96.77	94.88	11.99	87.34	11.86	88.45
Body-only, generative
DeepPrivacy2 (body)	89.16	88.76	95.78	94.03	12.06	76.17	10.72	83.74
FADM	88.19	88.48	95.86	94.11	12.02	83.55	10.88	88.00
Ours (Wan2.1-VACE conditioning)
StructHuman	92.23	91.52	97.12	95.05	16.36	89.74	19.30	89.49
StructAll	92.88	92.35	96.98	94.91	16.11	86.85	17.35	89.65

VBench / VBench-2.0-aligned quality metrics on HOIGen-1M (all reported as percentages).

Utility, downstream VQA, higher is better

Method	HOIGen (zero-shot)		NExT-QA (zero-shot)		MedVideoCap (zero-shot)
Method	Acc. ↑	Δ	Acc. ↑	Δ	Acc. ↑	Δ
Original
Original (upper bound)	86.40	—	78.31	—	90.47	—
Classical
EgoBlur	86.90	+0.50	77.56	-0.75	90.60	+0.13
Gaussian blur	81.33	-5.07	72.93	-5.38	85.27	-5.20
Pixelation	78.20	-8.20	73.76	-4.55	85.13	-5.34
Blackout	73.47	-12.93	73.00	-5.31	74.80	-15.67
Face-only, generative
Face Anonymization Made Simple	86.13	-0.27	77.97	-0.34	89.60	-0.87
DeepPrivacy2 (face)	85.60	-0.80	77.80	-0.51	89.80	-0.67
Body-only, generative
DeepPrivacy2 (body)	73.10	-13.30	73.39	-4.92	78.07	-12.40
FADM	83.33	-3.07	76.49	-1.82	86.30	-4.17
Ours (Wan2.1-VACE conditioning)
StructHuman	82.37	-4.03	75.37	-2.94	85.20	-5.27
StructAll	81.80	-4.60	75.85	-2.46	84.67	-5.80

Zero-shot Video Question Answering accuracy with a frozen Qwen3-VL-8B on anonymized clips. Δ is relative to the un-anonymized upper bound (top row).

BibTeX

@article{regenhuman2026,
  title     = {ReGenHuman: Re-Generating Human Appearances for Realistic
               Full-Body Video Anonymization},
  author    = {Sun, Adam and Barkataki, Eshaan and Milstein, Arnold
               and Wetzstein, Gordon and Adeli, Ehsan},
  journal   = {arXiv},
  year      = {2026},
}