Skip to content

Performance Notes

What we measured

We benchmarked the end-to-end SAM -> MatAnyone -> output video pipeline with the local validation script:

powershell
uv run --python .\.venv\Scripts\python.exe matanyone2-runtime cli --input media\bookcat.mp4 --device cpu --performance_profile fast --cpu_threads 8 --sam_model_type vit_h --frame_limit 241 --video_target_fps 0 --output_fps 12 --positive_point 280,180 --negative_point 30,30 --negative_point 530,30 --output_dir results

Test asset:

  • Input: media/bookcat.mp4
  • Resolution: 560x560
  • Frames: 241
  • Prompt points: 280,180 (positive), 30,30 and 530,30 (negative)

Generated comparison artifacts live under:

  • results/bookcat-profile-exp/
  • results/bookcat-profile-exp-gpu/
  • results/bookcat-profile-exp-compare.json
  • media/bookcat-comparison-frame-120.png
  • media/bookcat-quality-preview.webp
  • media/bookcat-balanced-preview.webp
  • media/bookcat-fast-preview.webp

Current ad-hoc runs created through the shared CLI go into a timestamped folder such as results/bookcat_1773163828_6577592/, which also includes debug artifacts and metadata.json. The historical benchmark folders above are still useful when you want stable paths for reporting or scripted comparisons.

Representative visual comparison:

  • results/bookcat-profile-exp/comparison_frame_120.png

Representative comparison frame

Compact animated previews:

qualitybalancedfast
Quality profile previewBalanced profile previewFast profile preview

Speed results

The timings below include mask generation, matting, and output writing.

ProfileCPU timeGPU timeGPU speedup vs CPU
quality227.76s72.84s3.13x
balanced225.62s78.72s2.87x
fast211.33s71.43s2.96x

Quality deltas

quality was used as the reference output.

ProfileForeground MAEAlpha MAEForeground PSNRAlpha PSNR
balanced0.870.6333.82 dB30.31 dB
fast1.481.9229.04 dB24.14 dB

Practical takeaways

  • On this 560x560 sample, balanced stayed visually close to quality.
  • fast was the quickest profile, but the gain was modest because the source video was already small.
  • GPU stayed roughly 3x faster than CPU for the full application path on this sample.
  • The most visible quality tradeoff shows up in the alpha matte before it shows up in the composited foreground.

Re-running the experiment

CPU:

powershell
uv run --python .\.venv\Scripts\python.exe matanyone2-runtime cli --input media\bookcat.mp4 --device cpu --performance_profile balanced --cpu_threads 8 --sam_model_type vit_h --frame_limit 241 --video_target_fps 0 --output_fps 12 --positive_point 280,180 --negative_point 30,30 --negative_point 530,30 --output_dir results

GPU:

powershell
uv run --python .\.venv\Scripts\python.exe matanyone2-runtime cli --input media\bookcat.mp4 --device cuda --performance_profile balanced --sam_model_type vit_h --frame_limit 241 --video_target_fps 0 --output_fps 12 --positive_point 280,180 --negative_point 30,30 --negative_point 530,30 --output_dir results

Animated WebP preview generation:

powershell
uv run --project D:\Prj\video-background-remover video-background-remover D:\Prj\MatAnyone\results\bookcat-profile-exp\balanced\bookcat_foreground.mp4 D:\Prj\MatAnyone\media\bookcat-balanced-preview.webp --matanyone --alpha-video D:\Prj\MatAnyone\results\bookcat-profile-exp\balanced\bookcat_alpha.mp4 --animated webp --webp-fps 6 --max-frames 96 --size 280x280

The --size 280x280 setting is the half-resolution post-process for the original 560x560 sample.