Is this video real or AI-generated? How to check

Published 9 June 2026 · FactHeck editorial team

Check if this is AI — paste any TikTok, Instagram Reel, or YouTube URL into FactHeck and get an AI-detection result in seconds.

Is this video real or AI-generated? The most reliable way to check is to look at what happens between frames, not just within them. Current AI video generators — Sora, Kling, Runway — often produce single frames that look convincing in isolation. Their weakness is temporal: objects morph between frames, backgrounds warp during camera movement, and motion defies physics across time. FactHeck's detector analyses consecutive frame windows alongside per-frame visual artefacts and an overall plausibility assessment. The result is probabilistic — useful evidence, not proof — and this page explains exactly what the pipeline does and where it falls short.

How FactHeck analyses video for AI generation

When you submit a TikTok, Instagram Reel, or YouTube video, FactHeck runs three analysis dimensions:

Visual artefacts — sampled keyframes are sent to Claude (Sonnet) with a structured prompt covering eight artefact categories: anatomy and hands, garbled text, texture inconsistencies, geometry, lighting and shadows, background anomalies, AI style indicators, and compositing seams. Each category is scored for severity.
Temporal consistency — this is the dimension that most reliably separates real from AI video. FactHeck extracts dense keyframes and groups them into consecutive windows (typically 3 frames approximately 0.5 seconds apart), then passes those windows to Gemini (Flash) to look for object morphing, appearance and disappearance of elements, background warping, physics violations, body and face changes between frames, texture flickering, and abrupt lighting shifts. Normal camera shake and motion blur are explicitly excluded as false positives.
Scenario plausibility — separately, the pipeline asks whether the depicted scene could realistically exist. Impossible physics, pristine dreamlike environments, absent safety equipment in dangerous scenarios, no bystander reaction to emergencies, and stacked implausibilities are each scored. Three or more implausibilities together are a high-confidence indicator of AI generation.

Each dimension returns a verdict ("Likely AI", "Partially AI / AI Edited", or "Likely Real") and a confidence level. These are combined into the final result shown on your check page.

Why temporal consistency is the key signal

Still-image AI detectors have trained generators to produce better-looking individual frames. The temporal dimension is harder to fake: generating hundreds of frames that are photorealistic and physically consistent across time is a much harder problem. Common temporal tells in current AI video:

A person's hand changes shape or finger count between frames 0.5 seconds apart
A background element (tree, sign, person) appears or vanishes with no explanation
The background warps during a pan in a way that doesn't match real-world parallax
A falling object accelerates at the wrong rate or bounces with impossible energy
Clothing texture or a person's face subtly changes between consecutive frames

Common scenario red flags in AI video

Many of the most-shared AI videos on TikTok and Instagram follow recognisable patterns that the plausibility dimension is designed to catch:

People falling from dramatic heights (cliff waterslides, roof edges, canyon platforms)
Impossible structures built into natural landscapes with no engineering rationale
Apparent serious injury or death that is still live on a major platform — real footage of this is routinely removed by content moderators
No bystander reaction to a horrifying event; calm, cinematic observation of something that would cause panic
Pristine fantasy locations with zero mundane detail (no signage, no litter, no other visitors)

The presence of three or more of these signals in the same video is a high-confidence indicator of AI generation, regardless of how convincing individual frames look.

Honest accuracy caveats

Platform re-compression strips per-frame artefacts. TikTok, Instagram, and YouTube re-encode every video they host. Fine-grain visual tells can be washed out by compression. The temporal-consistency dimension is more robust because it looks at changes between frames rather than within a single frame, but a heavily compressed clip can still return a lower-confidence result.
Newer AI video models produce fewer tells. The temporal-consistency analysis is currently the most reliable dimension, but the best current models already produce more stable temporal output than first-generation tools. Expect this arms race to continue.
Unusual real footage can trigger false positives. Shaky handheld video, extreme slow motion, and heavily colour-graded footage can resemble AI artefacts or implausible scenarios. Use context and source alongside any detector result.
No audio-specific deepfake detection. FactHeck checks whether the audio is consistent with the visuals (for example, whether a synthetic-sounding voice matches the visible speaker), but does not run a dedicated audio-deepfake classifier. If your concern is specifically about synthetic speech, a specialist audio tool will give you a more focused signal.

For further background on the state of AI video detection, see MIT Technology Review's coverage of deepfake detection research and First Draft's guide to AI-generated video verification.

How to use the result

A high-confidence "Likely AI" result with multiple dimensions flagged — especially temporal inconsistency — is a strong reason for caution. Don't share until you have investigated further.
A low-confidence or "Partially AI / AI Edited" result is inconclusive. One weak signal is not enough to be certain either way.
A "Likely Real" result does not certify authenticity. The video could still be real footage that is misleadingly captioned, taken out of context, or from a generator the pipeline hasn't been optimised against.

Frequently asked questions

How does FactHeck detect AI-generated video?

FactHeck runs three analysis dimensions on submitted video. First, a visual-artefacts scan examines sampled keyframes for malformed anatomy, garbled text, lighting inconsistencies, and compositing seams. Second, a temporal-consistency analysis (powered by Gemini) examines windows of consecutive frames for object morphing, background warping, and physics violations that only appear across time — the primary tell of current AI video generators. Third, a scenario-plausibility dimension assesses whether the depicted scene could realistically exist. All three dimensions are combined into a final verdict.

Why is temporal consistency analysis important for video?

Current AI video generators (such as Sora, Kling, and Runway) often produce individual frames that look convincing in isolation. Their weakness shows up across time: objects change shape between frames, backgrounds warp during camera movement, and motion does not follow realistic physics. Analysing consecutive frame windows — rather than single frames — is therefore the most reliable signal for AI-generated video that has been shared on social media.

Can platform compression affect the video AI detection result?

Yes. TikTok, Instagram, and YouTube all re-encode uploaded video, which can smooth out per-frame artefacts that detectors rely on. The temporal-consistency dimension is somewhat more robust to compression because it looks at changes between frames rather than within a single frame, but a heavily compressed clip may still return a lower-confidence result. Treat a low-confidence verdict as inconclusive.

Is the AI video detection result proof that a video is fake?

No. The result is a probabilistic assessment. AI detection is an active research area: generators improve constantly, and legitimate footage can occasionally trigger false positives — especially shakily filmed or heavily edited real video. Use the verdict as one signal alongside the source, context, and any corroborating evidence before drawing conclusions.

Does FactHeck check audio for AI generation?

FactHeck analyses whether the audio transcript is consistent with the video visuals (for example, checking whether a synthetic-sounding voice matches the speaker visible on screen). It does not run a dedicated audio-only deepfake classifier. If your concern is specifically about synthetic speech, a dedicated audio-deepfake tool will give you a more focused signal.

Ready to check a video? Run FactHeck's AI detector — paste any TikTok, Instagram, or YouTube URL for an instant AI-detection result.