What is the difference between AI Detection and Phishing Detection?

Phishing detection looks for fake URLs. AI Detection (Deepfake Detection) looks for 'fake content'—pixels or audio that were generated mathematically rather than captured by a lens.

Does MojoDocs use Python on the backend?

No. Our detection engine is 100% Client-Side. We use C++ and Rust compiled to WebAssembly (WASM). No server-side Python is involved in the actual scanning process.

How do you detect 'White on White' AI artifacts using ELA?

ELA is actually MORE effective on low-contrast areas. While we use different 'Enhanced ELA' modes for various backgrounds, the mathematical difference in compression noise remains consistent regardless of color.

Can I run this tool on a Chromium mobile browser?

Yes! Since the engine is in WASM, it runs on Chrome for Android and Safari for iOS. Performance depends on your device's RAM and CPU cores.

Is 'rPPG' pulse detection affected by makeup?

Heavy opaque makeup can reduce the signal-to-noise ratio of rPPG. However, our engine uses 'Multichannel Signal Processing' to look for pulse signatures in multiple skin areas (forehead, cheeks) to compensate.

What happens to the images after the scan?

They are instantly flushed from the browser's volatile memory (RAM). We do not save them to local storage or indexdb unless you explicitly click 'Download Report'.

Does 'Fast Fourier Transform' (FFT) work on compressed social media videos?

Yes, though the 'cutoff' frequencies are lower. We adjust our 'Synthetic Frequency Fingerprint' based on the detected compression level of the source file.

Why is local detection safer than cloud detection?

It eliminates the 'Transit Risk'. Your biometric data (your face) is never exposed to the internet. If the cloud service gets hacked, your data is safe on your local drive.

Do you use CNNs or Transformers for detection?

We use a hybrid: Convolutional Neural Networks (CNNs) for spatial artifact detection and a lightweight Vision Transformer (ViT) for understanding global image context.

How do I update the detection engine?

MojoDocs is a PWA. The next time you refresh the page, the Service Worker will automatically fetch the latest WASM models from our CDN. You are always running the latest 'Shield'.

Inside the Machine: The Engineering Behind Local-First Deepfake Detection (2026)

Engineering Resource

Engineering Digest

How do you detect AI with AI without ever leaving the browser? A deep technical dive into Error Level Analysis (ELA), Frequency Spikes, and Bio-Signal verification using WebAssembly and WebGPU.

Detection isn't about looking at the face; it's about looking at the 'Digital Substrate'—the mathematical noise left by the generator.

WebAssembly (WASM) allows us to run C++ and Rust-based forensic libraries at near-native speed inside Chrome and Safari.

Error Level Analysis (ELA) works by exploiting the 'Generation Resave' loss in JPEG/WebP compression.

rPPG (Remote Photoplethysmography) can detect the micro-pulse in a human face, which deepfakes often lack or render as static noise.

Content Roadmap

Most people think Deepfake Detection is a game of "Spot the Glitch." They look for an extra finger or a weird blink. But for an engineer, these are the 'Easy Fakes'. The real battle is happening in the Frequency Domain and Compression Histograms. In 2026, detecting a sophisticated AI imposter is a high-stakes math problem.

At MojoDocs, we faced a unique challenge: How do we build a world-class forensic tool that runs entirely on a user's device? Standard AI detection requires massive Python servers and NVIDIA A100 GPUs. Our goal was to run it inside the browser's sandbox using WebAssembly (WASM) and WebGPU.

This 2500-word technical deep dive explains the four pillars of our detection engine: ELA, Fourier Transforms, CNNs, and rPPG.

Pillar 1: Error Level Analysis (ELA) & Compression Forensics

JPEG and WebP are "Lossy" formats. Every time you save an image, the quality drops slightly. In computer vision, this is known as 8x8 Block Artifacts. When a scammer uses a 'DeepFaceLab' or 'Faceswap' model, they generate a fake face and "paste" it onto a real image. They then save the result.

The Mathematical Flaw: The "New" face has a different 'Compression Age' than the original background. While the eye can't see the difference, a simple algorithm can. ELA works by resaving the image at a known quality (say 90%) and calculating the Difference Map between the uploaded image and the resaved one. Modified areas will appear as "Hotspots" in the output because they react differently to the new compression cycle.

Pillar 2: Frequency Domain Analysis (The Fourier Trap)

Generative models like GANs (Generative Adversarial Networks) or Diffusion Models build images using an "Up-sampling" layer. This process leaves behind Periodic Artifacts—mathematical heartbeats that are completely invisible to humans but look like "Spikes" in the frequency domain.

By applying a Fast Fourier Transform (FFT), our engine converts the image from 'spatial pixels' to 'frequency magnitudes'. A natural photo has a smooth distribution of frequencies. A deepfake has rhythmic "Bright Spots" in the corners of the FFT plot. We use a lightweight Random Forest Classifier to scan these plots for these synthetic signatures in under 100ms.

Pillar 3: rPPG – The "Pulse" of Reality

This is the most advanced part of our detector. Remote Photoplethysmography (rPPG) is a technology that detects the human heartbeat by measuring tiny color changes in the face as blood flows through the skin. Even if a deepfake looks perfect, the "Pulse" is often missing or is "Static."

Engineering the Pulse Detector

Our engine breaks the video into 30fps frames and isolates the Green Channel (which has the highest heart-rate signal contrast). We then apply a Band-pass Filter (typically between 0.7Hz and 4Hz) to extract the pulse signal. If the 'Pulse Spectrum' shows a clear, periodic peak (around 60-100 BPM), it’s a high indicator of biological origin. If it’s flat or erratic noise? You're looking at a bot.

Pillar 4: Running Inference in the Browser (WASM & WebGPU)

Running a 400MB Neural Network in a browser is impossible for most users. We had to optimize. We use TensorFlow.js with WASM backend.

Quantization: We shrunken the model from 32-bit floats to 8-bit integers. This reduced the size by 4x with only a 1% drop in accuracy.
SIMD Optimization: We use Single Instruction, Multiple Data (SIMD) in WebAssembly to process 16 pixels at once, making the 'Face Mesh' calculation instant even on a mobile phone.
WebGPU: On modern machines, we offload the matrix multiplications to the user's graphics card directly from the browser, bypassing the CPU bottleneck.

Part 5: The "Ensemble" Strategy

No single method is foolproof. Scammers use 'Deblurring' to hide ELA artifacts. They use 'Noise Injection' to hide Frequency spikes. That's why MojoDocs uses an Ensemble Approach. We weigh the results from all four pillars to give you a single "Fidelity Score."

Technique	Catches	Accuracy (v2.0)
ELA Forensics	Face Swaps / Edits	88%
FFT Frequency	GAN / Diffusion Fakes	92%
rPPG Pulse	Pre-recorded Video Fakes	94%
CNN Mesh Audit	Geometry irregularities	86%

Conclusion: The "Zero-Knowledge" Security Paradigm

The engineering of MojoDocs is rooted in Privacy First. In the old world, security meant "Send your data to the expert (server)." In the 2026 world, security means "The expert (code) comes to your data."

By keeping the forensics local, we eliminate the 'Honey Pot' risk—no central database of scanned faces means no one can hack us to steal your identity. We are building the future of Self-Sovereign Identity Verification. If you are a developer, we invite you to explore our WASM implementation and join the movement against digital deception.

Execute a Scan (Local Engine) →

engineering webassembly AI development deepfake detection computer vision machine learning webgpu privacy engineering

Inside the Machine: The Engineering Behind Local-First Deepfake Detection (2026)

Pillar 1: Error Level Analysis (ELA) & Compression Forensics

Pillar 2: Frequency Domain Analysis (The Fourier Trap)

Pillar 3: rPPG – The "Pulse" of Reality

Engineering the Pulse Detector

Pillar 4: Running Inference in the Browser (WASM & WebGPU)

Part 5: The "Ensemble" Strategy

Conclusion: The "Zero-Knowledge" Security Paradigm

Fuelling the
Mojo Cutting Chai

smart deepfake detector

background remover

image compressor

The Engineering Loop

AI in Your Pocket: The Engineering of Local Mobile Deepfake Detection (2026)

Deepfakes vs. Face-Swaps: Navigating the Spectrum of Digital Deception (2026)

The Death of the API: Why The Future of SaaS is 'Bring Your Own Compute'