CSAM Text Detection

The CSAM Text Detection model analyzes text content to identify indicators of child sexual abuse material (CSAM) and sextortion threats. This model uses a tiered severity classification system to help platforms comply with legal reporting requirements and protect children from exploitation, grooming, and sextortion schemes.

The Challenge

Online platforms face legal and ethical obligations to detect, report, and remove CSAM content and sextortion threats. Text-based CSAM indicators and sextortion language can appear in captions, messages, comments, and file names. Sextortion—where perpetrators coerce victims by threatening to share intimate images—has become an epidemic targeting minors. Manual review at scale is impossible, and keyword-based approaches miss sophisticated evasion techniques. Platforms need AI-powered detection that understands context and severity while minimizing false positives that burden review teams.

Model Overview

The CSAM Text Detection model performs multi-tier classification to identify text containing CSAM indicators and sextortion threats. The model assigns content to severity tiers (1-5) based on the nature and explicitness of the material described, including coercive language patterns typical of sextortion schemes, enabling appropriate escalation and reporting workflows.

Achieving 99.9% accuracy, this model helps platforms meet legal obligations under NCMEC reporting requirements and international child protection laws.

How It Works

The model employs advanced natural language understanding to analyze CSAM indicators and sextortion threats:

Contextual analysis: Understands meaning beyond simple keyword matching
Severity assessment: Assigns appropriate tier based on content nature
Sextortion detection: Identifies coercion, threats, and blackmail patterns targeting minors
Evasion detection: Recognizes obfuscation techniques and coded language
Multi-language support: Detects indicators across multiple languages

Response Structure

The API returns a structured response containing:

label: Classification result - "safe" or tier level
tier: Severity tier (0 for safe, 1-5 for increasing severity)
is_csam: Boolean flag indicating CSAM detection
tier probabilities: Probability scores for each tier (tier1-5 and safe)
confidence: Overall classification confidence score

Tier Classification

Tier	Severity	Recommended Action
Safe (0)	No CSAM indicators detected	No action required
Tier 1	Lowest severity indicators	Flag for review
Tier 2	Low severity indicators	Priority review
Tier 3	Moderate severity indicators	Immediate review, consider reporting
Tier 4	High severity indicators	Immediate removal, mandatory reporting
Tier 5	Highest severity indicators	Immediate removal, urgent NCMEC report

Performance Metrics

Metric	Value
Classification Accuracy	99.9%
Average Response Time	150ms
Max File Size	1MB
Supported Formats	TXT, JSON

Use Cases

Content Moderation: Scan user-generated content for CSAM indicators before publication
Message Screening: Monitor private messages on platforms for illegal content
Sextortion Prevention: Detect coercive messages, threats, and blackmail attempts targeting minors
File Name Analysis: Detect CSAM indicators in uploaded file names and metadata
Search Query Monitoring: Identify suspicious search patterns
Compliance Automation: Streamline NCMEC reporting workflows with severity-based triage

Legal & Compliance Requirements

CRITICAL: Platforms have legal obligations regarding CSAM and sextortion detection and reporting.

NCMEC Reporting: US law requires electronic service providers to report CSAM and sextortion involving minors to NCMEC within specific timeframes
Sextortion Laws: Many jurisdictions have specific laws criminalizing sextortion, particularly when targeting minors
Content Preservation: Preserve detected content for law enforcement as required by law
User Account Actions: Suspend accounts associated with CSAM or sextortion violations
International Laws: Comply with local child protection laws in all operating jurisdictions

Implementation Guidelines

Human Review: All flagged content should be reviewed by trained trust & safety specialists
Escalation Procedures: Establish clear escalation paths based on tier severity
Documentation: Maintain detailed logs for compliance audits and law enforcement requests
Staff Wellbeing: Provide mental health support for content reviewers exposed to harmful material

Access Restrictions

This model is restricted to Business plan subscribers and above due to the sensitive nature of CSAM and sextortion detection. Organizations must agree to acceptable use policies and demonstrate legitimate trust & safety use cases.

CSAM Text Detection