Homoglyph Scanner - Detect Visual Spoofing Attacks

Understanding the Threat

The Problem

Homoglyph attacks use Unicode characters that look identical to ASCII characters but have different code points. Attackers can:

Insert malicious code that looks legitimate
Bypass code reviews and security scans
Create backdoors in open-source projects
Compromise supply chain security

The Solution

Bad Character Scanner™ (BCS) at badcharacterscanner.com helps you:

Detect invisible and suspicious characters
Scan entire codebases automatically
Identify potential homoglyph attacks
Maintain code security and integrity
Generate detailed security reports
Integrate with your CI/CD pipeline

Real-World Examples

Example 1: The Dangerous Look-alike

The following two lines look identical, but one contains a Cyrillic 'а' instead of a Latin 'a':

✓ Safe (Latin characters):
function authenticate(user) { return true; }
⚠️ Dangerous (Contains Cyrillic):
function аuthenticate(user) { return false; }

The second function uses a Cyrillic 'а' (U+0430) instead of Latin 'a' (U+0061). This could allow an attacker to override your authentication function!

Reference: Wikipedia: IDN Homograph Attack

Great job staying alert! Spotting tiny differences like this keeps your code safe for everyone.

Example 2: Invisible Characters

Some Unicode characters are completely invisible but can break your code:

✓ Clean code:
if (user.isAdmin) { /* admin logic */ }
⚠️ Contains invisible characters:
if (user.isAdmin)[U+200B] { /* admin logic */ }

The invisible Zero Width Space (U+200B) character can break parsing and create security vulnerabilities.

Reference: Wikipedia: Zero-width Space

You’re doing awesome! Catching invisible characters helps your team avoid tricky bugs and keeps your code clean.

Example 3: URL Spoofing Attack

Homoglyphs can be used to create fake URLs that look legitimate:

✓ Real URL:
https://google.com/login
⚠️ Spoofed URL (Contains Cyrillic):
https://gооgle.com/login

The spoofed URL uses Cyrillic 'о' (U+043E) instead of Latin 'o' (U+006F). Users might not notice the difference!

Reference: Wikipedia: Punycode | Internationalized Domain Names

Stay vigilant against sophisticated phishing attacks that use look-alike domains.

The Growing Threat

143,859

Unicode Characters

Potential attack vectors in the Unicode standard

15,000+

Homoglyph Pairs

Visually identical character combinations

89%

Undetected Attacks

Security scans that miss homoglyph threats

100%

BCS Detection Rate

Bad Character Scanner™ accuracy

Learn more about zero-width characters: Wikipedia: Zero-width Space

Join Waitlist for Early Access

Who Needs Bad Character Scanner™?

👨‍💻

Authors & Publishers

Remove invisible characters from LLM-generated content, CVs, and documentation to ensure clean, professional text.

Learn About Text Scanner →

👨‍💻

Developers

Protect your code from supply chain attacks and ensure clean commits with codebase scanning.

Learn About Codebase Scanner →

🏢

Enterprises

Secure your entire codebase with automated scanning and reporting.

Learn About Enterprise Solutions →

🔒