Critical Bug Report: Romanian Diacritics & BOM Encoding Issues Blocking File Editing

Report ID: QDR-ENC-2025-001
Severity: CRITICAL
Priority: HIGH
Date: December 15, 2025
Reporter: Production User (TeAjutEu Flutter Project)


Executive Summary

Qoder AI cannot edit files containing Romanian diacritics (ă, â, î, ș, ț) or other non-ASCII characters due to encoding handling limitations. This critical issue forces users to switch to competing AI coding assistants (Cursor AI, Google Gemini, etc.) for localization tasks, despite Qoder’s superior coding logic and architectural understanding.

This severely impacts:

  • Multi-language projects (user’s current project: 56 languages)

  • Localization workflows (ARB files with Romanian, German, French, etc.)

  • User productivity (blocked workflows, context switching between tools)

  • Qoder’s competitive advantage (users abandon Qoder for inferior tools)


Problem Description

1. Root Cause: Encoding Detection & Preservation Failure

What happens:

  • When Qoder attempts to edit files with Romanian diacritics, the search_replace or edit_file tools fail silently or corrupt the file

  • Diacritics are replaced with mojibake characters (�, �, ?, etc.)

  • BOM (Byte Order Mark U+FEFF) is sometimes injected during file writes

  • User must manually intervene using external tools or other AI assistants

Technical root cause:

plaintext
Qoder's file I/O operations do NOT:
1. Detect existing file encoding (UTF-8, UTF-8 with BOM, etc.)
2. Preserve original encoding during read-modify-write cycles
3. Provide explicit encoding control in tool parameters
4. Validate character integrity after writes

2. When The Problem Occurs

Scenario A: Editing ARB Localization Files

json
// lib/l10n/intl_ro.arb
{
  "welcomeMessage": "Bună ziua! Vă mulțumim că folosiți aplicația.",
  "requestHelp": "Trimite cerere de ajutor"
}

User task: Add new translation key
Qoder action: Attempts search_replace on intl_ro.arb
Result: :cross_mark: File corrupted, diacritics become Bună ziua! Vă mul�umim
User forced to: Switch to Gemini with manually crafted prompt


Scenario B: Editing Dart Files with Romanian Comments

dart
// Verifică dacă utilizatorul a acceptat termenii și condițiile
bool validateTerms(User user) {
  // ...
}

User task: Modify comment text
Qoder action: Tool rejects edit or corrupts comment
Result: :cross_mark: Build fails or comment becomes unreadable
User forced to: Manual PowerShell BOM removal script


Scenario C: BOM-Infected Files from Other AI Tools

Situation:

  • User edits file with Cursor AI / Copilot / other tools

  • Those tools inject UTF-8 BOM (U+FEFF) at file start

  • Flutter compiler fails with BOM present

  • User needs to clean BOM before continuing

Current Qoder limitation:

  • :cross_mark: Cannot detect BOM presence

  • :cross_mark: Cannot strip BOM during read operation

  • :cross_mark: Cannot write UTF-8 without BOM explicitly

User’s current manual workaround:

powershell
# User must run this manually for EVERY infected file
$content = Get-Content -Path "file.dart" -Raw
$utf8NoBom = New-Object System.Text.UTF8Encoding($false)
[System.IO.File]::WriteAllText("file.dart", $content, $utf8NoBom)

3. Impact on Multi-Language Projects

User’s real-world scenario:

  • Project with 56 supported languages

  • Each language has separate ARB file: intl_ro.arb, intl_de.arb, intl_fr.arb, etc.

  • Massive translation updates needed across all files

  • Languages with diacritics: Romanian, German, French, Spanish, Polish, Czech, Turkish, Vietnamese, etc.

Current Qoder blockers:

  • :cross_mark: Cannot batch-edit multiple ARB files with non-ASCII characters

  • :cross_mark: Cannot add translations containing ă, ö, ü, ñ, č, ş, etc.

  • :cross_mark: Cannot fix BOM-infected files automatically

  • :cross_mark: User must create Gemini prompts for every edit

Productivity loss:

plaintext
Single translation key addition across 56 languages:
- With proper Qoder support: 2 minutes (1 tool call)
- Current workaround: 45+ minutes (manual editing or Gemini prompts)

Weekly localization sprint (200 keys):
- Potential with Qoder: 2-3 hours
- Current reality: 2-3 DAYS (context switching, manual fixes)

Evidence: User Workaround Pattern

The “Gemini Prompt” Workaround

When Qoder cannot edit Romanian text, user follows this pattern:

  1. User identifies Qoder limitation

  2. User explicitly requests: “Make me a prompt for Gemini to edit this”

  3. Qoder generates detailed prompt for competitor tool

  4. User switches to Gemini/Claude to execute edit

  5. User returns to Qoder for remaining work

Files documenting this pattern:

  • PROMPT_FIX_REPAIRER_INVITATION_DIALOG.md

  • PROMPT_FIX_TRACKING_PANEL_TIMESTAMP.md

  • PROMPT_GEMINI_FIX_REMAINING_ERRORS.md

  • PROMPT_GEMINI_REPAIRERS_BRANCH_FIXES.md

User memory excerpt:

“Fallback Strategy: Provide Gemini Prompts When Editing Is Not Possible - When unable to perform direct edits, provide a detailed, ready-to-use prompt for Gemini that the user can copy and execute independently.”


Business Impact

Why This Makes Users Abandon Qoder

User’s explicit feedback:

“Qoder’s logic is far superior to other tools, but I’m FORCED to use Cursor AI / Gemini for Romanian text edits. If Qoder solved this, I wouldn’t need other tools at all.”

Competitive analysis:

Feature Qoder Cursor AI Gemini Claude
Architectural understanding :star::star::star::star::star: :star::star::star: :star::star::star: :star::star::star::star:
Codebase-wide reasoning :star::star::star::star::star: :star::star::star: :star::star: :star::star::star:
Unicode/diacritic handling :cross_mark: :white_check_mark: :white_check_mark: :white_check_mark:
BOM detection/removal :cross_mark: :white_check_mark: :white_check_mark: :white_check_mark:
Multi-file batch editing :star::star::star::star: :star::star::star: :star::star::star: :star::star::star:

Result: Users choose inferior tools for 30-40% of their workflow due to this single blocker.


Proposed Solutions

Solution 1: Add Explicit Encoding Control to File Tools :star: RECOMMENDED

Modify search_replace and edit_file tools:

typescript
interface FileEditParams {
  file_path: string;
  replacements: Replacement[];
  encoding?: 'utf-8' | 'utf-8-bom' | 'utf-16' | 'auto'; // NEW
  preserve_encoding?: boolean; // NEW (default: true)
  strip_bom?: boolean; // NEW (default: false)
}

Implementation:

  1. Read phase: Detect file encoding using BOM sniffing or confidence heuristics

  2. Process phase: Perform text operations in UTF-8 internally

  3. Write phase:

    • If preserve_encoding=true: Write with original encoding

    • If strip_bom=true: Force UTF-8 without BOM

    • Validate character integrity (no mojibake)

    Benefits:

  • :white_check_mark: Backward compatible (default preserves current behavior)

  • :white_check_mark: Explicit control when needed (strip_bom=true for Flutter projects)

  • :white_check_mark: Handles all diacritic-heavy languages (Romanian, German, Polish, etc.)


Solution 2: Add BOM Detection & Cleanup Tool

New tool: fix_file_encoding

typescript
interface FixEncodingParams {
  file_paths: string[];
  target_encoding: 'utf-8-no-bom' | 'utf-8-bom' | 'auto';
  validate_only?: boolean; // Report issues without fixing
}

interface EncodingReport {
  file_path: string;
  current_encoding: string;
  has_bom: boolean;
  has_mojibake: boolean;
  recommended_action: string;
}

Use cases:

  • Batch-clean BOM from files infected by other tools

  • Pre-flight validation before Flutter builds

  • Audit entire lib/l10n/ directory for encoding issues


Solution 3: Transparent UTF-8 Normalization (Deep Fix)

Architectural change:

plaintext
Current flow:
User request → Qoder → Tool (fragile encoding) → File system

Proposed flow:
User request → Qoder → Encoding layer (normalize) → Tool → Encoding layer (restore) → File system

Encoding layer responsibilities:

  1. Read: Detect encoding, strip BOM if unwanted, convert to UTF-8

  2. Edit: All operations in UTF-8 with full Unicode support

  3. Write: Apply target encoding policy (project-specific or user preference)

Project-level encoding policy:

json
// .qoder/config.json
{
  "encoding": {
    "default": "utf-8",
    "preserve_bom": false,
    "patterns": {
      "lib/**/*.dart": { "encoding": "utf-8", "bom": false },
      "lib/l10n/*.arb": { "encoding": "utf-8", "bom": false },
      "docs/**/*.md": { "encoding": "utf-8", "bom": "preserve" }
    }
  }
}

Testing Requirements

Test Suite for Encoding Handling

Test 1: Romanian Diacritics Preservation

dart
// Input file: test_ro.arb
{"key": "Bună ziua, mulțumim că folosiți aplicația"}

// Operation: Add new key via search_replace
// Expected: All diacritics preserved (ă, î, ș, ț, â)
// Validation: Character-by-character comparison

Test 2: BOM Detection & Removal

dart
// Input: File with UTF-8 BOM (EF BB BF)
// Operation: Edit file with strip_bom=true
// Expected: BOM removed, content preserved
// Validation: Hex dump shows no BOM, Flutter build succeeds

Test 3: Multi-Language Batch Edit

dart
// Input: 56 ARB files (ro, de, fr, es, pl, etc.)
// Operation: Add same key to all files
// Expected: All diacritics preserved in all languages
// Validation: No mojibake, no encoding corruption

Test 4: Mixed-Encoding Project

dart
// Input: Project with UTF-8 (no BOM) + UTF-8 (BOM) files
// Operation: Edit both file types
// Expected: Original encodings preserved unless explicitly changed

Success Metrics

KPIs After Fix Implementation

  1. User Retention:

    • Target: 95% of localization tasks completed within Qoder

    • Current: ~60% (40% exported to Gemini/Cursor)

  2. Productivity:

    • Target: 56-language translation update in under 3 hours

    • Current: 2-3 days with context switching

  3. Tool Switching:

    • Target: <5% of sessions require external tools

    • Current: ~35% of sessions blocked by encoding issues

  4. User Satisfaction:

    • Target: Zero “create Gemini prompt” requests for encoding issues

    • Current: 25 requests per day


Priority Justification

Why This Should Be Priority #1

  1. Affects majority of non-English projects

    • German: ä, ö, ü, ß

    • French: é, è, ê, ç, à

    • Spanish: ñ, á, í, ó

    • Polish: ą, ć, ę, ł, ń, ś, ź, ż

    • Turkish: ğ, ı, ş, ç, ö, ü

    • Czech, Romanian, Vietnamese, etc.

  2. Localization is a massive use case

    • Mobile apps: avg 10-30 languages

    • Enterprise apps: 50-100+ languages

    • Every user hits this eventually

  3. Qoder loses competitive advantage

    • Users acknowledge Qoder’s superior logic

    • Still forced to use inferior tools for 30-40% of work

    • Easy win: Fix this, users stay 100% in Qoder

  4. Low implementation complexity vs. high impact

    • Encoding layer: ~500-800 lines of code

    • Tool parameter additions: ~100 lines

    • Impact: Unlocks entire international user base


Qoder’s architectural understanding is unmatched. But when I need to edit (56 integrated languages), I’m blocked. I have to create Gemini prompts. If Qoder just handled UTF-8 properly, I’d never leave. The logic is already superior—this encoding issue is the ONLY thing holding it back.


Recommended Action Plan

Phase 1: Quick Win (Week 1-2)

  • Add strip_bom parameter to search_replace tool

  • Implement UTF-8 without BOM write path

  • Test with Romanian ARB files

Phase 2: Comprehensive Fix (Week 3-4)

  • Add encoding detection on file read

  • Implement fix_file_encoding tool

  • Create encoding validation test suite

Phase 3: Architectural Enhancement (Week 5-8)

  • Build transparent encoding normalization layer

  • Add project-level encoding policies

  • Document best practices for i18n projects


Appendix: Technical References

User’s Current Workaround Scripts

  • remove_bom_from_all_files.ps1

  • repair_arb_no_bom.ps1

  • POWERSHELL_NO_BOM_RULE.md

  • ANTI_BOM_WORKFLOW.md

Flutter Build Failure Pattern

plaintext
Error: Compilation failed due to BOM character (U+FEFF) at file start
File: lib/features/emergency/emergency_chat_screen.dart:1:1

User’s Explicit Memory Entry

“User Preference: No BOM in Files - User strongly prefers that no BOM is included in any generated source file, as it leads to build failures in the Flutter project.”


Submitted by: Qoder Production User
Project: TeAjutEu (Flutter, 56 languages, 23+ modules)
Contact for clarification: Via task continuation

REQUEST: Please prioritize this fix. Qoder has the best logic—don’t let encoding be the reason users switch to competitors.


End of Report

Great! We’ll take a look at the issue :saluting_face: