Appendix D

Grok Image Generation Warnings

xAI Aurora MIME mismatch — root cause, impact, mitigations, and safe workflows.

KNOWN ISSUE: xAI Grok Aurora returns JPEG bytes regardless of requested format. If mismatched bytes enter a Claude conversation history, the session becomes unrecoverable. Current code mitigates this — read on for safe workflows.

The Problem

The xAI Grok image generation API (Aurora) returns JPEG bytes regardless of the format you request. When these bytes are passed through MCP tool results with a declared media_type: "image/png", the Claude API rejects the request:

Error message
invalid_request_error: The image was specified using the image/png media type,
but the image appears to be a image/jpeg image

Why Sessions Lock Up

  1. The image tool generates an image — bytes land in the MCP tool result
  2. If raw base64 is included in the response, Claude adds it to conversation history
  3. Claude's API validates MIME types on every subsequent request (the entire message history is re-sent)
  4. Once a mismatched image enters the history, every future message fails with the same 400 error
  5. The session cannot be recovered — you must start a new conversation
This only affects conversations where raw base64 image data enters the message history. The current Plan Forge MCP implementation returns text-only responses (file path + metadata), so this crash should not occur during normal use.

Current Mitigations

The generateImage() function in orchestrator.mjs has four layers of defense:

DefenseWhat It DoesCode Location
Magic byte detection Inspects first bytes to determine actual format (JPEG = 0xFF 0xD8 0xFF, PNG = 0x89 0x50 0x4E 0x47) detectImageFormat()
Format conversion Uses sharp to convert to requested format when actual ≠ requested convertImageFormat()
Text-only MCP response Tool returns type: "text" with JSON payload (file path, metadata) — never raw base64 server.mjs handler
Truncated base64 Only first 100 chars of base64 included for diagnostics, never full bytes generateImage() return

Safe Workflows

For Chapter Art and Illustrations

  1. Always specify outputPath — image saves to disk, not returned inline
  2. Prefer .jpg extension — matches what Grok actually returns (no conversion needed)
  3. If you need PNG, ensure sharp is installed: cd pforge-mcp && npm install sharp
  4. Never generate images in a long-running session — use the REST API or a standalone script
  5. Batch image generation — generate all art in one dedicated session, separate from writing

Standalone Script (Recommended)

REST API (server must be running)
curl -X POST http://localhost:3100/api/image/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "dark fantasy forge workshop panoramic, amber firelight",
    "outputPath": "docs/manual/assets/chapter-heroes/ch1-hero.jpg"
  }'
One-shot Node script (no server needed)
node -e "
  import('./pforge-mcp/orchestrator.mjs').then(m =>
    m.generateImage('dark fantasy forge workshop, amber firelight', {
      outputPath: 'docs/manual/assets/chapter-heroes/ch1-hero.jpg',
      model: 'grok-imagine-image'
    }).then(r => console.log(JSON.stringify(r, null, 2)))
  )
"

Pipeline Test Results

Tested 2026-04-07:

TestResultDetails
JPG direct (.jpg output)✅ PASSGrok returns JPEG, saved as .jpg — no conversion. 41 KB.
PNG conversion (.png output)✅ PASSGrok returns JPEG, sharp converts to PNG — 312 KB.
MIME detection✅ PASSdetectImageFormat() correctly identified JPEG bytes.
MCP tool response✅ SAFEReturns text-only JSON, never raw base64.
Session recovery⚠️ MITIGATEDCrash only occurs if raw base64 with wrong MIME enters history. Current code prevents this.

If a Session Crashes

  1. Start a new conversation — the current session cannot be recovered
  2. Don't retry the same tool call in the new session — it will produce the same crash if the root cause persists
  3. Use the REST API instead of the MCP tool for the image generation
  4. Check sharp: run cd pforge-mcp && npm ls sharp — if not installed, format conversion won't work and the extension gets corrected to .jpg instead
Best practice: Use .jpg for all generated images. It matches Grok's native output format — no conversion, no risk, fastest save.

📄 Source: pforge-mcp/orchestrator.mjsdetectImageFormat(), convertImageFormat(), generateImage()