Appendix D
Grok Image Generation Warnings
xAI Aurora MIME mismatch — root cause, impact, mitigations, and safe workflows.
KNOWN ISSUE: xAI Grok Aurora returns JPEG bytes regardless of requested format. If mismatched bytes enter a Claude conversation history, the session becomes unrecoverable. Current code mitigates this — read on for safe workflows.
The Problem
The xAI Grok image generation API (Aurora) returns JPEG bytes regardless of the format you request. When these bytes are passed through MCP tool results with a declared media_type: "image/png", the Claude API rejects the request:
Error message
invalid_request_error: The image was specified using the image/png media type,
but the image appears to be a image/jpeg image
Why Sessions Lock Up
- The image tool generates an image — bytes land in the MCP tool result
- If raw base64 is included in the response, Claude adds it to conversation history
- Claude's API validates MIME types on every subsequent request (the entire message history is re-sent)
- Once a mismatched image enters the history, every future message fails with the same 400 error
- The session cannot be recovered — you must start a new conversation
This only affects conversations where raw base64 image data enters the message history. The current Plan Forge MCP implementation returns text-only responses (file path + metadata), so this crash should not occur during normal use.
Current Mitigations
The generateImage() function in orchestrator.mjs has four layers of defense:
| Defense | What It Does | Code Location |
|---|---|---|
| Magic byte detection | Inspects first bytes to determine actual format (JPEG = 0xFF 0xD8 0xFF, PNG = 0x89 0x50 0x4E 0x47) |
detectImageFormat() |
| Format conversion | Uses sharp to convert to requested format when actual ≠ requested |
convertImageFormat() |
| Text-only MCP response | Tool returns type: "text" with JSON payload (file path, metadata) — never raw base64 |
server.mjs handler |
| Truncated base64 | Only first 100 chars of base64 included for diagnostics, never full bytes | generateImage() return |
Safe Workflows
For Chapter Art and Illustrations
- Always specify
outputPath— image saves to disk, not returned inline - Prefer
.jpgextension — matches what Grok actually returns (no conversion needed) - If you need PNG, ensure
sharpis installed:cd pforge-mcp && npm install sharp - Never generate images in a long-running session — use the REST API or a standalone script
- Batch image generation — generate all art in one dedicated session, separate from writing
Standalone Script (Recommended)
REST API (server must be running)
curl -X POST http://localhost:3100/api/image/generate \
-H "Content-Type: application/json" \
-d '{
"prompt": "dark fantasy forge workshop panoramic, amber firelight",
"outputPath": "docs/manual/assets/chapter-heroes/ch1-hero.jpg"
}'
One-shot Node script (no server needed)
node -e "
import('./pforge-mcp/orchestrator.mjs').then(m =>
m.generateImage('dark fantasy forge workshop, amber firelight', {
outputPath: 'docs/manual/assets/chapter-heroes/ch1-hero.jpg',
model: 'grok-imagine-image'
}).then(r => console.log(JSON.stringify(r, null, 2)))
)
"
Pipeline Test Results
Tested 2026-04-07:
| Test | Result | Details |
|---|---|---|
JPG direct (.jpg output) | ✅ PASS | Grok returns JPEG, saved as .jpg — no conversion. 41 KB. |
PNG conversion (.png output) | ✅ PASS | Grok returns JPEG, sharp converts to PNG — 312 KB. |
| MIME detection | ✅ PASS | detectImageFormat() correctly identified JPEG bytes. |
| MCP tool response | ✅ SAFE | Returns text-only JSON, never raw base64. |
| Session recovery | ⚠️ MITIGATED | Crash only occurs if raw base64 with wrong MIME enters history. Current code prevents this. |
If a Session Crashes
- Start a new conversation — the current session cannot be recovered
- Don't retry the same tool call in the new session — it will produce the same crash if the root cause persists
- Use the REST API instead of the MCP tool for the image generation
- Check
sharp: runcd pforge-mcp && npm ls sharp— if not installed, format conversion won't work and the extension gets corrected to.jpginstead
Best practice: Use
.jpg for all generated images. It matches Grok's native output format — no conversion, no risk, fastest save.
📄 Source: pforge-mcp/orchestrator.mjs — detectImageFormat(), convertImageFormat(), generateImage()