Introduction: The OpenAI GPT-Image Matrix
OpenAI’s GPT-Image line spans two generations—1.5 and 2.0—each with a text-to-image and an image-to-image workflow. On FuseAI Tools that yields four concrete modes: fast 1.5 routes for iteration, and v2 routes for long prompts, social aspect ratios, and 1K/2K/4K output.
The usual question is not “which brand” but which generation and which input type: should you draft on 1.5, deliver on 2.0, and do you start from text alone or from reference stills?
GPT-Image hub: /home/gpt-image
I. Snapshot: Four Models at a Glance
| Model / route | Core function | Prompt limit | Output control | Reference images | Positioning |
|---|---|---|---|---|---|
| 1.5 text-to-image | Text → image | 3,000 chars | quality: medium / high | ❌ | Fast text-to-image |
| 1.5 image-to-image | Image + text → image | 3,000 chars | quality: medium / high | ✅ (1 image) | Quick edits |
| 2 text-to-image | Text → image | 20,000 chars | resolution: 1K / 2K / 4K | ❌ | Pro text-to-image |
| 2 image-to-image | Images + text → image | 20,000 chars | resolution: 1K / 2K / 4K | ✅ (up to 16) | Pro multi-reference edit |
II. Four Models Deep Dive
Model 1: GPT-Image 1.5 text-to-image
Create stills from language with lightweight controls—ideal for concept boards and social drafts.
{
"model": "gpt-image-1.5-text-to-image",
"prompt": "Describe the image you want",
"aspectRatio": "1:1",
"quality": "medium"
}
Parameters:
| Field | Options | Notes |
|---|---|---|
| aspectRatio | 1:1, 2:3, 3:2 | Three classic ratios; no 9:16 |
| quality | medium, high | medium = balanced; high = slower, more detail |
Best for: fast concepts, simple social tiles, low-friction experimentation.
Route: /home/gpt-image/text-to-image
Model 2: GPT-Image 1.5 image-to-image
Edit from a single reference still plus prompt—wardrobe swaps, background changes, light style shifts.
{
"model": "gpt-image-1.5-image-to-image",
"inputUrls": ["https://example.com/reference.jpg"],
"prompt": "Change the outfit to a red dress; keep pose and expression",
"aspectRatio": "3:2",
"quality": "high"
}
Note: FuseAI Tools accepts one reference upload on the 1.5 image-to-image tab (JPG/PNG/WEBP, max 10MB).
Best for: product touch-ups, portrait edits, single-image guided variations.
Route: /home/gpt-image/image-to-image
Model 3: GPT-Image 2 text-to-image
Professional text-to-image with very long prompts and explicit pixel tiers.
{
"model": "gpt-image-2-text-to-image",
"prompt": "Cinematic cyberpunk city at night, neon rain reflections (up to 20000 characters)",
"aspectRatio": "16:9",
"resolution": "4K"
}
vs 1.5 text-to-image:
| Contrast | 1.5 text-to-image | 2 text-to-image |
|---|---|---|
| Prompt limit | 3,000 chars | 20,000 chars |
| Output knob | quality | resolution (1K/2K/4K) |
| Aspect ratios | 1:1, 2:3, 3:2 | auto, 1:1, 9:16, 16:9, 4:3, 3:4 |
| 4K | ❌ | ✅ (with rules below) |
4K rules on FuseAI Tools:
- 1:1 cannot use 4K.
- auto only allows 1K resolution.
Best for: posters, hero banners, print-minded stills, prompt-heavy art direction.
Route: /home/gpt-image/v2-text-to-image
Model 4: GPT-Image 2 image-to-image
High-resolution editing with up to sixteen reference URLs and the same v2 aspect-ratio and resolution stack.
{
"model": "gpt-image-2-image-to-image",
"prompt": "Blend wardrobe references into one editorial look, soft daylight",
"inputUrls": [
"https://example.com/ref-1.jpg",
"https://example.com/ref-2.jpg"
],
"aspectRatio": "9:16",
"resolution": "2K"
}
vs 1.5 image-to-image: v2 adds 20k prompts, resolution tiers, social ratios, and multi-image input (up to 16). Use 1.5 when you only need a single quick edit; use v2 for complex fusion or 4K delivery.
Best for: multi-reference styling, campaign composites, high-res retouch pipelines.
Route: /home/gpt-image/v2-image-to-image
III. Cross-Model Comparison
3.1 Capability matrix
| Feature | 1.5 T2I | 1.5 I2I | 2 T2I | 2 I2I |
|---|---|---|---|---|
| Text-to-image | ✅ | ❌ | ✅ | ❌ |
| Image-to-image | ❌ | ✅ | ❌ | ✅ |
| 20,000-char prompt | ❌ | ❌ | ✅ | ✅ |
| 4K output | ❌ | ❌ | ✅ | ✅ |
| Multi-reference (16) | ❌ | ❌ | ❌ | ✅ |
| auto aspect ratio | ❌ | ❌ | ✅ | ✅ |
| quality param | ✅ | ✅ | ❌ | ❌ |
| resolution param | ❌ | ❌ | ✅ | ✅ |
3.2 Aspect ratio coverage
| Ratio | 1.5 series | 2.0 series |
|---|---|---|
| 1:1 | ✅ | ✅ |
| 2:3 / 3:2 | ✅ | ❌ |
| 16:9 / 9:16 | ❌ | ✅ |
| 4:3 / 3:4 | ❌ | ✅ |
| auto | ❌ | ✅ (1K only) |
Takeaway: 1.5 favors classic photo ratios (2:3, 3:2). v2 targets social and cinematic frames (9:16, 16:9) plus optional auto sizing.
IV. Selection Decision Tree
What is your starting point?
|
|-- Text only (no reference image)
| |-- Fast draft, simple brief -> 1.5 text-to-image
| `-- Long brief, 4K, 9:16 or 16:9 -> 2 text-to-image
|
`-- Have reference image(s)
|-- One image, quick edit -> 1.5 image-to-image
`-- Many refs or 4K delivery -> 2 image-to-image
V. Practical Payload Examples
1.5 text-to-image
{
"model": "gpt-image-1.5-text-to-image",
"prompt": "An orange cat napping in warm sunlight, cozy home mood",
"aspectRatio": "1:1",
"quality": "high"
}
1.5 image-to-image
{
"model": "gpt-image-1.5-image-to-image",
"inputUrls": ["https://example.com/portrait.jpg"],
"prompt": "Swap the outfit to a red dress; keep pose and expression",
"aspectRatio": "3:2",
"quality": "medium"
}
2 text-to-image (4K banner)
{
"model": "gpt-image-2-text-to-image",
"prompt": "Rain-soaked cyberpunk avenue, neon reflections, film still",
"aspectRatio": "16:9",
"resolution": "4K"
}
Reminder: do not pair 4K with 1:1 or auto.
2 image-to-image (multi-reference)
{
"model": "gpt-image-2-image-to-image",
"inputUrls": ["https://example.com/a.jpg", "https://example.com/b.jpg"],
"prompt": "Merge styling cues from both references into one editorial frame",
"aspectRatio": "9:16",
"resolution": "2K"
}
VI. Final Recommendations
| Use case | Model | Why |
|---|---|---|
| Quick concept still | 1.5 text-to-image | Simple quality toggle, low friction |
| Single-photo edit | 1.5 image-to-image | One reference, fast turnaround |
| Vertical social / hero banner | 2 text-to-image | 9:16 and 16:9 plus 4K tiers |
| Multi-image fusion | 2 image-to-image | Up to 16 inputUrls |
| Long structured prompt | 2.0 series | 20,000-character ceiling |
One-line playbook:
- Daily drafts → 1.5 text-to-image or 1.5 image-to-image.
- Final delivery → 2 text-to-image or 2 image-to-image.
1.5 and 2.0 are complementary: iterate cheaply on 1.5, then promote winners to v2 for resolution and ratio control. Open every mode from /home/gpt-image.
