Image Generation
Generate images using AI models that support multimodal output through the OpenAI-compatible API. This feature allows you to create images alongside text responses using models like Google's Gemini 2.5 Flash Image.
To enable image generation, include the parameter in your request:
- (array): Array of strings specifying the desired output modalities. Use for both text and image generation, or for image-only generation.
When image generation is enabled, the response separates text content from generated images:
- : Contains the text description as a string
- : Array of generated images, each with:
- : Always
- : Base64-encoded data URI of the generated image
For streaming requests, images are delivered in delta chunks:
When processing streaming responses, check for both text content and images in each delta:
Image generation support: Currently, image generation is supported by Google's Gemini 2.5 Flash Image model. The generated images are returned as base64-encoded data URIs in the response. For more detailed information about image generation capabilities, see the Image Generation documentation.
Was this helpful?