Request parameters for /prompt

This topic explains the prompt request parameters for a prompt passthrough.

This table shows the list of the api/v1/prompt endpoint parameters together with other details for further model configuration:

Table 1. Prompt Request
Parameter Type Required Description
model string false Name of the model to use, if none provided defaults to CLAUDE
version string false Specific id/version of the model to use, Model is required for this field.
safetyGuardrail Boolean false Enabled the base guardrails from AWS
prompt string true Prompt to provide to the LLM
encoded_image string false A Base 64 encoded image
config ModelConfig false Configuration to customize model parameters
document object false Upload the document as part of the prompt to LLM; see: Document support for LLM passthrough endpoints
Table 2. Configuration
Name Type Required Description
max_response int false

Maximum number of tokens to generate a response.

Note: The default value is 1024.
temperature float false What sampling temperature to use, between 0 and 2.
top_p float false An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. temperature vs top_p
stop_sequence string[] false Up to 4 sequences where the API will stop generating further tokens. How to use
frequency_penalty float false Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
presence_penalty float false

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

frequency_penalty vs presence_penalty

reasoning object false

Enabled: the flag for enabling the extensive thinking

Budget_tokens: the maximum of tokens used for reasoning

Prompt request example

{
  "model": "string",
  "version": "string",
  "prompt": "string",
  "encoded_image": "data:image/png;base64,<Base 64 Encoding>",
  "config": {
    "max_response": integer,
    "temperature": integer,
    "top_p": string,
    "stop_sequence": [
      string
    ],
    "frequency_penalty": integer,
    "presence_penalty": intger
  }
}