Azure OpenAI

Replace https://{server}.openai.azure.com/openai/deployments/{model}/... with https://llmfoundry.straive.com/azure/openai/deployments/{model}/....

ONLY these models are supported:

gpt-4o-mini
gpt-4 (points to gpt-4-turbo-2024-04-09)
gpt-4-vision-preview
text-embedding-3-large
text-embedding-3-small
text-embedding-ada-002
gpt-35-turbo (points to gpt-35-turbo-0125)

Curl

curl -X POST "https://llmfoundry.straive.com/azure/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-05-01-preview" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role":"user","content":"What is 2 + 2"}]}'

curl -X POST "https://llmfoundry.straive.com/azure/openai/deployments/gpt-4-vision-preview/chat/completions?api-version=2024-05-01-preview" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":[{"type":"image_url","image_url":{"url":"https://upload.wikimedia.org/wikipedia/en/7/7d/Lenna_%28test_image%29.png"}},{"type":"text","text":"Who is this?"}]}]}'

curl -X POST "https://llmfoundry.straive.com/azure/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-05-01-preview" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json" \
  -d '{"input": ["Alpha", "Beta", "Gamma"], "encoding_format": "base64"}'

Python requests

import os
import requests  # Or replace requests with httpx

response = requests.post(
    "https://llmfoundry.straive.com/azure/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-05-01-preview",
    headers={"Authorization": f"Bearer {os.environ['LLMFOUNDRY_TOKEN']}:my-test-project"},
    json={"messages": [{"role": "user", "content": "What is 2 + 2?"}]},
)
print(response.json())

JavaScript

const token = process.env.LLMFOUNDRY_TOKEN;
const response = await fetch("https://llmfoundry.straive.com/azure/openai/deployments/gpt-4o-mini/chat/completions?api-version=2024-05-01-preview", {
  method: "POST",
  headers: { "Content-Type": "application/json", Authorization: `Bearer ${token}:my-test-project` },
  // If the user is already logged into LLM Foundry, use `credentials: "include"` to send **THEIR** API token instead of the `Authorization` header.
  // credentials: "include",
  body: JSON.stringify({ messages: [{ role: "user", content: "What is 2 + 2" }] }),
});
console.log(await response.json());

Python OpenAI

import os
from openai import AzureOpenAI

client = AzureOpenAI(
  api_key = f'{os.environ.get("LLMFOUNDRY_TOKEN")}:my-test-project',
  api_version = "2024-05-01-preview",
  azure_endpoint = "https://llmfoundry.straive.com/azure"
)
# Rest of your code is the same
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What is 2 + 2?"}],
)
print(response.json())

LangChain

import os
from langchain_openai import ChatOpenAI

os.environ["OPENAI_API_VERSION"] = "2024-05-01-preview"
os.environ["AZURE_OPENAI_API_KEY"] = f'{os.environ.get("LLMFOUNDRY_TOKEN")}:my-test-project'
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://llmfoundry.straive.com/azure"

# Chat
chat_model = ChatOpenAI(model="gpt-4o-mini")
print(chat_model.invoke("What is 2 + 2?").content)