Azure AI

Replace https://{endpoint}.{location}.inference.ai.azure.com/ with https://llmfoundry.straive.com/azureai/{model}/....

ONLY these models are supported and served as serverless from eastus2:

llama-3-70b
llama-3-8b
phi-3-mini-4k
phi-3-medium-128k

Curl

curl -X POST "https://llmfoundry.straive.com/azureai/phi-3-mini-4k/v1/chat/completions" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role":"user","content":"What is 2 + 2"}]}'

Python requests

import os
import requests  # Or replace requests with httpx

response = requests.post(
    "https://llmfoundry.straive.com/azureai/phi-3-mini-4k/v1/chat/completions",
    headers={"Authorization": f"Bearer {os.environ['LLMFOUNDRY_TOKEN']}:my-test-project"},
    json={"messages": [{"role": "user", "content": "What is 2 + 2?"}]}
)
print(response.json())