Azure AI
Replace https://{endpoint}.{location}.inference.ai.azure.com/
with https://llmfoundry.straive.com/azureai/{model}/...
.
ONLY these model
s are supported and served as serverless from eastus2
:
llama-3-70b
llama-3-8b
phi-3-mini-4k
phi-3-medium-128k
Curl
curl -X POST "https://llmfoundry.straive.com/azureai/phi-3-mini-4k/v1/chat/completions" \
-H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
-H "Content-Type: application/json" \
-d '{"messages": [{"role":"user","content":"What is 2 + 2"}]}'
Python requests
import os
import requests # Or replace requests with httpx
response = requests.post(
"https://llmfoundry.straive.com/azureai/phi-3-mini-4k/v1/chat/completions",
headers={"Authorization": f"Bearer {os.environ['LLMFOUNDRY_TOKEN']}:my-test-project"},
json={"messages": [{"role": "user", "content": "What is 2 + 2?"}]}
)
print(response.json())