Vertex AI

Replace https://{LOCATION}.googleapis.com/v1/projects/{PROJECT}/locations/{LOCATION}/publishers/{PUBLISHER}/models/{MODEL}:{ACTION} with https://llmfoundry.straive.com/vertexai/{PUBLISHER}/models/{MODEL}:{ACTION}.

Pick the latest model from the Google Model Garden such as:

Curl

curl -X POST "https://llmfoundry.straive.com/vertexai/google/models/gemini-1.5-flash:generateContent" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d '{"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}}'

curl -X POST "https://llmfoundry.straive.com/vertexai/anthropic/models/claude-3-haiku@20240307:rawPredict" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d '{"anthropic_version": "vertex-2023-10-16", "max_tokens": 256, "messages": [{"role": "user","content": [{"type": "text", "text": "What is 2 + 2?"}]}]}'

curl -X POST "https://llmfoundry.straive.com/vertexai/meta/models/llama-3.2-90b-vision-instruct-maas:generateContent" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d '{"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}}'

Python requests

import os
import requests  # Or replace requests with httpx

response = requests.post(
    "https://llmfoundry.straive.com/vertexai/google/models/gemini-1.0-pro:generateContent",
    headers={"Authorization": f"Bearer {os.environ['LLMFOUNDRY_TOKEN']}:my-test-project"},
    json={"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}},
)
print(response.json())