Vertex AI
Replace https://{LOCATION}.googleapis.com/v1/projects/{PROJECT}/locations/{LOCATION}/publishers/{PUBLISHER}/models/{MODEL}:{ACTION}
with https://llmfoundry.straive.com/vertexai/{PUBLISHER}/models/{MODEL}:{ACTION}.
Pick the latest model from the Google Model Garden such as:
- claude-3-haiku@20240307
- claude-3-5-sonnet-v2@20241022
- claude-3-opus
- gemini-1.5-flash-8b
- gemini-1.5-flash
- gemini-1.5-pro
- multimodalembedding@001
- ... etc.
Curl
curl -X POST "https://llmfoundry.straive.com/vertexai/google/models/gemini-1.5-flash:generateContent" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d '{"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}}'
curl -X POST "https://llmfoundry.straive.com/vertexai/anthropic/models/claude-3-haiku@20240307:rawPredict" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d '{"anthropic_version": "vertex-2023-10-16", "max_tokens": 256, "messages": [{"role": "user","content": [{"type": "text", "text": "What is 2 + 2?"}]}]}'
curl -X POST "https://llmfoundry.straive.com/vertexai/meta/models/llama-3.2-90b-vision-instruct-maas:generateContent" \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json; charset=utf-8" \
  -d '{"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}}'
Python requests
import os
import requests  # Or replace requests with httpx
response = requests.post(
    "https://llmfoundry.straive.com/vertexai/google/models/gemini-1.0-pro:generateContent",
    headers={"Authorization": f"Bearer {os.environ['LLMFOUNDRY_TOKEN']}:my-test-project"},
    json={"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}},
)
print(response.json())