Vertex AI
Replace https://{LOCATION}.googleapis.com/v1/projects/{PROJECT}/locations/{LOCATION}/publishers/{PUBLISHER}/models/{MODEL}:{ACTION}
with https://llmfoundry.straive.com/vertexai/{PUBLISHER}/models/{MODEL}:{ACTION}.
Pick the latest model from the Google Model Garden such as:
claude-3-haiku@20240307claude-3-5-sonnet-v2@20241022claude-3-opusgemini-1.5-flash-8bgemini-1.5-flashgemini-1.5-promultimodalembedding@001- ... etc.
Curl
curl -X POST "https://llmfoundry.straive.com/vertexai/google/models/gemini-1.5-flash:generateContent" \
-H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
-H "Content-Type: application/json; charset=utf-8" \
-d '{"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}}'
curl -X POST "https://llmfoundry.straive.com/vertexai/anthropic/models/claude-3-haiku@20240307:rawPredict" \
-H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
-H "Content-Type: application/json; charset=utf-8" \
-d '{"anthropic_version": "vertex-2023-10-16", "max_tokens": 256, "messages": [{"role": "user","content": [{"type": "text", "text": "What is 2 + 2?"}]}]}'
curl -X POST "https://llmfoundry.straive.com/vertexai/meta/models/llama-3.2-90b-vision-instruct-maas:generateContent" \
-H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
-H "Content-Type: application/json; charset=utf-8" \
-d '{"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}}'
Python requests
import os
import requests # Or replace requests with httpx
response = requests.post(
"https://llmfoundry.straive.com/vertexai/google/models/gemini-1.0-pro:generateContent",
headers={"Authorization": f"Bearer {os.environ['LLMFOUNDRY_TOKEN']}:my-test-project"},
json={"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}},
)
print(response.json())