Vertex AI
Replace https://{LOCATION}.googleapis.com/v1/projects/{PROJECT}/locations/{LOCATION}/publishers/{PUBLISHER}/models/{MODEL}:{ACTION}
with https://llmfoundry.straive.com/vertexai/{PUBLISHER}/models/{MODEL}:{ACTION}
.
Pick the latest model from the Google Model Garden such as:
claude-3-haiku@20240307
claude-3-5-sonnet-v2@20241022
claude-3-opus
gemini-1.5-flash-8b
gemini-1.5-flash
gemini-1.5-pro
multimodalembedding@001
- ... etc.
Curl
curl -X POST "https://llmfoundry.straive.com/vertexai/google/models/gemini-1.5-flash:generateContent" \
-H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
-H "Content-Type: application/json; charset=utf-8" \
-d '{"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}}'
curl -X POST "https://llmfoundry.straive.com/vertexai/anthropic/models/claude-3-haiku@20240307:rawPredict" \
-H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
-H "Content-Type: application/json; charset=utf-8" \
-d '{"anthropic_version": "vertex-2023-10-16", "max_tokens": 256, "messages": [{"role": "user","content": [{"type": "text", "text": "What is 2 + 2?"}]}]}'
curl -X POST "https://llmfoundry.straive.com/vertexai/meta/models/llama-3.2-90b-vision-instruct-maas:generateContent" \
-H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
-H "Content-Type: application/json; charset=utf-8" \
-d '{"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}}'
Python requests
import os
import requests # Or replace requests with httpx
response = requests.post(
"https://llmfoundry.straive.com/vertexai/google/models/gemini-1.0-pro:generateContent",
headers={"Authorization": f"Bearer {os.environ['LLMFOUNDRY_TOKEN']}:my-test-project"},
json={"contents": {"role": "user", "parts": {"text": "What is 2 + 2?"}}},
)
print(response.json())