Using LLM Foundry in Production

You can use LLM Foundry in internal projects as well as in client projects, in production.

Is data confidential?

If the data is sensitive, use the safe, private models hosted on the Straive tenant in Google and Azure. We have their written assurance that data is not stored, transmitted outside or tenant, nor used to train models.

LLM Foundry itself just routes requests to these models and does not have any models or intelligence of its own.

What is the uptime?

LLM Foundry itself has an uptime of ~99.5%. It could be down ~7 minutes/day for maintenance or bugs. Your client should be OK with this.

LLM Foundry routes requests to Google, Azure, OpenAI, etc. whose models have their own rate limit and downtime. Handle these gracefully in your application.

Which token should I use?

Don't use your personal token -- the one you see at https://llmfoundry.straive.com/code -- in projects. That adds cost against you, not the project. Instead:

Create a Google Group (it's free) and add your team members to it. Name your group descriptively, like client-name-project-name@straive.com
Send the email to s.anand@gramener.com who will create a new token for the group email

How do I track project usage?

Go to https://llmfoundry.straive.com/usage and search for the email ID used for your token. That will show you the usage.

You can use the filters on the page to see usage by date, by model, etc.

How is it billed?

For projects with a usage of $10 or more per month, the project usage report is sent to the Finance team.

They will add the cost against your project code.

How much could will it cost?

To estimate cost, guess the volume of usage and multiply it by the model cost.

For example, to correct the spelling and grammar of 100 customer service agents' emails, say 100 per day, each of about 200 words, you would need about 100 _ 100 _ 200 = 2,000,000 words per day. That's ~2.7M tokens. (A word is about 3/4 of a token.)

So, on GPT-4o-mini, this might cost $0.15 x 2.7 for the input and $0.6 x 2.7 for the output, which is about $2 per day, or $60 per month.

Double the cost to allow for errors, development, validation, etc. Then add 18% GST to this.