Azure Form Recognizer

You can access the Azure Form Recognizer using the /azureformrecognizer/analyze endpoint.

The following models are supported:

  • prebuilt-read
  • prebuilt-layout
  • prebuilt-document
  • prebuilt-businessCard
  • prebuilt-contract
  • prebuilt-healthInsuranceCard.us
  • prebuilt-idDocument
  • prebuilt-invoice
  • prebuilt-receipt
  • prebuilt-marriageCertificate.us
  • prebuilt-creditCard
  • prebuilt-check.us
  • prebuilt-payStub.us
  • prebuilt-bankStatement
  • prebuilt-mortgage.us.1003
  • prebuilt-mortgage.us.1004
  • prebuilt-mortgage.us.1005
  • prebuilt-mortgage.us.1008
  • prebuilt-mortgage.us.closingDisclosure
  • prebuilt-tax.us
  • prebuilt-tax.us.w2
  • prebuilt-tax.us.1098
  • prebuilt-tax.us.1098E
  • prebuilt-tax.us.1098T
  • prebuilt-tax.us.1099(variations
  • prebuilt-tax.us.1040(variations

Curl

curl -X POST https://llmfoundry.straive.com/azureformrecognizer/analyze \
  -H "Authorization: Bearer $LLMFOUNDRY_TOKEN:my-test-project" \
  -H "Content-Type: application/json" \
  -d "{\"model\": \"prebuilt-layout\", \"document\": \"data:application/pdf;base64,$(base64 -w 0 -i input.pdf)\"}"

Python requests

import base64
import os
import requests  # Or replace requests with httpx

with open("input.pdf", "rb") as pdf_file:
    pdf_base64 = base64.b64encode(pdf_file.read()).decode("utf-8")
response = requests.post(
    "https://llmfoundry.straive.com/azureformrecognizer/analyze",
    headers={"Authorization": f"Bearer {os.environ['LLMFOUNDRY_TOKEN']}:my-test-project"},
    json={"model": "prebuilt-layout", "document": f"data:application/pdf;base64,{pdf_base64}"}
)
print(response.json())

JavaScript

const token = process.env.LLMFOUNDRY_TOKEN;
const pdf_base64 = Buffer.from(await fs.promises.readFile("input.pdf")).toString("base64");
const response = await fetch("https://llmfoundry.straive.com/azureformrecognizer/analyze", {
  method: "POST",
  headers: { "Content-Type": "application/json", Authorization: `Bearer ${token}:my-test-project` },
  // If the user is already logged into LLM Foundry, use `credentials: "include"` to send **THEIR** API token instead of the `Authorization` header.
  credentials: "include",
  body: JSON.stringify({
    model: "prebuilt-layout",
    document: `data:application/pdf;base64,${pdf_base64}`,
  }),
});
console.log(await response.json());