Budget enforcement for AI inference workloads. 3 lines of code. Result in 60 seconds.
Get running in under 60 seconds
pip install lutflowfrom lutflow import Client
import openai
client = Client(tenant_id="acme", budget_usd=10.00)
wrapped = client.wrap(openai.OpenAI())
response = wrapped.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}])No API key required for local mode
The SDK works entirely offline. Add an API key later to enable cloud reporting and the full kill path.
from lutflow import Client
import openai
client = Client(tenant_id="acme", budget_usd=10.00)
wrapped = client.wrap(openai.OpenAI())
response = wrapped.chat.completions.create(model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}])Supports chat completions, embeddings, and streaming.
RAISE_ERRORRaises BudgetExceededError (default)
WARN_ONLYLogs warning, continues execution
CALLBACKCalls your custom function
SELF_KILLSends SIGKILL to the process
from lutflow import Client, BudgetStrategy
# Option 1: Raise error (default)
client = Client(
tenant_id="acme",
budget_usd=10.00,
on_budget_exceeded=BudgetStrategy.RAISE_ERROR,
)
# Option 2: Warning only
client = Client(
tenant_id="acme",
budget_usd=10.00,
on_budget_exceeded=BudgetStrategy.WARN_ONLY,
)
# Option 3: Custom callback
def my_handler(spent: float, budget: float):
send_slack_alert(f"Budget exceeded: ${spent:.2f} / ${budget:.2f}")
client = Client(
tenant_id="acme",
budget_usd=10.00,
on_budget_exceeded=BudgetStrategy.CALLBACK,
on_exceeded_callback=my_handler,
)# Install
pip install lutflow
# GPU price lookup
lutflow lookup --gpu nvidia-l4
# Model recommendation
lutflow recommend --task text-classification --budget 0.50
# Live dashboard (simulated)
lutflow watchpip install lutflowCore + CLIpip install lutflow[openai]With OpenAI wrapperpip install lutflow[anthropic]With Anthropic wrapperpip install lutflow[gemini]With Google Gemini wrapperpip install lutflow[all]All providers + Kafka transport