The fastest AI inference client for Python, built with Rust for unmatched speed, efficiency, and scalability.
When standard Python HTTP clients became the bottleneck in LLM inference pipelines, Bhumi was forged from Rust and Python — a hybrid engine that speaks to every major LLM provider with unmatched speed. Where others wait on network overhead, Bhumi's Rust-powered core tears through API calls with 60% less memory and 2-3x the throughput.
from bhumi import AsyncClient
async def main():
client = AsyncClient(provider="openai")
response = await client.chat(
model="gpt-4",
messages=[
{"role": "user", "content": "Hello, world"}
]
)
print(response.content)