AI Client/Python · Rust

Bhumi

The fastest AI inference client for Python, built with Rust for unmatched speed, efficiency, and scalability.

Origin

When standard Python HTTP clients became the bottleneck in LLM inference pipelines, Bhumi was forged from Rust and Python — a hybrid engine that speaks to every major LLM provider with unmatched speed. Where others wait on network overhead, Bhumi's Rust-powered core tears through API calls with 60% less memory and 2-3x the throughput.

Attributes

2-3x

Faster

60%

Less Memory

Python

+ Rust

Capabilities

Async-first architecture built for concurrent LLM calls
Rust-powered HTTP core bypasses Python's GIL limitations
Multi-provider support: OpenAI, Anthropic, Gemini, Groq, SambaNova
Intelligent connection pooling and request batching
Streaming response support with zero-copy buffers
Automatic retries with exponential backoff

Invoke

python

from bhumi import AsyncClient

async def main():
    client = AsyncClient(provider="openai")

    response = await client.chat(
        model="gpt-4",
        messages=[
            {"role": "user", "content": "Hello, world"}
        ]
    )

    print(response.content)