Return to Arsenal
AI Client/Python · Rust

Bhumi

The fastest AI inference client for Python, built with Rust for unmatched speed, efficiency, and scalability.

Origin

When standard Python HTTP clients became the bottleneck in LLM inference pipelines, Bhumi was forged from Rust and Python — a hybrid engine that speaks to every major LLM provider with unmatched speed. Where others wait on network overhead, Bhumi's Rust-powered core tears through API calls with 60% less memory and 2-3x the throughput.

Attributes

2-3x
Faster
60%
Less Memory
Python
+ Rust

Capabilities

  • Async-first architecture built for concurrent LLM calls
  • Rust-powered HTTP core bypasses Python's GIL limitations
  • Multi-provider support: OpenAI, Anthropic, Gemini, Groq, SambaNova
  • Intelligent connection pooling and request batching
  • Streaming response support with zero-copy buffers
  • Automatic retries with exponential backoff

Invoke

python
from bhumi import AsyncClient

async def main():
    client = AsyncClient(provider="openai")

    response = await client.chat(
        model="gpt-4",
        messages=[
            {"role": "user", "content": "Hello, world"}
        ]
    )

    print(response.content)