Mistral Client

The MistralClient provides a simple interface to interact with Mistral AI's language models. It supports both synchronous and asynchronous requests, multi-turn conversations, and multimodal capabilities with the Pixtral model.

Quick Start

from maticlib.llm.mistral import MistralClient

# Initialize client
client = MistralClient(api_key="YOUR_MISTRAL_API_KEY")

# Make a request
response = client.complete("What is the best French cheese?")
print(response.content)

Class: MistralClient

Constructor Parameters

Parameter	Type	Default	Description
`model`	str	"mistral-large-latest"	The Mistral model to use
`api_key`	str	None	Mistral API key (or use MISTRAL_API_KEY env var)
`verbose`	bool	True	Enable detailed logging
`return_raw`	bool	False	Return raw JSON instead of Pydantic model

Available Models

mistral-large-latest - Most capable model (recommended)
mistral-medium-latest - Balanced performance and cost
mistral-small-latest - Fast and economical
pixtral-12b-latest - Multimodal model (text + images)

Methods

complete()

Make a synchronous completion request.

def complete(input: Union[str, List]) -> Union[MistralResponse, Dict[str, Any]]

Example:

response = client.complete("Write a poem about coding")
print(response.content)
print(f"Tokens used: {response.total_tokens}")

async_complete()

Make an asynchronous completion request.

async def async_complete(input: Union[str, List]) -> Union[MistralResponse, Dict[str, Any]]

Example:

import asyncio

async def main():
    response = await client.async_complete("Tell me about France")
    print(response.content)

asyncio.run(main())

get_text_response()

Helper method to extract text content from response.

def get_text_response(response: Union[MistralResponse, Dict]) -> str

Response Model

MistralResponse

Pydantic model returned by default

Attributes:

content (str) - Extracted text response
content_parts (List[ContentPart]) - Multimodal content parts
finish_reason (str) - Completion status
prompt_tokens (int) - Input token count
completion_tokens (int) - Output token count
total_tokens (int) - Total tokens used
response_id (str) - Unique response identifier
model_version (str) - Model used
raw_response (dict) - Original API response

Usage Examples

Different Models

from maticlib.llm.mistral import MistralClient

# Large model for complex tasks
large_client = MistralClient(
    model="mistral-large-latest",
    api_key="YOUR_KEY"
)

# Small model for simple tasks
small_client = MistralClient(
    model="mistral-small-latest",
    api_key="YOUR_KEY"
)

response = small_client.complete("Hello!")
print(response.content)

Multi-turn Conversations

from maticlib.messages import HumanMessage, AIMessage

conversation = [
    HumanMessage("Bonjour!"),
    AIMessage("Hello! How can I help you today?"),
    HumanMessage("Tell me about Paris")
]

response = client.complete(conversation)
print(response.content)

Using Dictionaries

messages = [
    {"role": "user", "content": "What is Mistral AI?"},
    {"role": "assistant", "content": "Mistral AI is..."},
    {"role": "user", "content": "Tell me more"}
]

response = client.complete(messages)
print(response.content)

Multimodal with Pixtral

client = MistralClient(
    model="pixtral-12b-latest",
    api_key="YOUR_KEY"
)

# Note: Image support requires base64 encoding
# This is a placeholder example
response = client.complete("Describe this image")
print(response.content)

Async Concurrent Requests

import asyncio
from maticlib.llm.mistral import MistralClient

async def process_multiple():
    client = MistralClient(api_key="YOUR_KEY")
    
    tasks = [
        client.async_complete("Write a haiku about AI"),
        client.async_complete("Explain machine learning"),
        client.async_complete("What is neural network?")
    ]
    
    responses = await asyncio.gather(*tasks)
    
    for i, response in enumerate(responses, 1):
        print(f"\nResponse {i}:")
        print(response.content)

asyncio.run(process_multiple())

Error Handling

import httpx

try:
    client = MistralClient(api_key="YOUR_KEY")
    response = client.complete("Your prompt")
    print(response.content)
    
except ValueError as e:
    print(f"Configuration error: {e}")
    
except httpx.HTTPStatusError as e:
    print(f"API error: {e.response.status_code}")
    
except Exception as e:
    print(f"Unexpected error: {e}")

Environment Variables

# Set API key
export MISTRAL_API_KEY="your-api-key"

# Use client without passing key
from maticlib.llm.mistral import MistralClient
client = MistralClient()  # Automatically uses MISTRAL_API_KEY

Best Practices

Use environment variables for API keys
Choose appropriate model based on task complexity
Use mistral-small-latest for simple tasks to reduce costs
Enable verbose mode during development
Use async methods for concurrent requests
Implement retry logic with exponential backoff
Monitor token usage for cost optimization
Cache responses when appropriate

Model Comparison

Model	Best For	Speed	Cost
mistral-large-latest	Complex reasoning, analysis	Medium	Higher
mistral-medium-latest	Balanced tasks	Fast	Medium
mistral-small-latest	Simple tasks, high volume	Very Fast	Lower
pixtral-12b-latest	Image understanding	Fast	Medium

Rate Limits

Mistral API has rate limits based on your subscription tier. Implement retry logic:

import time

def complete_with_retry(client, prompt, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.complete(prompt)
        except httpx.HTTPStatusError as e:
            if e.response.status_code == 429:
                if attempt == max_retries - 1:
                    raise
                wait_time = 2 ** attempt
                print(f"Rate limited. Waiting {wait_time}s...")
                time.sleep(wait_time)
            else:
                raise