Customize Timeouts and Rate Limits
Configure timeouts and retries directly on your LLM client when using the collections API with llm_factory.
OpenAI Client Configuration
from openai import AsyncOpenAI
from ragas.llms import llm_factory
from ragas.metrics.collections import Faithfulness
# Configure timeout and retries on the client
client = AsyncOpenAI(
timeout=60.0, # 60 second timeout
max_retries=5, # Retry up to 5 times on failures
)
llm = llm_factory("gpt-4o-mini", client=client)
# Use with metrics
scorer = Faithfulness(llm=llm)
result = scorer.score(
user_input="When was the first super bowl?",
response="The first superbowl was held on Jan 15, 1967",
retrieved_contexts=[
"The First AFL–NFL World Championship Game was an American football game played on January 15, 1967, at the Los Angeles Memorial Coliseum in Los Angeles."
]
)
Available Options
| Parameter | Default | Description |
|---|---|---|
timeout |
600.0 | Request timeout in seconds |
max_retries |
2 | Number of retry attempts for failed requests |
Fine-Grained Timeout Control
For more control over different timeout types:
import httpx
from openai import AsyncOpenAI
client = AsyncOpenAI(
timeout=httpx.Timeout(
60.0, # Total timeout
connect=5.0, # Connection timeout
read=30.0, # Read timeout
write=10.0, # Write timeout
),
max_retries=3,
)
Provider Documentation
Each LLM provider has its own client configuration options. Refer to your provider's SDK documentation:
Legacy Metrics API
The following examples use the legacy metrics API pattern with RunConfig. For new projects, we recommend using the collections-based API with client-level configuration as shown above.
Deprecation Timeline
This API will be deprecated in version 0.4 and removed in version 1.0. Please migrate to the collections-based API.
RunConfig Parameters
from ragas.run_config import RunConfig
run_config = RunConfig(
timeout=180, # Max seconds per operation (default: 180)
max_retries=10, # Retry attempts (default: 10)
max_wait=60, # Max seconds between retries (default: 60)
max_workers=16, # Concurrent workers (default: 16)
log_tenacity=False, # Log retry attempts (default: False)
seed=42, # Random seed (default: 42)
)
Usage with Evaluate
from langchain_openai import ChatOpenAI
from ragas.llms import LangchainLLMWrapper
from ragas import EvaluationDataset, SingleTurnSample, evaluate
from ragas.metrics import Faithfulness
from ragas.run_config import RunConfig
# Legacy LLM setup
llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-4o"))
# Configure run settings
run_config = RunConfig(max_workers=64, timeout=60)
# Use with evaluate
results = evaluate(
dataset=eval_dataset,
metrics=[Faithfulness(llm=llm)],
run_config=run_config,
)