LLMs
BaseRagasLLM
dataclass
BaseRagasLLM(run_config: RunConfig = RunConfig(), multiple_completion_supported: bool = False, cache: Optional[CacheInterface] = None)
Bases: ABC
get_temperature
is_finished
abstractmethod
generate
async
generate(prompt: PromptValue, n: int = 1, temperature: Optional[float] = 0.01, stop: Optional[List[str]] = None, callbacks: Callbacks = None) -> LLMResult
Generate text using the given event loop.
Source code in src/ragas/llms/base.py
InstructorBaseRagasLLM
Bases: ABC
Base class for LLMs using the Instructor library pattern.
generate
abstractmethod
Generate a response using the configured LLM.
For async clients, this will run the async method in the appropriate event loop.
Source code in src/ragas/llms/base.py
agenerate
abstractmethod
async
Asynchronously generate a response using the configured LLM.
InstructorLLM
InstructorLLM(client: Any, model: str, provider: str, model_args: Optional[InstructorModelArgs] = None, cache: Optional[CacheInterface] = None, **kwargs)
Bases: InstructorBaseRagasLLM
LLM wrapper using the Instructor library for structured outputs.
Source code in src/ragas/llms/base.py
generate
Generate a response using the configured LLM.
For async clients, this will run the async method in the appropriate event loop.
Source code in src/ragas/llms/base.py
agenerate
async
Asynchronously generate a response using the configured LLM.
Source code in src/ragas/llms/base.py
HaystackLLMWrapper
HaystackLLMWrapper(haystack_generator: Union[AzureOpenAIGenerator, HuggingFaceAPIGenerator, HuggingFaceLocalGenerator, OpenAIGenerator], run_config: Optional[RunConfig] = None, cache: Optional[CacheInterface] = None)
Bases: BaseRagasLLM
A wrapper class for using Haystack LLM generators within the Ragas framework.
This class integrates Haystack's LLM components (e.g., OpenAIGenerator,
HuggingFaceAPIGenerator, etc.) into Ragas, enabling both synchronous and
asynchronous text generation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
haystack_generator
|
AzureOpenAIGenerator | HuggingFaceAPIGenerator | HuggingFaceLocalGenerator | OpenAIGenerator
|
An instance of a Haystack generator. |
required |
run_config
|
RunConfig
|
Configuration object to manage LLM execution settings, by default None. |
None
|
cache
|
CacheInterface
|
A cache instance for storing results, by default None. |
None
|
Source code in src/ragas/llms/haystack_wrapper.py
LiteLLMStructuredLLM
LiteLLMStructuredLLM(client: Any, model: str, provider: str, cache: Optional[CacheInterface] = None, system_prompt: Optional[str] = None, **kwargs)
Bases: InstructorBaseRagasLLM
LLM wrapper using LiteLLM for structured outputs.
Works with all 100+ LiteLLM-supported providers including Gemini, Ollama, vLLM, Groq, and many others.
The LiteLLM client should be initialized with structured output support.
Args: client: LiteLLM client instance model: Model name (e.g., "gemini-2.0-flash") provider: Provider name cache: Optional cache backend for caching LLM responses system_prompt: Optional system prompt to prepend to all messages **kwargs: Additional model arguments (temperature, max_tokens, etc.)
Source code in src/ragas/llms/litellm_llm.py
generate
Generate a response using the configured LLM.
For async clients, this will run the async method in the appropriate event loop.
Args: prompt: Input prompt response_model: Pydantic model for structured output
Returns: Instance of response_model with generated data
Source code in src/ragas/llms/litellm_llm.py
agenerate
async
Asynchronously generate a response using the configured LLM.
Args: prompt: Input prompt response_model: Pydantic model for structured output
Returns: Instance of response_model with generated data
Source code in src/ragas/llms/litellm_llm.py
OCIGenAIWrapper
OCIGenAIWrapper(model_id: str, compartment_id: str, config: Optional[Dict[str, Any]] = None, endpoint_id: Optional[str] = None, run_config: Optional[RunConfig] = None, cache: Optional[Any] = None, default_system_prompt: Optional[str] = None, client: Optional[Any] = None)
Bases: BaseRagasLLM
OCI Gen AI LLM wrapper for Ragas.
This wrapper provides direct integration with Oracle Cloud Infrastructure Generative AI services without requiring LangChain or LlamaIndex.
Args: model_id: The OCI model ID to use for generation compartment_id: The OCI compartment ID config: OCI configuration dictionary (optional, uses default if not provided) endpoint_id: Optional endpoint ID for the model run_config: Ragas run configuration cache: Optional cache backend
Source code in src/ragas/llms/oci_genai_wrapper.py
generate_text
generate_text(prompt: PromptValue, n: int = 1, temperature: Optional[float] = 0.01, stop: Optional[List[str]] = None, callbacks: Optional[Any] = None) -> LLMResult
Generate text using OCI Gen AI.
Source code in src/ragas/llms/oci_genai_wrapper.py
agenerate_text
async
agenerate_text(prompt: PromptValue, n: int = 1, temperature: Optional[float] = 0.01, stop: Optional[List[str]] = None, callbacks: Optional[Any] = None) -> LLMResult
Generate text asynchronously using OCI Gen AI.
Source code in src/ragas/llms/oci_genai_wrapper.py
is_finished
Check if the LLM response is finished/complete.
Source code in src/ragas/llms/oci_genai_wrapper.py
llm_factory
llm_factory(model: str, provider: str = 'openai', client: Optional[Any] = None, adapter: str = 'auto', cache: Optional[CacheInterface] = None, **kwargs: Any) -> InstructorBaseRagasLLM
Create an LLM instance for structured output generation with automatic adapter selection.
Supports multiple LLM providers and structured output backends with unified interface for both sync and async operations. Returns instances with .generate() and .agenerate() methods that accept Pydantic models for structured outputs.
Auto-detects the best adapter for your provider: - Google Gemini → uses LiteLLM adapter - Other providers → uses Instructor adapter (default) - Explicit control available via adapter parameter
Args: model: Model name (e.g., "gpt-4o", "claude-3-sonnet", "gemini-2.0-flash"). provider: LLM provider (default: "openai"). Examples: openai, anthropic, google, groq, mistral, etc. client: Pre-initialized client instance (required). For OpenAI, can be OpenAI(...) or AsyncOpenAI(...). adapter: Structured output adapter to use (default: "auto"). - "auto": Auto-detect based on provider/client (recommended) - "instructor": Use Instructor library - "litellm": Use LiteLLM (supports 100+ providers) cache: Optional cache backend for caching LLM responses. Pass DiskCacheBackend() for persistent caching across runs. Saves costs and speeds up repeated evaluations by 60x. **kwargs: Additional model arguments (temperature, max_tokens, top_p, etc).
Returns: InstructorBaseRagasLLM: Instance with generate() and agenerate() methods.
Raises: ValueError: If client is missing, provider is unsupported, model is invalid, or adapter initialization fails.
Examples: from openai import OpenAI
# Basic usage
client = OpenAI(api_key="...")
llm = llm_factory("gpt-4o-mini", client=client)
response = llm.generate(prompt, ResponseModel)
# With caching (recommended for experiments)
from ragas.cache import DiskCacheBackend
cache = DiskCacheBackend()
llm = llm_factory("gpt-4o-mini", client=client, cache=cache)
# Anthropic
from anthropic import Anthropic
client = Anthropic(api_key="...")
llm = llm_factory("claude-3-sonnet", provider="anthropic", client=client)
# Google Gemini (auto-detects litellm adapter)
from litellm import OpenAI as LiteLLMClient
client = LiteLLMClient(api_key="...", model="gemini-2.0-flash")
llm = llm_factory("gemini-2.0-flash", client=client)
# Explicit adapter selection
llm = llm_factory("gemini-2.0-flash", client=client, adapter="litellm")
# Async
from openai import AsyncOpenAI
client = AsyncOpenAI(api_key="...")
llm = llm_factory("gpt-4o-mini", client=client)
response = await llm.agenerate(prompt, ResponseModel)
Source code in src/ragas/llms/base.py
579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 | |
oci_genai_factory
oci_genai_factory(model_id: str, compartment_id: str, config: Optional[Dict[str, Any]] = None, endpoint_id: Optional[str] = None, run_config: Optional[RunConfig] = None, cache: Optional[Any] = None, default_system_prompt: Optional[str] = None, client: Optional[Any] = None) -> OCIGenAIWrapper
Factory function to create an OCI Gen AI LLM instance.
Args: model_id: The OCI model ID to use for generation compartment_id: The OCI compartment ID config: OCI configuration dictionary (optional) endpoint_id: Optional endpoint ID for the model run_config: Ragas run configuration **kwargs: Additional arguments passed to OCIGenAIWrapper
Returns: OCIGenAIWrapper: An instance of the OCI Gen AI LLM wrapper
Examples: # Basic usage with default config llm = oci_genai_factory( model_id="cohere.command", compartment_id="ocid1.compartment.oc1..example" )
# With custom config
llm = oci_genai_factory(
model_id="cohere.command",
compartment_id="ocid1.compartment.oc1..example",
config={"user": "user_ocid", "key_file": "~/.oci/private_key.pem"}
)