Graph

NodeType

Bases: str, Enum

Enumeration of node types in the knowledge graph.

Currently supported node types are: UNKNOWN, DOCUMENT, CHUNK

Node

Bases: BaseModel

Represents a node in the knowledge graph.

Attributes:

Name	Type	Description
`id`	`UUID`	Unique identifier for the node.
`properties`	`dict`	Dictionary of properties associated with the node.
`type`	`NodeType`	Type of the node.

add_property

add_property(key: str, value: Any)

Adds a property to the node.

Raises:

Type	Description
`ValueError`	If the property already exists.

Source code in src/ragas/testset/graph.py

def add_property(self, key: str, value: t.Any):
    """
    Adds a property to the node.

    Raises
    ------
    ValueError
        If the property already exists.
    """
    if key.lower() in self.properties:
        raise ValueError(f"Property {key} already exists")
    self.properties[key.lower()] = value

get_property

get_property(key: str) -> Optional[Any]

Retrieves a property value by key.

Notes

The key is case-insensitive.

Source code in src/ragas/testset/graph.py

def get_property(self, key: str) -> t.Optional[t.Any]:
    """
    Retrieves a property value by key.

    Notes
    -----
    The key is case-insensitive.
    """
    return self.properties.get(key.lower(), None)

Relationship

Bases: BaseModel

Represents a relationship between two nodes in a knowledge graph.

Attributes:

Name	Type	Description
`id`	`(UUID, optional)`	Unique identifier for the relationship. Defaults to a new UUID.
`type`	`str`	The type of the relationship.
`source`	`Node`	The source node of the relationship.
`target`	`Node`	The target node of the relationship.
`bidirectional`	`(bool, optional)`	Whether the relationship is bidirectional. Defaults to False.
`properties`	`(dict, optional)`	Dictionary of properties associated with the relationship. Defaults to an empty dict.

get_property

get_property(key: str) -> Optional[Any]

Retrieves a property value by key. The key is case-insensitive.

Source code in src/ragas/testset/graph.py

def get_property(self, key: str) -> t.Optional[t.Any]:
    """
    Retrieves a property value by key. The key is case-insensitive.
    """
    return self.properties.get(key.lower(), None)

KnowledgeGraph `dataclass`

KnowledgeGraph(nodes: List[Node] = list(), relationships: List[Relationship] = list())

Represents a knowledge graph containing nodes and relationships.

Attributes:

Name	Type	Description
`nodes`	`List[Node]`	List of nodes in the knowledge graph.
`relationships`	`List[Relationship]`	List of relationships in the knowledge graph.

add

add(item: Union[Node, Relationship])

Adds a node or relationship to the knowledge graph.

Raises:

Type	Description
`ValueError`	If the item type is not Node or Relationship.

Source code in src/ragas/testset/graph.py

def add(self, item: t.Union[Node, Relationship]):
    """
    Adds a node or relationship to the knowledge graph.

    Raises
    ------
    ValueError
        If the item type is not Node or Relationship.
    """
    if isinstance(item, Node):
        self._add_node(item)
    elif isinstance(item, Relationship):
        self._add_relationship(item)
    else:
        raise ValueError(f"Invalid item type: {type(item)}")

save

save(path: Union[str, Path])

Saves the knowledge graph to a JSON file.

Source code in src/ragas/testset/graph.py

def save(self, path: t.Union[str, Path]):
    """Saves the knowledge graph to a JSON file."""
    if isinstance(path, str):
        path = Path(path)

    data = {
        "nodes": [node.model_dump() for node in self.nodes],
        "relationships": [rel.model_dump() for rel in self.relationships],
    }
    with open(path, "w") as f:
        json.dump(data, f, cls=UUIDEncoder, indent=2)

load `classmethod`

load(path: Union[str, Path]) -> KnowledgeGraph

Loads a knowledge graph from a path.

Source code in src/ragas/testset/graph.py

@classmethod
def load(cls, path: t.Union[str, Path]) -> "KnowledgeGraph":
    """Loads a knowledge graph from a path."""
    if isinstance(path, str):
        path = Path(path)

    with open(path, "r") as f:
        data = json.load(f)

    nodes = [Node(**node_data) for node_data in data["nodes"]]
    relationships = [Relationship(**rel_data) for rel_data in data["relationships"]]

    kg = cls()
    kg.nodes.extend(nodes)
    kg.relationships.extend(relationships)
    return kg

find_clusters

find_clusters(relationship_condition: Callable[[Relationship], bool] = lambda _: True) -> List[Set[Node]]

Finds clusters of nodes in the knowledge graph based on a relationship condition.

Parameters:

Name	Type	Description	Default
`relationship_condition`	`Callable[[Relationship], bool]`	A function that takes a Relationship and returns a boolean, by default lambda _: True	`lambda _: True`

Returns:

Type	Description
`List[Set[Node]]`	A list of sets, where each set contains nodes that form a cluster.

Source code in src/ragas/testset/graph.py

def find_clusters(
    self, relationship_condition: t.Callable[[Relationship], bool] = lambda _: True
) -> t.List[t.Set[Node]]:
    """
    Finds clusters of nodes in the knowledge graph based on a relationship condition.

    Parameters
    ----------
    relationship_condition : Callable[[Relationship], bool], optional
        A function that takes a Relationship and returns a boolean, by default lambda _: True

    Returns
    -------
    List[Set[Node]]
        A list of sets, where each set contains nodes that form a cluster.
    """
    clusters = []
    visited = set()

    relationships = [
        rel for rel in self.relationships if relationship_condition(rel)
    ]

    def dfs(node: Node, cluster: t.Set[Node]):
        visited.add(node)
        cluster.add(node)
        for rel in relationships:
            if rel.source == node and rel.target not in visited:
                dfs(rel.target, cluster)
            # if the relationship is bidirectional, we need to check the reverse
            elif (
                rel.bidirectional
                and rel.target == node
                and rel.source not in visited
            ):
                dfs(rel.source, cluster)

    for node in self.nodes:
        if node not in visited:
            cluster = set()
            dfs(node, cluster)
            if len(cluster) > 1:
                clusters.append(cluster)

    return clusters

Graph

NodeType

Node

add_property

get_property

Relationship

get_property

KnowledgeGraph dataclass

add

save

load classmethod

find_clusters

KnowledgeGraph `dataclass`

load `classmethod`