LLM Generated PyReason Rules
Introduction
In this tutorial, we use a Large Language Model (Claude) to generate a valid PyReason rule for a simple knowledge graph. We then validate the generated rule with PyReason’s rule parser and run inference to show it fires on the graph.
Note
Find the full, executable code here
Knowledge Graph
We build a small academic knowledge graph with three types of nodes - students, majors, and departments.
They are connected by two predicates: major_in and in_department.
import networkx as nx
g = nx.DiGraph()
g.add_edge('alice', 'math', major_in=1)
g.add_edge('bob', 'math', major_in=1)
g.add_edge('mary', 'cs', major_in=1)
# Major -> Department
g.add_edge('math', 'math_dept', in_department=1)
g.add_edge('cs', 'cs_dept', in_department=1)
The Prompt
The prompt describes a specific reasoning goal: deriving which department a student belongs to. The head predicate name is fixed to ensure consistent, comparable output across LLMs.
PROMPT = """\
You are generating a rule for a PyReason knowledge graph.
### Task
Write a single PyReason rule that derives which department a student belongs to,
given that a student is enrolled in a major and that major belongs to a department.
### Available predicates
- major_in(Student, Major) - student in enrolled in a major
- in_department(Major, Department) - major belongs to a department
### PyReason rule syntax
head_predicate(X,Y) <-N body_predicate_1(X,Z),body_predicates_2(Z,Y)
- N is the delta: use 0 for immediate firing
- Variables are single uppercase letters (X,Y,Z)
- Head predicate name must be: student_in_dept
### Output format
Output the rule on a single line. No explanation, no markdown, no punctuation.
### Example (Different predicates, shows syntax only)
grandparent(X,Y)<-0 parent(X,Z),parent(Z,Y)
"""
Generating the Rule
We call the Anthropic API to send the prompt to Claude and split the response into individual rule string.
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=256,
messages=[{"role": "user", "content": PROMPT}],
)
rule_str = response.content[0].text.strip()
A typical response looks like:
student_in_dept(X,Y)<-0 major_in(X,Z),in_department(Z,Y)
Validating the Rule
The rule is passed through pr.Rule() to confirm it is syntactically valid
before loading it into the reasoner. If invalid, the script exits immediately.
import pyreason as pr
try:
pr.Rule(rule_str)
print(f"[VALID] {rule_str}")
except Exception as e:
sys.exit(f"[INVALID] {rule_str}\nError: {e}")
Running inference
Load the valid rule into PyReason with infer_edges=True so that new edges
are created when the rule fires between currently unconnected nodes.
pr.settings.verbose = False
pr.load_graph(g)
pr.add_rule(pr.Rule(rule_str, name="student_in_dept_rule", infer_edges=True))
interpretation = pr.reason(timesteps=2)
print("\nInferred student-department relationships:")
for df in pr.filter_and_sort_edges(interpretation, ["student_in_dept"]):
if not df.empty:
print(df.to_string(index=False))
Expected output:
Inferred student-department relationships:
component student_in_dept
0 (alice, math_dept) [1.0, 1.0]
1 (bob, math_dept) [1.0, 1.0]
2 (mary, cs_dept) [1.0, 1.0]
Cross-LLM Consistency
The same prompt was tested against Claude, GPT-4, and Gemini through their web interfaces. All three produced the same valid rule:
Claude: student_in_dept(X,Y)<-0 major_in(X,Z),in_department(Z,Y)
GPT-4: student_in_dept(X,Y)<-0 major_in(X,Z),in_department(Z,Y)
Gemini: student_in_dept(X,Y)<-0 major_in(X,Z),in_department(Z,Y)
This demonstrates that a well-constrained prompt consistently produces identical, valid PyReason rules across different LLMs.