Natural Language to PyReason Tutorial
Welcome to the Natural Language to PyReason tutorial! In this document we outline a pipeline that converts a plain English paragraph into PyReason facts and rules using a Large Language Model (LLM). If you want to combine the flexibility of natural language input with the precision of symbolic reasoning, you’re in the right place!
Note
Find the full, executable code here
The goal of this tutorial is to take a paragraph like this:
Carlos and Emma are both lawyers. All lawyers who win cases regularly tend to
build a strong reputation. Emma wins cases regularly. Anyone with a strong
reputation is likely to attract high-profile clients. Carlos does not win
cases regularly.
And automatically convert it into PyReason facts and rules, then validate that they parse correctly with PyReason.
The pipeline needs three things:
An LLM (we use Ollama with a local model)
A two-stage prompt design (one prompt for extraction, one for conversion)
A parser that turns the LLM’s text output back into structured facts and rules
Setup
We use Ollama to run an LLM locally. Different models
have different trade-offs between speed and stability — we discuss this at the
end of the tutorial. For this tutorial, we use qwen3:14b.
Install Ollama from ollama.com
Pull the model in terminal:
ollama pull qwen3:14b
Install the Python dependencies:
pip install ollama pyreason
Step 1: Extract Facts and Rules in English
The first prompt asks the LLM to read the paragraph and produce structured English. No PyReason syntax appears yet — this step is purely about language understanding.
PROMPT_EXTRACT = f"""Read the paragraph below and extract two things.
FACTS: specific statements about a named person, place, or thing.
- Extract EVERY fact mentioned, including type/category facts like "A is a student". Do not skip any.
- Include negative facts too (e.g. "John does not study regularly")
- Only extract what is explicitly stated, do not assume or invent
RULES: general IF-THEN patterns that apply to any person or thing.
- These are generalizations, not about one specific person
Output exactly this format, no extra text:
Facts:
- <fact 1>
- <fact 2>
Rules:
- <rule 1>
- <rule 2>
Paragraph: {{paragraph}}
Facts:
"""
Three design choices in this prompt deserve attention:
Extract EVERY fact including type facts. Without this instruction, the LLM tends to drop background facts like “Carlos is a lawyer” because they feel like context, not assertions. But these facts are exactly what later rules need to fire.
Negation must be preserved. LLMs default to extracting positive statements and silently drop “X does not Y” sentences. The bullet point about negation prevents this.
The prompt ends with
Facts:— this is a completion primer. The LLM continues writing from where the prompt ended, jumping directly into the structured output rather than producing chatty preamble like “Sure, here are the facts I extracted…”.
For our example paragraph, the LLM produces:
Facts:
- Carlos is a lawyer
- Emma is a lawyer
- Emma wins cases regularly
- Carlos does not win cases regularly
Rules:
- All lawyers who win cases regularly tend to build a strong reputation
- Anyone with a strong reputation is likely to attract high-profile clients
Step 2: Convert to PyReason Syntax
The second prompt takes the English output from Step 1 and converts it into PyReason syntax. This is the harder step because PyReason’s syntax rarely appears in LLM training data, so the prompt has to teach it.
PROMPT_CONVERT = f"""Convert the facts and rules below into PyReason syntax.
FACT syntax and examples:
predicate(node):[l,u]
predicate(node):[1,1] ([1,1] means completely true) e.g. student(alice):[1,1] (Alice is a student)
predicate(node):[0,0] ([0,0] means completely false) e.g. student(marie):[0,0] (Marie is not a student)
predicate(node):[0.8,1] ([0.8,1] means likely, "tend to", "usually") e.g. doctor(bob):[0.8,1] (Bob is likely a doctor)
predicate(node1,node2):[1,1] e.g. enrolled_in(ryan,cs):[1,1] (Ryan enrolled in cs major)
RULE syntax:
head(X):[bound] <- condition1(X), condition2(X,Y)
Use variables X, Y (never specific names):
example: grandparent(X,Y) <- parent(X,Z), parent(Z,Y)
Include ALL conditions from the English rule, even if they seem redundant.
Constraints:
1. Predicate names: lowercase_with_underscores, no spaces, no capital letters.
2. Rule's head name describes WHAT, bound [l,u] describes HOW CERTAIN.
Don't use uncertain name for rule's head.
3. Facts use specific names (john, mary) lower case is prefered in specific name.
Rules use variables (X, Y).
4. Negation in facts: use [0,0] on the SAME predicate, never invent a new predicate.
e.g. "Alice is not student" -> student(alice):[0,0] NOT not_student(alice):[1,1]
5. Rules with one condition use only X: good_grade(X) <- study_hard(X)
Only introduce Y or Z when two different entities are involved.
### No markdown, no code blocks, no comments. Output ONLY the two sections.
Facts and rules to convert:
{{english_output}}
Facts:
<fact 1>
<fact 2>
Rules:
<rule 1>
<rule 2>
"""
The prompt enforces five constraints. The most important conceptually is
constraint 2: predicate names describe what a thing is, and bounds
describe how certain we are about it. Mixing these — for example writing
likely_to_graduate(X):[0.8,1] — breaks the chain of reasoning, because
later rules will not be able to reference the predicate by a consistent name.
Constraint 4 is also subtle. Negation in PyReason is expressed by setting the
bound to [0,0] on the same predicate, not by introducing a new
not_<predicate> predicate. This matters because PyReason treats predicate
names as opaque symbols — a rule that needs to detect “not a student” must
check student(X):[0,0], not not_student(X):[1,1].
The end of the prompt includes Facts: and Rules: template scaffolding
with <fact 1> / <rule 1> placeholders. This acts as a strong format
anchor — the LLM sees the exact shape of the expected output and fills in the
slots, which reduces format drift.
For our example, the LLM produces:
Facts:
lawyer(carlos):[1,1]
lawyer(emma):[1,1]
wins_cases_regularly(emma):[1,1]
wins_cases_regularly(carlos):[0,0]
Rules:
strong_reputation(X):[0.8,1] <- lawyer(X), wins_cases_regularly(X)
attract_high_profile_clients(X):[0.8,1] <- strong_reputation(X)
Notice how:
“Carlos does not win cases regularly” became
wins_cases_regularly(carlos):[0,0]— the same predicate with a falsified bound.“tend to build a strong reputation” became
strong_reputation(X):[0.8,1]— uncertainty lives in the bound, not in the predicate name.The two rules chain: rule 1’s head
strong_reputationappears verbatim in rule 2’s body.
Step 3: Validate with PyReason
After parsing the LLM output into a list of facts and rules, we check each
rule by constructing a pr.Rule object. If PyReason’s parser rejects it,
we catch the error and report which rule failed.
import pyreason as pr
for rule in rules:
try:
pr.Rule(rule)
print(f"Rule passed {rule}")
except Exception as e:
print(f"ERROR {rule}\n{e}")
For our example, both rules pass:
Validating rules...
Rule passed strong_reputation(X):[0.8,1] <- lawyer(X), wins_cases_regularly(X)
Rule passed attract_high_profile_clients(X):[0.8,1] <- strong_reputation(X)
This validation step catches LLM mistakes early. If the LLM produced malformed
syntax (an unbalanced parenthesis, a missing arrow, an invalid bound), the
pr.Rule() constructor raises an exception and we see exactly which rule
failed.
Testing on Other Paragraphs
The same pipeline handles paragraphs from any domain. Here are several test cases we used during development:
Medical
Tom and Lisa are both nurses. All nurses who work night shifts tend to
experience fatigue. Lisa works night shifts. Anyone who experiences
fatigue is likely to make errors. Tom does not work night shifts.
Animals
Rex and Bella are both dogs. All dogs that exercise daily tend to stay
healthy. Bella exercises daily. Any dog that stays healthy is likely to
live long. Rex does not exercise daily.
Relational (Edge Rule)
Alice and Bob are colleagues. All colleagues who share projects tend to
collaborate well. Alice and Bob share a project. Anyone who collaborates
well is likely to get promoted. Carol and Dave are colleagues but do not
share any projects.
The relational case is the most challenging because it involves two-entity
relationships, which the LLM must encode as edge facts like
colleague(alice,bob) and edge rules like
collaborate_well(X,Y) <- colleague(X,Y), share_project(X,Y).
Notes on Model Choice
We tested several local Ollama models with the prompts above:
llama3.1:8b — fast but unstable. Often drops facts or invents predicates.
mistral-nemo:12b — better than llama3.1, but suffers from naming drift (
works_night_shiftvsworks_night_shifts) and occasionally wraps output in markdown code blocks.qwen2.5:14b — significantly more stable, but in rare cases produces corrupted syntax like
lawyeremma:[][1,1]when handling compound subjects.qwen3:14b — the most consistent in our tests. Recommended.
Smaller models are accessible but less reliable; larger models are reliable but require better hardware. The prompt design above mitigates the issue but cannot fully eliminate it.
What’s Next
This pipeline is intentionally minimal to illustrate the core idea — taking natural language and turning it into PyReason syntax that parses correctly. Practical extensions could include:
Loading the validated facts and rules into a graph and running
pr.reason()to derive new conclusions.A retry loop that catches syntax errors and asks the LLM to fix its own output.
A post-inference step that translates PyReason’s results back into natural English.
Validation that detects predicate-naming drift across rules (e.g. rule 1’s head uses
strong_reputationbut rule 2’s body useshas_strong_reputation) and prompts the LLM to unify the names.