| ============================================ | |
| Problem Package β Placeholder-Free Drafting | |
| ============================================ | |
| This document captures everything another engineer / model needs to understand | |
| and complete the "placeholder-free legal drafting" task: | |
| 1. Problem statement | |
| 2. Current file structure (relevant subset) | |
| 3. Key existing code (verbatim) | |
| 4. Chain-of-thought / proposed solution | |
| 5. Reference code patch to implement the fix | |
| -------------------------------------------------- | |
| 1. Problem statement | |
| -------------------------------------------------- | |
| We have a FastAPI + LangChain agent that performs three steps each user turn: | |
| 1. Conversational chain gathers data and decides actions. | |
| 2. Document drafter chain produces a legal draft (JSON with fields | |
| `draft` and `is_drafted`). | |
| 3. Placeholder-checker chain scans the draft for unresolved tokens such as | |
| `[DATE]`, `[NAME]`, etc. | |
| Issue | |
| ----- | |
| β’ The drafter still returns drafts that contain placeholders. | |
| β’ The checker detects them but there is **no feedback loop** β the agent never | |
| retries nor asks the user for the missing details. | |
| β’ Users end up with documents containing `[Your Address]`, etc. | |
| Goal | |
| ---- | |
| Design a **simple, production-grade mechanism** that guarantees placeholder-free | |
| documents while keeping chains modular. | |
| -------------------------------------------------- | |
| 2. File structure (excerpt) | |
| -------------------------------------------------- | |
| agree_upon/ | |
| βββ api/ | |
| βββ agent/ | |
| β βββ agent_runner.py < orchestrator (snippet patched below) | |
| β βββ chains/ | |
| β βββ document_drafter_chain.py < full file below | |
| β βββ conversational_legal_chain.py | |
| β βββ placeholder_checker.py | |
| βββ probleam_statement.txt < you are here | |
| -------------------------------------------------- | |
| 3. Key existing code | |
| -------------------------------------------------- | |
| A) agent/chains/document_drafter_chain.py (current full content) | |
| ---------------------------------------------------------------- | |
| ```python | |
| """ | |
| agent/chains/document_drafter_chain.py | |
| ββββββββββββββββββββββββββββββββββββββ | |
| Generates or edits the legal draft *purely* from AgentState + instruction. | |
| INPUT keys: | |
| β’ document_type β str | |
| β’ filled_fields_json β str (JSON object of {field: value}) | |
| β’ current_draft β str (may be empty) | |
| β’ instruction β str (e.g. "create fresh draft" | "swap Party A β¦") | |
| OUTPUT (strict JSON): | |
| { | |
| "draft": "<complete draft>", | |
| "is_drafted": true | |
| } | |
| """ | |
| import json | |
| import logging | |
| from typing import Dict | |
| from langchain.chains import LLMChain | |
| from langchain.prompts import PromptTemplate | |
| from pydantic import BaseModel, Field | |
| from agent.llm import llm_chain | |
| logger = logging.getLogger("agent.drafter") | |
| logger.setLevel(logging.DEBUG) | |
| # ββββββββββββββββββββββββββββ | |
| # JSON schema & parser | |
| # ββββββββββββββββββββββββββββ | |
| from agent.utils import safe_parse_json_block, _iterate_json_candidates | |
| class SimpleDraftOutputParser: | |
| def get_format_instructions(self) -> str: | |
| return 'Return ONLY a JSON object with keys "draft" and "is_drafted".' | |
| def parse(self, text: str) -> dict: | |
| # Try all balanced-brace JSON blocks | |
| for candidate in _iterate_json_candidates(text): | |
| parsed = safe_parse_json_block(candidate) | |
| if parsed and 'draft' in parsed and 'is_drafted' in parsed: | |
| return parsed | |
| raise ValueError(f"No valid draft JSON found in output:\n{text}") | |
| parser = SimpleDraftOutputParser() | |
| # ββββββββββββββββββββββββββββ | |
| # Prompt | |
| # ββββββββββββββββββββββββββββ | |
| _DRAFTER_PROMPT = PromptTemplate( | |
| input_variables=["document_type", "filled_fields_json", | |
| "current_draft", "instruction"], | |
| template=r""" | |
| You are a veteran legal drafter. | |
| ββββββββββββββββββββββββββββββββββββββββββββ | |
| π Document type: {document_type} | |
| π Field values (JSON): {filled_fields_json} | |
| π Existing draft: <<<START>>> | |
| {current_draft} | |
| <<<END>>> | |
| π Instruction: {instruction} | |
| ββββββββββββββββββββββββββββββββββββββββββββ | |
| TASK: | |
| β’ Produce a *clean*, professional draft in plain text. | |
| β’ No placeholders like [DATE] β substitute actual field values. | |
| β’ NO boilerplate like "Here is your draft". | |
| β’ Return only compact JSON exactly: | |
| {{ | |
| "draft": "<the complete draft here>", | |
| "is_drafted": true | |
| }} | |
| No markdown, no commentary. | |
| {format_instructions} | |
| """ | |
| ) | |
| def get_document_drafter_chain() -> LLMChain: | |
| return LLMChain( | |
| llm=llm_chain, | |
| prompt=_DRAFTER_PROMPT.partial( | |
| format_instructions=parser.get_format_instructions() | |
| ), | |
| output_key="text", # parsed later by agent_runner | |
| verbose=True, | |
| ) | |
| ``` | |
| B) agent/chains/conversational_legal_chain.py (current full content) | |
| --------------------------------------------------------------------- | |
| ```python | |
| """ | |
| agent/chains/conversational_legal_chain.py | |
| ββββββββββββββββββββββββββββββββββββββββββ | |
| Natural conversation chain that: | |
| β’ Learns/updates `document_type` | |
| β’ Collects & validates `needed_fields` | |
| β’ Replies naturally to the user | |
| β’ Emits a *strict* JSON command set for the runner: | |
| { | |
| "actions": ["update_document_type", "update_needed_values", "update_document"], | |
| "user_reply": "<string>", | |
| "update_document_type": "<DOC_TYPE|NONE>", | |
| "update_needed_values": { "<field>": "<value>", ... } | {}, | |
| "update_document_instruction": "<string|NONE>" | |
| } | |
| """ | |
| from typing import Dict, Any | |
| from langchain.chains import LLMChain | |
| from langchain.prompts import PromptTemplate | |
| from agent.llm import llm_chain # β your ChatOpenAI wrapper | |
| _CLS_PROMPT = PromptTemplate( | |
| input_variables=["history", "user_input", "state"], | |
| template=r""" | |
| Act as an *empathetic legal-AI assistant*. | |
| Goal β build an accurate AgentState (shown below) and help the user. | |
| Never violate the guard-rules afterwards. | |
| ββββββββββββββββββββββββββββββββββββββββββββ | |
| π¦ Current AgentState (read-only summary) | |
| {state} | |
| ββββββββββββββββββββββββββββββββββββββββββββ | |
| π¬ Conversation history | |
| {history} | |
| ββββββββββββββββββββββββββββββββββββββββββββ | |
| π§ Latest user message | |
| {user_input} | |
| ββββββββββββββββββββββββββββββββββββββββββββ | |
| π― MUST: | |
| 1. Carry on a *natural* dialogue. Explain, clarify, warn politely. | |
| 2. Decide which of the following *atomic* actions you must take **this turn**: | |
| β’ `update_document_type` β we just learned / corrected the doc type | |
| β’ `update_needed_values` β we have one or more field values to add | |
| β’ `update_document` β ready to draft / revise the draft | |
| You MAY output multiple actions at once. | |
| 3. Reply to the user in `user_reply`. | |
| 4. Return **ONLY** a valid compact JSON, no markdown, no commentary. | |
| Schema: | |
| {{ | |
| "actions": [...], // β¬ see list | |
| "user_reply": "<your reply>", | |
| "update_document_type": "<type|NONE>", | |
| "update_needed_values": {{ "<field>": "<value>", ... }}, | |
| "update_document_instruction": "<instruction|NONE>" | |
| }} | |
| Guard-rules: | |
| β’ If AgentState.is_drafted is true β changing `document_type` is **forbidden**. | |
| Instead, offer to refine the existing draft. | |
| β’ Politely warn & double-check if the userβs input seems wrong or contradictory. | |
| """ | |
| ) | |
| def get_conversational_legal_chain(memory) -> LLMChain: | |
| """Factory β returns an LLMChain with SQL-buffer memory attached.""" | |
| return LLMChain( | |
| llm=llm_chain, | |
| prompt=_CLS_PROMPT, | |
| memory=memory, | |
| output_key="text", | |
| verbose=True # <β runner pulls .text then parses JSON | |
| ) | |
| ``` | |
| C) agent/chains/placeholder_checker.py (current full content) | |
| ------------------------------------------------------------- | |
| ```python | |
| """ | |
| Detect unresolved placeholders in a draft. | |
| Changes | |
| β’ Returns structured list via PydanticOutputParser | |
| """ | |
| import logging | |
| from typing import List | |
| from pydantic import BaseModel, Field | |
| from langchain.prompts import PromptTemplate | |
| from langchain.chains import LLMChain | |
| from langchain.output_parsers import PydanticOutputParser | |
| from agent.llm import llm_chain | |
| from agent.prompts import PLACEHOLDER_CHECKER_PROMPT | |
| logger = logging.getLogger("agent.placeholder_checker") | |
| logger.setLevel(logging.DEBUG) | |
| # ββββββββββββββββββββββββββββ | |
| # JSON schema | |
| # ββββββββββββββββββββββββββββ | |
| class PlaceholderOutput(BaseModel): | |
| placeholders: List[str] = Field( | |
| ..., description="List of unresolved placeholder tokens" | |
| ) | |
| parser = PydanticOutputParser(pydantic_object=PlaceholderOutput) | |
| # ββββββββββββββββββββββββββββ | |
| # Prompt | |
| # ββββββββββββββββββββββββββββ | |
| prompt = PromptTemplate( | |
| input_variables=["draft"], | |
| template=""" | |
| You are reviewing a legal draft for unresolved placeholders such as | |
| [DATE], [NAME], [ADDRESS], [PLACEHOLDER], etc. | |
| Draft: | |
| {draft} | |
| {placeholder_checker_prompt} | |
| Return ONLY a JSON list, e.g. ["[DATE]", "[NAME]"] or []. | |
| """ | |
| ) | |
| # ββββββββββββββββββββββββββββ | |
| # Chain | |
| # ββββββββββββββββββββββββββββ | |
| def get_placeholder_checker_chain() -> LLMChain: | |
| return LLMChain( | |
| llm=llm_chain, | |
| prompt=prompt.partial(placeholder_checker_prompt=PLACEHOLDER_CHECKER_PROMPT), | |
| output_key="text", # raw text; we'll parse separately | |
| verbose=True, | |
| ) | |
| ``` | |
| B) agent/agent_runner.py (patched excerpt β placeholder gate) | |
| ------------------------------------------------------------- | |
| ```python | |
| import re, datetime | |
| PLACEHOLDER_RE = re.compile(r"\[[^\[\]\n]{1,40}\]") # e.g. [DATE] | |
| MAX_RETRIES = 2 | |
| AUTO_FILL = { | |
| "[DATE]": datetime.date.today().strftime("%B %-d, %Y"), | |
| } | |
| def _find_placeholders(text: str) -> list[str]: | |
| return PLACEHOLDER_RE.findall(text) | |
| auto_retry_count = 0 | |
| while True: | |
| # drafter already ran β state.draft populated | |
| # optional quick auto-fill | |
| for ph, val in AUTO_FILL.items(): | |
| state.draft = state.draft.replace(ph, val) | |
| missing = _find_placeholders(state.draft) | |
| if not missing: | |
| break # draft clean β | |
| if auto_retry_count < MAX_RETRIES: | |
| auto_retry_count += 1 | |
| instr = ( | |
| "Replace the unresolved placeholders " | |
| + ", ".join(missing) | |
| + " with concrete values based on filled fields and context." | |
| ) | |
| drafter_raw = invoke_with_retry(drafter, { | |
| "document_type": state.document_type, | |
| "filled_fields_json": json.dumps(state.needed_fields), | |
| "current_draft": state.draft, | |
| "instruction": instr, | |
| }) | |
| state.draft = _extract_text(drafter_raw) | |
| continue # loop again | |
| # after retries, still missing β ask user | |
| reply_to_user = ( | |
| "I still need the following details to finalise the draft: " | |
| + ", ".join(missing) | |
| + ". Could you please provide them?" | |
| ) | |
| state.needed_fields.update(_map_placeholders_to_fields(missing)) | |
| _persist(memory, conv_id, user_input, reply_to_user, state) | |
| return # end turn β conversational chain will ask next time | |
| ``` | |
| -------------------------------------------------- | |
| 5. Reference patch checklist | |
| -------------------------------------------------- | |
| β Add `PLACEHOLDER_RE`, `MAX_RETRIES`, `_find_placeholders` and loop to | |
| `agent_runner.py`. | |
| β Optionally extend `AUTO_FILL` map. | |
| β No other files need modification. | |
| --- End of package --- | |
| solution | |
| # Upgrade Guide β βAsk-Until-Cleanβ Drafting | |
| *(logic explanation + concrete change list you can paste into Vibe Coding)* | |
| --- | |
| ## π How the new flow works (plain English) | |
| 1. **Generate a draft.** | |
| `document_drafter_chain` turns the currently-known field values into a full document. | |
| 2. **Check the draft for leftover tokens** like `[DATE]`, `[PARTY_ADDRESS]`. | |
| `placeholder_checker_chain` now returns one JSON object: | |
| ```json | |
| { "is_success": true|false, "missing_desc": "natural language list of whatβs still empty" } | |
| ``` | |
| 3. **Branch on the result** | |
| * **Success (`is_success == true`)** β send the polished draft to the user, stop. | |
| * **Failure (`is_success == false`)** β fall back to the conversational chain: | |
| > βI am an internal checker. The draft is missing: *effective date, addresses* β please ask the user for these details.β | |
| The conversational chain politely asks the user, captures the answers, and emits `update_needed_values`. The runner merges those into state and goes back to step 1. | |
| 4. **Guard against endless loops.** | |
| If the agent has already asked twice (`MAX_USER_PROMPTS = 2`) and the user still hasnβt supplied the data, it exits gracefully: | |
| > βIβm still missing details. Letβs continue once you have them.β | |
| This guarantees either a placeholder-free draft *or* a controlled, polite stop. | |
| --- | |
| ## π What to change (file-by-file) | |
| | File | Change | | |
| | ----------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | |
| | **`agent/chains/placeholder_checker_chain.py`** | *Replace* the Pydantic model with:<br>`class PlaceholderCheckOut(BaseModel): is_success: bool; missing_desc: str`<br>*Replace* the prompt with the three-line logic shown below. | | |
| | **`AgentState` dataclass / pydantic model** | Add `missing_prompt_count: int = 0`. | | |
| | **`agent/agent_runner.py`** | Remove the old regex-retry loop. Add the new orchestration block (β25 lines) shown below. | | |
| ### π New checker prompt (copy verbatim) | |
| ``` | |
| Scan the draft below. | |
| β’ If NO tokens like [DATE] remain: | |
| return {"is_success": true, "missing_desc": ""} | |
| β’ Otherwise: | |
| return {"is_success": false, "missing_desc": "<short English list of whatβs missing>"} | |
| Return ONLY that JSON. No other text. | |
| Draft: | |
| {draft} | |
| ``` | |
| ### π New runner logic (insert where the old placeholder code was) | |
| ```python | |
| MAX_USER_PROMPTS = 2 | |
| draft_raw = drafter.run({...}) | |
| draft = drafter_parser.parse(draft_raw)["draft"] | |
| check_raw = placeholder_checker.run({"draft": draft}) | |
| check = checker_parser.parse(check_raw) | |
| if check.is_success: # β clean β finish | |
| state.draft = draft | |
| reply_to_user = "β All set! Your document is fully drafted. Let me know if you'd like any edits." | |
| _persist(memory, conv_id, user_input, reply_to_user, state) | |
| return | |
| if state.missing_prompt_count >= MAX_USER_PROMPTS: # β give up politely | |
| reply_to_user = ( | |
| "I'm still missing details (" + check.missing_desc + | |
| "). Let's continue once you have them." | |
| ) | |
| _persist(memory, conv_id, user_input, reply_to_user, state) | |
| return | |
| state.missing_prompt_count += 1 # β ask user | |
| system_addition = ( | |
| "I am an internal checker. The draft is missing: " + | |
| check.missing_desc + | |
| ". Please ask the user for these details." | |
| ) | |
| conv_raw = conversational_chain.run({ | |
| "system_addition": system_addition, | |
| **standard_inputs | |
| }) | |
| conv_cmds = conv_parser.parse(conv_raw) | |
| _apply_commands(conv_cmds) # merges user answers into state | |
| return # wait for next turn | |
| ``` | |
| --- | |
| ## π Safety & validation | |
| * **Type checking:** the conversational chain prompt already instructs the LLM to ensure answers match expected formats. | |
| * **Second line of defence (optional):** after `_apply_commands`, run lightweight regex/date parsers; if a value fails validation, ignore it and prompt again. | |
| --- | |
| ### Result | |
| *User sees at most two polite follow-up questions; you never deliver a document with `[PLACEHOLDERS]`; and the codebase changes only in one chain, one dataclass field, and one runner block.* | |