Tutorial 09: Custom Legal Skills, Guardrails & Agents (OpenAI)
Build custom skills for your firm's workflows, create compliance guardrails, and deploy multi-agent systems for complex legal tasks.
What You'll Learn
This tutorial shows you how to build custom legal skills, add safety checks (guardrails), and run multi-agent workflows with OpenAI. Some technical comfort is required.
Expert Level
Developer skills recommended. Estimated time: 120 minutes.
Learning Objectives
By the end of this tutorial, you will:
- Understand OpenAI's architecture (Assistants, Custom GPTs, Moderation, custom guardrails)
- Build custom legal skills for your firm's workflows
- Create guardrails for quality control and compliance
- Deploy multi-agent systems for complex legal tasks
Part 1: Understanding the OpenAI Stack
Architecture Overview
Why This Matters for Legal
| Component | Legal Application |
|---|---|
| Custom GPTs | Encode playbooks, review procedures, drafting standards |
| Assistants API | Multi-step workflows, document processing pipelines |
| Moderation API | Input/output content classification; block inappropriate content |
| Custom Wrappers | Audit logging, compliance checks, approval gates |
| Function Calling | Connect to document management, legal research tools |
| Orchestration | Parallelize document review, research tasks (your code coordinates) |
Part 2: Building Custom Legal Skills
What Are Skills in OpenAI Context?
Skills are specialized instructions and knowledge stored in Custom GPTs or Assistants. Unlike one-time prompts, skills persist and activate automatically within the configured context.
Skill Implementation via Custom GPT
Step 1: Create Custom GPT
- Go to ChatGPT and create a new Custom GPT
- Name it (e.g., "Contract Review - [Firm Name]")
Step 2: Write Instructions (SKILL.md equivalent)
Step 3: Add Knowledge Files
Upload to Custom GPT or Assistant:
- Playbook JSON or markdown
- Clause library
- Example outputs (good and bad)
Knowledge File Structure (for Custom GPT):
| File | Purpose | Format |
|---|---|---|
playbook.json | Standard positions, risk thresholds | JSON |
clause-library.md | Approved language for redlines | Markdown |
good-review-example.md | Calibration reference | Markdown |
bad-review-example.md | Anti-pattern to avoid | Markdown |
Reference these in instructions: "Apply positions from playbook.json. Use clause-library.md for approved redline language."
Example: Playbook JSON (upload as knowledge file)
Example: Clause Library (upload as knowledge file)
Step 4: Test and Calibrate
Run 3-5 real contracts through the GPT. Compare output to expert review. Refine instructions.
Calibration Notes (add to GPT instructions or a knowledge file):
- Liability cap thresholds updated January 2026
- New data processing requirements per GDPR changes
- Updated AI/ML clause language required
Example: NDA Triage Skill
For high-volume NDA processing, create a separate Custom GPT with triage-specific instructions:
Part 3: Building Compliance Guardrails
What Are Guardrails?
Guardrails are checks that run at specific points in the AI workflow. OpenAI provides the Moderation API for content classification; additional controls (audit logging, PII checks, approval gates) are implemented via custom wrappers or the OpenAI Guardrails Python framework (preview).
| Guardrail Type | Trigger Point | Implementation |
|---|---|---|
| Input Moderation | Before prompt processing | Moderation API or custom wrapper |
| Output Moderation | After response | Moderation API or custom wrapper |
| Content Filters | Before/after | Custom logic; PII detection, privilege checks |
| Audit Logging | Via wrapper | Custom API wrapper around OpenAI calls |
Legal Compliance Guardrail Example
Purpose: Prevent AI from outputting unauthorized changes to privileged documents.
Implementation Options:
- OpenAI Moderation API: Use for input/output content filtering
- Custom API Wrapper: Intercept requests/responses, log, validate
- Post-Processing Script: Run output through validation before use
Example: Audit Logging Wrapper (Conceptual)
Additional Guardrail Use Cases
Citation Verification:
- Add instruction: "For any legal citation, note that verification in Westlaw/Lexis is required."
- Post-process output to flag citation patterns for human review
Confidentiality Check:
- Use input moderation to detect potential client identifiers
- Add instruction: "Do not include client names or matter identifiers in output."
Privilege Protection:
- Restrict which Custom GPTs can access privileged project folders (via integration design)
- Log all document access for compliance review
Guardrail Implementation Checklist
Before deploying legal AI with guardrails:
- Input moderation: Moderation API or custom wrapper blocks inappropriate content
- Output verification: Instruction to "note that all citations require Westlaw/Lexis verification"
- Audit logging: All API calls logged with timestamp, model, token count, prompt hash (no PII)
- Confidentiality: Instructions prohibit client names and matter IDs in output
- Approval gates: For privileged folders, require explicit user approval before document access
- PII detection: Input/output scanned for identifiers; redact or block as needed
- Rate limiting: Throttle high-volume calls to avoid cost spikes and abuse
OpenAI Moderation API Quick Reference
Categories include harassment, hate, sexual, violence, self-harm, and others. See the Moderation API reference for the full list. For legal workflows, focus on preventing accidental data leakage rather than content moderation.
Part 4: Multi-Agent Legal Workflows
Understanding Multi-Agent Patterns
OpenAI supports multi-agent workflows via:
- Assistants API: Multiple assistants with different instructions; your code coordinates runs across threads
- Agents SDK: Handoffs between specialized agents (see OpenAI Agents SDK for JS/TS; openai-agents-python for Python)
- Orchestration: Your code coordinates multiple API calls (Assistants or Chat Completions)
Example: Parallel Due Diligence Review
Example: Research + Draft Workflow
Part 5: Packaging Skills for Distribution
Custom GPT Sharing
- Private: Only you can use
- Organization: Share with ChatGPT Team/Enterprise
- Public: Publish to GPT Store (use with caution for legal workflows)
Assistants API for Programmatic Use
For firm-wide deployment:
- Create Assistants via API with your skill instructions
- Store assistant IDs in your application
- Use threads for matter isolation
- Add file search for knowledge base
Skill Package Structure (Assistants API)
When packaging skills for programmatic use, organize assets as:
Distribution Checklist
Before sharing a Custom GPT or Assistant:
- Instructions reviewed for accuracy and firm standards
- Knowledge files (playbook, clause library) are current
- No client names or matter IDs in any uploaded content
- Sharing scope appropriate (Private vs Organization)
- Version/date noted in GPT name or description
Security Considerations
- Source verification: Only install skills from trusted sources
- Code review: Review all integration code before deployment
- No client data: Never include client data in skill files
- Version control: Track changes to instructions
- Access control: Limit who can modify firm GPTs/Assistants
Part 6: Security and Compliance
Skill and GPT Security
- Source verification: Only install skills or GPTs from trusted sources
- Code review: Review all integration code and API wrappers before deployment
- No client data: Never include client names, matter IDs, or privileged content in GPT instructions or knowledge files
- Version control: Track changes to instructions; note version/date in GPT name or description
- Access control: Limit who can modify firm GPTs; use Organization sharing, not Public
Data Protection Patterns
API Wrapper for Audit + Sanitization (conceptual):
Compliance Requirements Checklist
- Skills/GPTs reviewed by IT security
- Guardrails tested in sandbox environment
- Audit logging enabled and retention policy defined
- Client data segregation verified (no cross-matter leakage)
- Access controls configured for Custom GPTs and Assistants
- Backup and recovery procedures documented
Skill Security Checklist
- Source verification: Only install skills or GPTs from trusted sources
- Code review: Review all integration code and API wrappers before deployment
- No client data: Never include client names, matter IDs, or privileged content in GPT instructions or knowledge files
- Version control: Track changes to instructions; note version/date in GPT name or description
- Access control: Limit who can modify firm GPTs; use Organization sharing, not Public
Part 7: Troubleshooting
Common Issues
| Issue | Cause | Fix |
|---|---|---|
| Skill not activating | Instructions too long or vague | Put activation criteria at top; use clear "Apply when" section |
| Inconsistent output | No output format in instructions | Add "OUTPUT FORMAT" with required structure |
| Guardrail bypassed | Wrapper not applied to all code paths | Ensure all OpenAI calls go through wrapper |
| Assistants timeout | Long-running runs | Use polling with backoff; consider chunking work |
| Wrong model behavior | Conflicting instructions | Simplify; remove redundant or contradictory rules |
| Token limit exceeded | Large context + long output | Summarize intermediate steps; use multiple runs |
| Custom GPT ignores knowledge files | Files not uploaded or wrong format | Re-upload; use supported formats (JSON, MD, TXT); check file size limits |
| Moderation API false positives | Legal terminology flagged | Use custom wrapper to whitelist known terms; log for review |
Assistants API Debugging Tips
- Check
run.statusandrun.last_errorfor failed runs - Use
includeparameter to retrieve message content and tool calls - For function calling, verify tool definitions match expected schema
- Log
usage(prompt_tokens, completion_tokens) to tune context size - For file search: ensure files are attached to the assistant; verify vector store is built
Custom GPT Debugging Tips
- Test with minimal instructions first, then add complexity
- If output is truncated, add "Provide complete response" or split into smaller requests
- Verify knowledge file content is referenced in instructions (e.g., "Apply positions from playbook.json")
When to Use Custom GPTs vs Assistants API
| Use Case | Prefer | Reason |
|---|---|---|
| Ad-hoc legal review | Custom GPT | No code; quick setup |
| Firm-wide workflow | Custom GPT (Org) | Easy distribution |
| Automated pipelines | Assistants API | Programmatic control |
| Multi-step orchestration | Assistants API | Threads, tools; or Agents SDK for handoffs |
| Audit/compliance requirements | Assistants API + wrapper | Full control over logging |
Example: Output Format Instruction
Add to Custom GPT or Assistant instructions for consistent structure:
Do This Now
- Create a custom skill (Custom GPT) for one of your firm's review processes
- Add at least one guardrail (audit logging or output verification)
- Test a multi-step workflow for document processing
- Document your skill so your team can use it
- Consider packaging as shared Custom GPT for distribution
Navigation
Quick Reference: OpenAI Commands
Related family pages
- Claude Skills and Hooks - Same concepts with Claude
- Core Concepts - Platform-neutral legal workflow model
Sources
- OpenAI Assistants API (verify current API status)
- OpenAI Custom GPTs
- OpenAI Moderation API
- OpenAI Moderation API Reference
- OpenAI Function Calling
- OpenAI Agents SDK (Handoffs) (JS/TS; Python: openai-agents-python)