17 Considerations
⚠ 17 Considerations Before Going Live
Key Considerations
on the Path to Production
Before any enterprise client commits to production, they will probe across five dimensions. Understanding these concerns — and having clear, specific answers — is what moves the conversation from demo to deployment.
🛡 Safety & Guardrails 🔍 Observability 🔒 Data & Compliance ⚙ Integration & Reliability 👥 People & Change
🛡 Safety & Guardrails High Priority
Hallucination & Wrong Decisions
🤔
"What happens if the AI extracts the wrong auth number or routes to the wrong vendor?"
Every extracted field carries a confidence score. Low-confidence fields are flagged for human review before the episode proceeds. No routing happens on unconfirmed critical fields.
Confidence Scoring
Scope Boundaries & Guardrails
"How do we stop the agent from doing something it shouldn't — taking actions outside its role?"
Agents operate within strictly defined action scopes — each can only read/write specific data fields and call approved integrations. No open-ended tool access. Actions are whitelisted, not blacklisted.
Action Whitelisting
Prompt Injection & Adversarial Input
💀
"Can a bad actor manipulate the agent through a crafted fax or voice message?"
Input sanitization runs before any LLM processing. Structured extraction schemas reject free-form instructions embedded in documents. Voice inputs are transcribed then validated against field patterns — not executed as commands.
Input Sanitization
Human Override & Kill Switch
🛑
"If something goes wrong, can we stop it immediately? Who has that authority?"
Any coordinator can pause, override, or reassign any episode at any point. A global pause halts all in-flight agent actions within seconds. Episode state is preserved — human picks up where the agent left off.
Human Override
🔍 Observability High Priority
Real-Time Visibility
📊
"We need to see what every agent is doing, right now, not after the fact."
A live ops dashboard shows every episode in-flight: current step, last action, elapsed time, and next scheduled action. Coordinators see the queue in real time with drill-down to field-level detail.
Live Dashboard
Decision Traceability
🧾
"If a claim is questioned, can we explain exactly what the agent did and why?"
Every agent action is logged with source attribution — which document, which field, which model call produced each decision. The full chain is exportable for audits, disputes, or regulatory review.
Full Audit Log
Alerting & Anomaly Detection
🔔
"How do we know when something is going wrong before it becomes a problem?"
Configurable alerts fire on anomalies: unusual gap rates, failed outreach cycles, episodes stuck in a state too long, or extraction accuracy drops below threshold. Alerts route to Slack, email, or PagerDuty.
Threshold Alerts
🔒 Data & Compliance Critical
HIPAA & PHI Handling
🏥
"This is workers' comp health data. Where does PHI go? Who can see it? How long is it kept?"
PHI never leaves the client's environment. The agent runs in client-owned cloud infrastructure. All data is encrypted at rest and in transit. Retention policies are configurable. BAA available on request.
HIPAA Ready
LLM Data Privacy
🤖
"Does our patient data get sent to OpenAI? Can it be used to train their models?"
The client uses their own Azure OpenAI or OpenAI enterprise contract, which includes data processing agreements. PHI is never sent to shared endpoints. Zero-day data retention options are available.
Client LLM Contract
Regulatory & State WC Rules
"Workers' comp rules vary by state. How does the agent know which rules apply?"
A carrier and state rules engine is embedded in the intelligence layer. Auth requirements, fee schedules, and documentation rules are configurable per carrier and jurisdiction — updated as regulations change.
Rules Engine
Access Control & Permissions
🔑
"Not everyone should see every claim. How is access controlled?"
Role-based access control at the episode level. Coordinators see their assigned queue. Managers see all. Adjusters and carriers see only their claims. SSO integration with client identity provider (Okta, Azure AD).
RBAC + SSO
Integration & Reliability Operational
Legacy System Integration
🖥
"We run Guidewire / Applied / a custom claims system. Can this connect?"
The agent outputs a structured JSON episode record via REST API or file drop — format configurable to match any downstream system. Pre-built connectors for Guidewire, Mitchell, and major WC platforms.
API + File Drop
Fallback & Failure Handling
🔄
"What happens when the LLM is down, the fax fails, or a call doesn't connect?"
Every agent step has a defined fallback: LLM timeout retries with exponential backoff; failed calls escalate to email then to a human task; fax parse failures queue for manual review. Nothing silently drops.
Defined Fallbacks
SLA & Performance
"What uptime and response time can you guarantee? What are the SLAs?"
99.9% uptime SLA for the agent orchestration layer. Fax-to-extraction p95 under 90 seconds. Outreach email dispatched within 5 minutes of gap detection. SLA breach alerts fire to ops before the client notices.
99.9% SLA
👥 People & Change Management Underestimated
Staff Adoption & Trust
🤝
"Our coordinators have been doing this for years. They won't trust a machine to do their job."
The agent handles data collection and outreach — coordinators handle exceptions, relationships, and judgment calls. Framed correctly, it removes the frustrating parts of the job, not the expertise. Pilot rollout builds trust gradually.
Augmentation, Not Replacement
Training & Onboarding
📚
"How long does it take to train our team? What does the learning curve look like?"
Coordinators need to learn one thing: when and how to intervene. The interface is designed around the exception queue — what needs a human, not the full pipeline. Most staff are productive within one week.
1-Week Onboarding
Measuring & Proving ROI
📈
"How do we show leadership this is working? What metrics prove the value?"
Built-in KPI tracking: cycle time per referral, gap resolution rate, outreach success rate, cost per episode, and coordinator capacity freed. Baseline vs. post-deployment comparison report generated automatically at 30/60/90 days.
30/60/90 Day Report
4
Safety & Guardrail Controls
3
Observability Capabilities
4
Compliance & Data Controls
3
Integration & Reliability Answers
3
Change Management Strategies