Request a Consult

Request a Consultation

Your consultation request has been received
Oops! Something went wrong while submitting the form.
By hitting submit below, you consent to allow Fractional AI to process and store the data above. Please view our Privacy Policy for more detail.

Peter the Safety Agent: Scaling Safety and Consistency at Cando Rail & Terminals

Cando Rail & Terminals and Fractional AI launched Peter the Safety Agent to transform a manual risk assessment process with gen AI. Peter improves safety, reduces overhead by 20+ hours per assessment, and keeps human experts 100% in charge of every safety decision – setting the foundation to 6x Cando’s risk assessment program in just 17% of the time.

Who is Cando Rail & Terminals?

Cando Rail & Terminals (Cando) is one of North America’s largest owners and operators of first and last mile rail infrastructure. Cando keeps supply chains moving safely across North America, handling more than 2 million railcars carrying $100 billion in products every year.

Cando designs and operates custom rail solutions at its owned network of terminals or in customers’ rail yards, helping customers solve critical supply chain challenges (think: operating in-plant rail networks or managing last-mile freight).

At the heart of every operation is Cando’s award-winning safety culture, ensuring people, communities, and the environment are protected while maintaining reliability.

The challenge: scaling Cando’s safety-first culture  

Thorough risk assessments are central to Cando’s safety program. 

These structured evaluations analyze potential risks, the severity of those risks, and potential mitigating behaviors, helping answer questions like:

  • Should we introduce a new tool at rail yards?
  • What are the risks of opening operations in a new location?
  • What is the risk of having members of the public on our site?
  • What is the risk of stationing potash?

Each assessment focuses on preventing accidents and ensuring the safety of employees, customers, contractors, property, and the environment.

Cando set an ambitious goal: increase the number of annual risk assessments sixfold. But the existing process, while highly effective, came with significant administrative burden. Long meetings, manual documentation, and weeks of back-and-forth slowed things down and kept subject matter experts from doing what they do best.

The Solution: Peter the Safety Agent

We partnered with Cando to develop Peter the Safety Agent.

How Peter works:

  • Automates pre-work – Produces a first draft of the risk assessment in seconds at a cost of <$0.05 per generation.
  • Documents the risk assessment meeting – Transcribes and updates the risk assessment draft during the meeting of subject matter experts, turning conversations into structured documentation.
  • Preserves expert focus – Keeps the subject matter experts’ time centered on critical safety issues rather than administrative overhead. 

Our goal wasn’t to automate away the essential risk assessment meetings themselves, but to arm Cando’s team with better preparation, stronger documentation, and faster turnaround times surrounding these meetings, so the focus stays where it matters most: safety.

Impact at a glance

Peter the Safety Agent improves consistency, scales safety, and reduces overhead. It enables   an 83% reduction in the time needed to 6x the number of risk assessments.

Before Peter

  • 4–16 hours of meetings per assessment for up to 10 SMEs
  • 2+ weeks of back-and-forth to finalize the assessment
  • Manual prework limited by each individual employee’s prior context
  • Heavy reliance on PDFs and manual cross-referencing

After Peter

  • Draft assessments in seconds for <$0.05/ assessment
  • 20+ hours of SME time saved per assessment
  • Consistent output with thorough expert review
  • AI assistant consolidates all historical context to support institutional learning

Looking under the hood

Project setup

  • Data: Anonymized historical risk assessments; Cando Safe Work Procedures; local operating nstructions; emergency plans; training and efficiency testing data; incident and hazard reporting; and more—structured/unstructured safety data that Peter the Safety Agent can reference and link from its assessments.

  • Models & voice: GPT-4.1 for prework automation, OpenAI Realtime API for voice interaction; Recall.ai for meeting audio access; MS Teams integration to surface the AI assistant in live sessions.

Peter the Safety Agent: two distinct AI workflows

1 - Automating risk assessment drafts (meeting pre‑work):

  • Input: Starts with project name and detailed description (e.g., “introducing torque gun at rail yards”)
  • Output: Suggests risks, severity/frequency, and remedial measures as a draft risk assessment ahead of the expert meeting
  • Architecture: A LangChain agent flow with tools to access past risk assessments, company procedures, and industry safety standards, plus user-supplied, project-specific PDF context.

2 - In‑Meeting AI voice assistant:

  • Input: meeting conversation; draft risk assessment from prework
  • Process:
    • Listens to meeting conversations
    • Classifies exchanges (e.g., questions, direct prompts to the bot)
    • Responds by speaking, searching, or editing the shared draft live with experts
  • Output: Updated risk assessment draft refined in real time throughout the meeting
  • Architecture: See diagram.

Lessons Learned

  • Evals: We used LLM‑as‑judge to compare risk lists (and severity/frequency) against historic gold standards; aggregating evals produced stable scoring.
  • Generic RAG wasn’t always effective: we found that selecting past risk assessments based on their human-written titles and descriptions outperformed similarity search with RAG. Leveraging the meaningful summaries already provided by humans gave the LLM better context than generic RAG retrieval.
  • Response gating: we found that the voice agent loved to talk. We managed this by gating its actions. 
    • Non-reasoning models like the realtime agent struggle with conditional and nuanced instructions (e.g., "only speak when spoken to" but also "continue responding if already in a conversation"). 
    • These rules require classification before action, but the model can’t perform explicit reasoning, and exposing its chain-of-thought to the user isn’t acceptable in a conversational voice setting.
    • Our solution was to separate classification from execution: the model outputs a classification via a tool call, and OpenAI’s primitives enforce the corresponding action.