Checkr is a technology company built around background checks and workforce verification. Processing checks for over 100,000 hiring teams, including Uber and Lyft, Checkr offers various screenings, such as criminal, civil, and global checks for over 200 countries.
Checkr wanted a way for their entire revenue operations team to be able to query Salesforce data using natural language, and get back accurate information in Claude. The types of queries included account research to prepare for calls, analyzing pipeline opportunities, and creating pricing quotes.
While this sounds like this could be solved by giving an LLM direct access to Salesforce data with an MCP, when Checkr evaluated Claude against their raw Salesforce API, the model returned accurate answers only 13% of the time. It pulled from the wrong fields, queried the wrong objects, and misunderstood core business definitions.
As Zoë Mckenzie on Checkr’s RevOps team, puts it:
“There’s a lot of information in there that is the wrong information if you don’t know where you’re looking. AI will return very correct-sounding answers that are not correct. RevOps’ whole goal is to never have that conversation where it’s like, my dashboard says this and your dashboard says that.”
Checkr has three separate Salesforce instances, one for each product line, with three different data models, three sets of segment definitions, and three definitions of core concepts.
In addition, Salesforce’s API exposes far more than what any user actually sees in their interface. Checkr’s team deliberately curates the UI, leaving many fields off-screen even though users technically retain access to them, fields that exist to support backend automations. When an AI tool queries the raw API, it sees all of it and can’t tell the curated, authoritative data from the backend data, and it becomes very challenging to debug.
This ambiguity had always been a RevOps headache, but the rollout of Claude turned it into an urgent problem. Suddenly, far more employees were querying Salesforce data through AI tools, and they were getting answers back that sounded authoritative but were often wrong.
So how can businesses ensure that the users are making decisions on accurate and quality data? What’s the delta between a raw LLM output on Salesforce data versus the desired, accurate outcome? Is critical business context like definitions, workflows, SOPs, when a case should be escalated, why routing works the way that it does, and what an MQL means?
Working with Credal, Checkr built three separate Credal MCP servers, one per Salesforce instance. This is because the same term could mean completely different things depending on which business unit you were in: a “strategic segment” in Product Line 1 is not a “strategic segment” in Product Line 2, and what counts as a “Sales Qualified Lead (SQL)” in one instance is an “opportunity” in another. Each MCP server had its own unique set of instructions to provide the right data.
Each MCP server is grounded in Confluence documentation that defines what the data actually means in that context and on the specific semantics of its business unit, with governed Salesforce actions to help query their database.
Checkr’s RevOps team saw an immediate jump in the reliability of AI-driven Salesforce answers from 13% to 86% data accuracy.
By grounding every query in business-unit-specific context, the MCP combats the “correct-sounding but wrong” answers that AI tools previously produced. In addition, the Credal MCP asks clarifying questions before executing, surfacing two or three clarifying questions and letting the user choose, rather than guessing. Results arrive with their assumptions and limitations stated alongside them, directly addressing the “my dashboard says this, yours says that” problem.
Before building with Credal, Checkr weighed the obvious paths. Each addressed part of the problem; none fit a company running three Salesforce instances and a fast-growing mix of AI tools.
Before partnering with Credal, Zoë explored Claude’s native plugins and skills. The limitations surfaced quickly.
First, there was no way to collaborate. Zoë could build a plugin for herself, but couldn’t share that configuration with her team, couldn’t centrally manage who had access to what, and couldn’t ensure that every user across RevOps was getting the same functionality.
Would a skill combined with an MCP suffice? They decided against it because ensuring consistent activation of that skill for every user is nearly impossible. A plugin also felt like the wrong fit; if a user utilized the connector independently of the plugin, they would lose critical data.
Second, the tool-level controls weren’t granular enough. With three separate Salesforce instances, Zoë needed to specify exactly which instance each plugin should point to. Claude’s plugin model didn’t support that kind of routing. And there was no way to enforce how skills were read, meaning the carefully curated instructions about which fields were authoritative and how segment definitions differed couldn’t be guaranteed to actually shape the AI’s behavior.
Finally, plugins only work in Claude. Zoë needed something that would deliver the same governed, context-rich experience whether someone was working in Claude, Cursor, Lovable, or Slack.
Salesforce now ships hosted MCP servers and an Agent Fabric registry as part of Agentforce, letting developers expose an org’s APIs, Flows, and prompt templates to AI agents. However, Agentforce’s governance only applies to Agentforce-built agents. Most AI traffic to Salesforce happens outside the Agentforce perimeter, and Checkr wanted centralized governance across all agents, across all teams.
In addition, Salesforce’s native tooling governs within a single org. It doesn’t provide a governed, cross-instance routing layer for external AI surfaces. Salesforce’s hosted MCP exposes the org’s API surface, but doesn’t carry curated instructions about which fields are authoritative or how segment definitions differ across business units. Credal offers editable MCP instructions that saves that knowledge directly.
A basic in-house server connecting one tool to one instance is achievable. The trap is maintenance, which scales non-linearly with Checkr’s reality of three instances, multiple AI surfaces, and constantly evolving tooling. Every Salesforce schema change, every MCP protocol update, every newly adopted AI tool means more engineering work. And governance (audit logging, human-in-the-loop approval, permissions mirroring, rate limiting) would all have to be built from scratch and maintained indefinitely. A custom server designed for Claude also has to be re-engineered for Cursor, Lovable, and anything else, whereas Credal publishes one configuration to many surfaces. Notably, the observability Checkr relied on to debug bad answers and iterate on instructions would itself be a bespoke build: logging pipelines, dashboards, and alerting.
Reach out to sales@credal.ai to see how Credal’s MCP can bring the same accuracy gains to your team.
One platform for all agents. Full visibility for admins, full access for teams.