A Model Context Protocol server that gives Claude, Cursor, ChatGPT — anything that speaks MCP — full mastery over a Salesforce org through one search() and one execute() call.
search() fuses 4 layers — local catalog grep, 6,410 chunks of official Salesforce docs in Pinecone, live Tavily web search, and live introspection of your connected org — then Cohere reranks. execute() runs Python in a 25s sandbox against a pre-authed sf SDK. 100 tool calls collapse into ~5.
Most Salesforce MCP servers expose every REST endpoint as its own tool. The model sees a wall of 800+ function signatures, burns tokens parsing them, and still can't compose multi-step workflows without losing context.
This server flips it: hand the model one tool that runs code against a pre-authed SDK, and one tool that actually finds the right method. The model writes a Python expression, the server runs it in a sandbox, you get the result. SOQL, REST, Tooling, Composite, Bulk, Metadata — all one call. The fused search() finds the right SDK method and a clonable org template in one hop, so the model usually nails the whole task in 4–6 tool calls instead of a tangled 100-step dance.
The model asks once. The server fans out across the local catalog, a 6,410-chunk RAG index of official Salesforce docs, the live web, AND your connected org — then Cohere reranks the fused candidates for relevance. The model gets a tight, ranked list with the right SDK method, the right doc snippet, AND a clonable template from your own org.
Fast, but only knows SDK method names. Missed half the time when the docs called something by a different name.
Knows your SDK, the docs, what shipped last week, AND what already exists in your org so the model can clone a working template.
Keyword index over ~1,700 SDK methods, sObjects, REST endpoints. Fast, deterministic baseline.
Official SF dev docs: Metadata, Apex, REST, Tooling, Connect, Agentforce, GenAI. Semantic match on intent, not keywords.
Recent blog posts, Stack Exchange, release notes. Catches features the docs haven't caught up to.
When you mention Bot, Flow, ApexClass, GenAiFunction… it lists existing instances in your org so the model clones a working template.
Two end-to-end sweeps: 135 calls across REST / Tooling / Composite / Bulk / SOSL / Connect, and 117 calls exercising GenAI + Agentforce + AgentScript + LWC + prompt-template deploys. Both executed against a real production-grade org. Here are the results.
All deployed, exercised, then cleanly torn down. Zero org pollution. Zero retry storms.
Full Apex deploy → run → cleanup roundtrip in 3 MCP calls totalling 3.5s. Deployed a class, executed it via executeAnonymous, queried results, deleted it. Zero manual cleanup.
25-second timeout enforced to the millisecond. FS writes blocked (read-only filesystem). SOQL injection attempts return clean errors. No 5xx. No 429s. No hung connections.
Win-rate calculation (closed-won / total-closed × 100) — normally 3 REST calls plus glue code — done in a single execute(). CALENDAR_MONTH histograms, GROUP BY / HAVING, all native.
Caveat: every failure across both sweeps was a caller-side mistake (wrong path, wrong field name, perm scope) — the MCP itself never broke. Full report on request.
Real execute() snippets the model writes. Copy any of them into your client (after connecting your org) and they'll just run.
"Show me my top opps closing this month."
_result_ = sf.query(
"SELECT Name, StageName, Amount, CloseDate "
"FROM Opportunity "
"WHERE CloseDate = THIS_MONTH "
"ORDER BY Amount DESC NULLS LAST LIMIT 10"
)
"Deploy this Apex class to my org."
_result_ = sf.tooling.post("sobjects/ApexClass", {
"Name": "HelloMcp",
"Body": "public class HelloMcp {\n"
" public static String greet() { return 'Hello from MCP'; }\n"
"}"
})
"Find me leads I haven't touched from last week."
_result_ = sf.query(
"SELECT Id, Name, Company, Status, CreatedDate "
"FROM Lead "
"WHERE IsConverted = false AND CreatedDate = LAST_WEEK "
"ORDER BY CreatedDate DESC"
)
"Run my Apex test suite and summarize the failures."
classes = sf.query(
"SELECT Id, Name FROM ApexClass WHERE Name LIKE '%Test%'"
)
class_ids = [c["Id"] for c in classes]
run = sf.tooling.post("runTestsSynchronous", {
"classids": ",".join(class_ids[:25]),
"maxFailedTests": -1
})
_result_ = {
"ran": run.get("numTestsRun"),
"failed": run.get("numFailures"),
"failures": [
{"class": f.get("name"), "method": f.get("methodName"),
"msg": f.get("message")} for f in (run.get("failures") or [])
]
}
"How bad is my case backlog right now?"
_result_ = sf.query(
"SELECT Priority, Status, COUNT(Id) c "
"FROM Case "
"WHERE IsClosed = false "
"GROUP BY Priority, Status "
"ORDER BY Priority"
)
"Mark these 50 accounts as Tier 1."
accts = sf.query(
"SELECT Id FROM Account WHERE AnnualRevenue > 10000000 LIMIT 50"
)
records = [{"attributes": {"type": "Account"}, "Id": a["Id"],
"Tier__c": "Tier 1"} for a in accts]
_result_ = sf.rest.patch(
"composite/sobjects",
{"allOrNone": True, "records": records}
)
The server itself is functionally complete and battle-tested against a real org. Before opening it up for general internal availability, we are deliberately gating it behind a proper user-level access and permission-enforcement layer so every action performed by an LLM client is anchored to a real Salesforce user and respects their org's profile, permission set, sharing, and FLS rules end-to-end.
No public sign-up, no waitlist, no production endpoint to point a client at right now. This page exists to explain what the optimization does, why two tools beats hundreds, and where it is going.
A purpose-built Model Context Protocol server that collapses the entire headless Salesforce platform into two well-designed tools. Inspired by the Stainless code-mode pattern: instead of generating hundreds of endpoint-specific wrappers, hand the model a real SDK and let it write code.
Built on FastAPI, deployed on Heroku, exercised against a real Salesforce production org. 252 end-to-end stress tests across REST, Tooling, Composite, Bulk, Connect, GenAI / Agentforce, AgentScript, and LWC deploys — with zero server failures attributable to the MCP itself.
Designed for enterprise architects, RevOps automation teams, and AI engineers who want a deterministic, audit-friendly bridge between LLM clients and Salesforce orgs.