Discussion about this post

User's avatar
JP's avatar

The function calling workaround is clever. Building a text-based command system to simulate tool use is exactly the kind of hack that makes open-weight models practical for real agent workflows.

Since you wrote this the function calling situation has improved quite a bit. DeepSeek's later releases handle it natively and there are services now that present these models through an Anthropic-compatible API, so Claude Code just works with them out of the box. I documented the routing setup here https://reading.sh/how-to-get-3x-claude-rate-limits-for-30-a-month-1d3fdb8658df

Your Agentic RAG architecture would actually work well with a multi-model setup. Route reasoning to DeepSeek R1 and retrieval coordination to something faster. Have you tried splitting tasks across models like that?

No posts

Ready for more?