When Claude Code's 510,000 lines of source code were accidentally leaked on GitHub, the entire developer community realized: making an AI Agent actually do useful work requires enough code to rebuild half an operating system. This explains why, after a leading manufacturer spent 2 million RMB deploying a private LLM, the procurement team's price comparison work still relies on a 3,000-RMB-per-month intern manually copying data between Excel spreadsheets -- the AI can recite every raw material specification, but when facing supplier quotes sent via WeChat groups, it cannot even auto-fill a simple form.
510K lines
Claude Code source reveals Agent engineering complexity
73%
Manufacturing AI projects stalled at POC stage
0times
Daily automated approval executions by private LLM
This is not a computing power problem. The 70B-parameter model deployed by that manufacturer achieved over 92% accuracy on professional domain questions. But when the business department asked "can it automatically compare PDF quotes from three suppliers and generate procurement recommendations," the IT department's solution was: have AI generate comparison text, then have someone manually copy it into the ERP. It's like buying a CNC machine for your factory, only to discover it can only be used as a calculator.
MCP Protocol Is Connected, But the Business Decision Chain Is Broken
MCP (Model Context Protocol), heavily promoted this year, was expected to be a game-changer. The official SDK, modelcontextprotocol/python-sdk, rapidly accumulated 12K GitHub stars and promised to seamlessly connect LLMs to enterprise systems through standardized interfaces. When an auto parts manufacturer piloted it in their procurement department, they did successfully connect the ERP, email system, and internal price comparison database via MCP.
However, an absurd scene unfolded in the first week: the AI read quotes from three suppliers, generated a flawless comparative analysis report in the system, then stopped and waited for "manual confirmation." Although MCP had bridged the data interfaces, it hadn't solved the decision chain problem -- the AI didn't know that when price differences are within 5%, the long-term partner should be prioritized, nor that a specific raw material requires quality department clearance before proceeding through procurement. These rules are scattered across dozens of Excel spreadsheets and WeChat group announcements -- no matter how powerful MCP protocol is, it cannot read tacit human agreements.
auto_awesomeFrom API to Action: The Overlooked Last Mile
Technical teams often mistakenly believe that providing an API means AI can achieve automation. In reality, manufacturing procurement comparison involves: PDF parsing (unstructured data) -> specification matching (domain knowledge) -> price calculation (numerical operations) -> supplier rating (historical data queries) -> approval triggering (workflow engine) -> result notification (cross-system sync). Exception handling at any step (e.g., handwritten modifications in a PDF) requires a fallback mechanism. Low-code platforms from the open-source community like Dify (GitHub 85k+ stars) can easily build RAG (Retrieval-Augmented Generation) workflows for the first three steps, but when it comes to complete closed-loop cross-system transaction processing, their visual orchestration often falls short.
The POC Curse of Procurement AI
Let's take a closer look at the failed 2-million-RMB LLM case. During the POC (Proof of Concept) phase, the IT team tested with 100 historical procurement records, and the AI's performance was flawless: it could identify tables in PDFs, extract prices, calculate discounts, and even catch calculation errors in a supplier's tax-inclusive pricing. But when the system connected to live business flows, problems erupted.
Supplier quote formats were inconsistent -- some were scanned documents, others were WeChat chat screenshots converted to PDF. Some listed shipping costs separately in a notes column, others hid payment term discounts on page two. Even worse, someone might send a quick message in the WeChat group: "If you pay in full upfront, we can knock off another 3% from the total" -- this unstructured, context-dependent, constantly changing business information turned the well-trained LLM into a mute.
Ultimately, business users had to copy the AI's half-finished results, confirm details in WeChat groups, and manually enter data into the ERP. The entire process shrank from 4 hours to 3.5 hours -- the 30 minutes saved didn't even offset the cognitive overhead of constantly switching between systems.
The Digitalization Gap in Organizational Processes
The root problem is that enterprises treat AI as new employee onboarding rather than an opportunity for organizational restructuring. The 3,000-RMB intern can outperform the 2-million-RMB AI not because they're smarter, but because they can navigate between systems: getting verbal commitments from WeChat groups, doing ad-hoc calculations in Excel, getting face-to-face confirmation from supervisors for exceptions, and finally manually entering results into ERP. These "informal processes" constitute the real operational logic of the enterprise, but they are implicit, unstructured, and fundamentally anti-API.
When enterprises deploy private LLMs, they are essentially asking AI to adapt to existing digital ruins -- ERP systems from 20 years ago, OA systems from 10 years ago, SaaS purchased 5 years ago, and the ever-present Excel spreadsheets filling every gap. MCP protocol solved the tool-calling standardization problem but did not solve the business process standardization problem. It's like installing a Formula 1 engine in a vintage car -- no matter how much power it has, it still won't race.
From Knowledge Base to Digital Colleague: Bridging the Gap
The real breakthrough is not in bigger models or more expensive GPUs, but in shifting from "knowledge base thinking" to "process engineering thinking." At FluxWise, when serving manufacturing clients, we've found that successful AI Agent projects all share one common trait: they don't try to make AI "understand" the business; they make AI "execute" the business.
Specifically, this means:
- Process Externalization: First, convert the unwritten rules in the intern's head (such as "must check with quality department first") into decision trees and state machines, rather than expecting the LLM to infer them from chat logs
- Human-Machine Collaboration Boundaries: Clearly define which actions AI can execute autonomously (such as data extraction, preliminary calculations) and which require human intervention (such as final approvals, exception handling)
- Incremental Automation: Start with automating single steps (such as auto-parsing PDF quotes), then gradually expand to the full chain, rather than pursuing an "omniscient procurement AI" from the outset
Deconstruct Rather Than Connect
Don't simply use MCP protocol to connect existing systems. Instead, deconstruct current processes to identify true decision nodes. For example, in procurement comparison, price calculation is deterministic (automatable), while supplier preference is ambiguous (requires human input).
State Machines Over Prompts
Following Claude Code's architecture, use code (such as Python state machines) for process control and LLMs for content understanding. Don't let the model decide "what to do next" -- let it focus on "what is the price in this text."
Accept Imperfection in Your Digital Workforce
Interns make mistakes too, but they have error-correction mechanisms (asking their supervisor). AI Agents similarly need exception-handling fallbacks -- for example, automatically escalating to a human when confidence falls below 90%, rather than stubbornly giving a wrong answer.
Conclusion: Private Deployment Is Just the Entry Ticket
Claude Code's 510,000 lines of source code remind us that a reliable AI Agent is heavy-duty engineering, not prompt magic. For manufacturing enterprises, spending 2 million RMB on GPUs and licenses only buys the entry ticket. The real competition lies in whether you can transform business processes scattered across WeChat groups and Excel spreadsheets into digital workflows that AI can execute.
The 3,000-RMB intern won't stay cheap forever, but as long as AI can only serve as a "glorified search engine," they will remain irreplaceable for the next three years. Until enterprises realize that deploying an LLM is not the finish line but the starting point for digitizing organizational processes -- only then will AI truly start doing real work, instead of merely answering questions in a chat box.



