
There’s a certain déjà vu in the way enterprises have embraced generative AI. Two years ago, large language models (LLMs) captured every boardroom’s imagination. They could draft, summarize, reason, even converse so naturally that organizations rushed to test them. Proofs of concept and pilots multiplied across departments. But somewhere between “look what this can do” and “let’s take this live,” the enthusiasm met its first roadblock: reality.
The Limits of the Lab
LLMs’ breadth of knowledge makes them ideal for experimentation – for answering “can this be done?” But the moment enterprises try to scale them into production – across business units, under real data governance, within finite budgets – the costs spiral and control falters.
The challenge isn’t capability, rather it is context. General-purpose models don’t understand the nuances of enterprise data. They hallucinate and require months of validation to build trust. Also, they rely on external infrastructure that often sits outside compliance boundaries. These are key reasons why an exercise that begins as a promising demo can quickly become an expensive exercise in model management.
As one CIO told us recently, “We proved AI could work; now we need to prove it can work for us.”
From General to Contextual Intelligence
That question – how to make AI work for us – is driving the next phase of language model adoption in the enterprise. Most companies have come to the realization that the future of enterprise AI isn’t about bigger models; it’s about purpose-built ones.
Small language models (SLMs) emerged from this very need. They’re compact, domain-tuned versions of large models, designed to understand enterprise-specific vocabulary, workflows, and data sensitivities. Their smaller size makes them cheaper to train and faster to deploy. Their narrow focus makes them less prone to hallucination. And because they can run within enterprise firewalls, they solve one of the biggest challenges of all – data sovereignty.
When Infosys built its foundational SLM with NVIDIA last year, the goal wasn’t to compete with the giants. It was to make AI usable where it matters most: inside regulated, cost-sensitive enterprise environments. The model now powers capabilities for platforms like Finacle, delivering contextual intelligence to banks across India, Africa, and Asia-Pacific – regions where data localization laws and infrastructure economics make hyperscale LLMs impractical.
The result: intelligence that’s fast, affordable, and compliant – a rare combination in enterprise AI.

The Hybrid Future: Where LLMs and SLMs Co-exist
So does this mean LLMs are obsolete? Not really. In fact, both models have distinct and complementary roles.
Large language models are invaluable for their breadth. They act as exploration engines that are ideal for knowledge discovery, hypothesis generation, or working with unstructured, heterogeneous data. SLMs, on the other hand, are the exploitation engines; purpose-built for predictable, high trust use cases that demand precision and control.
Enterprises need both. The smartest strategies today are hybrid by design: start broad, validate with an LLM, then distill that learning into an SLM that runs within enterprise boundaries. This “Poly AI” approach mirrors how innovation naturally matures – from curiosity to control, from open exploration to optimized execution.
EdgeVerve’s AI Next platform makes that lifecycle seamless. Through its Neural Connect layer, enterprises can experiment with any LLM, test feasibility, and then transition to SLMs without any disruption. The Fabric layer allows developers to encode the logic that bridges these two models and turns prototypes into scalable, compliant workflows. The result is freedom to innovate without being locked into a single model or vendor.
We’ve seen early success in this approach. A global asset manager, for example, wanted to automate extraction of business rules from fund prospectuses. A task that required deep financial comprehension but absolute data confidentiality. An LLM prototype proved the concept, but token costs and compliance realities stopped it from scaling. By distilling the model to a domain-specific SLM, hosted privately on NVIDIA infrastructure and orchestrated through EdgeVerve AI Next, the company turned a costly proof of concept into a sustainable production system.
The same pattern repeated for a Middle Eastern public entity processing thousands of Arabic-language contracts. A public LLM worked in the pilot, but sovereign-data regulations ruled it out for deployment. A smaller, localized SLM – tuned for Arabic syntax and hosted within national borders – bridged the gap between innovation and implementation.
Across healthcare, insurance, and supply chain too, the story is similar. LLMs open possibilities; SLMs operationalize them.

Platforms as the New Edge
The real advantage, however, doesn’t lie only in the models; it lies in the platform that orchestrates them. Models will change faster than enterprises can adapt. As experts aptly say, “Data is permanent; models are perishable.” The question isn’t which model to choose anymore; it is how to stay agile when the next one arrives.
That’s where a platform-centric approach becomes indispensable – acting as a buffer between the volatility of AI innovation, and the stability enterprises need. With a platform in place, organizations can swap models as needed – say, from GPT-4 to Llama 3 – without rewriting applications or re-certifying every workflow. Business users continue to consume “summarization” or “extraction” services as before, while the underlying intelligence evolves in the background.
For Infosys Topaz and EdgeVerve, this synergy is deliberate. Topaz provides the foundational small language models and access to a vast partner ecosystem. EdgeVerve’s AI Next offers the orchestration layer that makes them enterprise-ready – unifying data, governance, and trust. Together, they form an ecosystem where innovation scales safely, compliantly, and at speed.
Loved what you read?
Get practical thought leadership articles on AI and Automation delivered to your inbox


Loved what you read?
Get practical thought leadership articles on AI and Automation delivered to your inbox
Why This Matters Now
As Generative AI enters its second act, the pilot frenzy is giving way to questions of sustainability. Companies are taking a deeper look at token economics, regulators are watching, and enterprises are realizing that the real differentiator isn’t the size of their model, but the resilience of their architecture. The shift from large to small, from single to hybrid, from model to platform is the transformation enterprises need to craft the next with AI.
In the coming year, we expect to see a clear divide between organizations still experimenting at the edges and those that have turned experimentation into their edge – leveraging hybrid architectures, embedding domain-specific intelligence, and scaling AI responsibly through platforms designed for permanence. Because in a world where models evolve every quarter, the true edge isn’t size or speed. It’s stability with adaptability – and that’s the language every enterprise now needs to speak.
Disclaimer Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the respective institutions or funding agencies.

