computerweekly.com

SLM series - Nooks: Downsizing AI without shrinking its smarts

_This is a guest post for the Comp__uter Weekly Developer Network by [Nikhil Cheerla](https://www.linkedin.com/in/nikhilcheerla/) in his role as founder & CTO of AI assistants for sales organisations company, Nooks._

_Cheerla writes in full as follows…_

Small Language Models (SLMs) have become a major development in AI, particularly for creating personalised assistants.

Over the last year, I’ve seen two major trends reshaping AI assistants: achieving the same intelligence at lower cost and latency and the advancement of reasoning capabilities that allow systems to explore solutions more deeply than before.

Evaluating whether an SLM, LLM, or a combination is most appropriate for your use case is all about finding the right match. Intelligent routing systems can play matchmaker, directing queries to the most suitable model based on complexity and domain specificity.

SLMs shine when handling specialised tasks like sentiment analysis or customer support automation, while their bigger siblings excel at more complex, general-purpose challenges. Many successful relationships take a hybrid approach, where SLMs handle routine queries and LLMs step in for the heavy lifting.

The vision I see in the future is that everyone will be carrying a small language model in their pocket, trained and fine-tuned specifically for their unique needs.

It will be like having a personal AI companion.

One of the most attractive qualities of SLMs is their faster training times, allowing you to iterate quickly and fine-tune models for specific domains or user preferences. This means you can engage in rapid prototyping and experimentation without the long-term commitment required by larger models.

If you need to adapt to new data or requirements, SLMs are flexible enough for frequent updates. If you’re looking for something special, they’re perfect for customisation in niche applications or industries. When AI is personal, fast and doesn’t need extensive prompt tuning to perform well, it delivers far more value to the end user.

Deciding where your SLM should reside is a crucial choice that affects everything from performance to privacy.

> On-premises deployment provides enhanced data security, which is particularly valuable in regulated industries like healthcare and finance where privacy concerns are paramount. For those seeking more flexibility while maintaining control, private cloud solutions offer an attractive middle ground. Edge deployment shines for applications where every millisecond counts, enabling real-time responses with minimal latency. These deployment options enable SLMs to function almost like a personal companion that’s always with you, rather than a distant service you have to reach out to. Traditional AI assistance often feels generic precisely because it lacks this personal dimension and immediate availability that comes from thoughtful deployment.

Let’s be honest about what SLMs can and cannot do. These models may struggle with tasks requiring deep contextual understanding, where LLMs generally offer better performance on complex, general-purpose challenges. That said, don’t underestimate the little guys – the performance gap between SLMs and LLMs has narrowed significantly in recent years, with some SLMs achieving comparable results in specific domains. I’ve noticed that larger reasoning models sometimes behave like temperamental artists, producing different outputs each time for the same structured problem – sometimes brilliant, sometimes not so much.

This unpredictability makes them less ideal for assistant functions where reliability is important. SLMs, properly fine-tuned, can deliver more consistent responses within their domain of expertise, like that reliable friend who might not know everything but never lets you down on topics they understand well.

In practice, we’ve found that reasoning models are best used to define workflows or templates, while SLMs handle the day-to-day execution. Think of it as a consulting relationship – bring in the expensive expert to set up your strategy, then let your dedicated team implement it. You can have a large reasoning model analyse examples and create a framework, then use a small language model to implement it daily. Reasoning is valuable, but you don’t always need to reinvent the wheel for every interaction. This hybrid approach combines the best of both worlds: deep intelligence when needed and responsive, personalised assistance for daily use.

The shift toward personal AI assistants powered by small language models represents a different way of thinking about human-AI interaction. As these models continue to improve while remaining efficient enough for local deployment, we’ll increasingly see AI that feels less like accessing a remote service and more like working with a knowledgeable colleague who understands your specific needs and communication style.

![](https://itknowledgeexchange.techtarget.com/cwdn/files/2025/03/600x200.jpg)

Read full news in source page