Why liquid neural networks are interesting and you should pay attention.
- aubrey3218
- Jan 16
- 4 min read

Liquid neural networks (LNN) was created by MIT researchers Ramin Hasani and Daniela Rus. In the era where OpenAI is creating models with 1 trillion parameters, LNN shows potential to drastically reduce parameter count. Instead of trillions of parameters, models might be able to deliver the same accuracy and performance with millions of parameters.
One significant difference between transformers and LNN is that the weights aren’t static. Classic Transformer models like chatGPT lock the model weights after alignment and refinement. When the weights are locked, it means the model can’t learn new things and continue to evolve. To get around that, prompt engineering is used to improve the results. This is commonly called “In context learning” and produces significant improvement in responses. There are benefits to frozen weights and in-context learning.
The first benefit is the language model doesn’t memorize sensitive information from your data. In the case of health care and banking, it reduces the chance of leaking sensitive data. Second is it reduces the need to refine a language model for simple use cases. The trade off with in context learning is the prompt needs all the data and questions to produce a good answer. If the data is small and fits within the context, the language model will use the information to improve the response. If the data is too large and exceeds the context size, the model won’t produce a good answer. “The data is in the document, but the chat doesn’t return the right answer” is a common complaint and primary cause of failed projects. Researchers and engineers are working to increase the context size, but it increases the processing and power needed to produce a good answer.
Model Name | Version / Family | Developer | Context Window (Tokens) |
Claude | 3.5 Sonnet | Anthropic | 200,000 |
Claude | 3.7 Sonnet | Anthropic | 200,000 (1M Beta) |
Claude | 4.1 Opus / Sonnet | Anthropic | 1,000,000 |
DeepSeek | R1 (Reasoning) | DeepSeek | 164,000 |
DeepSeek | V3 | DeepSeek | 128,000 |
Gemini | 1.5 Pro | 2,000,000 | |
Gemini | 2.5 Pro | 1,000,000+ | |
Gemini | 3 Pro | 2,000,000+ | |
GPT | 4.1 | OpenAI | 1,047,576 |
GPT | 4o | OpenAI | 128,000 |
GPT | 5 | OpenAI | 400,000 |
Llama | 3.1 (405B) | Meta | 128,000 |
Llama | 4 Scout / Maverick | Meta | 128,000 |
Mistral | Large 2 | Mistral AI | 128,000 |
o1 / o3 / o4 | mini / Reasoning | OpenAI | 128,000 – 200,000 |
Qwen | 3-Coder-480B | Alibaba | 262,000 (Extensible to 1M) |
Depending on the application, refining LLM using LORA is still needed. To avoid leaking sensitive data, data masking and cleansing techniques are used to remove details the model shouldn’t memorize. Any data used to train a model ends up in the model and can be exploited by attackers. What about techniques like LNN that don’t lock the weights? At this time, it isn’t clear if liquid networks can avoid memorizing sensitive data. Liquid networks continue to learn as users interact with the model and users might provide sensitive information.
In context learning is commonly known as the “thinking model” with products like Google Gemini, DeepSeek R1, Anthropic Claude, Qwen and OpenAI. The first iteration of deep thinking was a wrapper around LLM using chain of thought and tree of thought. The idea was to prompt the model multiple times to get a better response. The current iteration of deep thinking uses RLHF datasets to teach models how to “show the work”. More complex problems like math, physics and programming are used to refine foundation models. The general approach to thinking models is the same, but the implementation of the newest models is “baked-in”. Baked-in means the model is trained to show the process it used to arrive at the answer and spend more time iterating over the problem. The trade off with thinking models is they take longer to produce the answer. Thinking models require 100 to 1000 times more tokens than non-thinking models. From a business perspective, running thinking models can be 100 times more expensive. If the user base grows 10 times and chat traffic increases 40 times, do you have enough computing resources and budget?
Since the initial introduction of LNN in 2023, the creators Ramin and Daniela have started their own company. In July 2025, they announced liquid foundation models Liquid.ai benchmark shows their 1.2 billion parameter model beats Qwen 1.7 billion parameter model. This is good progress, but it is still too big for an entry level smart phone to run dozens of models. Can liquid foundation models beat the best large models while staying under 500 million parameters?
Entry level smart phones come with 4-6G of memory and use ARM SOC. After you subtract the memory used by the operating system, applications and other processes, you’re lucky to have 1G of free memory. This means your phone can only run 1 LLM with 1 billion parameters. If you want to run multiple models, the phone has to unload the model and load a different model. This makes everything run slower and drains the battery. To reach the point where users can run dozens of AI powered applications, several things have to happen.
Entry level phones need more memory
The memory bandwidth has to increase to make loading and unloading data faster
Models have to get smaller
Dedicated circuits to speed up processing (NPU)
Different model architectures optimized for efficiency
The current approach pushed by OpenAI is like a gigantic shipping container. It can memorize the entire internet and carry everything, but it isn’t the best solution for every problem. If you want to get a pizza from your local shop, should you use a shipping container? If you want to get groceries for the week, do you take a skateboard? LNN points to a future where models are purpose built, efficient to run and orders of magnitude smaller than OpenAI GPT.
References:




Comments