Telefónica and Multiverse Computing Develop an AI-Based Model to Support Customer Service Agents With 75% Energy Savings
Madrid, November 12 2025 -- Telefónica and Multiverse Computing have reached an important milestone in the application of artificial intelligence in the telecommunications sector by successfully compressing and fine-tuning two large language models (LLMs), based on machine learning and natural language processing, for internal use in customer service.
These compressed models are expected to be used in the near future within the chat system that supports Customer Service agents as part of the “Movistar por ti” initiative, Movistar’s new, more agile, proactive and closer approach to customer care. The goal is to optimize response times for queries while reducing energy consumption in the systems.
The solution, based on AI model compression, delivers major improvements in speed, efficiency, energy usage and costs, all while maintaining the accuracy of the information that helps agents manage customer service more effectively.
Specifically, Multiverse Computing has applied cutting-edge quantum-inspired techniques to Meta’s Llama 3.1 8B and Llama 3.3 70B models, pre-trained language models (LLMs) that can be applied to a wide variety of intelligent assistant use cases.
The result achieved is an 80% reduction in model size, which implies a considerable decrease in storage needs, while maintaining the quality of the generated responses.
Another important aspect is the environmental dimension: in addition to being able to run in the cloud, the compressed models developed can be deployed directly on Telefónica’s network, including local (on-premise) facilities. This makes it possible to reduce energy consumption by up to 75% compared to uncompressed models.
In this way, this improvement also reinforces Telefónica and Multiverse Computing’s joint commitment to reducing the environmental impact of technology.
Furthermore, thanks to local deployment in Telefónica’s central offices, where 100% of the electricity comes from renewable sources and there is continuous work to improve efficiency, the operator has also managed to reduce the CO₂ emissions associated with the use of artificial intelligence in this case.
As for the original models that have been compressed (Llama), they are open-source models, in line with Telefónica’s goal of promoting openness, security and technological neutrality to foster standards and accelerate AI adoption.
Ultimately, Telefónica’s future application of the compressed models developed will deliver major operational efficiencies in the use of artificial intelligence, as they will make it possible to maintain the original quality of large language models (LLMs) while using much more modest hardware and lowering query costs for the models both in the cloud and on-premise. On top of this, energy consumption at Telefónica’s facilities will be reduced to a minimum.
A pioneering collaboration for scalable AI
This collaboration highlights the strategic potential of combining Telefónica’s scale with Multiverse Computing’s deep technical innovation. By deploying a compressed, high-performance and environmentally efficient AI solution, both companies reaffirm their leadership in developing more accessible and scalable AI for enterprise use.


