Distributed Local Inference Makes a Difference in AI Datacenters

Go big, they say. In fries, cars and perhaps sunscreen, maybe…but not always in datacenters. This is the suggestion coming out of Anyway Systems, a commercial organization spun out of Swiss research body École Polytechnique Fédérale de Lausanne (EPFL).

Anyway Systems coordinates compute resources into an optimized on-premises cluster, using technology built at EPFL’s distributed computing laboratory.

Focused on finding ways to run high-performance large language (and presumably image, action, or other) models locally – and thus providing an alternative to submitting to cloud datacenter dominance – the EPFL lab team has developed working services that may represent a shake-up of the market in some application scenarios.

Privacy, Sovereignty & Sustainability

Because clusters run locally on-premises, deployments on Anyway Systems are argued to be more controllable from a data privacy, AI sovereignty and sustainability perspective. This is largely because data is not shared externally with third-party cloud service providers.

Anyway Systems reminds us that the lion’s share of AI applications make use of cloud infrastructure and services to execute. User queries are sent to remote servers in cloud datacenters so that the compute logic in each AI service can perform the required amount of inference and algorithmic reasoning. The result is then sent back to the user.

While there are advantages here in terms of economies of scale, robustness of service and task specialism, it’s hard not to see how this concentrates most of the power in a small number of cloud hyperscalers. This monopoly is also regarded by some as a concern when it comes to the control of sensitive or confidential data and national sovereignty. It also raises questions related to energy consumption, especially where overprovisioning has occurred.

Developed by researchers Gauthier Voron, Geovani Rizk and Rachid Guerraoui, the group explains that Anyway Systems allows users to download open source AI models and deploy them on local networks by coordinating multiple machines into an on-premises computing cluster.

Self-Stabilization Software

With the robustness of service, massive scalability, comprehensive security layers and extensive service feature set offered by cloud datacenters, how does the Anyway team think it can compete? The researchers have specified the use of “self-stabilization techniques” that enable their system to optimise available hardware; this, in turn, allows the system to maintain its level of operational robustness without requiring any type of traditional centralized cloud service.

According to a Swiss technology news and investment forum, the technology on offer here means that large language models can be deployed on a small number of “standard machines” equipped with “commodity GPUs”, which obviously represents a significant cost saving for some AI project deployments. Although the team has not specified how wide the applicability of its technology could be, there may be instances where a specialized AI server cloud infrastructure with response times may be more suitable.