PyTorch has become the heart of an ever-growing AI ecosystem, and now it’s added to two new programs for its family: Helion and Safetensors. 

PARIS: At PyTorch Conference EU, the PyTorch Foundation is formally welcoming Helion, a Python-embedded Domain Specific Language (DSL) for machine learning (ML) and Safetensors, a secure tensor serialization format originally developed by Hugging Face, into its roster of hosted projects.

Python, as you probably know, has become one of AI developers’ favorite languages. From it sprang PyTorch, which started life as a Meta dynamic neural network library. In 2026, however, it’s more accurate to describe it as a sprawling ecosystem and governance center for much of the open‑source AI stack. The core of PyTorch remains an imperative tensor and autograd library that lets you build and train models in Python. Around that core, you now have compilation, such as TorchInductor and torch.compile, distributed training, mobile runtimes, and a growing fleet of hosted projects that tackle everything from serving to systems‑level optimization. 

Now, the PyTorch Foundation has expanded from governing a single-framework project to an umbrella foundation, somewhat like the Cloud Native Computing Foundation (CNCF). These latest newcomers follow earlier PyTorch Foundation additions, such as vLLM, DeepSpeed, and Ray. 

Both new projects target pressure points that worsen as organizations move from training a few giant models to serving many variants across heterogeneous hardware: Helion targets kernel authoring and autotuning, while Safetensors targets the security and I/O paths for model weights. Taken together, they push more of the “hard parts” of high‑performance and secure PyTorch deployments onto shared, open infrastructure rather than bespoke in‑house tooling. 

Helion 

Helion’s maintainers describe it as a Python-embedded DSL that enables developers to write custom ML kernels in high‑level PyTorch-like code and compile them to multiple backends, including Triton and TileIR, with more on the way. Rather than forcing kernel authors to juggle low-level indexing, memory layout, and backend-specific tuning, Helion lifts the abstraction level and leans heavily on an autotuning engine to explore block sizes, loop orders, and memory access patterns for you. This makes programming ML much easier for developers who want to focus on higher-level issues rather than nuts-and-bolts details.  

Crucially, Helion is designed as a PyTorch‑native language rather than a separate AI language. That means you can use your PyTorch expertise while still giving you the control you need to squeeze the last drop of performance from a specific GPU. With new hardware and model architectures changing faster than you can rewrite your kernels, this combination of abstraction, autotuning, and backend portability is exactly the kind of unglamorous engineering work that keeps inference clusters up to date and making the most of your new hardware. 

The practical pitch is productivity and portability: You stay inside the PyTorch ecosystem to write kernels using familiar tools. Helion also enables you to search the implementation space in advance, then ship pre‑optimized kernels that run efficiently across GPU generations and vendors. Benchmarks from the project show Helion matching or beating hand‑written Triton and CuTe DSL kernels on workloads like softmax and RMSNorm, often with significantly less code and without hardware‑specific rewrites. 

Safetensors 

Safetensors, for its part, tackles a problem that most teams only discover the hard way: Shipping and loading huge model checkpoints using general‑purpose serialization like pickle is both slow and a security liability. The format stores only tensor data and metadata in a simple, zero‑copy layout, explicitly avoiding executable payloads and arbitrary code execution paths that attackers can abuse in model supply chains. 

Deeply integrated with PyTorch, Safetensors is designed as a drop‑in replacement for torch.load and torch.save, binding directly to native APIs such as torch.UntypedStorage, Tensor.narrow, and Tensor. to support lazy deserialization and zero‑copy loading without touching existing model code.  

Pragmatically, you can use Safetensors to solve the problem of dealing with model weights as untrusted input—and you really should—it gives you a format that cannot hide arbitrary code execution paths. The design keeps things deliberately simple: tensors are stored in a binary layout with a small JSON header describing shapes and dtypes, enabling constant‑time indexing and zero‑copy slices, but nothing resembling an executable payload. 

Because it slots directly into existing PyTorch workflows, most teams can adopt Safetensors just by switching save/load calls or by consuming models from ecosystems like the Hugging Face Hub that have already standardized on the format. The payoff shows up in three places practitioners care about: Reduced supply‑chain risk from malicious checkpoints; faster startup times for large models. especially in tensor‑parallel, multi‑GPU serving; and a common, open format that vendors and frameworks can agree on instead of proliferating yet another proprietary container. 

Safetensors has already become the default checkpoint format across the Hugging Face Hub, which means that the most popular PyTorch models, such as Llama, Gemma, Cohere’s models, and their thousands of fine‑tuned derivatives, are now distributed in Safetensors, with noticeably faster load times in multi‑GPU and multi‑node deployments where I/O contention is a bottleneck. If, like me, you’re also a big fan of standardized interoperability, this is a big win.   

What this means to the bottom line of AI deployments, newly appointed PyTorch Foundation Executive Director Mark Collier said, is that  “Safetensors is an important step towards scaling production-grade AI models.” That’s because “Safetensors ensures secure model distribution and de-risks code execution, all while offering significant speed across complex computing architectures. For security, Safetensors is a crucial piece of the open source AI stack that will drive fast, secure, and technically advanced AI.” 

These new projects are important, Collier argues, because kernel‑level performance and secure model formats are too important to be left as ad hoc extensions or vendor‑specific differentiators. By hosting Helion and Safetensors alongside the core framework, PyTorch is signaling that things like autotuned kernels and safe, portable weight files are part of the “standard library” of modern AI engineering, not niche optimization projects. 

Strategically, Collier said it also gives the broader AI ecosystem a neutral place to collaborate on performance and security primitives that have to work across NVIDIA, AMD, custom ASICs, and distributed training setups. Going forward, that means AI programmers have an open-source answer to “how do I write a custom kernel?” or “how do I safely ship this 100‑GB checkpoint?” using these projects instead of having to reinvent the wheel.  

Personally, anything that makes life easier, faster, and safer for me to work with low-level AI programming is a win in my book. I urge you to seriously consider using both projects in your next AI projects.