Amazon Web Services (AWS) this week unveiled an AWS Trainium2 processor that provides a faster alternative to graphical processor units (GPUs) for training artificial intelligence (AI) models.
Announced at the AWS re:Invent 2023 conference, AWS claims the AWS Tranium2 processor is four times faster than the previous generation of the processor.
This option will provide organizations with a less expensive alternative to training AI models at a time when GPUs are both scarce, says Chetan Kapoor, director of product management for the Elastic Compute Cloud (EC2) service at AWS.
In addition, AWS Tranium 2 provides a more energy efficient alternative to reduce carbon emissions, he noted.
The overall goal is to reduce the amount of time required to train an AI model as part of an effort to make it less costly to build a large language model (LLM), says Kapoor.
While GPUs are widely used to train AI models, they were not designed from the ground up for that purpose. The Tranium processors designed by AWS are a type of accelerator specially built for machine learning algorithms. That approach provides a less expensive alternative to a GPU that, for example, an enterprise IT organization might use to train a smaller LLM using a narrow set of domain-specific data using a processor that isn’t ever going to need to run a graphics library, notes Kapoor. All that’s required is an ability to efficient run matrix multiplications at a level of scale that is required to train an LLM, he adds.
Trainium2 will be available in Amazon EC2 Trn2 instances, each containing 16 Trainium chips in a single instance. Trn2 instances can be scaled up to 100,000 Trainium2 chips in UltraClusters running on the AWS Elastic Cloud (EC2) service. That capability makes it possible to train a 300-billion parameter LLM in weeks versus months, says Kapoor.
Longer term, AWS is moving toward making available clusters and pods of compute resources versus instances that IT teams provision today, adds Kapoor. The goal is to, as application environments become more complex, make it simpler for IT teams to provision cloud infrastructure based on a mix of different types of processors either developed by AWS or partners such as Intel, NVIDIA and others.
Most organizations are going to initially focus on customizing LLMs by using various techniques that will expose them to more data in ways that enterprise IT teams can securely govern. However, in time more IT organizations will look to build small-scale LLMs using their own data to ensure more accurate results. In addition, various regulations that are still evolving may require some organizations to retain total control of an LLM that might be employed.
Regardless of motivation, optimizing consumption of cloud infrastructure resources to minimize costs is going to be essential. Understanding what types of AI models lend themselves better to certain types of processors will be a critical requirement for achieving that goal, especially as availability of GPU continues to be constrained in the AI era.