Amazon Web Services (AWS) announced the general availability of Amazon Elastic Compute Cloud (Amazon EC2) DL1 instances powered by Gaudi accelerators from Habana Labs, an Intel company. 
The Amazon EC2 DL1 instances will improve the price-performance for training machine learning models by 40% compared to the latest GPU-powered Amazon EC2 instances. In turn, customer will train their ML models in a faster and more cost-effective manner.
clients might use the DL1 instances for natural language processing, object detection and classification, fraud detection, recommendation and personalization engines, intelligent document processing, business forecasting, and more.Excited about @habanalabs‘ Gaudi AI accelerators bringing a new level of #deeplearning and training efficiency to @awscloud customers. Amazon EC2 DL1 instances will drive the next wave of #EmergingAI and enable more companies to implement #AI training. https://t.co/ne8SzPXFVz
— Pat Gelsinger (@PGelsinger) October 26, 2021
The new DL1 instances feature up to eight Gaudi accelerators, 32 GB of high-bandwidth memory, 768 GiB of system memory, 2nd generation Amazon custom Intel Xeon Scalable (Cascade Lake) processors, 400 Gbps of networking throughput, and up to 4 TB of local NVMe storage.
The Gaudi accelerators deliver higher compute efficiency at a lower cost than the general-purpose GPUs since they were explicitly designed to accelerate ML model training.
The DL1 instances include the Habana SynapseAI SDK. It’s integrated with leading ML frameworks, such as TensorFlow and PyTorch. In turn, it helps clients to migrate their existing ML models, running on GPU-based or CPU-based instances onto DL1 instances. Better yet, the technology incorporates minimal code changes in the process.
Developers and data scientists can start with the reference models optimized for Gaudi accelerators within Habana’s GitHub repository. They include popular models for diverse applications, including image classification, object detection, natural language processing, and recommendation systems.
Furthermore, the DL1 instances are built on the Nitro System. It’s a rich collection of building blocks that offloads many traditional virtualization functions to dedicated hardware and software. In turn, it delivers high performance, high availability, and high security while also reducing virtualization overhead.
When it comes to containerized applications, customers can launch DL1 instances using AWS Deep Learning AMIs, Amazon Elastic Kubernetes Service (Amazon EKS), or Amazon Elastic Container Service (Amazon ECS).
David Brown, who serves the role of Vice President, of Amazon EC2, at AWS, had this to say:
The use of machine learning has skyrocketed. One of the challenges with training machine learning models, however, is that it is computationally intensive and can get expensive as customers refine and retrain their models. AWS already has the broadest choice of powerful compute for any machine learning project or application. The addition of DL1 instances featuring Gaudi accelerators provides the most cost-effective alternative to GPU-based instances in the cloud to date. Their optimal combination of price and performance makes it possible for customers to reduce the cost to train, train more models, and innovate faster.”
The DL1 instances don’t require an upfront investment. Instead, they are available as On-Demand Instances, Reserved Instances, Spot Instances, or as part of a Savings Plan. Currently,
provides these services in the US East (N. Virginia) and US West (Oregon) regions.Soon, the company will provide support for DL1 instances in Amazon SageMaker.
