Summary
In an innovative move, Amazon Web Services (AWS) has launched a service that allows customers to rent Nvidia GPUs for their AI projects. This initiative addresses the needs of companies running large language models that require access to GPUs. The new service, Amazon Elastic Compute Cloud (EC2) Capacity Blocks for ML, gives customers access to Nvidia H100 Tensor Core GPU instances for a defined time.
Photo from developer.nvidia.com
In a move that significantly changes the landscape for artificial intelligence (AI) projects, Amazon Web Services (AWS) has launched a new service that allows customers to rent Nvidia GPUs. This development is set to revolutionize how companies execute their AI projects, particularly running large language models requiring access to GPUs.
The demand for GPUs, especially those from Nvidia, has skyrocketed recently. This surge in demand and high cost has made these resources precious commodities often in short supply. The situation is even more challenging for companies that need these resources for single jobs, where renting a long-term instance from a cloud provider may be costly.
“The only way to predict the future is to have the power to shape the future.” – Eric Hoffer
AWS’ Innovative Solution
To address this problem, AWS has launched Amazon Elastic Compute Cloud (EC2) Capacity Blocks for ML. This new service enables customers to buy access to these GPUs for a defined amount of time. This is typically utilized to run AI-related jobs, such as training a machine learning model or experimenting with an existing one. As described by Channy Yun in a blog post announcing the new feature, this is an innovative way to schedule GPU instances where users can reserve the number of cases they need for a future date and for just the amount of time they require. One of the notable use cases of this service is by Stable Diffusion, a startup that uses machine learning to predict protein structures.
What the Product Offers
The product provides customers with access to Nvidia H100 Tensor Core GPUs instances in cluster sizes varying from one to 64 cases, with each model having 8 GPUs. Customers can reserve time for up to 14 days in one-day increments, up to eight weeks in advance. The instances will automatically shut down once the committed time frame is over.
This new service is akin to reserving a hotel room for specific days. Customers will know exactly how long the job will run, how many GPUs they’ll use, and the total cost upfront, providing them with cost certainty. This allows Amazon to utilize these in-demand resources in an auction-like environment, ensuring revenue.
Dynamic Pricing and Availability
The price for these resources varies based on supply and demand. As users sign up for the service, it displays the total cost for the timeframe and resources. Users can then adjust this according to their resource needs and budgets before agreeing to buy.
The new feature is generally available today in the AWS US East (Ohio) region. This innovative step by AWS is set to redefine AI project execution, making it more flexible and cost-effective for companies.