Majority of Alexa Now Running on Faster, More Cost-Effective Amazon EC2 Inf1 Instances

50 · Amazon Web Services · Nov. 12, 2020, 6:23 p.m.

Summary

Today, we are announcing that the Amazon Alexa team has migrated the vast majority of their GPU-based machine learning inference workloads to Amazon Elastic Compute Cloud (EC2) Inf1 instances, powered by AWS Inferentia. This resulted in 25% lower end-to-end latency, and 30% lower cost compared to GPU-based instances for Alexa’s text-to-speech workloads. The lower latency […]...

Read full post on aws.amazon.com →

AUTHOR

BLOG POST FEATURED ON

r/aws

12 points

Add this plugin to your blog