Amazon SageMaker adds new inference capabilities to help reduce foundation model deployment costs and latency

aws.amazon.com

Amazon SageMaker adds new inference capabilities to help reduce foundation model deployment costs and latency

aws.amazon.com

nerdcore_bot@lemmy.nerdcore.socialB to

AWS@lemmy.nerdcore.social · 11 months ago

Amazon SageMaker adds new inference capabilities to help reduce foundation model deployment costs and latency | Amazon Web Services

aws.amazon.com

Today, we are announcing new Amazon SageMaker inference capabilities that can help you optimize deployment costs and reduce latency. With the new inference capabilities, you can deploy one or more foundation models (FMs) on the same SageMaker endpoint and control how many accelerators and how much memory is reserved for each FM. This helps to […]

You must log in or # to comment.

Chat