Lab: Autoscaling Amazon SageMaker endpoints
In this lab, we’ll use AWS Console to setup Amazon Sagemaker endpoint and then configure its autoscaling
Select a trained model
Create an endpoint configuration
Create an Endpoint
Click on “Create endpoint” Then,
create and configure endpoint
Configure autoscaling
Click on the endpoint to open it’s settings page
then, scroll down to “Endpoint
runtime settings” to select the variant and then click on “Configure
autoscaling”
Now,
configure autoscaling properties - for this example I have set “Maximum instance
count” to 2 and “Target value” to 100 for the target metric
“SageMakerVariantInvocationPerInstance” (the average number of times per minute
that each instance for a variant is invoked), then clicked on “Save”