Lab: Autoscaling Amazon SageMaker endpoints
In this lab, we’ll use AWS Console to setup Amazon Sagemaker endpoint and then configure its autoscaling
Select a trained model
Create an endpoint configuration
Create an Endpoint
Click on “Create endpoint” Then, create and configure endpoint
Configure autoscaling
Click on the endpoint to open it’s settings page then, scroll down to “Endpoint runtime settings” to select the variant and then click on “Configure autoscaling” Now, configure autoscaling properties - for this example I have set “Maximum instance count” to 2 and “Target value” to 100 for the target metric “SageMakerVariantInvocationPerInstance” (the average number of times per minute that each instance for a variant is invoked), then clicked on “Save”