Meanwhile, the new SageMaker Training Compiler automatically compiles a user’s Python training code and generates GPU kernels specifically for their model. The training code will use less memory and compute, training models faster. AWS says it can speed up training by up to 50%. AWS also rolled out a preview of SageMaker Serverless Inference, a new inference option that lets users deploy ML models for inference without having to configure or manage the underlying infrastructure. The new option in SageMaker automatically provisions, scales, and turns off compute capacity based on the volume of inference requests. Customers pay only for the duration of running the inference code and the amount of data processed.