Enable pod-based GPU metrics in Amazon CloudWatch

Maximize Stable Diffusion performance and lower inference costs with AWS Inferentia2

Accelerate PyTorch with DeepSpeed to train large language models with Intel Habana Gaudi-based DL1 EC2 instances