Run multiple generative AI models on GPU using Amazon SageMaker multi-model endpoints with TorchServe and save up to 75% in inference costs

Elevating the generative AI experience: Introducing streaming support in Amazon SageMaker hosting

Deploy self-service question answering with the QnABot on AWS solution powered by Amazon Lex with Amazon Kendra and large language models

MLOps for batch inference with model monitoring and retraining using Amazon SageMaker, HashiCorp Terraform, and GitLab CI/CD

Machine learning with decentralized training data using federated learning on Amazon SageMaker

Build ML features at scale with Amazon SageMaker Feature Store using data from Amazon Redshift

Intelligent video and audio Q&A with multilingual support using LLMs on Amazon SageMaker

Zero-shot and few-shot prompting for the BloomZ 176B foundation model with the simplified Amazon SageMaker JumpStart SDK

Build production-ready generative AI applications for enterprise search using Haystack pipelines and Amazon SageMaker JumpStart with LLMs

Build a centralized monitoring and reporting solution for Amazon SageMaker using Amazon CloudWatch