Applied LLM Quantisation with AWS Sagemaker | Analytics.gov

A Priority Based Scheduler for Amazon SageMaker Training Jobs

Optimized Deployment of Mistral7B on Amazon SageMaker Real-Time Inference

Deploying Large Language Models with SageMaker Asynchronous Inference

Optimizing Instance Type Selection for AI Development in Cloud Spot Markets

Building an LLMOPs Pipeline

Hosting Multiple LLMs on a Single Endpoint

Deploy a Custom ML Model as a SageMaker Endpoint

Augmenting LLMs with RAG

Host Hundreds of NLP Models Utilizing SageMaker Multi-Model Endpoints Backed By GPU Instances