Fill your skill gaps in AI and Data Science

External Tag: quantization

Reducing the Size of AI Models

External Tags artificial-intelligence, LLM, machine-learning, model-quantization, quantization

Deploying Large Language Models: vLLM and QuantizationStep by Step Guide on How to Accelerate…

External Tags artificial-intelligence, deep learning, large-language-models, machine-learning, quantization

Quantizing the AI Colossi

towardsdatascience.com
Post date April 15, 2024
No Comments on Quantizing the AI Colossi

External Tags computer-vision, deep learning, editors-pick, LLM, quantization

Improving LLM Inference Latency on CPUs with Model Quantization

External Tags artificial-intelligence, data-science, generative-ai-tools, LLM, quantization

Exploring “Small” Vision-Language Models with TinyGPT-V

External Tags artificial-intelligence, large-language-models, machine-learning, quantization, vision-language-model

ExLlamaV2: The Fastest Library to Run LLMs

External Tags artificial-intelligence, data-science, large-language-models, programming, quantization

Run Llama 2 70B on Your GPU with ExLlamaV2

External Tags llama 2, machine-learning, programming, quantization, software-engineering

Tensor Quantization: The Untold Story

External Tags machine-learning, model-quantization, normalization, quantization, zero-point

Quantize Llama models with GGML and llama.cpp

External Tags data-science, large-language-models, machine-learning, programming, quantization

GPTQ or bitsandbytes: Which Quantization Method to Use for LLMs — Examples with Llama 2

External Tags artificial-intelligence, large-language-models, machine-learning, programming, quantization