Reducing the Size of AI Models towardsdatascience.com Post date November 4, 2024 No Comments on Reducing the Size of AI Models External Tags artificial-intelligence, LLM, machine-learning, model-quantization, quantization
Deploying Large Language Models: vLLM and QuantizationStep by Step Guide on How to Accelerate… towardsdatascience.com Post date April 16, 2024 No Comments on Deploying Large Language Models: vLLM and QuantizationStep by Step Guide on How to Accelerate… External Tags artificial-intelligence, deep learning, large-language-models, machine-learning, quantization
Quantizing the AI Colossi towardsdatascience.com Post date April 15, 2024 No Comments on Quantizing the AI Colossi External Tags computer-vision, deep learning, editors-pick, LLM, quantization
Improving LLM Inference Latency on CPUs with Model Quantization medium.com Post date February 29, 2024 No Comments on Improving LLM Inference Latency on CPUs with Model Quantization External Tags artificial-intelligence, data-science, generative-ai-tools, LLM, quantization
Exploring “Small” Vision-Language Models with TinyGPT-V towardsdatascience.com Post date January 12, 2024 No Comments on Exploring “Small” Vision-Language Models with TinyGPT-V External Tags artificial-intelligence, large-language-models, machine-learning, quantization, vision-language-model
ExLlamaV2: The Fastest Library to Run LLMs medium.com Post date November 20, 2023 No Comments on ExLlamaV2: The Fastest Library to Run LLMs External Tags artificial-intelligence, data-science, large-language-models, programming, quantization
Run Llama 2 70B on Your GPU with ExLlamaV2 towardsdatascience.com Post date September 29, 2023 No Comments on Run Llama 2 70B on Your GPU with ExLlamaV2 External Tags llama 2, machine-learning, programming, quantization, software-engineering
Tensor Quantization: The Untold Story towardsdatascience.com Post date September 8, 2023 No Comments on Tensor Quantization: The Untold Story External Tags machine-learning, model-quantization, normalization, quantization, zero-point
Quantize Llama models with GGML and llama.cpp medium.com Post date September 4, 2023 No Comments on Quantize Llama models with GGML and llama.cpp External Tags data-science, large-language-models, machine-learning, programming, quantization
GPTQ or bitsandbytes: Which Quantization Method to Use for LLMs — Examples with Llama 2 towardsdatascience.com Post date August 25, 2023 No Comments on GPTQ or bitsandbytes: Which Quantization Method to Use for LLMs — Examples with Llama 2 External Tags artificial-intelligence, large-language-models, machine-learning, programming, quantization