Democratizing LLMs: 4-bit Quantization for Optimal LLM Inference towardsdatascience.com Post date January 15, 2024 No Comments on Democratizing LLMs: 4-bit Quantization for Optimal LLM Inference External Tags deep-dives, gguf, hugging face, llamaindex, model-quantization
Scaling Down, Scaling Up: Mastering Generative AI with Model Quantization feeds.feedburner.com Post date November 10, 2023 No Comments on Scaling Down, Scaling Up: Mastering Generative AI with Model Quantization External Tags ai, Applications, artificial-intelligence, blogathon, deep learning, generative-ai, Healthcare, Intermediate, model-quantization, Models, precision, pytorch, Real Time, TensorFlow, time
Tensor Quantization: The Untold Story towardsdatascience.com Post date September 8, 2023 No Comments on Tensor Quantization: The Untold Story External Tags machine-learning, model-quantization, normalization, quantization, zero-point