4 点作者 montanalow将近 2 年前

1 comment

montanalow将近 2 年前

Quantization allows PostgresML to fit larger models in less RAM. These algorithms perform inference significantly faster on NVIDIA, Apple and Intel hardware. Half-precision floating point and quantized optimizations are now available for your favorite LLMs downloaded from Huggingface.

PostgresML Adds GPTQ and GGML Quantized LLM Support for HuggingFace Transformers

1 comment

PostgresML Adds GPTQ and GGML Quantized LLM Support for HuggingFace Transformers

1 comment