How to run it on a mobile device?

by KoiSikhaDo - opened Sep 25, 2024

Sep 25, 2024

•

edited Sep 25, 2024

A lot of the non-allenai blogs online say that the model is small enough to run on a mobile device. Just wanted to know if this can be done.
Can the model be quantised using bitsandbytes to make it smaller to run on a mobile device?

KoiSikhaDo changed discussion status to closed Sep 25, 2024

KoiSikhaDo changed discussion status to open Sep 25, 2024

Muennighoff

Sep 25, 2024

Yes actually @soldni ran https://huggingface.co/allenai/OLMoE-1B-7B-0924-GGUF on a mobile device - not sure if there is a public guide about it somewhere?

We'll also need to merge MolmoE into llama.cpp first to make it work like the above one (i.e. a PR like https://github.com/ggerganov/llama.cpp/pull/9462)

soldni

Sep 25, 2024

stay tuned, we are trying to get this one running!

davidchi

Dec 8, 2024

@soldni Hello, are there any updates on this? I would like to test this model with vLLM.

Thank you very much for your amazing models!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment