Instructions to use ibm-granite/granitelib-rag-r1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Granite Library
How to use ibm-granite/granitelib-rag-r1.0 with Granite Library:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Models and configs for Ollama backend
#4
by kndtran - opened
For granite-common issue https://github.com/ibm-granite/granite-common/issues/94.
PR notes
- Conversion scripts for Ollama backend. Designed to run once to create the files in this PR.
convert_io_yaml_files.py- rewrites the vLLMio.yamlfor an Ollama backend- The
response_formatvalue is remapped to a different location in theio.yamlfor Ollama. No modification of the value is currently needed.
- The
convert_to_gguf.py- converts the.safetensorsto.ggufformat and writes theModelfiles for each LoRA adapter
- Ollama model naming scheme on filesystem renamed from
granite4:microtogranite4_microto avoid issues on Windows systems. - Simplify
run_ollama.shscript to assume that repo already contains converted files.
kndtran changed pull request status to open
Please do not merge this branch yet. There are issues with some of the quantized LoRA adapters as reported in the granite-common issue above. I can't seem to "unpublish" this PR.
This branch is ready to merge. The corresponding granite-common issue above will need to be updated to use the main branch once this PR is merged.
frreiss changed pull request status to merged