24 4 59

Qi Liu

baiall

AI & ML interests

None yet

Recent Activity

liked a model 27 days ago

ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-7B-v1

liked a model 30 days ago

aliRafik/invoices-donut-finetuned-lora

liked a model 3 months ago

moondream/moondream3-preview

View all activity

Organizations

None yet

liked a model 27 days ago

ApsaraStackMaaS/EvoQwen2.5-VL-Retriever-7B-v1

Visual Document Retrieval • 8B • Updated Nov 4 • 72 • 17

liked a model 30 days ago

aliRafik/invoices-donut-finetuned-lora

Updated Aug 30 • 1

liked a model 3 months ago

moondream/moondream3-preview

Image-Text-to-Text • 9B • Updated Oct 9 • 5.9k • • 531

liked a Space 3 months ago

Moondream3 Preview

🐠

Process images and text to answer questions, caption, detect objects, and find points

New activity in numind/NuMarkdown-8B-Thinking 4 months ago

NuMarkdown-8B-reasoning on A100 40GB is extremely slow (even for 1 token)

👍 1

#4 opened 4 months ago by

Fedoration

liked a model 4 months ago

google/embeddinggemma-300m

New activity in numind/NuMarkdown-8B-Thinking 4 months ago

Quantizations version

#5 opened 4 months ago by

baiall

New activity in futurehouse/ether0 5 months ago

dose it can work in the vllm

#3 opened 5 months ago by

baiall

New activity in numind/NuExtract-2.0-4B 5 months ago

Why is NuExtract-2.0-8B is inferior than 4B?

#1 opened 5 months ago by

ikiransuryavanshi

reacted to anakin87's post with ❤️ 5 months ago

Post

1088

Haystack can now see 👀

The latest release of the Haystack OSS LLM framework adds a long-requested feature: image support!

📓 Notebooks below

This isn't just about passing images to an LLM. We built several features to enable practical multimodal use cases.

What's new?
🧠 Support for multiple LLM providers: OpenAI, Amazon Bedrock, Google Gemini, Mistral, NVIDIA, OpenRouter, Ollama and more (support for Hugging Face API coming 🔜)
🎛️ Prompt template language to handle structured inputs, including images
📄 PDF and image converters
🔍 Image embedders using CLIP-like models
🧾 LLM-based extractor to pull text from images
🧩 Components to build multimodal RAG pipelines and Agents

I had the chance of leading this effort with @sjrhuschlee (great collab).

📓 Below you can find two notebooks to explore the new features:
󠁯•󠁏󠁏 Introduction to Multimodal Text Generation https://haystack.deepset.ai/cookbook/multimodal_intro
󠁯•󠁏󠁏 Creating Vision+Text RAG Pipelines https://haystack.deepset.ai/tutorials/46_multimodal_rag

(🖼️ image by @bilgeyucel )