A quick way to use open source LLMs like Qwen2.5 is by pulling the model into a Docker container based on Ollama image. Ollama is an open source tool to run LLMs.
What happens below is it pulls qwen2:0.5b.
➜ docker pull ollama/ollama:latest
➜ docker run -d --name ollama -p 11434:11434 -v ollama_data:/root/.ollama ollama/ollama:latest
➜ docker exec -it ollama ollama pull qwen2:0.5b
➜ docker exec -it ollama ollama list
NAME ID SIZE MODIFIED
qwen2:0.5b 6f48b936a09f 352 MB 23 seconds ago
To test it, a simple prompt is sent to ollama. We can see that it responds to the prompt.
➜ curl -s http://localhost:11434/api/generate \
-d '{"model":"qwen2:0.5b","prompt":"hello","stream":false}'
{"model":"qwen2:0.5b","created_at":"2026-02-18T12:33:01.697006098Z","response":"Hello! How can I assist you today?","done":true,"done_reason":"stop","context":[151644,872,198,14990,151645,198,151644,77091,198,9707,0,2585,646,358,7789,498,3351,30],"total_duration":1018464957,"load_duration":860213493,"prompt_eval_count":9,"prompt_eval_duration":54162213,"eval_count":10,"eval_duration":92238954}%