Luminal Cloud

Chat with images using Moondream 3

Join our Cloud Waitlist
or
Preview
Upload an image above and start chatting about it...

Performance Benchmarks

Results from 100 requests with context length of 800 tokens

Burst Load
All requests simultaneous
49,498 tok/s prefill
1,002 tok/s decode
Low Traffic
10 requests per second
4,518 tok/s prefill
91 tok/s decode
High Traffic
50 requests per second
17,579 tok/s prefill
355 tok/s decode

Request rate significantly impacts token processing speed due to server-side batching optimization