1.5 KiB
1.5 KiB
Hardware details search for AI model benchmark values
Motivation
Benchmark values for speed are provided in tokens per second for a certain AI model but the Hardware used in the benchmark is not described; therefore, a research is requested to find combinations of AI model speed values and Hardware details.
AI model
The AI model is Qwen3 VL 30B A3B Instruct . The model details are described at https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct
Benchmark speed values
An example for provided benchmark speed values is at https://artificialanalysis.ai/models/qwen3-vl-30b-a3b-instruct/providers
There three relevant speed values are provided:
- Fireworks: 141.7 t/s
- Novita: 105.3 t/s
- Alibaba Cloud: 104.3 t/s
The fourth speed value is not relevant because it is for a quantized version of the model (FP8, 8 bit per parameter).
Tasks
Your tasks are:
- Find out which Hardware was used for the three speed values of the example and which VRAM throughput in GB/s this Hardware had.
- If you were not able to solve 1., then search for at least two combinations of speed values and Hardware details for the AI model with the original model parameter size of 16 bit per parameter. When succeeded, end here.
- If you were not able to solve 1. and not able to solve 2., then search for at least two combinations of speed values and Hardware details for the AI model with quantized model parameter size of 8 bit per parameter.