New: Agile-Environment.md, Search_Requests.md

This commit is contained in:
tlg
2026-03-13 17:25:53 +01:00
parent 0056946e8a
commit 1fcec1a954
2 changed files with 197 additions and 0 deletions

43
Search_Requests.md Normal file
View File

@@ -0,0 +1,43 @@
# Hardware details search for AI model benchmark values
## Motivation
Benchmark values for speed are provided in tokens per second for
a certain AI model but the Hardware used in the benchmark is not
described; therefore, a research is requested to find combinations
of AI model speed values and Hardware details.
## AI model
The AI model is
Qwen3 VL 30B A3B Instruct .
The model details are described at
https://huggingface.co/Qwen/Qwen3-VL-30B-A3B-Instruct
## Benchmark speed values
An example for provided benchmark speed values is at
https://artificialanalysis.ai/models/qwen3-vl-30b-a3b-instruct/providers
There three relevant speed values are provided:
- Fireworks: 141.7 t/s
- Novita: 105.3 t/s
- Alibaba Cloud: 104.3 t/s
The fourth speed value is not relevant because it is for a quantized
version of the model (FP8, 8 bit per parameter).
## Tasks
Your tasks are:
1. Find out which Hardware was used for the three speed values of the example
and which VRAM throughput in GB/s this Hardware had.
2. If you were not able to solve 1., then search for at least two
combinations of speed values and Hardware details for the AI model
with the original model parameter size of 16 bit per parameter.
When succeeded, end here.
3. If you were not able to solve 1. and not able to solve 2., then
search for at least two
combinations of speed values and Hardware details for the AI model
with quantized model parameter size of 8 bit per parameter.