Qwerky 72B (free)
featherless/qwerky-72b:free
About Qwerky 72B (free)
Qwerky-72B is a linear-attention RWKV variant of the Qwen 2.5 72B model, optimized to significantly reduce computational cost at scale. Leveraging linear attention, it achieves substantial inference speedups (>1000x) while retaining competitive accuracy on common benchmarks like ARC, HellaSwag, Lambada, and MMLU. It inherits knowledge and language support from Qwen 2.5, supporting approximately 30 languages, making it suitable for efficient inference in large-context applications.
Specifications
Context Length
32,768
Tokenizer
Other
Pricing
Prompt
0.000
Completion
0.000
Image
0
Request
0
Last updated: 4/11/2025