Qwen2-Math: A new era for AI maths whizzes

Published on:

Alibaba Cloud’s Qwen staff has unveiled Qwen2-Math, a collection of huge language fashions particularly designed to deal with advanced mathematical issues.

These new fashions – constructed upon the prevailing Qwen2 basis – display exceptional proficiency in fixing arithmetic and mathematical challenges, and outperform former trade leaders.

The Qwen staff crafted Qwen2-Math utilizing an unlimited and numerous Arithmetic-specific Corpus. This corpus contains a wealthy tapestry of high-quality assets, together with net texts, books, code, examination questions, and artificial information generated by Qwen2 itself.

- Advertisement -

Rigorous analysis on each English and Chinese language mathematical benchmarks – together with GSM8K, Math, MMLU-STEM, CMATH, and GaoKao Math – revealed the distinctive capabilities of Qwen2-Math. Notably, the flagship mannequin, Qwen2-Math-72B-Instruct, surpassed the efficiency of proprietary fashions reminiscent of GPT-4o and Claude 3.5 in varied mathematical duties.

“Qwen2-Math-Instruct achieves the very best efficiency amongst fashions of the identical measurement, with RM@8 outperforming Maj@8, significantly within the 1.5B and 7B fashions,” the Qwen staff famous.

This superior efficiency is attributed to the efficient implementation of a math-specific reward mannequin throughout the improvement course of.

Additional showcasing its prowess, Qwen2-Math demonstrated spectacular ends in difficult mathematical competitions just like the American Invitational Arithmetic Examination (AIME) 2024 and the American Arithmetic Contest (AMC) 2023.

- Advertisement -

To make sure the mannequin’s integrity and forestall contamination, the Qwen staff applied strong decontamination strategies throughout each the pre-training and post-training phases. This rigorous strategy concerned eradicating duplicate samples and figuring out overlaps with check units to keep up the mannequin’s accuracy and reliability.

Wanting forward, the Qwen staff plans to broaden Qwen2-Math’s capabilities past English, with bilingual and multilingual fashions within the pipeline.  This dedication to inclusivity goals to make superior mathematical problem-solving accessible to a worldwide viewers.

See also  IBM to test Southeast Asian LLM and facilitate localization efforts

“We are going to proceed to reinforce our fashions’ skill to unravel advanced and difficult mathematical issues,” affirmed the Qwen staff.

You’ll find the Qwen2 fashions on Hugging Face right here.

See additionally: Paige and Microsoft unveil next-gen AI fashions for most cancers analysis

Need to study extra about AI and massive information from trade leaders? Try AI & Large Information Expo happening in Amsterdam, California, and London. The great occasion is co-located with different main occasions together with Clever Automation Convention, BlockX, Digital Transformation Week, and Cyber Safety & Cloud Expo.

Discover different upcoming enterprise know-how occasions and webinars powered by TechForge right here.

- Advertisment -

Related

- Advertisment -

Leave a Reply

Please enter your comment!
Please enter your name here