Baichuan Intelligent Technology Releases Its First Large-scale Pre-training Language Model baichuan-7B

Baichuan Intelligent Technology Releases Its First Large-scale Pre-training Language Model baichuan-7B

Baichuan Intelligent Technology, an organization established by former Sogou CEO Wang XiaoChuan, has formally launched its first large-scale Chinese and English pre-training mannequin, baichuan-7B. The mannequin, which consists of seven billion parameters, has been launched on a number of platforms together with Hugging Face, Github, and Model Scope, and has achieved high leads to a number of benchmark exams.

At Github, it states: baichuan-7B is an open-source, large-scale pre-training language mannequin developed by Baichuan Intelligent Technology. baichuan-7B relies on Transformer structure, which accommodates 7 billion parameter and skilled on roughly 1.2 trillion tokens. It helps each Chinese and English languages with a context window size of 4096. It has achieved one of the best efficiency amongst fashions of the identical measurement on normal Chinese and English authoritative benchmarks (C-EVAL, MMLU, and so on).

The efficiency of baichuan-7B has been verified via complete assessments utilizing influential Chinese benchmarks resembling C-Eval, AGIEval, and Gaokao. In these evaluations, the mannequin has persistently achieved excellent outcomes, surpassing different pre-trained fashions of the identical parameter scale and changing into the top-performing native pre-trained mannequin in Chinese.In the AGIEval evaluation, baichuan-7B scored 34.4 factors, considerably outperforming different open-source fashions resembling LLaMA-7B, Falcon-7B, Bloom-7B, and ChatGLM-6B.

In the C-EVAL check, baichuan-7B scored 42.8 factors, exceeding ChatGLM-6B’s 38.9 factors, and within the Gaokao analysis, it scored 36.2 factors, clearly main different pre-trained fashions of the identical parameter scale.

AGIEval is a benchmark launched by Microsoft Research Institute geared toward comprehensively evaluating the capabilities of fundamental fashions in human cognition and problem-solving duties. C-Eval, co-created by Shanghai Jiao Tong University, Tsinghua University, and the University of Edinburgh, is a complete examination analysis set for Chinese language fashions, masking 52 topics from totally different trade fields. The Gaokao benchmark, created by the analysis staff of Fudan University, makes use of Chinese faculty entrance examination questions as a dataset to check massive fashions’ capabilities in Chinese language understanding and logical reasoning.

Baichuan-7B not solely excels in Chinese, but additionally performs brilliantly in English. In the MMLU evaluation, baichuan-7B scored as excessive as 42.5 factors, considerably main the English open-source pre-trained mannequin LLaMA-7B’s 34.2 factors and the Chinese open-source mannequin ChatGLM-6B’s 36.9 factors.

SEE ALSO: iFlytek Claims Its Large Language Model Outperforms ChatGPT in Three Key Areas

Training corpus is essential to the outcomes of enormous mannequin coaching. Baichuan Intelligent Technology constructed a high-quality pre-training corpus based mostly on high-quality Chinese corpora, whereas additionally integrating high-quality English information. The authentic information features a large quantity of Chinese and English web information and a few open-source Chinese and English information, in addition to a considerable amount of high-quality information information.

The mannequin has carried out an environment friendly and steady coaching course of. Compared to fashions of the identical parameter scale, baichuan-7B exhibits superior efficiency in key efficiency indicators resembling perplexity (PPL) and coaching loss.

Most of the present open-source fashions have a window size of 2K or much less. With an optimized tokenization algorithm, baichuan-7B has been in a position to develop to a super-long dynamic window functionality of 4K, making it extra versatile for a variety of functions.

Sign up right this moment for five free articles month-to-month!



…. to be continued
Read the Original Article
Copyright for syndicated content material belongs to the linked Source : Pandaily – https://pandaily.com/baichuan-intelligent-technology-releases-its-first-large-scale-pre-training-language-model-baichuan-7b/

Exit mobile version