HIT Shenzhen Team Develops Multimodal Large Model ‘JiuTian’, Tops OpenCompass Ranking

HIT Shenzhen Team Develops Multimodal Large Model ‘JiuTian’, Tops OpenCompass Ranking

Harbin Institute of Technology (Shenzhen) Computing and Intelligence Research Institute group, counting on Shenzhen Hashen Asset Management Co., Ltd. for achievement transformation, has established a multimodal large-scale mannequin growth enterprise – Shenzhen Ruoyu Technology Co., Ltd. (abbreviated as ‘Ruo Yu Technology’)

The first multimodal large-scale mannequin ‘JiuTian‘ beneath Shenzhen Ruoyu Technology Co., Ltd. has topped the OpenCompass multimodal large-scale mannequin rating upon its debut analysis.

‘123 billion parameters’, ‘120 million image-text pairs’, ‘5.5 million bilingual language samples’, ‘1.2 million fine-tuning data samples’, “500,000 reinforcement data samples”… The enchancment of core parameters brings a couple of qualitative change within the mannequin’s capabilities. JiuTian multimodal large-scale mannequin has achieved exceptional efficiency in logical reasoning, relational reasoning, and perceptual skills.

With over billions of parameters, JiuTian has achieved multimodal fusion of textual content, photos, audio, and video. Its clever understanding and response capabilities not solely cowl fields reminiscent of pure language processing, pc imaginative and prescient, and speech recognition but additionally successfully break down the data boundaries between totally different modalities, integrating them right into a unified ‘JiuTian’.

‘The ‘JiuTian’ symbolizes the best celestial realm in historic Chinese mythology, representing our boundless pursuit of technological progress and eager for an clever future. This mannequin transcends the boundaries of assorted modes reminiscent of textual content, photos, audio, and video with its highly effective understanding and responsive capabilities, attaining true multimodal fusion.’ Dr. Sun Teng, CEO of Ruoyu Technology, defined: ‘By finding bridges that connect various fields from a disordered and fragmented information world, integrating information from different domains such as natural language processing, computer vision, and speech recognition breaks down the information silos between modalities and truly achieves orderly flow and communication of information.’

Harbin Institute of Technology Shenzhen Campus has established an asset joint-stock firm to encourage the transformation and implementation of achievements by school and workers. HIT (Shenzhen) receives coverage help for the combination of manufacturing, schooling, and analysis. If Shenzhen Ruoyu Technology Co., Ltd. had been established from the start with the college as an preliminary shareholder, it might have offered sturdy help for the corporate’s growth.

Recently, the well-known journal IEEE Intelligent Systems introduced its checklist of ‘AI’s 10 to Watch’ for the yr 2022. Professor Nie Liqiang was included on this checklist as a result of his contributions within the subject of multimodal analysis. Professor Nie is a recipient of the DAMO Academy Qingcheng Award and TR35 China Award. He acknowledged that the achievements of Harbin Institute of Technology (Shenzhen) within the subject of synthetic intelligence shouldn’t solely exist inside laboratories but additionally be remodeled into sensible purposes to serve nationwide protection, aerospace, and society.

If Ruoyu Technology Co., Ltd. has one other AI professional as a co-founder, it might be Professor Zhang Min. Professor Zhang is the Assistant President of Harbin Institute of Technology (Shenzhen), the primary distinguished younger scholar in NLP subject in China, a nationwide “Top Talent” recipient, a mid-career professional with excellent contributions acknowledged by the state, and he additionally enjoys particular allowances from the State Council. Harbin Institute of Technology ranks first amongst Chinese analysis establishments in NLP route in line with CSRankings (2022-2023), an authoritative rating checklist in pc science. Professor Zhang is probably the most influential particular person at Harbin Institute of Technology on this subject.

Dr. Sun Teng, co-founder and CEO of Ruoyu Technology Co., Ltd. , can also be a core professional within the firm’s analysis and growth group. Dr. Sun’s analysis has all the time targeted on multimedia computing, with associated achievements revealed in CCF A-class conferences and IEEE/ACM Trans. Dr. Sun has earlier profitable entrepreneurial expertise and possesses full-process expertise within the software of synthetic intelligence expertise in vertical fields in addition to firm administration experience.

Geng Chen, one other co-founder of Ruoyu Technology Co., Ltd. , serves as the corporate’s strategic advisor. He has been repeatedly acknowledged as the very best expertise analyst by New Fortune journal and has gathered wealthy {industry} sources all through his years of analysis profession. He is accountable for funding and financing actions in addition to connecting industrial sources for the corporate’s implementation functions.

‘If Ruoyu Technology Co., Ltd. was established at this time, it has its historical mission and ideals. As cutting-edge researchers, we deeply feel the transformative impact of artificial intelligence on future society. The productivity explosion brought by generative AI will redefine production relationships in various industries. It is our honor and mission to have the opportunity to participate in it. ’Computing energy, knowledge, and expertise are the three main boundaries for coming into the sphere of large-scale fashions, and Ruoyu Technology Co., Ltd. has gathered these core parts from its inception. The internally developed analysis and growth group led by prime skills has shaped impartial iterative capabilities. In the longer term, beneath the management of technical specialists, ‘JiuTian’will proceed to iterate.

With top-notch entrepreneurial group, core capabilities in self-developed multimodal giant fashions, and profitable sensible expertise, Ruo Yu Technology expresses that it’ll convey a contact of brilliance to the ‘Battle of Hundred Models’.

Based on the muse of large-scale mannequin capabilities, reshaping every observe has grow to be an {industry} consensus. According to OpenAI’s growth path, when fashions attain a sure dimension, new skills will emerge, particularly some beforehand unseen capabilities.

If JiuTian will proceed to iterate sooner or later, Dr. Sun Teng mentioned: ‘JiuTian’ remains to be iterating in the direction of each bigger and smaller instructions. On one hand, it’s growing the size of parameters to discover nodes that help the emergence of common multimodal giant fashions. On the opposite hand, as a way to meet the appliance wants of {industry} customers and obtain most results with minimal computing energy, it’s essential to compress giant fashions into light-weight ones and mix them with edge computing gadgets.

SEE ALSO: SenseTime Releases Large Multimodal Model amid ChatGPT Boom

Based on the multimodal framework of ‘JiuTian’, Ruo Yu Technology’s enterprise mannequin has a elementary distinction from the AI 1.0 period. In the previous, the enterprise mannequin required redeveloping algorithms for every particular demand, working on a venture foundation. With ‘JiuTian’ as a unified multimodal basis, there isn’t any want to revamp the framework; solely minor changes primarily based on totally different {industry} knowledge are mandatory to acquire corresponding {industry} fashions. Customers may even make secondary changes themselves in line with their particular area necessities utilizing their very own knowledge.

The issue of multimodal giant fashions lies within the fusion of multimodal data. Common fusion strategies embrace linear addition, cascading, and different comparatively crude means. However, the ultimate impact is usually not as spectacular as that of a single modality. This is as a result of some technical groups lack expertise and capabilities in fine-tuning multimodal knowledge, integrating and aligning multimodal options.

JiuTian has a completely built-in mannequin coaching framework for autonomous growth of multimodal function extraction, alignment, fusion, and inference, in addition to a complete and meticulous course of for gathering and cleansing multimodal knowledge. The mannequin’s prime rating on the multimodal large-scale mannequin checklist proves the group’s main capabilities within the subject of multimodal large-scale fashions.

Robots are system-level software merchandise within the industrial subject, and they’re a key route empowered by the multimodal giant mannequin base of ‘Ruo Yu-Jiu Tian’. Harbin Institute of Technology presently has deep industry-academia-research accumulation within the subject of robotics. In the longer term, embodied robots would require the fusion of multimodal data reminiscent of speech, imaginative and prescient, decision-making, and management to type a closed loop. The multimodal giant mannequin base of ‘JiuTian’ will additional combine analysis primarily based on Harbin Institute of Technology’s gathered experience in robotics and has already established deep cooperation with a number of giant client electronics/automotive corporations.

With the ‘JiuTian’ multimodal giant mannequin base, Ruo Yu Technology has the power to supply personalised and customised providers for customers in numerous fields by way of fine-tuning of present multimodal giant mannequin bases. It supplies capabilities reminiscent of language pre-training giant fashions, multimodal pre-training giant fashions, and vertical area pre-training giant fashions, aiming to construct a future AI general-purpose platform and infrastructure.

…. to be continued
Read the Original Article
Copyright for syndicated content material belongs to the linked Source : Pandaily – https://pandaily.com/hit-shenzhen-team-develops-multimodal-large-model-jiutian-tops-opencompass-ranking/

Exit mobile version