Image courtesy: Jian Fan/ Getty Images
Abu Dhabi artificial intelligence (AI) firm G42 Group’s subsidiary Inception has launched “Jais”, an advanced Arabic language software that can power generative AI applications, according to state news agency WAM.
Jais is a 13-billion parameter model trained on a newly developed 395-billion-token Arabic and English dataset, a portion of which is from computer code. Its release marks a significant milestone in the realm of AI for the Arabic world.
“With this release, we are setting a new standard for AI advancement in the Middle East and ensuring that the Arabic language, with its depth and heritage, finds its voice within the AI landscape. Jais is a testament to our commitment to excellence and dedication to democratising AI and promoting innovation,” said Andrew Jackson, CEO of Inception.
The homegrown large language model offers more than 400 million Arabic speakers the opportunity to harness the potential of generative AI.
G42 said Jais will facilitate and expedite innovation while cementing Abu Dhabi’s position as a hub for AI, innovation, culture preservation and international collaboration.
By open-sourcing Jais, Inception aims to engage the scientific, academic, and developer communities to accelerate the growth of a vibrant Arabic language AI ecosystem.
The language outperforms existing Arabic models by a sizable margin and is also competitive with English models of similar size.
Today, we are excited to announce the launch of Jais, the world’s highest quality open-source Arabic large language model (#LLM) and a collaboration between Inception, a G42 company,@MBZUAI & @CerebrasSystems. pic.twitter.com/fsMV4SHp1g
— G42 (@G42ai) August 30, 2023
Named after the highest peak in the UAE, Jais is a collaboration between G42’s Inception, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) and Cerebras Systems. “We believe that innovation thrives when we collaborate,” added Jackson.
Jais is a transformer-based large language model that incorporates many cutting-edge features, including ALiBi position embeddings, which enables the model to extrapolate to much longer inputs, providing better context handling and accuracy. Keywords: G42, Mohamed bin Zayed University of Artificial Intelligence, UAE, AI model, Jais
“Developing such a high-calibre Arabic LLM demanded cutting-edge AI research in addition to an in-depth and nuanced understanding of the Arabic language, its diversity and heritage, and the growing importance of LLMs across all echelons of society,” said MBZUAI President and University Professor Eric Xing.
Other state-of-the-art techniques include SwiGLU and maximal update parameterisation to improve the model’s training efficiency and accuracy.
Jais is available for download on Hugging Face. Users can also try Jais online upon registering interest on its website and receiving an invite to access the playground environment.
Jais leverages Condor Galaxy 1
Jais was created with the help of Condor Galaxy 1, the supercomputer produced by Silicon Valley-based Cerebras Systems, which designs dinner plate-sized chips. CG1 has the capacity of 4 exaFLOPs and 54 million cores, with 64 Cerebras CS-2 systems linked together into a single, easy-to-use AI supercomputer.
CG1 is part of a network of nine interconnected supercomputers that were unveiled by G42 and Cerebras Systems in July. The supercomputers offer a new approach to AI computing that promises to significantly reduce model training time.
Cerebras and G42 offer CG-1 as cloud service allowing customers to enjoy the performance of an AI supercomputer without having to manage or distribute models over physical systems.
The two entities plan to deploy two more such supercomputers, CG-2 and CG-3, in the US in early 2024. With a planned capacity of 36 exaFLOPs in total, this unprecedented supercomputing network will revolutionise the advancement of AI globally.