MediaTek Research launches the world’s first AI LLM in Traditional Chinese

    MediaTek’s AI research group has recently released the world's first Large Language Model (LLM) in Traditional Chinese language. The multilingual language model, BLOOM-zh, outperforms its predecessor in most Traditional Chinese benchmarks, while maintaining its English capability.

    Starting from released models, the team extended the pre-training of BLOOM by an additional 7.4 billion tokens in Traditional Chinese and English, covering a variety of domains such as news articles, books, encyclopedias, educational materials, as well as spoken language. To show the properties of BLOOM-zh, both existing and newly created benchmark scenarios are used for evaluating the performance. The research was conducted in collaboration with Academia Sinica.

    • For further information, a technical paper is available here >
    • The model is publicly available to use here >