Building MediaTek’s AI future with an on-premises AI factory powered by NVIDIA

    Building MediaTek’s AI future with an on-premises AI factory powered by NVIDIA

    As a global leader powering nearly two billion devices each year, MediaTek is accelerating its evolution into a full-stack AI innovator. With AI now transforming everything in our lives, from mobile/home/automotive experiences to enterprise workflows, we are investing strategically to continually advance the AI capabilities of our chipsets, foster a vibrant developer ecosystem, and deliver turnkey, localized AI solutions that drive long-term demand.

    To support these initiatives, we established a secure, high-performance on-premises AI factory built on NVIDIA DGX SuperPOD™. This environment centralizes our most demanding training and inference workloads, unlocks major productivity gains, and provides a scalable foundation for next-generation AI development.

     

    Scaling AI ambitions with a unified high-performance platform

    As we’ve grown our LLM efforts, such as the Breeze series, including the huge 480B parameter traditional-Chinese model, the compute demands quickly outpaced conventional infrastructure. We routinely process billions of tokens monthly and run tens of thousands of training iterations, all while protecting sensitive data and enabling rapid experimentation.

    NVIDIA DGX SuperPOD provided the ideal solution to meet our business needs; its scalable architecture, powered by NVIDIA Blackwell-based systems, allows us to train and deploy billion-parameter models with unprecedented speed and efficiency.

    “Our AI factory, powered by DGX SuperPOD, processes approximately 60 billion tokens per month for inference and completes thousands of model-training iterations every month,”

    David Ku, Co-COO and CFO at MediaTek

    Model inference at this scale often requires distributing massive models across multiple GPUs. The DGX SuperPOD’s tightly coupled systems make this possible, delivering ultra-fast communication, synchronized GPU memory access, and industry-leading throughput.

    "The DGX SuperPOD is indispensable for our inference workloads. It allows us to deploy and run massive models that wouldn't fit on a single GPU or even a single server, ensuring we achieve the best performance and accuracy for our most demanding AI applications," 

    David Ku, Co-COO and CFO at MediaTek

    Accelerating R&D with enterprise-grade AI software

    Beyond hardware acceleration, NVIDIA’s software ecosystem has also played a key role in maximizing the value of our AI factory. For example, using TensorRT-LLM and NIM microservices on DGX, our teams have achieved:

    • 40% faster inference speeds
    • 60% higher token throughput
    • Smaller, distilled models optimized for on-device and edge deployment

    These improvements translate directly into AI-assisted coding, automated design analysis, and documentation agents built on domain-adapted LLMs, which now help our engineers complete workflows in days instead of weeks.

    “With DGX SuperPOD, we’ve gone from training 7-billion-parameter models in a week to training models exceeding 480 billion parameters in the same timeframe—a dramatic leap that has accelerated our AI capabilities,” 

    David Ku, Co-COO and CFO at MediaTek

    Integrated into NVIDIA DGX Spark™, which uses the NVIDIA GB10 Grace Blackwell superchip co-designed by MediaTek, NVIDIA Riva has accelerated our work in SLMs and speech, enabling high-quality ASR and TTS pipelines for voice-enabled features like search, scheduling, and messaging, helping us bring more natural interactions to future MediaTek-powered devices.