Google Cloud debuts new AI chips, tools for building agents
· The Straits TimesLAS VEGAS - Alphabet Inc’s Google Cloud division unveiled the latest generation of its tensor processing unit, or TPU, a home-grown chip that’s designed to make AI computing services faster and more efficient.
The new line-up will come in two versions, the company said on April 22 at its Google Cloud Next event, where it also announced a US$750 million (S$957.36 million) fund to help boost corporate AI adoption and showed off tools for building AI agents.
The TPU 8t is tailored for creating artificial intelligence software, while the TPU 8i is designed to run AI services after they’ve been created – a stage known as inference.
Shares of Alphabet gained 1.7 per cent before markets opened in New York.
Google has emerged as one of the most successful makers of in-house AI chips in an industry dominated by Nvidia Corp.
TPUs have become a hot commodity in Silicon Valley in recent months, and the company is looking to build on that momentum with the latest versions.
The effort is part of a broader push to make it cheaper and less energy-intensive to roll out AI software.
The company also is working to make services more responsive.
The new TPUs store more information on the chip, helping provide the rapid responses that users crave.
But demands on increasingly complex layers of software are only growing.
“It’s about how you deliver the lowest possible latency of the response at the lowest possible cost per transaction,” said Mr Mark Lohmeyer, Google’s vice-president of compute and AI infrastructure.
“The number of transactions is going way up, and the cost per transaction needs to go way down for it to scale.”
Creating AI services and software is done by using systems that can sift through massive amounts of data very quickly to make connections and establish patterns that can be represented mathematically.
Inference, running the software and services, benefits from processors that have huge amounts of memory integrated into them.
This approach helps make AI responses more instantaneous because the component doesn’t have to go seek information stored elsewhere.
It’s particularly useful when computers “reason” through problems, taking multiple steps and learning from their own actions.
The training chip, 8t, can be combined into groups of 9,600 semiconductors. Google said that when deploying such massive systems, power is increasingly the major constraint in data centres.
Owners therefore need systems that are more efficient to get the best out of the limited availability of electricity.
TPU 8t delivers 124 per cent more performance per watt than the preceding generation, with TPU 8i providing a gain of 117 per cent.
That step-up is helped by improving in-house networking that increases the chips’ ability to communicate with one another efficiently.
AI systems built on the chips will be “generally available later this year,” Google said in a statement.
The company will continue to offer services based on Nvidia chips to customers who want to use the systems that currently dominate AI computing, it said.
Google intends to be among the first to deploy gear based on a new design from Nvidia coming in the second half of the year, Mr Lohmeyer said.
Like Google, Nvidia is focusing more on the inference stage of AI. Its forthcoming lineup will include technology from its acquisition of Groq – technology tailored specifically for providing ultrafast responsiveness.
Nvidia chief executive officer Jensen Huang has said that more than 20 per cent of AI workloads might be best served by that type of chip.
Groq was founded in 2016 by a group of former Google engineers.
Last December, Nvidia paid US$20 billion for a licence to use its technology and hired most of its engineering team.
Separately on April 22, Google’s cloud computing unit showcased a set of tools that can create AI agents and track their work within companies, including a dedicated inbox for the virtual bots to post information and progress reports.
Google also introduced updates across its Workspace productivity suite and offered up a vision in which AI agents dramatically overhaul the day-to-day routines of the average worker.
The newly launched US$750 million fund, meanwhile, is meant to help consultancies bring agentic AI to their clients.
Google’s AI lab DeepMind will give early access to Gemini models to select firms, which will use the AI tools and provide feedback ahead of launch.
Google engineers will also work alongside consulting firms to help solve client issues.
The capital will be deployed over the next 12 months and will be used to do things like help consulting firms train engineers and develop AI agents through Gemini’s enterprise platform. BLOOMBERG