{"id":8987,"date":"2025-10-18T04:06:21","date_gmt":"2025-10-18T04:06:21","guid":{"rendered":"https:\/\/jobuzo.com\/en\/alibaba-cloud-claims-to-slash-nvidia-gpu-use-by-82-with-new-pooling-system\/"},"modified":"2025-10-18T04:06:21","modified_gmt":"2025-10-18T04:06:21","slug":"alibaba-cloud-claims-to-slash-nvidia-gpu-use-by-82-with-new-pooling-system","status":"publish","type":"post","link":"https:\/\/jobuzo.com\/en\/alibaba-cloud-claims-to-slash-nvidia-gpu-use-by-82-with-new-pooling-system\/","title":{"rendered":"Alibaba Cloud claims to slash Nvidia GPU use by 82% with new pooling system"},"content":{"rendered":"<div>\n<div><img decoding=\"async\" src=\"https:\/\/cdn.i-scmp.com\/sites\/default\/files\/styles\/og_image_scmp_generic\/public\/d8\/images\/canvas\/2025\/10\/17\/6846f06e-9449-41c8-903d-30f855635b92_84e2a06b.jpg?itok=DcdpX-8t&amp;v=1760700559\" class=\"ff-og-image-inserted\"><\/div>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">Alibaba Group Holding has introduced a computing pooling solution that it said led to an 82 per cent cut in the number of Nvidia graphics processing units (GPUs) needed to serve its artificial intelligence models.<\/p>\n<div data-qa=\"InlineAdSlot-Container\" class=\"css-zl1inp e11v3ui14\">\n<div class=\"e11v3ui10 e11v3ui13 css-1evd7i0 e1flwkbl0\" data-qa=\"AdSlot-Container\">\n<p>Advertisement<\/p>\n<\/div>\n<\/div>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">The system, called Aegaeon, was beta tested in Alibaba Cloud&rsquo;s model marketplace for more than three months, where it reduced the number of Nvidia H20 GPUs required to serve dozens of models of up to 72 billion parameters from 1,192 to 213, according to a research paper presented this week at the 31st Symposium on Operating Systems Principles (SOSP) in Seoul, South Korea.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">&ldquo;Aegaeon is the first work to reveal the excessive costs associated with serving concurrent LLM workloads on the market,&rdquo; the researchers from Peking University and Alibaba Cloud wrote.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">Alibaba Cloud is the AI and cloud services unit of Hangzhou-based Alibaba, which owns the Post. Its chief technology officer, Zhou Jingren, is one of the paper&rsquo;s authors.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">Cloud services providers, such as Alibaba Cloud and ByteDance&rsquo;s Volcano Engine, serve thousands of AI models to users concurrently, meaning that many application programming interface calls are handled at the same time.<\/p>\n<div data-qa=\"InlineAdSlot-Container\" class=\"css-zl1inp e11v3ui14\">\n<div class=\"e11v3ui10 e11v3ui13 css-117a6hs e1flwkbl0\" data-qa=\"AdSlot-Container\">\n<div class=\"internal-linking-related-contents\"><a href=\"https:\/\/jobuzo.com\/en\/12-weeks-jail-for-school-it-support-technician-who-took-upskirt-videos-of-teachers\/\" class=\"template-1\"><span class=\"cta\">News :<\/span><span class=\"postTitle\">&lt;div&gt;12 weeks' jail for school IT support technician who took upskirt videos of teachers&lt;\/div&gt;<\/span><\/a><\/div><p>Advertisement<\/p>\n<\/div>\n<\/div>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">However, a small handful of models such as Alibaba&rsquo;s Qwen and DeepSeek are most popular for inference, with most other models only sporadically called upon. This leads to resource inefficiency, with 17.7 per cent of GPUs allocated to serve only 1.35 per cent of requests in Alibaba Cloud&rsquo;s marketplace, the researchers found.<\/p>\n<p datatype=\"p\" data-qa=\"Component-Component\" class=\"e8zc9q40 css-1c6uqr6 ec74h0k1\">Researchers globally have sought to improve efficiency by pooling GPU power, allowing one GPU to serve multiple models, for instance.<\/p>\n<\/div>\n<p><sub>Alibaba Cloud claims to slash Nvidia GPU use by 82% with new pooling system<\/sub><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Alibaba Group Holding has introduced a computing pooling solution that it said led to an 82 per cent cut in the number of Nvidia graphics processing units (GPUs) needed to serve its artificial intelligence models. Advertisement The system, called Aegaeon, was beta tested in Alibaba Cloud&rsquo;s model marketplace for more than three months, where it&#8230;<\/p>\n<p class=\"more-link-wrap\"><a href=\"https:\/\/jobuzo.com\/en\/alibaba-cloud-claims-to-slash-nvidia-gpu-use-by-82-with-new-pooling-system\/\" class=\"more-link\">Read More<span class=\"screen-reader-text\"> &ldquo;Alibaba Cloud claims to slash Nvidia GPU use by 82% with new pooling system&rdquo;<\/span> &raquo;<\/a><\/p>\n","protected":false},"author":1,"featured_media":8988,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-8987","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news"],"_links":{"self":[{"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/posts\/8987","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/comments?post=8987"}],"version-history":[{"count":0,"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/posts\/8987\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/media\/8988"}],"wp:attachment":[{"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/media?parent=8987"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/categories?post=8987"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jobuzo.com\/en\/wp-json\/wp\/v2\/tags?post=8987"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}