胡俠: Efficient LLM Serving via Lossy Computation

發(fā)布時(shí)間：2026-01-08

點(diǎn)擊：

來(lái)源：人工智能創(chuàng)新學(xué)院

報(bào)告時(shí)間：2026年01月09日（星期五）8:30-9:30

報(bào)告地點(diǎn)：翡翠湖校區(qū)科教樓A座1104會(huì)議室

報(bào) 告人：胡俠教授

工作單位：上海人工智能實(shí)驗(yàn)室

舉辦單位：人工智能創(chuàng)新學(xué)院

報(bào)告簡(jiǎn)介：

Large language models (LLMs) exhibit human-like conversational abilities, but scaling them for long contexts (e.g., healthcare-related lengthy article information extraction) faces key challenges: inability to exceed pre-training context lengths and deployment difficulties due to inference memory requirements growing with context. A critical insight is LLMs’ strong robustness to noise from lossy computations (e.g., low-precision computing). This paper discusses advances in large-scale LLM deployment for long contexts. For algorithmic challenges, we propose extending LLM context length by at least 8× via coarsening positional information of distant tokens. For system hurdles, we quantize intermediate states of past tokens to 2-bit, achieving 8× memory efficiency and 3.5× wall-clock speedup without sacrificing performance. Finally, we highlight latest healthcare applications of LLMs, particularly using long-context retrieval techniques to mitigate hallucinations in healthcare chatbots.

報(bào)告人簡(jiǎn)介：

胡俠教授現(xiàn)任上海人工智能實(shí)驗(yàn)室主任助理、領(lǐng)軍科學(xué)家。曾任美國(guó)萊斯大學(xué)正教授、數(shù)據(jù)科學(xué)中心主任，作為聯(lián)合創(chuàng)始人兼首席科學(xué)家參與創(chuàng)立AIPOW公司。其長(zhǎng)期致力于機(jī)器學(xué)習(xí)和人工智能領(lǐng)域研究，在ICLR、NeurIPS、KDD、WWW、SIGIR等國(guó)際頂級(jí)會(huì)議及期刊上發(fā)表論文200余篇，被引超4萬(wàn)次。他主導(dǎo)開(kāi)發(fā)的自動(dòng)機(jī)器學(xué)習(xí)開(kāi)源系統(tǒng)AutoKeras成為最常的AutoML框架之一；其提出的NCF算法及系統(tǒng)被納入主流人工智能框架TensorFlow的官方推薦；此外，他開(kāi)發(fā)的異常檢測(cè)系統(tǒng)已在NVidia、通用電氣、Trane、蘋(píng)果等企業(yè)的產(chǎn)品中得到廣泛應(yīng)用。胡俠教授曾獲ICML、WWW、WSDM、INFORMS等會(huì)議最佳論文獎(jiǎng)或提名、美國(guó)國(guó)家科學(xué)基金委杰出青年獎(jiǎng)、KDD Rising Star Award和IEEE Atluri學(xué)者獎(jiǎng)等榮譽(yù)。他現(xiàn)任ACM TIST和Big Data期刊副主編、DMKD編委，曾擔(dān)任WSDM 2020大會(huì)主席及醫(yī)學(xué)信息學(xué)會(huì)議大會(huì)主席。

上一篇：范思遠(yuǎn): AI賦能光伏系統(tǒng)智能運(yùn)維的實(shí)踐與思考

下一篇：黃興懷: 新業(yè)態(tài)下電力行業(yè)巖土工程技術(shù)發(fā)展方向探討

本月熱點(diǎn)

2026-02-25

自然資源部發(fā)布深地國(guó)家科技重大專(zhuān)項(xiàng)2026年度公開(kāi)項(xiàng)目申報(bào)指南
2026-02-25

關(guān)于征集xx儀器設(shè)備以及xx量子項(xiàng)目需求的通知
2026-02-25

關(guān)于發(fā)布2026年xx先導(dǎo)計(jì)劃項(xiàng)目申報(bào)的通知
2026-02-16

關(guān)于國(guó)家留學(xué)基金委2026年中外合作獎(jiǎng)學(xué)金項(xiàng)目報(bào)名通知
2026-02-16

學(xué)校慰問(wèn)春節(jié)在崗干部職工

丰禾国际娱乐城-利澳娱乐城注册-csgo博彩看不到库存

胡俠: Efficient LLM Serving via Lossy Computation