報(bào)告時(shí)間:2026年01月09日(星期五)8:30-9:30
報(bào)告地點(diǎn):翡翠湖校區(qū)科教樓A座1104會(huì)議室
報(bào) 告 人:胡俠 教授
工作單位:上海人工智能實(shí)驗(yàn)室
舉辦單位:人工智能創(chuàng)新學(xué)院
報(bào)告簡(jiǎn)介:
Large language models (LLMs) exhibit human-like conversational abilities, but scaling them for long contexts (e.g., healthcare-related lengthy article information extraction) faces key challenges: inability to exceed pre-training context lengths and deployment difficulties due to inference memory requirements growing with context. A critical insight is LLMs’ strong robustness to noise from lossy computations (e.g., low-precision computing). This paper discusses advances in large-scale LLM deployment for long contexts. For algorithmic challenges, we propose extending LLM context length by at least 8× via coarsening positional information of distant tokens. For system hurdles, we quantize intermediate states of past tokens to 2-bit, achieving 8× memory efficiency and 3.5× wall-clock speedup without sacrificing performance. Finally, we highlight latest healthcare applications of LLMs, particularly using long-context retrieval techniques to mitigate hallucinations in healthcare chatbots.
報(bào)告人簡(jiǎn)介:
胡俠教授現(xiàn)任上海人工智能實(shí)驗(yàn)室主任助理、領(lǐng)軍科學(xué)家。曾任美國(guó)萊斯大學(xué)正教授、數(shù)據(jù)科學(xué)中心主任,作為聯(lián)合創(chuàng)始人兼首席科學(xué)家參與創(chuàng)立AIPOW公司。其長(zhǎng)期致力于機(jī)器學(xué)習(xí)和人工智能領(lǐng)域研究,在ICLR、NeurIPS、KDD、WWW、SIGIR等國(guó)際頂級(jí)會(huì)議及期刊上發(fā)表論文200余篇,被引超4萬(wàn)次。他主導(dǎo)開(kāi)發(fā)的自動(dòng)機(jī)器學(xué)習(xí)開(kāi)源系統(tǒng)AutoKeras成為最常的AutoML框架之一;其提出的NCF算法及系統(tǒng)被納入主流人工智能框架TensorFlow的官方推薦;此外,他開(kāi)發(fā)的異常檢測(cè)系統(tǒng)已在NVidia、通用電氣、Trane、蘋(píng)果等企業(yè)的產(chǎn)品中得到廣泛應(yīng)用。胡俠教授曾獲ICML、WWW、WSDM、INFORMS等會(huì)議最佳論文獎(jiǎng)或提名、美國(guó)國(guó)家科學(xué)基金委杰出青年獎(jiǎng)、KDD Rising Star Award和IEEE Atluri學(xué)者獎(jiǎng)等榮譽(yù)。他現(xiàn)任ACM TIST和Big Data期刊副主編、DMKD編委,曾擔(dān)任WSDM 2020大會(huì)主席及醫(yī)學(xué)信息學(xué)會(huì)議大會(huì)主席。