文字
縮放
增大字体
减小字体
南加州大學 Yingying Fan教授: An Empirical Bayes Solution for Selection Bias in Functional Data

([西财新闻] 发布于 :2019-07-03 )

光華講壇——社會名流與企業家論壇第5481期

 

主題:An Empirical Bayes Solution for Selection Bias in Functional Data

主講人:南加州大學 Yingying Fan教授

主持人:郭斌 副教授

時間:2019年7月5日(星期五)上午10:00-11:00

地點:人人棋牌柳林校区弘远楼408会议室

主辦單位:统计研究中心 统计学院 科研处

 

主講人簡介:

Yingying Fan is   Dean's Associate Professor in Business Administration in Data Sciences and   Operations Department of the Marshall School of Business at the University of   Southern California, Associate Professor in Departments of Economics and   Computer Science at USC, and an Associate Fellow of USC Dornsife Institute   for New Economic Thinking (INET). She received her Ph.D. in Operations   Research and Financial Engineering from Princeton University in 2007. She was   Lecturer in the Department of Statistics at Harvard University from   2007-2008. Her research interests include statistics, data science, machine   learning, economics, and big data and business applications.

Yingying Fan是南加州大學 Marshall 商学院数据科学与运行系工商管理副院长,南加州大學经济与计算机科学系副教授、南加州大學多恩西夫新经济思维研究所(INET)副研究员。2007年,她在普林斯顿大学获得了运筹学和金融工程学博士学位。2007-2008年,她于哈佛大学担任统计学系的讲师。她的研究方向包括统计学、数据科学、机器学习、经济学、大数据和商业应用。

主要內容:

Selection bias results from the sampling of extreme observations and is a well recognized issue for standard scalar or multivariate data. Numerous approaches have been proposed to address the issue, dating back at least as far as the James-Stein shrinkage estimator. However, the same potential issue arises, albeit with additional complications, for functional data. Given a set of observed functions, one may wish to select for further analysis those which are most extreme according to some metric such as the average, variance, or maximum value of the function. However, given that functions are often noisy realizations of some underlying mean process, these outliers are likely to generate biased estimates of the quantity of interest. In this paper we propose an empirical Bayes approach, using a variant of Tweedie's formula, to adjust such functional data to generate approximately unbiased estimates of the true mean functions. Our approach has several advantages. It is non-parametric in nature, but is capable of automatically shrinking back towards a James-Stein type estimator in low signal situations. It is also computationally e cient and possesses desirable theoretical properties. Furthermore, we demonstrate through extensive simulations and real data analyses that our approach can produce signi cant improvements in prediction accuracy relative to possible competitors.

選擇偏差是極端觀測的抽樣結果,是標准標量或多元數據的一個公認問題。已經提出了許多方法來解決這個問題,至少可以追溯到James-Stein收縮估計量。然而,同樣的潛在問題也出現在函數數據上,盡管存在額外的複雜性。給定一組觀察到的函數數據,人們可能希望根據函數的平均值、方差或最大值等度量來選擇最極端的函數進行進一步分析。然而,考慮到函數通常是一些底層平均過程的嘈雜實現,這些異常值很可能産生對感興趣的估計量的有偏的估計。在本文中,我們提出了一種經驗貝葉斯方法,使用Tweedie公式的一種變體,來調整這些函數數據,以生成真實均值函數的近似無偏估計。我們的方法有幾個優點。它本質上是非參數的,但在低信號情況下能夠自動收縮回James-Stein類型估計量。它具有計算效率高、理論性能好等優點。此外,我們通過大量的模擬和實際數據分析表明,相對于其他方法,我們的方法可以顯著提高預測精度。

☆該新聞已被浏覽: 次★

【打印本文】 【關閉窗口