Scalable Probabilistic Tensor Factorization for Binary and Count Data

Scalable Probabilistic Tensor Factorization for Binary and Count Data

7 Pages · 2015 · 281 KB · English

count-valued tensors, we present a probabilistic tensor factor- .. algorithm for exponential family tensor factorization, using. Laplace and Gaussian .. Journal of the Royal Statistical Society: Series B (Statistical. Methodology) 

Scalable Probabilistic Tensor Factorization for Binary and Count Data free download

Scalable Probabilistic Tensor Factorization for Binary and Count Data Piyush Rai  , Changwei Hu  , Matthew Harding y , Lawrence Carin   Department of Electrical & Computer Engineering, Duke University y Sanford School of Public Policy & Department of Economics, Duke University Durham, NC 27708, USA f piyushrai,ch237,matthewharding,lcarin [email protected] Abstract Tensor factorization methods provide a useful way to extract latent factors from complex multirela tional data, and also for predicting missing data Developing tensor factorization methods for mas sive tensors, especially when the data are binary or countvalued (which is true of most realworld ten sors), however, remains a challenge We develop a scalable probabilistic tensor factorization frame work that enables us to perform efcient factoriza tion of massive binary and count tensor data The framework is based on ( i) the P ´ olyaGamma aug mentation strategy which makes the model fully lo cally conjugate and allows closedform parameter updates when data are binary or countvalued; and ( ii ) an efcient onlineExpectation Maximization algorithm, which allows processing data in small minibatches, and facilitates handling massive ten sor data Moreover, various types of constraints on the factor matrices (eg, sparsity, nonnegativity) can be incorporated under the proposed framework, providing good interpretability, which can be useful for qualitative analyses of the results We apply the proposed framework on analyzing several binary and countvalued realworld data sets 1 Introduction Tensor factorization methods [ Kolda and Bader, 2009 ] of fer a useful way to learn latent factors from complex mul tiway data These methods decompose the original ten sor data into a set of factor matrices (one for each mode or “way” of the tensor), which can be used as a latent feature representation for the objects in each of the tensor mode, and can be used for other tasks, such as tensor com pletion Among tensor factorization methods, probabilistic approaches [ Chu and Ghahramani, 2009; Xu et al, 2013; Rai et al , 2014 ] are especially appealing because of a proper generative model of the data, which allows modeling different data types and handling missing data in a natural way Realworld tensor data are often binary or count valued [ Nickel et al, 2011; Chi and Kolda, 2012 ] For ex ample, a multirelational social network [ Nickel et al, 2011 ] can be described as a threeway binary tensor with two modes denoting people and the third mode denoting the types of re lationships Likewise, from a database of research publica tions, one may construct a threeway ( AU T H O R SW O R D S  V E N U E S ) countvalued tensor, where the three dimensions could be authors, words, and publication venues and each en try of the tensor denotes the number of time an author used a specic word at a specic venue Tensor factorization on this multiway data can be used for topic modelingon such a pub lications corpus (the latent factors would correspond to top ics) Another application could be in recommender systems; having learned the latent factors of authors and venues, one can use these factors for authorauthor recommendation (for potential coauthors) or authorvenue recommendation (rec ommending the most appropriate venues for a given author) Although several tensor factorization methods have been proposed in the recent years [ Kolda and Bader, 2009; Chu and Ghahramani, 2009 ] and there has been a signicant re cent interest on developing scalable tensor factorization meth ods [ Kang et al, 2012; Inah et al, 2015; Papalexakis et al , 2012; Beutel et al, 2014 ] , most of these methods treat data as realvalued, and are therefore inappropriate for han dling binary and count data; also see Related Work (Sec tion 7) Motivated by the prevalence of such binary and countvalued tensors, we present a probabilistic tensor factor ization framework which can handle binary and countvalued tensors, while being scalable for massive tensor data Our starting point will be a conjugate, fully Bayesian model for both binary and count data, for which we develop an efcient Gibbs sampler The framework is based on the P ´ olyaGamma data

------------- Read More -------------

Download scalable-probabilistic-tensor-factorization-for-binary-and-count-data.pdf

Scalable Probabilistic Tensor Factorization for Binary and Count Data related documents

Dealing with the benefits and costs of internationalisation of the Korean won

21 Pages · 2011 · 274 KB · English

As an another example, the volume of currency option trading between Korean banks and exporting companies for the purpose of hedging foreign exchange rate risk, including KIKO. (“knock-in/knock-out”) trading, has grown considerably since 2006. However, with the sharp rise in the exchange rate i

Soil and Agriculture

21 Pages · 2010 · 6.29 MB · English

Before you read the chapter, answer each question with information you know. 12.1 Soil. Key Concepts. Soil is a complex substance that forms through .. To find the daily food transportation cost for a group of people, multiply the 

Synthesis of styrene and acrylic emulsion polymer systems by semi-continuous seeded ...

57 Pages · 2016 · 7.32 MB · English

particles by semi-continuous seeded emulsion polymerization processes was . The critical factor in our choice of emulsion polymerization, however 

Full bios for participants at the conference

6 Pages · 2015 · 108 KB · English

Herman Aguinis, John F. Mee Chair of Management, Indiana University. Dr. Aguinis social identity, group processes and performance. Daniel Beal 

Development and Validation of a Scale for Christian Character Assessment of University Students

10 Pages · 2017 · 506 KB · English

brotherly kindness, love. Handong Global University wisdom, sexual purity, self-control, love, honesty, integrity, humility. Calvin Virtue diligence, patience, honesty, courage, charity, creativity, empathy, humility, stewardship, compassion, justice, faith(loyalty + trust), hope, wisdom. Wheaton C

Brand Preference for Mobile Phone Operator Services in the Cape Coast Metropolis

16 Pages · 2011 · 354 KB · English

Keller (1993) explains that attributes of a brand are those descriptive features Keller (1993) further explains that the price of the product or service is . White-Collar management capacity. Fitzsimons, G.J., Hutchinson, J.W., Williams, P., Alba, J.W., Chartrand, T.L., . SMS: Keep It Plain and Pi

God’s Word for Us Through Jeremiah

194 Pages · 2012 · 3.46 MB · English

Another mode might be to 12 God’s Word for Us Through Jeremiah. study the historical setting and events illuminated by He watched his sons be killed;

Environmental and Historical Preservation under IRC 501(c)(3)

46 Pages · 2001 · 274 KB · English

pollution, and shrinking energy supplies. There has organizations organized and operated exclusively for charitable or educational purposes . and private organizations concerned with environmental conservation. Some of . charitable purpose within the meaning of section 1.501(c)(3)-1(d)(2) of the.

level 3 award and certificates in practical animal care skills qualification guide

400 Pages · 2017 · 2.87 MB · English

All internal assessments must be accompanied by a signed Declaration of Authenticity (this document is available on the ABC web site). ADDITIONAL INFORMATION. Useful sources of reference. •. The RSPCA web site,uk and The pet web site The. Defra web site 

Thesis – The Piano, and the Essence of Tango

162 Pages · 2014 · 5.11 MB · English

which I wrote the arrangements for; and Tangos of Piazzolla for a piano duo; and a clarinet and piano The Tango techniques and characteristics ceased to be just information to exert a function. I became .. Another pointed reference is to the journal publication – Tango Renovation: On the Uses of