Project Information

Super βNTF 2000(by Antoine Liutkus)
In a nutshell, the beta_ntf module  supports weighted parafac factorization of nonnegative numpy ndarray of arbitrary shape minimizing any βdivergence
 fits in one small single .py file
 requires numpy v1.6
Overviewthe beta_ntf Python module provides a very simple to use, light and yet powerful implementation of the nonnegative PARAFAC model, also known as Nonnegative Tensor Factorization, Nonnegative Matrix Factorization (NMF) or Nonnegative Canonical Decomposition. As an example, let X be a ndarray of shape (1000,400,5) composed of nonnegative entries: >>>X.shape
(1000, 400, 5) The objective of beta_ntf in that case is to approximate X using 3 matrices of respective shapes (1000,K) (400,K) and (5,K) where K is called the number of components. Let us call these matrices A, B and C. We want to approximate X as:
Such problems occur in many fields such as source separation, datamining, image processing, etc. To perform this with say K=10 while minimizing the KullbackLeibler divergence, simply use: >>>import beta_ntf
>>>ntf = beta_ntf.BetaNTF(X.shape, n_components=10, beta=1, n_iter=100,verbose=True)
>>>ntf.fit(X) Now, the ntf object contains an approximation of X using 10 components. You can  Directly access the factors as ntf.factors_ which is a list composed of 3 matrices here.
 Get the approximation using beta_ntf.parafac(ntf.factors_)
 Get any of the constitutive components, e.g. using ntf[...,3] for the 4th component
 Compute the total βdivergence of your approximation using ntf.score(X)
Detailed Features1) Simply compute the parafac modelGiven a list factors of matrices which are ndarrays with the same number of columns, the beta_ntf module permits to efficiently compute the corresponding PARAFAC model. For example, assume >>>factors=[A,B,C,D] where >>>A.shape
(100,5)
>>>B.shape
(50,5)
>>>C.shape
(10,5)
>>>D.shape
(5,5)
>>>factors=[A,B,C,D] To compute the (100,50,10,5) tensor P given by simply use >>>P=beta_ntf.parafac(factors) 2) Easily generate random nonnegative dataAs a small extra, the beta_ntf module provides the nnrandn, which permits to generate nonnegative random ndarrays of arbitrary shape. Nothing complex really, but useful : >>>E = beta_ntf.nnrandn((10,40,10))
>>>E.shape
(10,40,10) 3) Arbitrary shapes are supportedInterestingly, the beta_ntf module supports data to fit of arbitrary shape (up to 25 dimensions). In the example above, X could well have been a (100,40,5,10,8) ndarray, with exactly the same syntax for using beta_ntf. This feature is noticeable, since it is not so common (as of this writing in December 2012) to find light implementations of nonnegative PARAFAC models for higher order tensors, fitting in only one source file of less than 250 lines. >>>import beta_ntf Generating synthetic data >>>data_shape=(15,10,5,40)
>>>true_ncomponents=50
>>>true_factors = [beta_ntf.nnrandn((shape, n_components)) for shape in data_shape]
>>>X=beta_ntf.parafac(true_factors)
>>>X.shape
(15,10,5,40) Creating the BetaNTF object >>>ntf = beta_ntf.BetaNTF(X.shape, n_components=10, beta=1, n_iter=100,verbose=False) Fitting the data >>>beta_ntf.fit(X)
Fitting NTF model with 100 iterations....
Done.
<beta_ntf.BetaNTF instance at 0x7f664c01e0e0> Now, construct the approximation >>>model = beta_ntf.parafac(ntf.factors_)
>>>model.shape
(15, 10, 5, 40) Print the shape of the factors >>>for f in ntf.factors_:print f.shape
(15, 10)
(10, 10)
(5, 10)
(40, 10) Extract the 3 first components >>>firstComponents=ntf[...,0:3]
>>>firstComponents.shape
(15, 10, 5, 40, 3) 4) All βdivergences are supported While performing the approximation, beta_ntf minimizes a cost function between original data X and its approximation P. The cost function D to be minimized for this optimization is any βdivergence, summed over all entries of X. We thus have : where the βdivergence d_{β} to use for each entry is defined as : Particular cases of interest are:  β = 2 : Euclidean distance
 β = 1 : KullbackLeibler divergence
 β = 0 : ItakuraSaito divergence.
Simply change the beta property of a BetaNTF object to change the divergence it will minimize during fit. 5) Easily compute the score of your decompositionIt may be required to compute the score corresponding to a given decomposition, e.g. for model selection. If you want to know how far your NTF model is from data X, simply use : >>> ntf.score(X)
304.01426732381253 The quantity returned gives the overall selected βdivergence between the model and the given ndarray, which ought to be of the same shape as ntf.data_shape. Since this score is a divergence, the smaller, the better the approximation. 6) Weighting the cost functionSome datapoints may be more important than others. Similarly, some datapoints may not be available, leading to missing data. This is common in data inpainting, or in source separation. In that case, it is desirable to weight the cost function accordingly. More precisely, let X be the nonnegative ndarray to approximate and let P be the corresponding NTF model. Until now, I have only discribed estimation of P that minimizes the simple sum D of individual entrywise βdivergence : where d_{β} has been defined above. Instead of giving the same weight to all these individual divergences, we may instead weight these entries so as to consider instead : where W gives the weights to assign to individual divergences. If a datapoint X[i] is less important than others, its corresponding weight W[i] may be smaller, or even 0 if X[i] is actually not known. This feature is implemented in beta_ntf. If required, you can simply provide these weights as the W parameter to the fit function. As a first exemple, suppose we have two nonnegative X and W such that >>>X.shape
(400,50,30)
>>>W.shape
(400,50,30) You can use W to weight the divergence simply through: >>>ntf.fit(X,W) where the BetaNTF object ntf has been defined as above. Interestingly, W and X do not have to be of the same shape. It is only required that W*X be defined, and thus that they can be broadcasted and multiplied. We could thus have instead : >>>W.shape
(400,1,30) and use the fit function the same way. This can be practical for e.g. giving different weights to whole modalities of the experiment. About the algorithmThe algorithm implemented by beta_ntf features standard multiplicative updates minimizing βdivergence, which were recently shown to guarantee a decrease of the costfunction at each step. For recent references on the topic, check for example:  A. Cichocki, R. Zdunek, A. H. Phan, and S. Amari, Nonnegative matrix and tensor factorizations : Applications to exploratory multiway data analysis and blind source separation, Wiley Publishing, September 2009.
 C. Févotte and J. Idier, Algorithms for nonnegative matrix factorization with the betadivergence, Neural Computation 23 (2011), no. 9, 2421–2456.
Algorithms for the weighted NTF are derived exactly the same way. As far as I known they were first derived by Virtanen in his 2007 paper :  Virtanen, Tuomas, Monaural Sound Source Separation by Perceptually Weighted NonNegative Matrix Factorization, Technical report, Tampere University of Technology, Institute of Signal Processing, 2007.
In any case, the update equations I derived in the general case of arbitrary dimension of the data is really nothing original with respect to those works and references therein. The main challenge was actually in terms of implementation. Still, I will write a small technical report about this and upload it here when available. beta_ntf makes use of the powerful voodoo magic of numpy.einsum to perform its computations, thus explaining both its computational efficiency and its conciseness. Hence, it requires numpy 1.6, in which einsum first appears. Extensions of this small module for very large datasets would be nice. For this small project, I have focused on a light implementation that is easy to start with. It should run quite fast provided the data used fits in memory.
