Matlab: rows and columns for random permutation of a 2-D array

I have a large matrix (approx. 80,000 X 60,000), and I basically want to scramble all the entries (that is, randomly permute both rows and columns independently).

I believe it'll work if I loop over the columns, and use randperm to randomly permute each column. (Or, I could equally well do rows.) Since this involves a loop with 60K iterations, I'm wondering if anyone can suggest a more efficient option?

I've also been working with numpy/scipy, so if you know of a good option in python, that would be great as well.

Thanks! Susan

Thanks for all the thoughtful answers! Some more info: the rows of the matrix represent documents, and the data in each row is a vector of tf-idf weights for that document. Each column corresponds to one term in the vocabulary. I'm using pdist to calculate cosine similarities between all pairs of papers. And I want to generate a random set of papers to compare to.

I think that just permuting the columns will work, then, because each paper gets assigned a random set of term frequencies. (Permuting the rows just means reordering the papers.) As Jonathan pointed out, this has the advantage of not making a new copy of the whole matrix, and it sounds like the other options all will.

You should be able to reshape the matrix to a 1 × 4800000000 "array", randperm it, and finally reshape it back to a 80000 × 60000 matrix. This will require copying the 4.8 billion entries 3 times at worst. This might not be efficient.

EDIT: Actually Matlab automatically uses linear indexing, so the first reshape is not needed. Just

reshape(x(randperm(4800000000), 80000, 60000))

is enough (thus reducing 1 unnecessary potential copying).

Note that, this assumes you have a dense matrix. If you have a sparse matrix, you could extract the values, and then randomly reassign indices to them. If there are N nonzero entries, then only 8N copying are needed at worst (3 numbers are required to describe one entry).

我来评几句
登录后评论

已发表评论数()

相关站点

+订阅
热门文章