I have a large matrix (approx. 80,000 X 60,000), and I basically want to scramble all the entries (that is, randomly permute both rows and columns independently).

I believe it'll work if I loop over the columns, and use randperm to randomly permute each column. (Or, I could equally well do rows.) Since this involves a loop with 60K iterations, I'm wondering if anyone can suggest a more efficient option?

I've also been working with numpy/scipy, so if you know of a good option in python, that would be great as well.

Thanks! Susan

Thanks for all the thoughtful answers! Some more info: the rows of the matrix represent documents, and the data in each row is a vector of tf-idf weights for that document. Each column corresponds to one term in the vocabulary. I'm using pdist to calculate cosine similarities between all pairs of papers. And I want to generate a random set of papers to compare to.

I think that just permuting the columns will work, then, because each paper gets assigned a random set of term frequencies. (Permuting the rows just means reordering the papers.) As Jonathan pointed out, this has the advantage of not making a new copy of the whole matrix, and it sounds like the other options all will.

`reshape`

the matrix to a 1 × 4800000000 "array", `randperm`

it, and finally `reshape`

it back to a 80000 × 60000 matrix.
EDIT: Actually Matlab automatically uses linear indexing, so the first `reshape`

is not needed. Just

reshape(x(randperm(4800000000), 80000, 60000))

is enough (thus reducing 1 unnecessary potential copying).

Note that, this assumes you have a dense matrix. If you have a sparse matrix, you could extract the values, and then randomly reassign indices to them. If there are N nonzero entries, then only 8N copying are needed at worst (3 numbers are required to describe one entry).

推荐文章

- 1. Elon Musk only has to sell 59 Teslas to offset the CO2 from a s..
- 2. Amazon reportedly considering $2 billion stake in Indian teleco..
- 3. Opinion: Open-source should stay away from all kinds of politic..
- 4. Google and Walmart's PhonePe establish dominance in India's mob..
- 5. It's the 10th Anniversary of That Time Mark Zuckerberg Showed t..
- 6. Snapchat no longer promoting Trump’s posts

热门文章

- 1. Elon Musk only has to sell 59 Teslas to offset the CO2 from a single Spac..
- 2. Former Facebook employees forcefully join the chorus against Mark Zuckerb..
- 3. Amazon reportedly considering $2 billion stake in Indian telecom operator..
- 4. Opinion: Open-source should stay away from all kinds of politics
- 5. Google and Walmart's PhonePe establish dominance in India's mobile paymen..

## 我来评几句

登录后评论已发表评论数()