## Numpy Submatrix Assignment Of Benefits

Integer array indexing allows selection of arbitrary items in the array based on their *N*-dimensional index. Each integer array represents a number of indexes into that dimension.

#### Purely integer array indexing¶

When the index consists of as many integer arrays as the array being indexed has dimensions, the indexing is straight forward, but different from slicing.

Advanced indexes always are broadcast and iterated as *one*:

Note that the result shape is identical to the (broadcast) indexing array shapes .

Example

From each row, a specific element should be selected. The row index is just and the column index specifies the element to choose for the corresponding row, here . Using both together the task can be solved using advanced indexing:

To achieve a behaviour similar to the basic slicing above, broadcasting can be used. The function can help with this broadcasting. This is best understood with an example.

Example

From a 4x3 array the corner elements should be selected using advanced indexing. Thus all elements for which the column is one of and the row is one of need to be selected. To use advanced indexing one needs to select all elements *explicitly*. Using the method explained previously one could write:

However, since the indexing arrays above just repeat themselves, broadcasting can be used (compare operations such as ) to simplify this:

This broadcasting can also be achieved using the function :

Note that without the call, only the diagonal elements would be selected, as was used in the previous example. This difference is the most important thing to remember about indexing with multiple advanced indexes.

#### Combining advanced and basic indexing¶

When there is at least one slice (), ellipsis () or in the index (or the array has more dimensions than there are advanced indexes), then the behaviour can be more complicated. It is like concatenating the indexing result for each advanced index element

In the simplest case, there is only a *single* advanced index. A single advanced index can for example replace a slice and the result array will be the same, however, it is a copy and may have a different memory layout. A slice is preferable when it is possible.

Example

The easiest way to understand the situation may be to think in terms of the result shape. There are two parts to the indexing operation, the subspace defined by the basic indexing (excluding integers) and the subspace from the advanced indexing part. Two cases of index combination need to be distinguished:

- The advanced indexes are separated by a slice, ellipsis or newaxis. For example .
- The advanced indexes are all next to each other. For example but
*not*since is an advanced index in this regard.

In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that. In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).

Example

Suppose is (10,20,30) and is a (2,3,4)-shaped indexing array, then has shape (10,2,3,4,30) because the (20,)-shaped subspace has been replaced with a (2,3,4)-shaped broadcasted indexing subspace. If we let *i, j, k* loop over the (2,3,4)-shaped subspace then . This example produces the same result as .

Example

Let be (10,20,30,40,50) and suppose and can be broadcast to the shape (2,3,4). Then has shape (10,2,3,4,40,50) because the (20,30)-shaped subspace from X has been replaced with the (2,3,4) subspace from the indices. However, has shape (2,3,4,10,30,50) because there is no unambiguous place to drop in the indexing subspace, thus it is tacked-on to the beginning. It is always possible to use to move the subspace anywhere desired. Note that this example cannot be replicated using .

### 1. Introduction

This question is difficult because:

It's not clear what the function does. It's always a good idea to write a docstring for a function, specifying what it does, what arguments it takes, and what it returns. (And test cases are always appreciated.)

It's not clear what the role of the arguments and is. The code in the post only ever passes for and for . So is that a requirement? Or is it sometimes possible to pass in other values?

I am going to assume in what follows that:

the specification of the function is ;

is always and is always ;

the Cython details are not essential to the problem, and that it's OK to work in plain Python.

Here's my rewrite of the function. Note the docstring, the doctest, and the simple implementation, which loops over the sequence elements rather than their indices:

And here's a 1000-element test case, which I'll use in the rest of this answer to compare the performance of various implementations of this function:

### 2. Vectorize

The whole reason for using NumPy is that it enables you to vectorize operations on arrays of fixed-size numeric data types. If you can successfully vectorize an operation, then it executes mostly in C, avoiding the substantial overhead of the Python interpreter.

Whenever you find yourself iterating over the elements of an array, then you're not getting any benefit from NumPy, and this is a sign that it's time to rethink your approach.

So let's vectorize the function. This is easy using a sparse :

Let's see how fast that is on the 1000-element test case:

That's about 1500 times faster than .

### 3. Improve the algorithm

The vectorized still takes \$O(n^2)\$ time on arrays of length \$O(n)\$, because it has to compare every pair of elements. Is it possible to do better than that?

Suppose that I start by sorting the first array . Then consider an element from the second array , and find the point where would fit into the sorted first array, that is, find such that . Then is greater than elements from . This position can be found in time \$O(\log n)\$ using , and so the algorithm as a whole has a runtime of \$O(n \log n)\$.

Here's a straightforward implementation:

This implementation is about three times faster than on the 1000-element test case:

This shows the importance of finding the best algorithm, not just speeding up the algorithm you've got. Here an \$O(n \log n)\$ algorithm in plain Python beats a vectorized \$O(n^2)\$ algorithm in NumPy.

### 4. Vectorize again

Now we can vectorize the improved algorithm, using :

And this is six times faster still:

### 5. Answers to your questions

In comments, you asked:

"What does vectorizing mean?" Please read the "What is NumPy?" section of the NumPy documentation, in particular the section starting:

Vectorization describes the absence of any explicit looping, indexing, etc., in the code - these things are taking place, of course, just “behind the scenes” (in optimized, pre-compiled C code).

"What is meshgrid?" Please read the documentation for .

I use to create a NumPy array containing all pairs of elements where is an element of and is an element of . Then I apply the function to those pairs, getting an array of Booleans, which I sum. Try it out in the interactive interpreter and see for yourself:

## One thought on “Numpy Submatrix Assignment Of Benefits”