The fundamental problem with most of the linear algebra techniques is that they scale badly for large matrices. Machine Learning algorithms compute all the eigenvalues and eigenvectors, all require ∼ n3 operations and ∼ n2 storage for n × n matrices.

You explicitly store and compute with all of the matrix entries, regardless of their values. Hence, they are sometimes called dense matrix algorithms.

## Sparse Matrices

The saving grace is that most really large matrices are sparse = mostly zeros (or have some other special structure with similar consequences). You only have to store the nonzero entries, and you can multiply matrix × vector quickly (you can skip the zeros).

In SciPy, there are many functions to work with sparse matrices by only storing the nonzero elements. The simplest one is the csr_matrix sparse function. Given a matrix A, the `csr` function creates a special data structure that only stores the nonzero elements:

```import numpy as np
from scipy import sparse

# Larger numpy matrix
matrix = np.array([[0, 0, 0, 0, 0, 0, 8, 0, 0, 0],
[0, 0, 0, 0, 6, 0, 0, 0, 0, 0],
[4, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

matrix_sparse = sparse.csr_matrix(matrix)
```

In practice, you would want to create the sparse matrix directly, rather than first making the “dense” matrix A and then converting it to a sparse data structure. We’ve actually seen this several times in graph/network-based problems, where we often get matrices of the form:

`A = GT DG`

D is diagonal (very sparse!) and G is the incidence matrix. Since each graph node is typically only connected to a few other nodes, G is sparse and so is A.

If each node is connected to a bounded number of other nodes (say, ≤ 20), then A only has ∼ n (i.e. proportional to n, not equal to n) entries, and Ax can be computed in ∼ n operations and ∼ n storage (unlike ∼ n2 for a general matrix).

Here we create matrix product tests, using the `sparse.random` method to create a sparse matrix with a specified sparsity. Sparse matrix multiplication performed better.

## Computational Complexity

The computational complexity of sparse operations depends on the number of nonzero elements in the matrix. Computational complexity also depends linearly on the row size `m` and column size `n` of the matrix, but is independent of the product `m*n`, the total number of zero and nonzero elements.

The complexity of fairly complicated operations, such as the solution of sparse linear equations, involves factors like ordering and fill-in, which are discussed in the previous section. In general, however, the computer time required for a sparse matrix operation is proportional to the number of arithmetic operations on nonzero quantities.

## Reordering for Sparsity

Reordering the columns of a matrix can often make its LU or QR factors sparser. Reordering the rows and columns can often make its Cholesky factors sparser. The simplest such reordering is to sort the columns by nonzero count. This is sometimes a good reordering for matrices with very irregular structures, especially if there is great variation in the nonzero counts of rows or columns.

The `colperm` computes a permutation that orders the columns of a matrix by the number of nonzeros in each column from smallest to largest.

## Conclusion

Sparse matrices also have significant advantages in terms of computational efficiency. Unlike operations with full matrices, operations with sparse matrices do not perform unnecessary low-level arithmetic, such as zero-adds (`x+0` is always `x`). The resulting efficiencies can lead to dramatic improvements in execution time for programs working with large amounts of sparse data.