In recent years there has been strong interest in modifying classical data dimensionality reduction algorithms, such as principal component analysis (PCA), in order for the result (say, a direction of high data variance) to be more interpretable, for example, sparse. In this talk we review some applications of this method to large data sets, highlighting the non-obvious fact that sparse versions can be easier to apply than their classical counterparts. We also explore the application of the row-by-row method (block coordinate descent) to these sparse dimensionality reduction problems.