Mining of Massive Data-Sets

Big data is essentially the computational analysis of very large data sets to reveal patterns, trends, and associations, especially relating to human behavior and interactions. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. Consequently, there is considerable investment in big data by the private sector. However, it is also useful for many Government entities, INGOs, and businesses.

To support wider engagement in big data analytics the Cambridge University Press with the agreement of Stanford University and the Authors have made available a free e-copy of the book ‘The Mining of Massive Datasets’.

This book is based on the Stanford Computer Science course CS246: Mining Massive Datasets (and CS345A: Data Mining). The book, like the course, is designed at the undergraduate computer science level with no formal prerequisites. To support deeper explorations, most of the chapters are supplemented with further reading references.

You can download a copy of the e-book here:

Please note: Cambridge University Press does retain copyright on the work, and expect that you will obtain their permission and acknowledge their authorship if you republish parts, or all, of it.