Skip to content
Cart

Your Cart

×

You have 0 items in your cart.

Register Sign in Wishlist

Mining of Massive Datasets

2nd Edition

$70.00 (P)

  • Date Published: December 2014
  • availability: In stock
  • format: Hardback
  • isbn: 9781107077232

$70.00 (P)
Hardback

Add to cart Add to wishlist

Other available formats:
eBook


Looking for an examination copy?

If you are interested in the title for your course we can consider offering an examination copy. To register your interest please contact collegesales@cambridge.org providing details of the course you are teaching.

Description
Product filter button
Description
Contents
Resources
Courses
About the Authors
  • Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the map-reduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets and clustering. This second edition includes new and extended coverage on social networks, machine learning and dimensionality reduction.

    • Contains brand new material and extended coverage of important topics
    • Includes a range of over 150 exercises to challenge even the most able student
    • Slides, homework assignments, project requirements and exams are available from http://infolab.stanford.edu/~ullman/mining/mining.html
    Read more

    Customer reviews

    Not yet reviewed

    Be the first to review

    Review was not posted due to profanity

    ×

    , create a review

    (If you're not , sign out)

    Please enter the right captcha value
    Please enter a star rating.
    Your review must be a minimum of 12 words.

    How do you rate this item?

    ×

    Product details

    • Edition: 2nd Edition
    • Date Published: December 2014
    • format: Hardback
    • isbn: 9781107077232
    • length: 476 pages
    • dimensions: 253 x 180 x 30 mm
    • weight: 0.99kg
    • contains: 150 b/w illus. 210 exercises
    • availability: In stock
  • Table of Contents

    Preface
    1. Data mining
    2. Map-reduce and the new software stack
    3. Finding similar items
    4. Mining data streams
    5. Link analysis
    6. Frequent itemsets
    7. Clustering
    8. Advertising on the Web
    9. Recommendation systems
    10. Mining social-network graphs
    11. Dimensionality reduction
    12. Large-scale machine learning
    Index.

  • Resources for

    Mining of Massive Datasets

    Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman

    General Resources

    Welcome to the resources site

    Here you will find free-of-charge online materials to accompany this book. The range of materials we provide across our academic and higher education titles are an integral part of the book package whether you are a student, instructor, researcher or professional.

    Find resources associated with this title

    Type Name Unlocked * Format Size

    Showing of

    Back to top

    *This title has one or more locked files and access is given only to instructors adopting the textbook for their class. We need to enforce this strictly so that solutions are not made available to students. To gain access to locked resources you either need first to sign in or register for an account.


    These resources are provided free of charge by Cambridge University Press with permission of the author of the corresponding work, but are subject to copyright. You are permitted to view, print and download these resources for your own personal use only, provided any copyright lines on the resources are not removed or altered in any way. Any other use, including but not limited to distribution of the resources in modified form, or via electronic or other media, is strictly prohibited unless you have permission from the author of the corresponding work and provided you give appropriate acknowledgement of the source.

    If you are having problems accessing these resources please email lecturers@cambridge.org

  • Authors

    Jure Leskovec, Stanford University, California
    Jure Leskovec is Assistant Professor of Computer Science at Stanford University. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including a Microsoft Research Faculty Fellowship, the Alfred P. Sloan Fellowship, Okawa Foundation Fellowship, and numerous best paper awards. His research has also been featured in popular press outlets such as the New York Times, the Wall Street Journal, the Washington Post, MIT Technology Review, NBC, BBC, CBC and Wired. Leskovec has also authored the Stanford Network Analysis Platform (SNAP, http://snap.stanford.edu), a general purpose network analysis and graph mining library that easily scales to massive networks with hundreds of millions of nodes and billions of edges. You can follow him on Twitter at @jure.

    Anand Rajaraman, Milliways Laboratories
    Anand Rajaraman is a serial entrepreneur, venture capitalist, and academic based in Silicon Valley. He is a Founding Partner of two early-stage venture capital firms, Milliways Labs and Cambrian Ventures. His investments include Facebook (one of the earliest angel investors in 2005), Aster Data Systems (acquired by Teradata), Efficient Frontier (acquired by Adobe), Neoteris (acquired by Juniper), Transformic (acquired by Google), and several others. Anand was, until recently, Senior Vice President at Walmart Global eCommerce and co-head of @WalmartLabs, where he worked at the intersection of social, mobile, and commerce. He came to Walmart when Walmart acquired Kosmix, the startup he co-founded, in 2011. Kosmix pioneered semantic search technology and semantic analysis of social media. In 1996, Anand co-founded Junglee, an e-commerce pioneer. As Chief Technology Officer, he played a key role in developing Junglee's award-winning Virtual Database technology. In 1998, Amazon.com acquired Junglee, and Anand helped launch the transformation of Amazon.com from a retailer into a retail platform, enabling third-party retailers to sell on Amazon.com's website. Anand is also a co-inventor of Amazon Mechanical Turk, which pioneered the concepts of crowdsourcing and hybrid Human-Machine computation. As an academic, Anand's research has focused at the intersection of database systems, the World-Wide Web, and social media. His research publications have won several awards at prestigious academic conferences, including two retrospective 10-year Best Paper awards at ACM SIGMOD and VLDB. In 2012, Fast Company magazine named Anand to its list of '100 Most Creative People in Business'. In 2013, he was named a Distinguished Alumnus by his alma mater, IIT Madras. You can follow Anand on Twitter at @anand_raj.

    Jeffrey David Ullman, Stanford University, California
    Jeffrey David Ullman is the Stanford W. Ascherman Professor of Computer Science (Emeritus) and he is currently the CEO of Gradiance. His research interests include database theory, data mining, and education using the information infrastructure. He is one of the founders of the field of database theory, and was the doctoral advisor of an entire generation of students who later became leading database theorists in their own right. He was the Ph.D. advisor of Sergey Brin, one of the co-founders of Google, and served on Google's technical advisory board. Ullman was elected to the National Academy of Engineering in 1989, the American Academy of Arts and Sciences in 2012, and he has held Guggenheim and Einstein Fellowships. Recent awards include the Knuth Prize (2000), and the Sigmod E. F. Codd Innovations award (2006). Ullman is also the co-recipient (with John Hopcroft) of the 2010 IEEE John von Neumann Medal, for 'laying the foundations for the fields of automata and language theory and many seminal contributions to theoretical computer science'.

Sign In

Please sign in to access your account

Cancel

Not already registered? Create an account now. ×

Sorry, this resource is locked

Please register or sign in to request access. If you are having problems accessing these resources please email lecturers@cambridge.org

Register Sign in
Please note that this file is password protected. You will be asked to input your password on the next screen.

» Proceed

You are now leaving the Cambridge University Press website. Your eBook purchase and download will be completed by our partner www.ebooks.com. Please see the permission section of the www.ebooks.com catalogue page for details of the print & copy limits on our eBooks.

Continue ×

Continue ×

Continue ×

Find content that relates to you

Are you sure you want to delete your account?

This cannot be undone.

Cancel

Thank you for your feedback which will help us improve our service.

If you requested a response, we will make sure to get back to you shortly.

×
Please fill in the required fields in your feedback submission.
×