Search results for Knowledge Management, Databases and Data Mining

Excel Basics to Blackbelt

An Accelerated Guide to Decision Support Designs
3rd edition
Elliot Bendoly
Published online:

08 May 2020

Print publication:

28 May 2020
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
This third edition capitalizes on the success of the previous editions and leverages the important advancements in visualization, data analysis, and sharing capabilities that have emerged in recent years. It serves as an accelerated guide to decision support designs for consultants, service professionals and students. This 'fast track' enables a ramping up of skills in Excel for those who may have never used it to reach a level of mastery that will allow them to integrate Excel with widely available associated applications, make use of intelligent data visualization and analysis techniques, automate activity through basic VBA designs, and develop easy-to-use interfaces for customizing use. The content of this edition has been completely restructured and revised, with updates that correspond with the latest versions of software and references to contemporary add-in development across platforms. It also features best practices in design and analytical consideration, including methodical discussions of problem structuring and evaluation, as well as numerous case examples from practice.

Mining of Massive Datasets

3rd edition
Jure Leskovec, Anand Rajaraman, Jeffrey David Ullman
Published online:

16 April 2020

Print publication:

09 January 2020
- Textbook
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the MapReduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream-processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets, and clustering. This third edition includes new and extended coverage on decision trees, deep learning, and mining social-network graphs.

17 - The Statistics behind Online Controlled Experiments
from Part V - Advanced Topics for Analyzing Experiments
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 185-192
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Why you care: Statistics are fundamental to designing and analyzing experiments.

Index
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 266-272
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

3 - Techniques
from Part I - Conceptual Introductions
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 66-96
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

There are many tools and techniques that a data scientist is expected to know or acquire as problems arise. Often, it is hard to separate tools and techniques. One whole section of this book (four chapters) is dedicated to teaching how to use various tools, and, as we learn about them, we also pick up and practice some essential techniques. This happens for two reasons. The first one is already mentioned here – it is hard to separate tools from techniques. Regarding the second reason – since our main purpose is not necessarily to master any programming tools, we will learn about programming languages and platforms in the context of solving data problems.

Acknowledgments
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp xxii-xxiv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Acknowledgments
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp xvii-xviii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

14 - Choosing a Randomization Unit
from Part IV - Advanced Topics for Building an Experimentation Platform
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 166-170
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Why you care: The choice of randomization unit is critical in experiment design, as it affects both the user experience as well as what metrics can be used in measuring the impact of an experiment. When building an experimentation system, you need to think through what options you want to make available. Understanding the options and the considerations to use when choosing amongst them will lead to improved experiment design and analysis.

About the Author
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp xx-xxi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

10 - Unsupervised Learning
from Part III: - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 290-318
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

References
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 246-265
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

20 - Triggering for Improved Sensitivity
from Part V - Advanced Topics for Analyzing Experiments
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 209-218
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Why you care: Triggering provides experimenters with a way to improve sensitivity (statistical power) by filtering out noise created by users who could not have been impacted by the experiment. As organizational experimentation maturity improves, we see more triggered experiments being run.

Appendix D: Installing and Configuring Tools
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 385-389
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

11 - Hands-On with Solving Data Problems
from Part IV - Applications, Evaluations, and Methods
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 321-353
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

So far in this book we have taken one topic or tool at a time and looked at how we could tackle a given data problem. Now, it is time to start bringing them together to develop a deeper understanding of the nature of data problems and methods, as well as extend our reach and skillset to address new problems that may emerge. There is, of course, no way we could cover all that you would encounter in real life, but we can certainly try to go through a few examples to see where you could take your data science skills.

5 - Python
from Part II - Tools for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 125-160
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Python is a simple-to-use yet powerful scripting language that allows one to solve data problems of varying scale and complexity. It is also the most used tool in data science and most frequently listed in data science job postings as the requirement. Python is a very friendly and easy-to-learn language, making it ideal for the beginner. At the same time, it is very powerful and extensible, making it suitable for advanced data science needs.

4 - UNIX
from Part II: - Tools for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 99-124
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 418-434
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - UNIX
from Part II - Tools for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 99-124
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

While there are many powerful programming languages that one could use for solving data science problems, people forget that one of the most powerful and simplest tools to use is right under their noses. And that is UNIX. The name may generate images of old-time hackers hacking away on monochrome terminals. Or, it may hearken the idea of UNIX as a mainframe system, taking up lots of space in some warehouse. But, while UNIX is indeed one of the oldest computing platforms, it is quite sophisticated and supremely capable of handling almost any kind of computational and data problem. In fact, in many respects, UNIX is leaps and bounds ahead of other operating systems; it can do things of which others can only dream!

8 - Machine Learning Introduction and Regression
from Part III - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 209-234
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

So far, our work on data science problems has primarily involved applying statistical techniques to analyze the data and derive some conclusions or insights. But there are times when it is not as simple as that. Sometimes we want to learn something from that data and use that learning or knowledge to solve not only the current problem but also future data problems. We might want to look at shopping data at a grocery chain, combined with farming and poultry data, and learn how supply and demand are related. This would enable us to make recommendations for investments in both the grocery store and the food industries.

Copyright page
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp viii-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Knowledge Management, Databases and Data Mining

Refine search

Refine search

Actions for selected content:

1911 results in Knowledge Management, Databases and Data Mining

Excel Basics to Blackbelt

Mining of Massive Datasets

17 - The Statistics behind Online Controlled Experiments

Summary

Index

3 - Techniques

Summary

Acknowledgments

Acknowledgments

14 - Choosing a Randomization Unit

Summary

About the Author

10 - Unsupervised Learning

References

20 - Triggering for Improved Sensitivity

Summary

Appendix D: Installing and Configuring Tools

11 - Hands-On with Solving Data Problems

Summary

5 - Python

Summary

4 - UNIX

Index

4 - UNIX

Summary

8 - Machine Learning Introduction and Regression

Summary

Copyright page

Knowledge Management, Databases and Data Mining

Refine search

Refine search

Actions for selected content:

Save Search

1911 results in Knowledge Management, Databases and Data Mining

Excel Basics to Blackbelt

Mining of Massive Datasets

Summary

Summary

Summary

Summary

Summary

Summary

Summary

Summary