Search results for Knowledge Management, Databases and Data Mining

Index
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 418-434
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - UNIX
from Part II - Tools for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 99-124
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

While there are many powerful programming languages that one could use for solving data science problems, people forget that one of the most powerful and simplest tools to use is right under their noses. And that is UNIX. The name may generate images of old-time hackers hacking away on monochrome terminals. Or, it may hearken the idea of UNIX as a mainframe system, taking up lots of space in some warehouse. But, while UNIX is indeed one of the oldest computing platforms, it is quite sophisticated and supremely capable of handling almost any kind of computational and data problem. In fact, in many respects, UNIX is leaps and bounds ahead of other operating systems; it can do things of which others can only dream!

8 - Machine Learning Introduction and Regression
from Part III - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 209-234
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

So far, our work on data science problems has primarily involved applying statistical techniques to analyze the data and derive some conclusions or insights. But there are times when it is not as simple as that. Sometimes we want to learn something from that data and use that learning or knowledge to solve not only the current problem but also future data problems. We might want to look at shopping data at a grocery chain, combined with farming and poultry data, and learn how supply and demand are related. This would enable us to make recommendations for investments in both the grocery store and the food industries.

Copyright page
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp viii-viii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

10 - Unsupervised Learning
from Part III - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 290-318
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

In the previous chapter, we saw how to learn from data when the labels or true values associated with them are available. In other words, we knew what was right or wrong and we used that information to build a regression or classification model that could then make predictions for new data. Such a process fell under supervised learning. Now, we will consider the other big area of machine learning where we do not know true labels or values with the given data, and yet we will want to learn the underlying structure of that data and be able to explain it. This is called unsupervised learning.

Preface
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp xv-xix
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Part IV - Applications, Evaluations, and Methods
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 319-378
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

19 - The A/A Test
from Part V - Advanced Topics for Analyzing Experiments
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 200-208
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Why you care: Running A/A tests is a critical part of establishing trust in an experimentation platform. The idea is so useful because the tests fail many times in practice, which leads to re-evaluating assumptions and identifying bugs.

9 - Supervised Learning
from Part III: - Machine Learning for Data Science
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 235-289
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Appendix F: Using Cloud Services
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 393-406
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

4 - Experimentation Platform and Culture
from Part I - Introductory Topics for Everyone
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 58-78
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

As discussed in Chapter 1, running trustworthy controlled experiments is the scientific gold standard in evaluating many (but not all) ideas and making data-informed decisions. What may be less clear is that making controlled experiments easy to run also accelerates innovation by decreasing the cost of trying new ideas, as the quotation from Moran shows above, and learning from them in a virtuous feedback loop. In this chapter, we focus on what it takes to build a robust and trustworthy experiment platform. We start by introducing experimentation maturity models that show the various phases an organization generally goes through when starting to do experiments, and then we dive into the technical details of building an experimentation platform.

Reviews
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp ii-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

3 - Techniques
from Part I: - Conceptual Introductions
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 66-96
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

2 - Data
from Part I: - Conceptual Introductions
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp 37-65
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Part I - Introductory Topics for Everyone
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 1-78
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

9 - Ethics in Controlled Experiments
from Part II - Selected Topics for Everyone
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 116-124
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Why you care: Understanding the ethics of experiments is critical for everyone, from leadership to engineers to product managers to data scientists; all should be informed and mindful of the ethical considerations. Controlled experiments, whether in technology, anthropology, psychology, sociology, or medicine, are conducted on actual people. Here are questions and concerns to consider when determining when to seek expert counsel regarding the ethics of your experiments.

Dedication
Chirag Shah, University of Washington
Book:

A Hands-On Introduction to Data Science

Published online:

01 February 2020

Print publication:

02 April 2020, pp v-vi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Part III - Complementary and Alternative Techniques to Controlled Experiments
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 125-150
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Part V - Advanced Topics for Analyzing Experiments
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 183-245
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

21 - Sample Ratio Mismatch and Other Trust-Related Guardrail Metrics
from Part V - Advanced Topics for Analyzing Experiments
Ron Kohavi, Diane Tang, Ya Xu
Book:

Trustworthy Online Controlled Experiments

Published online:

13 March 2020

Print publication:

02 April 2020, pp 219-225
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

Why you care: Guardrail metrics are critical metrics designed to alert experimenters about violated assumptions. There are two types of guardrail metrics: organizational and trust-related. Chapter 7 discusses organizational guardrails that are used to protect the business, and this chapter describes the Sample Ratio Mismatch (SRM) in detail, which is a trust-related guardrail. The SRM guardrail should be included for every experiment, as it is used to ensure the internal validity and trustworthiness of the experiment results. A few other trust-related guardrail metrics are also described here.

Knowledge Management, Databases and Data Mining

Refine search

Refine search

Actions for selected content:

1835 results in Knowledge Management, Databases and Data Mining

Index

4 - UNIX

Summary

8 - Machine Learning Introduction and Regression

Summary

Copyright page

10 - Unsupervised Learning

Summary

Preface

Part IV - Applications, Evaluations, and Methods

19 - The A/A Test

Summary

9 - Supervised Learning

Appendix F: Using Cloud Services

4 - Experimentation Platform and Culture

Summary

Reviews

3 - Techniques

2 - Data

Part I - Introductory Topics for Everyone

9 - Ethics in Controlled Experiments

Summary

Dedication

Part III - Complementary and Alternative Techniques to Controlled Experiments

Part V - Advanced Topics for Analyzing Experiments

21 - Sample Ratio Mismatch and Other Trust-Related Guardrail Metrics

Summary

Knowledge Management, Databases and Data Mining

Refine search

Refine search

Actions for selected content:

Save Search

1835 results in Knowledge Management, Databases and Data Mining

Summary

Summary

Summary

Summary

Summary

Summary

Summary