To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
You want to identify hotels in a city that are underpriced for their location and quality. You have scraped the web for data on all hotels in the city, including prices for a particular date, and many features of the hotels. How can you check whether the data you have is clean enough for further analysis? And how should you start the analysis itself?
Poor-quality measurements are likely to yield meaningless or unrepeatable findings. High-quality measurements are characterised by validity and reliability. Validity relates to whether the right quantity is measured and is assessed by comparing a metric with a gold-standard metric. Reliability relates to whether measurements are repeatable and is assessed by comparing repeated measurements. The accuracy and precision with which measurements are made affect both validity and reliability. A major source of unreliability in behavioural data comes from the involvement of human observers in the measurement process. Where trade-offs are necessary, it is better to measure the right quantity somewhat unreliably than to measure the wrong quantity very reliably. Floor and ceiling effects can make measurements useless for answering a question, even if they are valid and reliable. Outlying data points should only be removed if they can be proved to be biologically impossible or to result from errors.
Interpreting results correctly and communicating them honestly are vital parts of what scientists do. Incorrect interpretation of data often results from avoidable statistical mistakes. Common pitfalls arise from abuse of significance testing, misunderstanding of correlations and overgeneralisation of findings. Publishing peer-reviewed papers in scientific journals is the primary means by which researchers communicate their findings to other scientists. A scientific paper has an established basic format comprising title, abstract, introduction, methods, results and discussion. Open Science practices are an important part of the modern publication process. Non-technical (lay) summaries and press releases are tools for communicating behavioural research to journalists and the public. All science involves potential conflicts of interest, and their influence on scientific communication is an unresolved cause for concern. Several organisations oversee the integrity of science, but ultimately it is the personal responsibility of each individual researcher to behave with openness and integrity.
This chapter looks at cases where those subject to Roman hegemony attempted to throw off Roman control and also where the power of individuals within the state became so contested that it threatened the constitutional integrity of the republic.In the first half coin evidence is used to look at South Italian communities that sided with Hannibal during the Second Punic War, uprisings of enslaved peoples and Roman responses, and the failed attempt by Rome’s former Italian allies to set up a rival federal state.The second half examines what numismatic evidence can tell us about the autocratic ambitions of Marius, Sulla, and Pompey and ends with a close look at how Sulla’s memory was used during the period of Pompey’s ascendency.
Are larger companies better managed? To answer this question, you downloaded data from the World Management Survey. How should you describe the relationship between firm size and the quality of management? In particular, can you describe that with the help of a single number, or an informative graph?
There is a substantial difference in the average earnings of women and men in all countries. You want to understand more about the potential origins of that difference, focusing on employees with a graduate degree in your country. You have data on a large sample of employees with a graduate degree, with their earnings and some of their characteristics, such as age and the kind of graduate degree they have. Women and men differ in those characteristics, which may affect their earnings. How should you use this data to uncover gender difference that are not due to differences in those other characteristics? And can you use regression analysis to uncover patterns of associations between earnings and those other characteristics that may help understand the origins of gender differences in earnings?
Public trust in science depends on scientists behaving legally and ethically. Ethical science is also often better science. To be ethical, research must be of sufficient quality to further scientific understanding and its potential benefits should outweigh the risks of harm to subjects or other stakeholders. All research must also be lawful. Conducting a harm–benefit analysis is central to ensuring that ethical standards are maintained in research and is required for the majority of behavioural studies. Formal ethical approval must be obtained before starting to collect data. Research on animals should minimise animal suffering by following the 3Rs principles of replacement, reduction and refinement. Humane end points should be used to limit unnecessary suffering. Research on humans should respect the autonomy and rights of participants and will generally require informed consent, the right to withdraw and debriefing. Deception is potentially harmful and should only be used following careful consideration.
You need to predict rental prices of apartments using various features. You don’t know that the various features may interact with each other in determining price, so you would like to use a regression tree. But you want to build a model that gives the best possible prediction, better than a single tree. What methods are available that keep the advantage of regression trees but give a better prediction? How should you choose from those methods?
This chapter explores the diverse ways in which coins serve as ‘monuments in miniature’, commemorating a wide variety of aspects of Roman public life.The first section uses two case studies to exemplify the different types of interactions of individuals, families, and the state seen through the coins.The first looks at the coinage produced over three generations by the Marcii Philippi; the second looks at the diversity of commemorative strategies used within the divisive years 56-55 BCE.The second section looks at how the Romans conceived of their empire as proof of divine favor.This type of ideology is evident in their foundation legends, how Rome is personified, the importance of priesthoods to individual and family status, and how military victories are themselves the subject of religious thanksgiving.
You want to predict rental prices of apartments in a big city using their location, size, amenities, and other features. You have access to data on many apartments with many variables. You know how to select the best regression model for prediction from several candidate models. But how should you specify those candidate models to begin with? In particular, which of the many variables should they include, in what functional forms, and in what interactions? More generally, how can you make sure that the candidates include the truly good predictive models?
High-quality behavioural data can be recorded using cheap and simple technologies such as checks sheets and sound recorders. Advances in technologies for data recording have made big data available to behavioural scientists, which in turn has stimulated the development of AI technologies for automated data processing. A data pipeline describes the workflow of data recording, processing and analysis, including details of the technologies used in each step. The choice of technology for capturing behavioural data will depend on the research question and the resources available, the quantity of data required, where the data is to be collected, the amount of interaction with subjects and the likely impact of the technology on the subjects and their environment. Data that are initially recorded in a relatively rich form will require subsequent processing to code behavioural metrics. Coding of data can be either manual or automated using rules-based approaches and machine learning.
You have a car that you want to sell in the near future. You want to know what price you can expect if you were to sell it. You may also want to know what you could expect if you were to wait one more year and sell your car then. You have data on used cars with their age and other features, and you can predict price with several kinds of regression models with different right-hand-side variables in different functional forms. How should you select the regression model that would give the best prediction?