Skip to main content Accessibility help
×
Hostname: page-component-5db58dd55d-lqwgf Total loading time: 0 Render date: 2026-05-31T05:39:54.690Z Has data issue: false hasContentIssue false

Chapter 3 - Membership

Learning to Become a Competent Data User

Published online by Cambridge University Press:  29 May 2026

Götz Hoeppe
Affiliation:
University of Waterloo

Summary

Educating graduate students aims at making them competent members in a disciplinary community and culture. This chapter identifies PhD student training as a curious process in which instruction and the advancement of science go together. It examines how a PhD student was instructed to tackle a common, though often challenging, problem of science with large datasets: calibrating a new dataset and combining it with data from a different source for analysis. By following this student around over two years as she achieved this goal, the author learnt how she became a competent member in the community and culture of extragalactic astronomy. Conversely, it is possible to gain insights into what makes combining scientific datasets often so challenging. As such, this chapter applies the tactics of Chapter 2 – take a problem of data-intensive science, consider how it is “staffed” in a specific case, and follow its management ethnographically – to another setting. This account serves as a resource for the next two chapters, on uses of diagrams and mundane reasoning in research with large datasets.

Information

Figure 0

Figure 3.1 Flowchart of the MAMBO research group’s data reductions as taken from a conference presentation of one of its members. It depicts the work as a sequence of operations. First, a set of exposures is taken of specific selected fields in the sky through a series of color filters (B, R, I, z, H). These are then used to algorithmically detect objects, take photometric measurements at the object positions, and classify objects by identifying the best-fitting match from a library of template spectra. The results are object positions, radiation flux densities, classifications of the object type (star, galaxy), and estimates of the photometric redshift, a measure of cosmic distance.Note: The online version shows the colors of the original figure.

(Reproduced with permission by the author)
Figure 1

Figure 3.2 Scheme depicting basic steps of the MAMBO team’s spectral energy distribution template-fitting technique. First, exposures taken through each color filter are processed and calibrated with standard data-reduction procedures (a). In the resulting images, objects are detected algorithmically and assigned catalog numbers, their fluxes are measured, converted into magnitudes, and saved in a table (b). These can then be plotted as a spectral energy distribution (c), wherein measurements taken with the telescope in Chile (d) join those taken with the telescope in Spain (e). Crosses indicate measurement errors. Next, the best-fitting spectral energy distribution template, shown here as a continuous line, is selected algorithmically from the template library (f). Using the best-fitting template, the object is classified as a “star” or “galaxy,” and, in case of the latter, its redshift and physical parameters, such as mass and luminosity, are inferred from this fit and entered in the catalog (g).Note: The online version shows the colors of the original figure.

Figure 2

Figure 3.3 At a team meeting, Otfried is concerned about an offset visible in some galaxy spectral energy distribution fits that Nadine had prepared. This diagram depicts measurements of flux density over wavelength for one of these galaxies. As in Figure 3.2, dots with error bars represent the measurements and the continuous line represents the template model.Note: The online version shows the colors of the original figure.

(Photograph: Götz Hoeppe)
Figure 3

Figure 3.4 Following the group meeting, Nadine and Otfried assess the flatfield frames visually.Note: The online version shows the colors of the original figure.

(Photograph: Götz Hoeppe)
Figure 4

Figure 3.5 Guided by Otto, Nadine assesses the flatfield frame quantitatively by plotting differences of her brightness measurements of stars in the field with those of the public 2MASS catalog.Note: The online version shows the colors of the original figure.

(Photograph: Götz Hoeppe)
Figure 5

Figure 3.6 Summarized on Otto’s office blackboard, a group discussion yields a recipe to correct for the flatfield ring that is now considered an artifact due to scattered light in the infrared camera. A sequence of arithmetic operations to the flatfield frames specifies its remedy (steps 1 to 5, as seen on the right).Note: The online version shows the colors of the original figure.

(Photograph: Götz Hoeppe)
Figure 6

Figure 3.7 Histogram in Nadine’s draft manuscript, showing the number of galaxies as a function of redshift for all objects detected in the A2713 field. This printout includes Peter’s handwritten notes as made prior to the group meeting (unlabeled) as well as the marks he added during the discussion with Nadine and Otfried as described in the text (labeled A and B).Note: The online version shows the colors of the original figure.

Figure 7

Figure 3.8 Revised histogram as it appeared in the published paper, showing on the right the number of galaxies as a function of redshift for all objects in the same field as in Figure 3.7, selected according to refined criteria from Nadine’s optical plus near-infrared galaxy catalog. The dip seen around z = 0.5 in the earlier version is less prominent due to the adoption of different selection criteria for the galaxy sample. On the left, four similar histograms are now included, each depicting one of four fields that the MAMBO team observed. These histograms use the “old” (optical-only) dataset. They illustrate “cosmic variance,” that is, how much the numbers and volume densities of galaxies vary when observed with the same technique in different parts of the sky.

(© American Astronomical Society. Reproduced with permission)
Figure 8

Figure 3.9 Journal articles accompanying the public release of deep field survey data often contain plots characterizing a new dataset and comparing it with existing ones. Number counts, showing the number of detected objects in a field (vertical axis) as a function of flux density or magnitude (horizontal axis), illustrate the degree of the datasets’ “sameness.” As such these diagrams assert and specify the reliability of new observations.

(Figure 12 of Capak et al. 2007, 114; © American Astronomical Society. Reproduced with permission.)

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Membership
  • Götz Hoeppe, University of Waterloo
  • Book: How Data Need People
  • Online publication: 29 May 2026
  • Chapter DOI: https://doi.org/10.1017/9781009686754.005
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • Membership
  • Götz Hoeppe, University of Waterloo
  • Book: How Data Need People
  • Online publication: 29 May 2026
  • Chapter DOI: https://doi.org/10.1017/9781009686754.005
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • Membership
  • Götz Hoeppe, University of Waterloo
  • Book: How Data Need People
  • Online publication: 29 May 2026
  • Chapter DOI: https://doi.org/10.1017/9781009686754.005
Available formats
×