Hostname: page-component-89b8bd64d-dvtzq Total loading time: 0 Render date: 2026-05-06T10:20:59.199Z Has data issue: false hasContentIssue false

Data-driven hypothesis generation among inexperienced clinical researchers: A comparison of secondary data analyses with visualization (VIADS) and other tools

Published online by Cambridge University Press:  04 January 2024

Xia Jing*
Affiliation:
Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, USA
James J. Cimino
Affiliation:
Informatics Institute, School of Medicine, University of Alabama, Birmingham, AL, USA
Vimla L. Patel
Affiliation:
Cognitive Studies in Medicine and Public Health, The New York Academy of Medicine, New York City, NY, USA
Yuchun Zhou
Affiliation:
Department of Educational Studies, The Patton College of Education, Ohio University, Athens, OH, USA
Jay H. Shubrook
Affiliation:
Department of Clinical Sciences and Community Health, College of Osteopathic Medicine, Touro University California, Vallejo, CA, USA
Sonsoles De Lacalle
Affiliation:
Department of Health Science, California State University Channel Islands, Camarillo, CA, USA
Brooke N. Draghi
Affiliation:
Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, USA
Mytchell A. Ernst
Affiliation:
Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, USA
Aneesa Weaver
Affiliation:
Department of Public Health Sciences, College of Behavioral, Social and Health Sciences, Clemson University, Clemson, SC, USA
Shriram Sekar
Affiliation:
Electrical Engineering and Computer Science, Russ College of Engineering and Technology, Ohio University, Athens, OH, USA
Chang Liu
Affiliation:
Russ College of Engineering and Technology, Ohio University, Athens, OH, USA
*
Corresponding author: X. Jing, MD, PhD; Emails: xjing@clemson.edu, xia.xjing@gmail.com.
Rights & Permissions [Opens in a new window]

Abstract

Objectives:

To compare how clinical researchers generate data-driven hypotheses with a visual interactive analytic tool (VIADS, a visual interactive analysis tool for filtering and summarizing large datasets coded with hierarchical terminologies) or other tools.

Methods:

We recruited clinical researchers and separated them into “experienced” and “inexperienced” groups. Participants were randomly assigned to a VIADS or control group within the groups. Each participant conducted a remote 2-hour study session for hypothesis generation with the same study facilitator on the same datasets by following a think-aloud protocol. Screen activities and audio were recorded, transcribed, coded, and analyzed. Hypotheses were evaluated by seven experts on their validity, significance, and feasibility. We conducted multilevel random effect modeling for statistical tests.

Results:

Eighteen participants generated 227 hypotheses, of which 147 (65%) were valid. The VIADS and control groups generated a similar number of hypotheses. The VIADS group took a significantly shorter time to generate one hypothesis (e.g., among inexperienced clinical researchers, 258 s versus 379 s, p = 0.046, power = 0.437, ICC = 0.15). The VIADS group received significantly lower ratings than the control group on feasibility and the combination rating of validity, significance, and feasibility.

Conclusion:

The role of VIADS in hypothesis generation seems inconclusive. The VIADS group took a significantly shorter time to generate each hypothesis. However, the combined validity, significance, and feasibility ratings of their hypotheses were significantly lower. Further characterization of hypotheses, including specifics on how they might be improved, could guide future tool development.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2024. Published by Cambridge University Press on behalf of The Association for Clinical and Translational Science
Figure 0

Figure 1. Study flow for the data-driven hypothesis generation (IRB, Institutional Review Boards; VIADS, a visual interactive analysis tool for filtering and summarizing large datasets coded with hierarchical terminologies).

Figure 1

Table 1. Profile of eighteen participants

Figure 2

Table 2. Expert panel quality rating results for hypotheses generated by VIADS and control groups

Figure 3

Table 3. Multilevel random intercept modeling results on hypotheses quality ratings for different strategies

Figure 4

Table 4. Multilevel random intercept modeling results on time used to generate hypotheses for different strategies

Figure 5

Table 5. Follow-up questions (verbal) and answers after each study session (all study participants)

Figure 6

Figure 2. Scientific hypothesis generation framework: contributing factors.

Supplementary material: File

Jing et al. supplementary material 1
Download undefined(File)
File 352.9 KB
Supplementary material: File

Jing et al. supplementary material 2
Download undefined(File)
File 3.1 MB
Supplementary material: File

Jing et al. supplementary material 3
Download undefined(File)
File 178.8 KB
Supplementary material: File

Jing et al. supplementary material 4
Download undefined(File)
File 129.4 KB
Supplementary material: File

Jing et al. supplementary material 5
Download undefined(File)
File 67.8 KB
Supplementary material: File

Jing et al. supplementary material 6
Download undefined(File)
File 471 KB
Supplementary material: File

Jing et al. supplementary material 7
Download undefined(File)
File 97.1 KB
Supplementary material: File

Jing et al. supplementary material 8
Download undefined(File)
File 117.8 KB