Hostname: page-component-7f64f4797f-kjzhn Total loading time: 0 Render date: 2025-11-09T20:12:51.466Z Has data issue: false hasContentIssue false

Empowering professionals: An intensive short course on fundamentals of clinical data science

Published online by Cambridge University Press:  10 October 2025

Richard F. Ittenbach*
Affiliation:
Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital, University of Cincinnati College of Medicine , Cincinnati, OH, USA
Brian McCourt
Affiliation:
Duke Clinical Research Institute, Duke University, Durham, NC, USA
Maurizio Macaluso
Affiliation:
Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital, University of Cincinnati College of Medicine , Cincinnati, OH, USA
*
Corresponding author: R.F. Ittenbach; Email: richard.ittenbach@cchmc.org
Rights & Permissions [Opens in a new window]

Abstract

Clinical data science, like the broader discipline of all data science, has quickly grown from obscurity only a few decades ago to one of the fastest growing specialties in biomedical research today. Yet, the education and training of the workforce has not kept pace with the growth of the field, the complexity of science, or the needs of the profession. The purpose of this paper is to provide a template for an intensive short course on fundamentals of clinical data science that meets the needs of working professionals in academic, industry, and government research settings. Care will be taken to introduce students to essential roles, responsibilities, and practice patterns within the field, the foundational components from which they come, and many of the soft skills needed for professional practice and advancement in the field today. The course is designed as an evidence-based, immersive learning experience taught over a 5-day period on a university campus, taught using principles of best educational practice and multiple modalities, to assure optimal interaction and engagement throughout the week. This template may be reproduced by any institution interested in and capable of offering such a program.

Information

Type
Special Communication
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Association for Clinical and Translational Science

Introduction

Clinical data science, like the broader discipline of all data science, has quickly grown from obscurity only a few decades ago to one of the fastest growing specialties in biomedical research today [1]. Yet, the education and training of the workforce has not kept pace with the growth of the field, the complexity of the science, or the needs of the profession. A new approach to education is desperately needed for the field, and its dedicated professionals, to reach their full potential today [Reference Ittenbach2].

The pace, complexity, and sophistication of clinical research continues to expand well beyond what was imagined only a few decades ago [Reference Zerhouni3, Reference Austin4]. With a typical Phase III clinical trial now having 263 procedures, 22 endpoints, and 3.6M data points (up from 7 endpoints, 100 procedures, and 0.5M data points in 2005) [Reference Smith, Bilke, Pretorius and Getz5] and with 60% of clinical sites now using more than 20 overlapping software applications at any one time [Reference Hooker, Johnson, Bokus, Silvester and Curtis6], even the more rigorously trained professionals find it challenging to keep up. Concomitant rises in ethical issues, cost of healthcare, and advancing technologies have raised the stakes even further, necessitating that today’s clinical data scientists’ understanding goes well beyond simply managing the data to now include the usability of the data, their merits as a scientific tool, and their role in the scientific process [Reference Beller7Reference Weil, Crumpler, Medendorp, Weil and Medendorp9]. The scientific community is now expecting these professionals to be true “scientists.”

While there are many programs training data scientists, there are currently no programs training “clinical” data scientists – at either the graduate or undergraduate levels. Today’s clinical data scientists have generally received their training through a collection of efforts and opportunities [Reference Ittenbach2]. The most rigorous cases are traditional or rebranded statistics or informatics programs, which are strong with respect to analytics, systems development, and technology, but often lack formal training in the data itself and most certainly lack the ability to manage the flow of data through a clinical research study. The more typical cases are employer-driven continuing education programs focusing on the processing of data rather than the theory, rationale, or scientific attributes of the data. Professional associations have offered specialized courses and programs just as individuals have created unique individually-tailored programs; however, these are not reliable mechanisms to train an entire workforce.

A trend that has become apparent in the business community over the past several decades is to bring working professionals together for targeted, brief, intensive, immersive educational experiences [Reference Voller and Honoré10]. Such courses, often referred to as executive education courses, have made in-roads in the business world and traditional university communities, alike [Reference Smith and Keaveney11Reference Nielson, Bittencourt, Presada, Cavalcanti and Berardo13]. According to Mena-Guacas et al., short courses offer benefits not typically available through other instructional formats, such as being taught by professional educators used to delivering the content, relocating students from the typical office and work-day routines to a more typical learning environment, and a holistic integration of content [Reference Mena-Guacas, Chacon, Munar, Ospina and Agudelo14]. The benefits of using an evidence-based framework add to the impact of the experience through an enriched appreciation of content, higher levels of trust in the instructors, and instructors’ ability to package the information in a more usable format for learners [Reference Smith and Keaveney11, Reference Jessani, Hendricks, Nicol and Young15].

As with all investments, efficacy and potential for a return on one’s investment remains a concern. Beginning with the managerial revolution of business schools in the United States and Harvard Business School’s Advanced Management Program as a model for other programs in 1945, programs are now taught worldwide across a wide range of formats [Reference Amdam16, Reference Amdam17]. In a survey of 52 corporations conducted by the Consortium for University-based executive education programs, 92% of companies reported using participant feedback data sources to evaluate their executive education programs and 49% use it in career tracking and promotions, with 60% connecting executive education to their overall strategy [Reference Cataldo, Stilliard and Topping18].

Given the pace and complexity of clinical research today, the lack of instructional programs globally, and the general lack of educational opportunities for working professionals, the purpose of this paper is to provide a template for an intensive short course on fundamentals of clinical data science that meets the needs of working professionals in academic, industry, and government research settings.

Materials and methods

Expectations of the course

This week-long course was developed with several expectations in mind: first, to map the coursework onto existing curricula and competency frameworks in the professional literature, specifically drawing from the foundational domains of biostatistics, biomedical informatics, biomedical science, regulatory science, and the clinical research literature, more generally (see Figure 1) [Reference Succar, Terteryan, Manrique and Pacifici8, 19Reference Read24]. The course, while short, needs to be a snapshot of content shared in a graduate level program in clinical data science “devoted to the measurement, acquisition, care, treatment, and inferencing of clinical research data.” Second, the content must be scaled to the abbreviated nature of the program and the accelerated pace of the instruction, but still prioritize weaving the information into a coherent whole, with each module building on the ones that came before it and laying the foundation for the ones that come after it (vertical articulation) [Reference Hlebowitsh25]. Third, for it to truly be a clinical data science course, it needs to show fidelity to “science” and to the scientific method. Furthermore, executive education faculty are encouraged to incorporate experiential learning activities to better integrate knowledge with applications of the material in typical work settings.

Figure 1. Clinical data science components.

As with all scientific disciplines, there is core content to help define the discipline and supporting content that connects the new knowledge base to the other sciences. In the case of clinical data science, what began as an operational subspeciality to help get drugs and devices to market faster and with better reliability, effectiveness, and safety has evolved into a scientific specialty in its own right, one that draws from the scientific knowledge bases of related, foundational disciplines [Reference Banach, Fendt, Proeve, Plummer, Qureshi and Limaye26, Reference Meadows27]. Consequently, a short course in clinical data science should consist of core knowledge representing the essential components of the field such as first principles, roles and responsibilities, and fundamentals of design and implementation, while also orienting learners to core tenets of the disciplines on which the new field is based (biostatistics, biomedical informatics, biomedical science, and regulatory science) [Reference Ittenbach2].

Pedagogy

Implicit within the course is a commitment to adhering to principles of best educational practice. That is, to meet the students where they are professionally, to engage them using multiple instructional formats (lecture, case study, group discussion), and to introduce them to exemplary models of literature and professional practice. While all instructional modules are organized around specific learning objectives, not all learning will come from structured exercises. Some will come from the less structured parts of the course by design – the breaks, meals, and networking. The course should offer an educational experience that is greater than the sum of its parts and stimulate learning and development once the students leave the course.

To be consistent with the educational characteristics of a graduate program, this intensive, abbreviated course strives to relay content from a full graduate program, just in a more condensed form. If the course is successful, it will move the students toward a deeper understanding of clinical data science. To help guide the curriculum and provide a point of reference for what students will be learning in the short course, we offer seven learning objectives – one from each of our three core modules and one from each of the four foundational disciplines of biostatistics, biomedical informatics, biomedical science, and regulatory science.

With respect to the core modules, the learning objectives begin with mapping a very simple data-flow diagram and process onto the scientific method (Objective 1), to recalling from memory and describing elements of a data management plan (Objective 2), to actually connecting data fields and variables from the study protocol to the statistical analysis plan, to the data management plan – and, then finally, to the study database (Objective 3). Implicit in these core modules is a developmental sequence that takes the learner from the scientific method to the plan for the data to how the fields and variables progress through all phases of the study from both an operational and a scientific standpoint.

Regarding the research courses that support the core content, the learner moves from a recognition and understanding of the impact of measurement bias and imprecision on the scientific process (Objective 4, biostatistics); to formatting data elements using globally-recognized standards (Objective 5, biomedical informatics); to distinguishing among various health conditions of patients and research participants along with the role of diagnosis and intervention in responding to those conditions (Objective 6, biomedical science); and, finally, an understanding and reliance of influential guidelines protecting human subjects in clinical research (Objective 7, regulatory science). Please see Figure 2 for a list of the seven learning objectives for this short course.

Figure 2. Short course learning objectives and evaluation.

As important as the roadmap for learning is, the educational process would not be complete without an evaluation plan. As such, the evaluation mechanism used for the short course will be similar to the one used for our graduate program and includes a two-phase approach to program evaluation. Phase I will include a straightforward pretest-posttest approach to evaluation in which the students are asked a series of questions related to typical work products, immediately prior to and immediately following delivery of the content (see Figure 2 for examples of the work product evaluation). In each case, the assessments will be designed to be succinct, focused, time limited, and based on content used in the course. Phase II will consist of components from the Environment, Pedagogy, Institution, and Course (EPIC) system, which is a standardized (NSF funded) evaluation system of surveys for university-level data science programs – but tailored to our short course rather than that of an entire semester-long course [Reference Unfried, Whitaker, Batackci, Peters, Zapata-Cardona and Fan28].

One challenge will be to find a meaningful but efficient way of incorporating homework into the intensive short course. One strategy that is available to instructors is to have the students bring specific examples of work products with them to the short course, that they can use in the break-out sessions following content-related discussions (e.g., Data Management Plan, abbreviated protocol, data-flow diagram). Foundational concepts can be effectively demonstrated using data science tools that faculty employ in real-world applications. With the widespread availability of online data, online data-collection methods, databases, and analytical tools, these elements can be integrated into case studies, enabling participants to gain a deeper understanding of the material encountered in traditional classroom discussions. Finally, the course must be developed in keeping within the fiscal expectations and limitations of host institutions.

Budgetary considerations

As with all innovative programs, this one must be funded in a way that is not only affordable but is also sustainable and does not deplete resources from other programs and initiatives. As such, we have provided a preliminary budget that details typical costs of such a program (see Table 1).

Table 1. Example budget

Note. catering costs for instructional faculty and staff should be built into the overhead costs.

The proposed budget is divided into four categories: facilities, catering, speaker travel/honoraria, and project management support. As one might imagine, implementing an intensive, 5-day course such as this one requires meticulous planning and coordinated efforts as well as time for follow-up communication and post-course processing. The higher the quality, the more likely students are to attend from great distances, and, as a result, the greater the need for communication, planning, and organizational support. We have proposed the use of a skilled project manager at 30% time for six months. The underpinning value of this program lies in the carefully curated and coordinated content and opportunities for interaction with leaders in the field. Unlike traditional university courses based on a single instructor, a multidisciplinary program such as this one requires the input and synthesis of multiple experts from many areas, driving a significant portion of the budget. Facilities and catering are the most easily quantifiable, and the most easily scaled to align with the expectations and needs of the program. Cost estimates provided in Table 1 are drawn from actual examples of conferences offered at two university medical centers and experienced with on- as well as off-campus continuing education courses and meetings.

Results

The clinical data science course is designed to be a 50-hour course consisting of 19 educational modules taught over an intensive, five-day period. The 19 modules are organized as follows:

  • 5 primary modules associated with core content (overview, roles and responsibilities, design and implementation, field placements, good clinical data management practice)

  • 5 supporting modules associated with foundational research content (biomedical informatics, biostatistics, biomedical science, regulatory science, and clinical research ethics)

  • 4 supporting modules on soft skills needed for successful professional practice (leadership, critical thinking, team science, communication)

  • 5 supporting modules devoted to challenging cases studies.

Day 1 of the program begins with an opening session, which includes a light breakfast, an introduction to the program, faculty, course objectives, housekeeping, and news and notes for the day. The remaining days will begin similarly but be much more focused, beginning with a light breakfast, a recap of the prior day, an introduction to the current day and any news or notes that may be needed. The instructional portion of the day is divided into morning and afternoon sessions, with each half-day beginning with a substantive content session (e.g., Overview of Clinical Data Science) followed by a supporting case study or instruction in a “soft skill” deemed important for professional practice (e.g., critical thinking). A number of different instructional methods will be used to keep delivery fresh and of interest to students with different areas of interest, ranging from the traditional didactic method to hands-on exercises and small group discussions.

When people are brought to campus for an intensive and demanding short course, especially when brought to campus from great distances, having meals available and aligned with the conference are crucial to keeping the students present, focused, and on schedule. As alluded to previously, even the meals are designed to offer the students time to network with faculty, staff, and other students – to be fun and relaxing but still immerse them in the field of clinical data science. Three of the five days will end with a working dinner and a professional speaker from the community (see Figure 3).

Figure 3. Example curriculum.

The most beneficial component of a short, intensive program such as this one is what it can do for current, working professionals who may not have the time or resources to enroll in a more traditional, multiyear program. While the benefits of the course will be shaped by the instructors as much as each student’s commitment to learning, this short course will be designed to equip the students with an integrated knowledge base that extends the foundational content of the parent disciplines. Provided below are examples of ways in which the short course will benefit working professionals, organized by the four respective foundational areas:

Biostatistics

Data science as a distinct discipline is rooted in statistics (or biostatistics) and computer science. As such, the new discipline draws heavily from the analytical side of clinical inquiry – one that is premised upon searching for certainty in a world of unknown, but probabilistic events. Appreciation of the uncertainty that necessarily affects data capture and the importance of a rigorous statistical approach to harnessing the uncertainty of the inferences and the point estimates on which the estimates are based will require a critical understanding of study design and the ways in which the data are measured and collected. The connection among fundamentals of measurement, the statistical tests used to analyze the data, and other supporting principles discussed in most introductory statistics courses will highlight the importance of the rationale, validity, and need for the precision of the biomedical sciences. Such relationships are generally detailed in the core content, Statistical Analysis Plan, and its companion document, Data Management Plan, noted above (Objectives 2, 4).

Biomedical informatics

Information science and the technology that supports it provides the means to ensure scientific rigor to a degree previously deemed unachievable – across all areas of science. Mastery of the data capture software and the programming techniques that support it may allow presenting survey questions with branching logic in a user-friendly format, representing the data entry modules in a table that reproduces a table of events, displaying summary tables reporting which forms have been completed or how many data elements are missing. Alignment of data formats to existing standards will ensure that the data produced are “findable, accessible, interoperable and reusable” (FAIR), enhancing the impact of the clinical research study. The Clinical Data Interchange Standards Consortium (CDISC) has developed standards that cover most aspects of clinical research, from protocol development to data collection, presentation, and analysis. Adherence to CDISC standards is mandatory for submission of data to the U.S. Food and Drug Administration (FDA) (Objectives 3, 5).

Biomedical science

Understanding the research questions that motivate a clinical study, the students will verify how the study protocol operationalizes the scientific objectives into a schedule of events and outlines the procedures needed to capture interventions and endpoints in a synthetic study flow diagram. It is also important, however, for the clinical data scientist have some basis for understanding the health condition being studied, diagnostic strategies used, and avenues of intervention. The mechanism through which the health-related data will be collected, processed, and stored should be specified in a data management plan [Reference Lebedys, Famatiga-Fay, Bhatkar, Johnson, Viswanathan and Zozus29] and its corresponding manual of operations, the roles of key clinical research staff, who in turn will interface with the clinical data managers (Objectives 3, 6).

Regulatory science

The technical aspects of designing and deploying a clinical research study must be embedded in an overarching ethical/legal framework that reinforces the principles of respect for persons, beneficence, and distributive justice specified in the Belmont Report and further developed in the current body of rules and regulations governing research with human subjects [30]. Understanding these principles is indispensable for clinical data scientists, whose technical expertise must be put at the service of the ethical treatment of research participants. Thus, the research protocol must not only reflect the scientific objectives of the study but must align the study procedures and objectives in such a way that review boards can make an independent assessment of the risks and benefits to the participants. The work of the data management team must be inspired by the respect for the autonomy and privacy of the participants and must adhere to all applicable data confidentiality regulations (Objectives 1, 7).

Integration of principles and concepts from the four foundational areas of biostatistics, biomedical informatics, biomedical science, and regulatory science transforms a clinical data manager into a “clinical” data scientist [Reference Wilkinson, Dumontier and Aalbersberg31, Reference Wilkinson, Dumontier, Jan Aalbersberg, Appleton, Axton and Baak32]. A clinical data scientist does not simply use the data as a tool but rather treats the clinical research data as a scientific object in its own right, worthy of study, and further scientific development. The clinical data scientist strengthens the precision of the data to achieve the objectives of a research study and enhances the usability of the data and its power to better serve clinical and translational science, in general [Reference Austin4].

Discussion

Executive education courses offer a new and potentially exciting option for filling the gap in continuing education opportunities for working professionals. Extensive short courses such as the one presented here can delve more deeply into selected domains and still put the information into a broader context. Executive education programs remain a vastly under-utilized form of continuing education fully capable of having an impact at both the individual and institutional levels [Reference Tushman, O’Reilly, Fenollosa, Kleinbaum and McGrath12]. This paper provides a template for an intensive short course on fundamentals of clinical data science that meets the needs of working professionals in academic, industry, and government research settings.

The most notable feature of this course is instruction in the core tenets of clinical data science – the measurement, acquisition, care, treatment, and inferencing of clinical research data [Reference Ittenbach2]. The data are not simply the product of research but the basis for it. This is the material that distinguishes the content from other content received in other settings. It is the material that distinguishes this profession from all others. With respect to the core content noted previously, following is a snapshot of the content covered during the morning sessions:

  • Overview of the field, focusing on the data as the basis for scientific hypotheses and the inferences that come from them: Module 1

  • Roles and responsibilities of today’s data scientists: Module 2

  • Design and implementation, from development of the protocol to database lock and closeout, and every step in between: Module 3

  • Simulation of field-based work: Module 4

  • Good Clinical Practice/Data Management Practice Guidelines: Module 5

All of the above are critical components of a clinical data scientists’ world and routines. Modules 1–5 will be the featured content in the mornings, respectively (see Figure 3).

Research courses

The field of clinical data science is little more than 50 years old. Yet, the field as it is known today actually evolved out of other related fields, initially at the intersection of statistics and computer science, but now with heavy influence from the biomedical and regulatory sciences. The fact that today’s professionals must now deal with complex issues not fully appreciated in years past, requires that today’s clinical data scientists receive instruction in the tenets of clinical research ethics. Whereas core content will be featured in the morning, with the research courses featured in the afternoons:

  • Principles of biomedical informatics, the linking of computational tools and algorithms with biomedical information and data: Module 6.

  • Principles of biostatistics, inferencing, and clinical trial design: Module 7

  • Principles of biomedical science, including that which directly relates to healthcare: Module 8

  • Federal regulatory guidelines and principles that shape biomedical research today, particularly those that pertain to the protection of research subjects: Module 9

  • Clinical research ethics, a topic that is often passed over for the more familiar and comfortable analytical content: Module 10.

Supplemental courses

Recognizing that science is no longer practiced in a vacuum, scientists are now expected to work closely and collaboratively with investigators from multiple disciplines. Consequently, the closing hour of each morning and afternoon will be devoted to either a challenging case study reinforcing principles presented in that day’s session, or supplemental content on one of four soft skills deemed critical to the practice of clinical data science and advancement in the field: critical thinking, communication with colleagues, leadership fundamentals and, finally, team science skills (Modules 11 through 14), with challenging case studies designed to illustrate the challenges in today’s intricate and fast-paced world of biomedical research (Modules 15 through 19). Sessions featuring soft skills and challenging case studies are designed to be offered in alternating sessions (see Figure 3).

Benefits of short courses

For individuals whose training has become outdated, those trained in a different area of expertise, or even those whose supervisors simply want the staff member to have additional training, intensive short courses can provide the education needed for strengthening one’s professional practice. Whereas it is often difficult to devote time to a semester-long course that may be disruptive to a person’s life or work schedule, most colleagues understand dedicating several days to continuing education. And most will agree that 40 to 50 hours of formal instruction by university faculty is substantial enough to improve one’s knowledge base irrespective of level. Whether through supplemental readings, repetition of concepts, increased exposure time, or in-depth dialog between the students and instructors, intensive short courses are likely to produce returns on investment well beyond what employer- and learner-driven webinars can typically offer.

Not surprisingly, students are not the only ones who benefit from these short, intensive courses. Courses such as these can also have tremendous appeal for university faculty. Staffing the courses with experienced university faculty who are familiar with the content makes sense from an administrative standpoint. They know the material and the pitfalls – and, in many cases, have dealt with the barriers to practice. This is their world. What university faculty have not always encountered, however, are the problems of managers and employees in the trenches. Here faculty get to see the problems and pitfalls of practice through a different lens, by seasoned administrators and employees, which can be both exciting and challenging for all involved. Because the courses are short, university faculty are often willing to invest their time in the area and with students who have an acute need to know. The faculty can then return to their normal schedules. In short, university faculty can often draw from both theory and practice to go deeper, quicker than others for whom the teaching is an add-on to their normal work routine.

In addition, universities often appreciate the inclusion of short courses to their portfolio of classes. These courses often bring non-traditional students to campus, expanding the exposure of the program and the university to new groups of potentially interested students. In addition, these courses have the potential to put university faculty and staff in direct contact with industry professionals, opening opportunities for other forms of collaboration, consultation, and engagement. And why not expand the portfolio of learning opportunities to these new groups of students? The best data available suggest that universities currently account for less than 1% of the executive education market – a market that universities are most qualified to compete in – and, one for which there is ample room for growth in this profoundly important area of professional development [Reference Voller and Honoré10, Reference Lloyd and Newkirk33]. Intensive short courses and the opportunities for interaction mean that industry professionals also have access to new faculty with whom they can interact with when needed.

As noted previously, faculty should strive to incorporate a number of different instructional models in their teaching. As working professionals know, professional education classes can often be long and boring. But, lessons learned from the best elementary and secondary teachers suggest that active engagement is the best way to learn and retain information. For this reason, it is recommended that short course faculty balance traditional lectures with small group discussions, as well as some activity-based assignments to allow the students to shape the direction of the content as well as be actively engaged with the material. An example of an activity-based assignment is to create a data-flow diagram, data-collection form, example code, or output using public datasets and freely accessible tools. This blend will likely need to be adjusted frequently based on the faculty’s strengths and the needs of the content. Asking students to bring examples from their prior work experience into activities can be a highly valuable way of engaging students and alignment to individual learning goals.

Budgetary considerations

Education costs. More importantly, though, investments in education represent the values of a person, the community, and the profession. Intensive short courses such as the one presented here begin to make the content available to a much wider audience than ever before, working professionals from a broad range of backgrounds. The week-long commitment requires time away from family, work, and typical routines, but elevates the knowledge and skills of those who engage.

The course brings together and packages information that is not otherwise available to the professional community who can directly benefit. If the students leave the course being able to achieve the objectives listed here and begin operating at levels substantially above their previous levels of performance, everyone will benefit – the student, the studies they are working on, the employees they supervise and work with, the organizations in which they work, and science more generally. Investment in education pays dividends well beyond the cost of the instruction. But education does indeed cost; the more essential the education, the greater the cost. This program is an example. Organizations committed to these processes, and their outcomes may replicate this model, its purpose, and scope.

The budget provided in Table 1 is generated from conference expense sheets designed to increase professional education in biomedical research at the authors’ institutions. We are now applying that model to clinical data science. The short course presented here proposes a budget of approximately $140,000, which not all organizations will be able or even interested in offering. It is not a formal prescription for a one-size-fits-all course but may be scaled up or down to meet the needs of the students, sponsors, and institutions involved. There are always ways to trim expenses, and to scale the information to the budgets and resources available, either by using more local, less experienced instructors, offering the program in less expensive venues, and/or not providing snacks and meals. A piece-meal, fragmented system of webinars and short courses has been used with this segment of the workforce for decades, it is now time to offer them more. This is a profession that oversees increasingly complex biomedical data without any formal training – it is time to change that model – and offer them more training that is commensurate with their importance to the field.

No revenue projections were provided to allow institutions to gauge their own rates of return due to differences in what they are willing and able to offer. The intensive short course presented here is designed for industry-leading organizations and workforces willing to invest in training at the highest levels. For purposes of completeness, a registration fee of $3,500 for 40 students would cover the costs of the course described here, while a registration fee of $4,370 for the same 40 students would return a 25% margin for the host institutions.

Additional factors will also need to be considered such as student versus practicing professional rates as well as early bird and/or regular registration rates. Even deadlines for drop-outs can impact one’s budget markedly after expenses have been committed, so establishing firm drop-out deadlines can provide a cushion should drop-outs come too late for others to register. In addition, host institutions will need to decide if they wish to pursue sponsorship from educational and/or commercial organizations and whether they will want to offer continuing education credit. Such factors are usually institution-specific and were not factored into this model. As with all continuing education, not all organizations will be able or interested in undertaking this type of training for their community. This entire model can be scaled based on the needs of an organization.

Evaluation

Care should be taken to evaluate not only the delivery of information (pedagogy) but the course’s timeliness and receptivity by the students to keep the material fresh and useful for the learners. The specifics of the evaluation tool should be up to the course organizers; however, one key component should be whether the students can accurately and effectively master the seven objectives presented in Figure 2. Other process-oriented questions on planning such as content, meals, and overall costs may be included, as well. The evaluation plan described here should be designed a priori, prior to the launching of the program, to help in determining its effectiveness and planning for subsequent instruction [Reference Lockyer, Ward and Toews34].

Conclusion

The purpose of this paper was to provide a template for an intensive short course on fundamentals of clinical data science that meets the needs of working professionals in academic, industry, and government research settings. The 50-hour, 19 module course is divided into three sets of instructional modules: core content, supporting research content, and supplemental modules. Whereas the first set of modules is designed to instruct on the core content of the profession, the research modules are designed to convey information that serves as the basis for the new field, and, finally, the supplemental modules are designed to guide instruction on challenging case studies as well as many of the soft skills needed for practice as a clinical data scientist today.

Acknowledgement

The authors thank Ms. Karen Whyte, MSLIS, MA of Cincinnati Children’s Hospital, for her insight and help assisting us with the literature search.

Author contributions

Richard F. Ittenbach: Conceptualization, Methodology, Project administration, Supervision, Writing – original draft, Writing – review and editing; Brian McCourt: Conceptualization, Investigation, Methodology, Writing – original draft, Writing – review and editing; Maurizio Macaluso: Conceptualization, Investigation, Methodology, Writing – original draft, Writing – review and editing.

Funding statement

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Competing interests

The authors have no conflicts of interest to declare.

References

U.S. Department of Labor. Occupational outlook handbook. U.S. Government Printing Office. (https://www.bls.gov/ooh/fastest-growing.htm) Accessed February 11, 2025.Google Scholar
Ittenbach, RF. From clinical data management to clinical data science: time for a new educational model. Clin Transl Sci. 2023;16:13401351. doi: 10.1111/cts.13545.Google Scholar
Zerhouni, EA. The NIH roadmap. Sci. 2003;302:6372.Google Scholar
Austin, CP. Opportunities and challenges in translational science. Clin Transl Sci. 2021;14:16291647.Google Scholar
Smith, Z, Bilke, R, Pretorius, S, Getz, K. Protocol design variables highly correlated with, and predictive of, clinical trial performance. Ther Innov Regul Sci. 2022;56:333345.Google Scholar
Hooker, D, Johnson, J, Bokus, T, Silvester, R, Curtis, J. Clinical trial technologies. Charlotte, NC: Bourne Partners Market Research Report, 2025. https://bourne-partners.com/wp-content/uploads/2025/09/Bourne-Partners-Clinical-Trial-Technology-Report-Jan-2025.pdf. Accessed October 23, 2025.Google Scholar
Beller, E. Clinical data management: challenges in training. Health Info Manag. 1996;26:1719.Google Scholar
Succar, T, Terteryan, A, Manrique, K, Pacifici, E. Optimizing clinical trial education in academia. Clin Res. 2024;38:615.Google Scholar
Weil, SA, Crumpler, A, Medendorp, SV, Weil, SA, Medendorp, SV. The implementation of a data learning series focused on clinical development teams in a contract research organization. J Soc Clin Data Manag. 2022;2:112. doi: 10.47912/jscdm.39.Google Scholar
Voller, S, Honoré, S. Innovation in Executive Development: A Case-based Study of Practice in International Business Schools. Ashridge, 2008.Google Scholar
Smith, MA, Keaveney, SM. A technical/strategic paradigm for online executive education. Decis Sci J Innov Edu. 2017;15:82100.Google Scholar
Tushman, ML, O’Reilly, C, Fenollosa, A, Kleinbaum, AM, McGrath, D. Relevance and rigor: executive education as a lever in shaping practice and research. Acad Manag Learn Edu. 2007;6:345362.Google Scholar
Nielson, FA, Bittencourt, JP, Presada, WA, Cavalcanti, C, Berardo, BM. Pedagogical Innovation: Best Practices Through The Perspectives Of Some Major Business Schools Around The World . https://uniconexed.org/wp-content/uploads/2022/02/UNICON-Research-Report-2019-Pedagogical-Innovation-Best-Practices.pdf. Published 2019. Accessed October 23, 2025.Google Scholar
Mena-Guacas, AF, Chacon, MF, Munar, AP, Ospina, M, Agudelo, M. Evolution of teaching in short-term courses: a systematic review. Heliyon. 2023;9(6):115.Google Scholar
Jessani, NS, Hendricks, L, Nicol, L, Young, T. University curricula in evidence-informed decision making and knowledge translation: integrating best practice, innovation, and experience for effective teaching and learning. Front Public Health. 2019;7:313.Google Scholar
Amdam, RP. Executive education and the managerial revolution: the birth of executive education at Harvard Business School. Bus Hist Rev. 2016;90:671690.Google Scholar
Amdam, RPS. Executive Education. Oxford Research Encyclopedia of Business and Management. Oxford University Press, 2020.Google Scholar
Cataldo, P, Stilliard, B, Topping, P. ROI on Executive Education: Revisiting the Past and Looking to the Future. UNICON Research Report. UNICON, 2018.Google Scholar
Joint Task Force for Clinical Trial Competency. Domains and leveled core competencies. Multi-Regional Clinical Trials Center of Brigham and Womens Hospital and Harvard. (https://mrctcenter.org/clinical-trial-competency/framework/domains/) Accessed March 6, 2022.Google Scholar
Valenta, AL, Berner, ES, Boren, SA, et al. AMIA Board White Paper: AMIA. 2017 core competencies for applied health informatics education at the master’s degree level. J Am Med Info Assoc. 2018;25:16571668.Google Scholar
Zozus, MN, Lazarov, A, Smith, LR, et al. Analysis of professional competencies for the clinical research data management profession: implications for training and professional certification. J Am Med Info Assoc. 2017;24:737745.Google Scholar
Takata, M, Miyaji, T, Hayashi, Y, Sanada, S, Yamaguchi, T, Takata, M. Analysis of core competencies for the clinical data management profession in Japan. J Soc Clin Data Manag. 2024;4:19. doi: 10.47912/jscdm.355.Google Scholar
Sonstein, SA, Jones, CT. Joint task force for clinical trial competency and clinical research professional workforce development. Front Pharmacol. 2018;16(9):1148.Google Scholar
Read, KB. Adapting data management education to support clinical research projects in an academic medical center. J Med Libr Assoc: JMLA. 2019;107:89.Google Scholar
Hlebowitsh, PS. Designing the School Curriculum. Allyn & Bacon, 2005.Google Scholar
Banach, MA, Fendt, KH, Proeve, J, Plummer, D, Qureshi, S, Limaye, N. Clinical data management in the United States: Where we have been and where we are going. J Soc Clin Data Manag. 2021;1(3):16. doi: 10.47912/jscdm.61.Google Scholar
Meadows, M. Promoting safe and effective drugs for 100 years. FDA Consumer Magazine. 2006;40(1):1420.Google Scholar
Unfried, A, Whitaker, D, Batackci, L, et al. The Big Picture: A Family of Instruments for Understanding University-Level Statistics and Data Science Attitudes. In: Peters, S, Zapata-Cardona, LBF, Fan, A, eds. 11th International Conference on Teaching Statistics (ICOTS 2022). Rosario, Argentina: National Science Foundation, 2022. https://doi.org/10.52041/iase.icots11.T8A1. Accessed October 23, 2025.Google Scholar
Lebedys, E, Famatiga-Fay, C, Bhatkar, P, Johnson, D, Viswanathan, G, Zozus, MN. Data management plan. J Soc Clin Data Manag. 2021;1:120. doi: 10.47912/jscdm.X.Google Scholar
Protection of human subjects; Belmont Report: notice of report for public comment. Fed Regist. 1979;44(76):2319123197.Google Scholar
Wilkinson, MD, Dumontier, M, Aalbersberg, IJ, et al. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016;3:19.Google Scholar
Wilkinson, M, Dumontier, M, Jan Aalbersberg, I, Appleton, G, Axton, M, Baak, A. Erratum: addendum: the FAIR guiding principles for scientific data management and stewardship (Scientific Data (2016) 3 (160018)). Sci Data. 2019;6:6.Google Scholar
Lloyd, FR, Newkirk, D. University-based executive education markets and trends. Unpublished manuscript Retrieved from UNICON (University Consortium for Executive Education). 2011;166:2011. Accessed October 23, 2025.Google Scholar
Lockyer, J, Ward, R, Toews, J. Twelve tips for effective short course design. Med Teach. 2005;27:392395.Google Scholar
Figure 0

Figure 1. Clinical data science components.

Figure 1

Figure 2. Short course learning objectives and evaluation.

Figure 2

Table 1. Example budget

Figure 3

Figure 3. Example curriculum.