Policy Significance Statement
The known: The use of large language models (LLMs) has increased rapidly since the introduction of ChatGPT in November 2022. The new: Most survey respondents are aware of, if not using, LLMs in their work across our hospital, research, and university campus. Diverse uses were reported, including generating or editing text and exploring ideas. There were varying attitudes towards LLMs. Perceived risks included privacy and security risks. A key perceived opportunity was increased efficiency. The implications: LLM tools are already widely used on our campus, highlighting the need for guidelines and governance to keep up with practice.
1. Introduction
Large language models (LLMs) are computational models trained on huge amounts of text to recognize and mimic nuanced patterns in human language, allowing them to receive multimodal prompts and generate responses perceived to approximate human performance on many tasks (Bender and Koller, Reference Bender and Koller2020; Naveed et al., Reference Naveed, Khan, Qiu, Saqib, Anwar, Usman, Akhtar, Barnes and Mian2024). The performance, public availability, and awareness of LLMs increased dramatically after the introduction of ChatGPT in November 2022 (Naveed et al., Reference Naveed, Khan, Qiu, Saqib, Anwar, Usman, Akhtar, Barnes and Mian2024), leading to extensive use in professional and learning settings. This has led to unprecedented challenges in ensuring LLMs are used in responsible and ethical ways (e.g. Kasneci et al., Reference Kasneci, Sessler, Küchemann, Bannert, Dementieva, Fischer, Gasser, Groh, Günnemann, Hüllermeier, Krusche, Kutyniok, Michaeli, Nerdel, Pfeffer, Poquet, Sailer, Schmidt, Seidel and Kasneci2023; Sahoo et al., Reference Sahoo, Plasek, Xu, Uzuner, Cohen, Yetisgen, Liu, Meystre and Wang2024), accentuated by their typically opaque nature and frequently undocumented use in workplaces (Barman et al., Reference Barman, Wood and Pawlowski2024). The use of these tools presents opportunities for efficiencies and innovations in administrative, clinical, and research settings. However, clinical, research, and education applications of LLMs have raised concerns, including in confidentiality and privacy, inaccuracies, ‘hallucinations’ (generation of content that is not real), medical liability, biases embedded within models and a lack of accountability or transparency in how LLMs make decisions (Abràmoff et al., Reference Abràmoff, Tarver, Loyo-Berrios, Trujillo, Char, Obermeyer, Eydelman and Maisel2023; Chin et al., Reference Chin, Afsar-Manesh, Bierman, Chang, Colón-Rodríguez, Dullabh, Duran, Fair, Hernandez-Boussard, Hightower, Jain, Jordan, Konya, Moore, Moore, Rodriguez, Shaheen, Snyder, Srinivasan and Ohno-Machado2023; Zaretsky et al., Reference Zaretsky, Kim, Baskharoun, Zhao, Austrian, Aphinyanaphongs, Gupta, Blecker and Feldman2024).
The Melbourne Children’s Campus is a fully integrated paediatric academic hospital and research institute. While anecdotally it appeared LLMs were used frequently throughout the Campus in clinical, research, and teaching contexts, we did not previously have evidence of how prevalent this was, in exactly which contexts and for what purposes LLMs were used, or attitudes towards LLMs in our clinical care, research, and teaching contexts. Such evidence would help support better governance of LLM and broader Generative Artificial Intelligence (GenAI) use and integration across our Campus. Expert viewpoints and reviews have been reported on LLM use and opportunities for medical education (Gray et al., Reference Gray, Slavotinek, Dimaguila and Choo2022; Abd-alrazaq et al., Reference Abd-alrazaq, AlSaad, Alhuwail, Ahmed, Healy, Latifi, Aziz, Damseh, Alabed Alrazak and Sheikh2023; Karabacak and Margetis, Reference Karabacak and Margetis2023; Safranek et al., Reference Safranek, Sidamon-Eristoff, Gilson and Chartash2023; Lucas et al., Reference Lucas, Upperman and Robinson2024), drug discovery, (Huang et al., Reference Huang, Yang, Liao, Tseng, Lee, Gill, Compas, See and Tsai2024; Rovenchak and Druchok, Reference Rovenchak and Druchok2024) and higher degree research (Butson and Spronken-Smith, Reference Butson and Spronken-Smith2024; Sarangi et al., Reference Sarangi, Panda, Pattanayak, Panda and Mondal2025), as well as other clinical Artificial Intelligence (AI) applications (Scott et al., Reference Scott, Carter and Coiera2021). However, there is sparse evidence regarding the use of LLM, user perspectives, and opportunities in the context of an integrated specialist paediatric academic hospital and research campus.
1.1. Objective
We aimed to summarise the current uses and attitudes towards LLMs in our Campus across the clinical, research, and teaching contexts.
2. Methods
2.1. Context
Our Campus spans a paediatric hospital, children’s medical research institute, and paediatric university department in a metropolitan, Australian city. The paediatric hospital is a quaternary hospital with over 6000 staff and 350 beds, servicing the entire state in which it is located. The medical research institute employs over 1800 researchers working across 150 diseases and conditions affecting children. The university paediatric department comprises over 70 academic staff, 15 professional staff, 150 graduate research students, and over 400 Honorary staff members. We estimate there are approximately 7500 staff and students across the Campus, although many have joint affiliations across the three partners.
2.2. Qualitative approach
Based on our objective of generating insights from participants across the Campus, we followed a systematic grounded theory approach (Glaser and Strauss, Reference Glaser and Strauss1967), which emphasises the development of concepts and themes directly from the data rather than imposing predetermined frameworks. Grounded theory is particularly suited to exploratory work such as ours, where the goal is to understand emerging attitudes, behaviours, and concerns around the use of large language models (LLMs) in clinical, research, and teaching contexts. We circulated a survey amongst all staff and students across our Campus to gather information about current uses, expected future uses, opportunities, and risks of LLM use in clinical, research, and teaching and learning contexts. The survey is available in Supplementary Material 1. As most questions on LLM use and attitudes were open-ended, we followed the Standards for Reporting Qualitative Research (SRQR, see Supplementary Material 2) (O’Brien et al., Reference O’Brien, Harris, Beckman, Reed and Cook2014).
The researchers involved in the survey design, data collection, and data analysis were comprised of a Working Group assembled to assess the current state of LLM use in academic child health and identify opportunities and challenges for the future. The group’s positions and expertise are broad, spanning early and senior researchers, clinicians, informaticians and data scientists, and leadership, as well as child clinical care consumer representatives.
2.3. Participants
All staff and students across our Campus were eligible to participate, spanning clinicians, health and medical researchers, postgraduate researchers, data scientists, and administrative and support staff. This project received quality assurance approval from the Royal Children’s Hospital Research Ethics & Governance Office (100638).
2.4. Survey design
We developed a survey based on an iterative discussion between members of the Working Group. MS drafted survey items, which were then piloted with the Working Group and refined until the group was satisfied. During this process, we referred to a survey of GenAI use in commercial entities (Writer, 2024).
2.5. Data collection methods
We created a web-based survey using LimeSurvey (https://www.limesurvey.org/). We distributed the survey via various mailing lists across the Campus, estimated to have reached about 1000 clinical staff, 1800 researchers, and 100 students. The survey was open for a period of 4 weeks (28th August to 22nd September 2023).
2.6. Data processing, coding, and thematic analysis
Data were deidentified and aggregated and distinctive written text was paraphrased to avoid identification. We performed statistical analysis of descriptive and quantitative data using Microsoft Excel. We did not perform inferential statistics.
We analysed free text responses using inductive analysis in NVivo 14 (Lumivero, 2023). Coding of the data, a systematic process of labelling and categorising to generate themes, was conducted evenly by GLD and LG. First, GLD coded a segment of the data and established initial codes and definitions. The remaining data were single-coded by GLD or LG, who added new codes and definitions as needed. The definitions of codes were refined by discussion. After coding, GLD conducted an inductive thematic analysis of the codes using grounded theory. The resulting themes were reviewed and approved by LG and NP.
To establish a timely overview of current uses and perceptions of LLM use in our campus, we deemed it sufficient to single-code the data.
3. Results
3.1. Summary statistics
We received 281 survey responses. Most respondents had an affiliation with the hospital (n = 174), many with the research institute (n = 131), and fewer with the university department (n = 85) (many respondents have more than one affiliation, so numbers exceed total respondents). Over half reported having a clinical role (n = 158), half in research (n = 140), about one quarter in teaching (n = 75), and a tenth in data services (n = 27) (again, respondents can work in multiple capacities). This distribution may be related to the employment structure on our campus, where the research institute employs more staff than the university department. It may also be an indicator of the relatively greater interest and activity around AI in healthcare (Xie et al., Reference Xie, Zhai and Lu2025) compared to education (Durak et al., Reference Durak, Çankaya, Özdemir and Can2024), as suggested by recent trends in publications.
Respondents were asked to report what best described their level of experience. Most identified as “senior” (senior, fully qualified, post-doctoral, senior data analyst, or developer; n = 201) and fewer as “junior” (junior, trainee, student, junior data analyst, or developer; n = 56).
3.2. Quantitative results
Respondents were asked about specific LLM tools that were gaining traction at the time. As a result, current prominent LLMs at the time of publication, such as Anthropic’s Claude, were not included. Over 90% of staff were familiar with LLM tools, most commonly ChatGPT. Regarding the use of LLM tools, 64% reported current or previous use of any LLM tool in their work. Figure 1 shows respondents’ reported familiarity with and use of specific LLM tools. The vast majority was ChatGPT (GPT3, 63%; GPT4, 19%). There was also some use of Grammarly (24%) and Microsoft Bing Chat (7%), and very small numbers reported using any other tools. In terms of familiarity, some people reported having heard of but never using ChatGPT (GPT3, 28%; GPT4, 16%) or Grammarly (40%). Some had heard of but never used Microsoft Bing Chat (23%), Google Bard (15%), or GitHub Co-pilot (10%). Very small numbers reported familiarity with other tools.

Figure 1. Reported familiarity with and use of LLMs. Green represents use of the LLM tool, yellow familiarity with but no use of the tool, and orange no familiarity with the tool.
3.3. Qualitative synthesis
The responses suggest a wide array of current LLM uses in clinical and research settings. Respondents seemed mostly very aware of the potential benefits of the use of LLMs but also of a range of possible risks.
3.3.1. Current and future uses
A substantial number of respondents reported using LLMs for generating content, e.g. generating or editing text. While it is possible to distinguish generating text from editing, GLD took the perspective during thematic analyses that LLMs generate corrections (Schmaltz et al., Reference Schmaltz, Kim, Rush and Shieber2016). This interpretation was adopted by LG and NP. Other uses included knowledge support (e.g. exploring ideas, identifying areas of improvement, or helping write programming code), or data processing (e.g. data analysis, extraction, management, or interpretation), and using LLM as an information source or to find information sources. Relatively few reported using LLMs for technical Natural Language Processing research or development, such as text classification, or for administrative tasks. Various respondents also reported no current use of LLMs, as reflected by the quantitative results.
Respondents frequently identified potential future uses related to generating text, which matches current uses of LLM. Respondents also frequently foresaw that LLMs could be used in the future for clinical decision support, research activities like data analysis or computer programming, and teaching activities like generating learning materials, grading, reviewing, and developing human skills.
Table 1 presents example quotes for the themes we described, selected through a grounded theory approach, reflecting key conceptual categories identified.
Table 1. Example quotes from key themes of current and future uses of LLMs

3.3.2. Opportunities and risks
The primary opportunity identified by respondents was efficiency, namely, saving time or requiring few resources. See Table 2 for example, quotes for the themes we describe. Some respondents saw opportunities to expand human skills using LLMs. Some reported that using LLMs could result in higher quality and more accurate outputs, although there were more reports of the risk of lower quality and accuracy.
Table 2. Example quotes from key themes of opportunities and risks of LLMs

The main risks identified related to privacy, intellectual property (IP), security, integrity, and public trust, particularly by clinical or research respondents. Low accuracy was another common risk reported by respondents across all three affiliations. Many teaching and learning respondents (i.e. educators or students) indicated their concern that LLM outputs are likely to be of lower quality.
After these risks, other risk themes were identified infrequently but included the risk of misinterpreting or uncritically using LLM output, integrity risks, low transparency and oversight for appropriately using LLMs, and loss of human skills when relying on LLMs.
4. Discussion
4.1. Main findings
We found that most respondents have heard of LLM tools and about two-thirds have used them in their work. Respondents reported using LLMs for various uses, including generating or editing text, knowledge support, data processing, and as a source of information. Many respondents identified a major opportunity for LLM use as increased efficiency. Many respondents seemed aware and realistic about the limitations and potential risks of LLMs. Privacy, IP, and security concerns were the most frequently cited risks of LLM use, as well as lower accuracy and quality of outputs generated by LLMs. In contrast, some of the respondents’ comments suggest they may not be aware of these risks, particularly the entering of private information into LLMs or relying on LLMs as an information source when they are not verified to be reliable for this purpose (e.g. see example quotes in Table 1). This suggests the need for clearer and more accessible usage guidelines and training. A few themes arose both as an opportunity and risk (e.g. higher or lower accuracy and quality, or the increase or decrease of human skills), highlighting that opinions differ on LLM capabilities and potential.
Many of the respondents’ attitudes echo issues reported in the literature around LLM and other AI use in medical education and higher degree research. Some key opportunities identified in a systematic review of LLM use in medical education were for creating courses and assessments, using LLMs as a writing tool, and allowing greater access to published knowledge and research (Lucas et al., Reference Lucas, Upperman and Robinson2024). These correspond to the many reports of current use of LLMs in the themes of content generation and knowledge support. The key challenges the authors of that review identified were ethical, legal, and privacy, academic integrity, incorrect responses, and overreliance. These correspond to the risks reported by our respondents around privacy, IP, security, integrity and public trust, accuracy, and human skills. Other viewpoints and editorial papers on LLM use in medical research have expressed similar attitudes on opportunities and risks (e.g. Abd-alrazaq et al., Reference Abd-alrazaq, AlSaad, Alhuwail, Ahmed, Healy, Latifi, Aziz, Damseh, Alabed Alrazak and Sheikh2023; Safranek et al., Reference Safranek, Sidamon-Eristoff, Gilson and Chartash2023). A critical dialogue between two academics in higher education also highlights the tension between using tools to improve efficiency whilst retaining academic integrity (Butson and Spronken-Smith, Reference Butson and Spronken-Smith2024). A review of studies reporting on stakeholder attitudes towards AI applications in clinical practice reported clinicians’ attitudes similar to those found in our study, including the potential for greater efficiency, higher or lower accuracy, and privacy breaches (Scott et al., Reference Scott, Carter and Coiera2021).
4.2. Implications
Our findings show that LLM tools are already widely used on our campus and for a range of uses. We were concerned by some of the uses identified by respondents that implied potential confidentiality and privacy breaches, which would violate the Victorian Department of Health’s advice on use of unregulated AI in health services (Victorian Department of Health, 2023). Certain responses also implied that some respondents may use LLMs as an information source, when they are not fully reliable for this purpose. This only strengthens the pressing need to develop agile policies and relevant training that evolve with emerging use cases and resultant risks. Thus, insights from this survey were used by members of our Working Group to draft recommendations for the use of LLMs and GenAI more broadly on our campus (see Supplementary Material 3).
We consider that banning the use of LLM tools in certain domains is unlikely to work in practice. These tools are already getting significant use, and users are experiencing obvious benefits, including from major electronic medical record providers in Australia (Epic Systems Corporation, 2024). There is no practical way to limit access, and the very same tools are publicly available to patients and their families.
It is also important to recognise the enthusiasm and interest staff have around the benefit of LLMs, and encourage them to explore the technology, as long as this is done with transparency for the organisations and with appropriate governance informed by local and external policies.
To ensure LLM use leads to higher accuracy and quality of outputs, as well as the augmentation, not erosion, of human skills, users could be provided with training and resources. This includes guidance on effective LLM use, such as prompt engineering strategies and critical appraisal of LLM output. Such resources should also provide guidance for ethical and responsible use of LLMs. Guidance should also address ethical, responsible, and context-aware use of LLMs, particularly where outputs influence patient care, research quality, or educational integrity.
The draft recommendations our Working Group developed based on insights from this survey (Supplementary Material 3), highlighted key domains for safe and effective integration of LLMs. These include the formation of governance structures that are dynamic, to match the rapid evolution of the technology and shifting regulatory landscape, coupled with long-term oversight of LLM projects and uses. Staff use of LLMs should be visible to the organisation, including ICT and digital teams, to ensure alignment with cybersecurity and infrastructure considerations. Clear boundaries must be in place around privacy when interacting with publicly accessible models, particularly the handling of sensitive or proprietary data. Additionally, it should be emphasised that while AI-generated content may serve as a beneficial starting point, they are meant to support human insight, not replace it; thus, outputs must be reviewed, verified, and interpreted by qualified individuals. Staff are fully responsible for the accuracy and impact of GenAI outputs, especially when these affect patient or consumer safety.
Beyond contributing to the formulation of internal guidelines and policies, it is hoped that the findings of our survey will support the federal and state governments’ work in this area, such as the 2024 Safe and Responsible AI in Australia consultation (Department of Industry, Science and Resources, 2023). Here, the government’s recommended ‘risk-based approach’ to regulation requires lawmakers to identify low- and high-risk AI applications in various sectors. Thus, the spectrum of activities identified in our survey can provide insights into the nature and extent of LLM use in the health sector. Given the rapid evolution of AI tools, we suspect that developing generalisable principles for responsible AI use may be more sustainable than overly specific rules.
4.3. Transferability
We expect the exact uses of LLM tools will differ across different institutions, but the results of the current study may reflect general patterns in academic, clinical, and tertiary education contexts. Other institutions may repeat our methods to better understand the use of LLMs in their context and should develop guidelines relevant to their context.
4.4. Limitations
Our survey was completed voluntarily across our Campus, so the quantitative results represent LLM use of respondents, not across the entire Campus. Our relatively low response rate limits the generalisability of our quantitative findings. While a larger sample size would have strengthened the representativeness of our results, the insights captured still offer valuable perspectives on current patterns of LLM use. Our qualitative results provide a summary of uses of and attitudes towards LLMs on our campus, though they should be interpreted as exploratory. Our survey responses also reflect a snapshot in time (August–September 2023) of LLM uses and attitudes, which are expected to change as the landscape of LLMs is rapidly developing. However, the results provide us with a baseline understanding of LLMs’ use and attitudes in our context and will allow us to track changes over time.
5. Conclusion
We surveyed the current uses and attitudes towards large language models on our campus across clinical, research, and teaching contexts. Most respondents have heard of LLMs, and around two-thirds have already used them in their work, primarily for content generation, knowledge support, data processing, and information sourcing. Many saw opportunities for increased efficiency and improved access to knowledge; however, respondents also expressed concerns around accuracy, data privacy, and the potential erosion of human expertise and skills. This highlights the need for governance to keep up with practice. Our Working Group has used the insights gained through this survey to develop recommendations for the use of GenAI and LLMs on our campus, covering key areas such as privacy, transparency, human oversight, organisational visibility, and education. These recommendations aim to support safe, context-aware, and ethically sound use of LLMs, and may offer a useful framework for other institutions navigating similar challenges.
Abbreviations
- AI
-
Artificial Intelligence
- GenAI
-
Generative Artificial Intelligence
- IP
-
intellectual property
- LLM
-
Large Language Model
- SRQR
-
Standards for Reporting Qualitative Research
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/dap.2025.10044.
Data availability statement
The data that support the findings of this study are available upon reasonable request from the corresponding author, LG. The data are not publicly available due to privacy of the participants.
Acknowledgements
We thank all the respondents to our survey for their time and contribution. For insightful contributions, we thank the other members of the Melbourne Children’s Working Group on Generative Artificial Intelligence in Child Health.
Author contribution
Conceptualization-Equal: L.G., N.P., D.C., R.R., J.B., J.H., S.R., C.Q., N.S., M.W., M.S., G.D.; Data curation-Lead: G.D.; Data curation-Supporting: L.G.; Formal analysis-Equal: M.S., G.D.; Investigation-Lead: M.S.; Methodology-Equal: M.S., G.D.; Methodology-Supporting: L.G.; Visualization-Equal: M.S., G.D.; Writing – Original Draft-Lead: L.G.; Writing – Review & Editing-Equal: N.P., D.C., R.R., J.B., J.H., S.R., C.Q., N.S., M.W., M.S., G.D.; Writing – Review & Editing-Lead: L.G.
Funding statement
LG was supported by an Australian Government Research Training Program (RTP) Scholarship and MCRI PhD Top Up Scholarship. Research at the Murdoch Children’s Research Institute (MCRI) was supported by the Victorian Government’s Operational Infrastructure Support Program. The funding organisations are independent of all researchers and were not involved in any of the study design, the collection, analysis, and interpretation of data, the writing of the report or the decision to submit the manuscript for publication.
Competing interests
The authors declare none.
Ethics statement
This quality improvement project received quality assurance approval from the Royal Children’s Hospital Research Ethics & Governance Office (100638).
Comments
No Comments have been published for this article.