Evaluation of automated anthropometrics produced by smartphone-based machine learning: a comparison with traditional anthropometric assessments

Automated visual anthropometrics produced by mobile applications are accessible and cost effective with the potential to assess clinically relevant anthropometrics without a trained technician present. Thus, the aim of this study was to evaluate the precision and agreement of smartphone-based automated anthropometrics against reference tape measurements. Waist and hip circumference (WC; HC), waist:hip ratio (WHR) and waist:height ratio (W:HT) were collected from 115 participants (69 F) using a tape measure and two smartphone applications (MeThreeSixty®, myBVI®) across multiple smartphone types. Precision metrics were used to assess test-retest precision of the automated measures. Agreement between the circumferences produced by each mobile application and the reference were assessed using equivalence testing and other validity metrics. All mobile applications across smartphone types produced reliable estimates for each variable with intraclass correlation coefficients ≥ 0·93 (all P < 0·001) and root mean square coefficient of variation between 0·5 and 2·5 %. Precision error for WC and HC was between 0·5 and 1·9 cm. WC, HC, and W:HT estimates produced by each mobile application demonstrated equivalence with the reference tape measurements using 5 % equivalence regions. Mean differences via paired t-tests were significant for all variables across each mobile application (all P < 0·050) showing slight underestimation for WC and slight overestimation for HC which resulted in a lack of equivalence for WHR compared with the reference tape measure. Overall, the results of our study support the use of WC and HC estimates produced from automated mobile applications, but also demonstrates the importance of accurate automation for WC and HC estimates given their influence on other anthropometric assessments and clinical health markers.

Obesity is rapidly increasing with disparately high rates for individuals who are of low socio-economic status (1) or living in rural communities (2,3) . Specifically, both low socio-economic status (4,5) and rural occupancy (6) are associated with higher rates of abdominal obesity, which is itself linked to a higher risk of cardiometabolic abnormalities (7) . Traditionally, abdominal obesity is evaluated by standard anthropometric assessments that include circumferences of the waist -and -hips specifically, given their association with cardiometabolic health risks (8,9) and mortality (10) . Nevertheless, traditional anthropometric measures lack feasibility for those without access to clinical care, which is concerning given that there are few alternative methods that can successfully provide remote and cost-effective assessments without a trained technician present. Therefore, the development of remote healthcare tools that can provide accurate anthropometric assessments without additional costs are of critical importance.
Interestingly, the adaptations made to traditional healthcare models during the COVID-19 pandemic may have, unintentionally, provided a potential solution to this pressing issue. At the onset of the pandemic, healthcare facilities were forced to find remote alternatives to providing clinical care. Given that the majority of USA adults own a smartphone (11) with a similar ownership existing across more vulnerable populations (12,13) , the demand for remote healthcare solutions led to the swift integration of mobile healthcare models that enable patients with access to care regardless of limited transportation or geographical location. In fact, recent evidence demonstrates the feasibility of mobile healthcare interventions in a rural setting showing reductions in waist circumference (WC) and visceral adiposity with this approach (14) . Additionally, and contrary to the observations made in a traditional healthcare setting (15) , intention to use and satisfaction with mobile health services is inversely associated with perceived health status which notes its utility for individuals of poor health. However, increased access to communication with healthcare providers is only one facet of the total healthcare experience. As such, the surge in mobile healthcare use coupled with the accelerated increase in obesity requires parallel advancements in remote health tools, such as automated anthropometrics, that can conveniently assess patient health status without additional costs.
Because virtually all smartphones can now employ highresolution imaging through advances in smartphone camera technology, newly developed mobile applications can now leverage machine learning to automate anthropometric assessments from as little as two self-taken images. Furthermore, this method has recently shown to be associated with traditional measurement methods (16) , agree with multicompartment body composition estimation models (17) and demonstrates higher predictive ability of visceral adipose tissue compared with physical circumferences and other measures (18) . Thus, this methodological approach may be a potential solution to collecting large-scale anthropometric information and improve access to health information for those with monetary or geographic limitations. However, the consumer level and accessible nature of this method fosters continual industry competition and thus there are several mobile applications that claim to produce accurate automated anthropometric evaluations from smartphone-based imaging. As such, there are currently no studies, to our knowledge, that have fully assessed the equivalence of clinically significant measures of abdominal adiposity such as WC, hip circumference (HC), w:hip ratio (WHR) and waist:height ratio (W:HT) produced by multiple mobile applications across smartphone types to a reference measurement. Therefore, the purpose of this study was to determine the agreement and precision of automated anthropometric assessments produced by threedimensional (3D) mobile scanning applications compared with a reference tape measure.

Participants
A total of 115 individuals (F: 69, M: 46) between ages 18 and 75 years were prospectively recruited for this cross-sectional study. Participants were excluded if they were younger than 18 or older than 75; were missing any limbs or part of a limb that influenced an accurate assessment of the primary anthropometric measures; were pregnant; trying to become pregnant or breast-feeding or lactating. The study took place from March 2022 through July 2022 and was conducted according to the guidelines laid down in the Declaration of Helsinki, with all procedures involving human participants approved by the university ethics committee (IRB#21-213). Written informed consent was obtained from all participants.

Procedures
Our procedures for visual body composition scanning have been previously reported (19) but are summarised below. Participants reported to the laboratory after abstention from food, beverages, supplements/medication and exercise for ≥ 8 h. Upon arrival participants were asked to remove any external accessories (jewelry, shoes, etc.) and/or loose clothing and underwent measurements of height collected by a digital stadiometer (SECA, Hamburg, Germany), weight collected by a calibrated digital scale (SECA, Hamburg, Germany) and WC and HC collected using an aluminum tape measure. Following tape measurements, participants were lead to a specific area of the laboratory to complete the smartphone-based assessments. For scanning on each mobile application, participants were instructed to wear minimal form-fitting clothing. For example, female participants were instructed to wear a sports bra and tight-fitting shorts/leggings, and male participants were instructed to wear compression shorts/tights only. Higher waisted shorts that covered the participants bellybutton were altered to expose the participants entire abdominal region to the smartphone camera. Participants with long hair were instructed to tie their hair up so that no hair was present below the shoulder line.

Reference tape measurements
Traditional WC and HC were collected by an aluminum tape measure, and these measurements were used as, or used to calculate, all reference variables for this study. Because there are no standardised measurement sites for WC and HC across mobile applications, estimates of WC and HC from these applications could be generated from different locations at or around the waist and hip regions. Therefore, the reference tape measurements were standardised to specific locations, where the reference WC was measured at the level of the iliac crest (20) and the reference HC was measured at the widest portion of the lateral hips (9) given the ease in the visual detection of pronounced or distinct body areas during the landmarking procedures of smartphone-based imaging. In addition, the American Heart Association waist circumference risk classifications are based off measurements collected at the level of the iliac crest (21) . All tape measurements were conducted by the same two investigators for all participants. WC and HC were used to produce a WHR by dividing WC by HC and measurements of W:HT by dividing WC by height collected from the digital stadiometer. For WHR measurements, the first WC was divided by the first HC and the second WC by the second HC. Because height was measured at a single timepoint, both the first and second WC were divided by the same height measurement. All measurements were conducted in duplicate and averaged to produce a final estimate.

Mobile applications and smartphone types
Our procedures for each smartphone application and type have been described elsewhere (19) . Two mobile applications were used for this study which included MeThreeSixty ® (ME360; Size Stream LLC) and myBVI ® (Select Research LTD). Because mobile applications are frequently updated under real-world circumstances, and because these updates are often necessary to fix unavoidable issues with the application's performance, applications were updated daily prior to testing, when available. The study began with software version 3.3.0 (ME360 Apple) , 3.2.2 (ME360 Samsung) and 3.0.0 (myBVI ® ) and ended using versions 3.4.2 (both ME360) and 3.1.2 (myBVI ® ), respectively. Further, two different smartphones were used for this study to compare the precision and agreement of the applications across smartphone types. The smartphones used in this study included an iPhone ® 12 Pro (Apple ® Inc.) and a Samsung Galaxy ® S21þ (Samsung ® Group). Assessments using the iPhone ® were conducted using the same software version for the entirety of the study (iOS 15·0·1) but due to Samsung's ® forced security updates, multiple software versions were employed for this smartphone specifically (One UI version 3·1, 4·0 and 4·1 and Android ® version 11 and 12). All images for ME360 were collected using the front facing camera from both smartphones, whereas images from myBVI ® were collected using the front facing camera from the iPhone ® only due to compatibility issues between myBVI ® and Samsung Galaxy ® .

Anthropometric assessment protocol for the mobile applications
Our procedures for collecting smartphone-based body composition estimates have been previously reported (19) . To perform the assessments for each mobile application, participants were taken to a designated area of the laboratory that did not have any objects or light behind the participants' back. All images were taken in front of a gray vinyl wall in this designated area, and all external windows were covered so that no other background or external light source polluted the scanning region. The smartphone was positioned at a standardised distance from the participant's mid-foot unless instructed otherwise by the application and a standardised height for all participants using a stationary tripod with adjustable angle settings. The smartphone order was assigned randomly for each participant. Each participant's personal information (age, sex, height and weight) was uploaded into the application before testing commenced. The smartphone was locked into place at an angle determined appropriate by the mobile application. Once the smartphone and the participant were situated appropriately, participants were asked to stand in two positions, in accordance with the manufacturer guidelines, while images were collected. For the first image, participants faced the camera and stood with arms and feet positioned away from the torso. For the second image, participants were instructed to turn to their profile with either their left (ME360) or right (myBVI ® ) shoulder facing the camera, face forward, extend the elbows completely and place their hands against their lateral thigh while images were collected. All assessments were conducted in duplicate and subjectively inspected for quality to ensure that there were no errors during landmarking procedures. WC and HC from ME360 were provided directly by the mobile application. WHR ME360 was calculated as WC ME360 divided by HC ME360 . Because myBVI ® provides WHR and W:HT, but not WC or HC, WC myBVI was calculated as height from the stadiometer multiplied by W: HT myBVI . The newly produced WC myBVI was divided by WHR myBVI to produce HC myBVI . Because myBVI ® requires the user to round their height, and because ME360 W:HT used height measured by stadiometer, we calculated W:HT myBVI by dividing WC myBVI from the stadiometer height. For WHR measurements from each application, the first automated WC was divided by the first automated HC and the same was done for the second scans. Similar to the reference method, both the first and second automated WC were divided by the same height measurement. All measurements were conducted in duplicate and averaged to produce a final estimate.

Statistical analysis
We conducted a non-directional power analysis to determine the sample size necessary to detect significant differences using a paired-samples t test (the primary statistical analysis for determining group mean differences (MD)). Prior to analysis, we determined ±4·0 cm to be a meaningful MD and thus using a MD of 4·0 ± 6·0 and an α = 0·05 it was determined that twenty participants were necessary to observe at least 80 % power. All outcome variables were normally distributed as assessed by Shapiro-Wilk and visual inspection of Q-Q plots. Means and 95 % confidence intervals (95 %CI) for each device were calculated for WC, HC, WHR and W:HTand the MD and 95 %CI for each variable were calculated as the mobile application in question minus the reference. Test-retest precision of each anthropometric assessment was assessed using intraclass correlation coefficients (ICC) with two-way, random effects and absolute agreement. Precision was also measured using precision error (PE) and root mean square coefficient of variation (RMS-% CV). For ME360, precision metrics were used to determine precision between smartphone types using the first scan from each smartphone. Because WHR and W:HT are unitless, PE was not calculated for these variables. A device error resulted in one participant (n 1) missing a single scan from one smartphone (Samsung ® ) for ME360. Therefore, 114 participants were used to determine precision, and 115 participants were used to assess agreement. An average of the two scans for each mobile application was used to determine agreement, and only one scan was used for the participant with a missing scan. Equivalence testing was used to determine equivalence with the reference tape measurements for WC, HC, WHR and W:HT using 5 % equivalence regions. Additionally, and because WC is often used to assess abdominal obesity for the evaluation of cardiometabolic health risk, the percentage of correct abdominal obesity classifications according to the guidelines put forth by the American Heart Association (≥ 88 cm for females; ≥ 102 cm for males) are presented for each mobile assessment (21) . Agreement with the reference method was also assessed by separate paired samples t test, Pearson correlation coefficients, root mean square error (RMSE = p ∑(predicted-actual) 2 /n) and standard error of the estimate. Individual accuracy was assessed using the methods of Bland and Altman (22) to determine the 95 % limits of agreement (LOA), and regression techniques were used to determine proportional biases. Subgroup analyses were conducted for sex and racial differences. For race, analyses were conducted for non-Hispanic white and non-Hispanic Black/African-American (B/AA) individuals (Table 1). Data from other racial and ethnic groups were included in the complete sample analyses only.
Smartphone-based automated anthropometrics 1079 Anthropometric differences between groups were assessed by independent samples t test. Statistical significance was accepted at P < 0·05. Data were analysed using the TOSTER package (23) in R version 4.1.2, IBM SPSS version 27 and Microsoft Excel version 16.

Results
Participant characteristics are reported as mean and the 95 %CI or as a percentage of the total for each column (Table 1).

Precision analysis
Results of the precision analysis are presented in Table 2. For precision, all ICC, including assessments between smartphone types, ranged from 0·933 to 0·998 (all P < 0·001). For WC and HC, ME360 Apple produced the lowest PE and RMS-%CV followed by tape measurements and ME360 Samsung . myBVI ® had the largest PE and RMS-%CV for both WC and HC across applications.

Agreement analysis
Results of the agreement analysis for the total sample are presented in Table 3. Overall, the number of correct abdominal obesity classifications for each mobile application was 103 (89·6 %) for ME360 Apple , 104 (90·4 %) for ME360 Samsung and 106 (92·2 %) for myBVI ® . Of the incorrect classifications, both ME360 Apple (twelve incorrect) and myBVI ® (nine incorrect) incorrectly classified participants as having abdominal obesity on three occasions with the remaining incorrectly classifying participants as not having abdominal obesity. ME360 Samsung incorrectly classified participants (eleven incorrect) as having abdominal obesity on five occasions with the remaining incorrectly classifying participants as having abdominal obesity. Paired t tests revealed that all variables produced by the mobile applications differed significantly from the reference tape measurement for the total sample. WC was slightly underestimated by each mobile application (all MD: ≤ −2·5 cm, Age AA, African-American. * Statistically significant at P < 0·001. † Statistically significant at P < 0·010. ‡ Statistically significant at P < 0·050.
P < 0·050) relative to the reference method for each application. Conversely, HC was slightly overestimated across each mobile application relative to the reference method with myBVI ® demonstrating the lowest MD (1·9 cm) followed by ME360 with negligible differences between smartphone types (all P < 0·050). MD for WHR and W:HT were similar across devices. Despite significantly different MD, equivalence testing revealed that all devices demonstrated equivalence to the reference method using a 5 % equivalence region for WC, HC and W:HT (Fig. 1, Table 3 Table 3 and illustrated in Fig. 2. LOA ranged from ±9·0 to ±12·4 cm for WC and from ±6·7 to ±10·7 cm for HC. For WHR and W:HT, LOA were similar across applications. Significant proportional bias was observed for WC estimates produced by ME360 but not myBVI ® . Conversely, significant      proportional bias was observed for HC estimates produced by all methods.

Sex differences in agreement
Agreement analyses by sex groups are presented in Table 4. When stratified by sex, there were no significant differences between the WC estimates produced by each device and the reference method for males (all P > 0·050); however, WC was significantly underestimated for females across all devices (all P < 0·001). Significant overestimations of HC were observed for both males and females using ME360 (P < 0·001) but only for males using myBVI ® (P < 0·001). WHR was significantly underestimated for both males and females across all applications (all P < 0·05). For WC, RMSE was lower for males across all devices. RMSE for HC estimates was negligible between males and females using ME360 but were substantially higher for males using myBVI ® . RMSE for WHR was similar between males and females other than WHR estimates produced by ME360 Apple which, for females, was more than double that of males. Results for the Bland-Altman analysis by sex group are displayed in Table 4. LOA ranged from ±6·6 to ±12·9 cm for WC and were smaller in males across all devices. LOA ranged from ±6·5 to ±10·1 cm for HC and were smaller in females across all devices, albeit similar for ME360. For WHR and W:HT, LOA were similar between males and females although all LOA were slightly lower for males. Significant proportional biases were observed for WC estimates produced by ME360 (all P < 0·001), but not myBVI® (both P > 0·050) and were similar between males and females. Significant proportional biases were observed for HC estimates produced by myBVI ® and were similar between males and females (both P < 0·050). For HC produced by ME360, no proportional bias was observed for males, but was observed for females using ME360 Apple (P < 0·010) but not ME360 Samsung (P = 0·058). Significant proportional biases were observed for WHR produced by ME360 in both males and females (all P < 0·050) but not for WHR produced by myBVI ® (P < 0·050).

Racial differences in agreement
Agreement analyses by the race groups evaluated in this study are presented in Table 5. WC estimates from ME360 Apple , but not ME360 Samsung , were significantly underestimated for both White and B/AA participants (P < 0·010). Although non-significant, the MD for WC produced by ME360 Samsung for B/AA participants was more than double the MD for White participants. myBVI ® significantly underestimated WC for White participants (P < 0·001) but not B/AA participants (P = 0·08). HC estimates from each application were all significantly overestimated (all P < 0·050); however, MD for HC estimates were markedly higher for B/AA participants compared with White participants. WHR estimates were significantly underestimated for all applications (all P < 0·001) and were similar between White and B/AA participants. For WC, RMSE was similar between White and B/AA participants across devices; however, RMSE for HC was substantially higher for B/AA participants across applications. RMSE for WHR and W:HT was similar between White and B/AA participants across applications.

Smartphone-based automated anthropometrics
Results for the Bland-Altman analysis by race are displayed in Table 5. LOA ranged from ±9·0 to ±13·4 cm for WC and were similar for White and B/AA participants. For HC, LOA ranged from ±5·5 to ±11·9 and were noticeably higher for B/AA participants for each device. For WHR and W:HT, LOA were similar for White and B/AA participants. Significant proportional bias was observed for WC and W:HT estimates produced by ME360 (all P < 0·050) but not myBVI ® which were similar for White and B/AA participants. For HC, significant proportional bias was observed across all devices in White participants (P < 0·010) but only for myBVI ® in B/AA participants (P = 0·024); however, coefficients for each device were similar between groups. There was no significant proportional bias observed for WHR estimates across devices, although the coefficients for WHR produced by ME360 were non-existent for White participants and high for B/ AA participants.

Discussion
While traditional tape measurements are considered to be cost effective and accessible, there are questions regarding their social acceptance and reliability (24) , particularly for those with overweight or obesity (25) . As such, this study sought comprehensively evaluate the precision and agreement of automated and clinically significant anthropometric variables across multiple mobile applications, and smartphones, against a reference tape measurement. The principle findings were (1) all variables produced by each mobile application and between smartphones exhibited acceptable precision which were comparable with tape measurements; (2) WC, HC and W:HT from all mobile applications demonstrated equivalence with tape measurements; however, WC was slightly underestimated, whereas HC was slightly overestimated; (3) the slight under-and overestimations for WC and HC were small enough to demonstrate equivalence, but resulted in non-equivalence for WHR across all automated methods; (4) there were no differences between WC produced by the automated methods and the reference in males, but WC was significantly underestimated across all applications in females and (5) some variation existed across applications, but all variables demonstrated slightly lower agreement for B/ AA participants compared with White participants which may be a product of the weight status differences between groups or demographics of the populations used for method development. Overall, the results of our study support the use of WC and HC estimates produced from automated mobile applications, but demonstrates the importance of accurate automation for WC and HC estimates given their influence on other anthropometric assessments and clinical health markers.
First, each mobile application used in our study demonstrated acceptable precision for all automated assessments. There was a slight drop-off in precision when estimates were compared between smartphone types; however, precision remained within an acceptable range. While there are only a few studies that have evaluated the precision of automated anthropometrics using mobile applications (16,26) , there is contention regarding the precision of traditional tape measurements citing inaccuracies between self-measured and professionally measured  assessments (27) and measurements for those of higher weight status (27,28) . Another method shown to produce reliable digital anthropometrics is 3D optical scanning (29) . However, 3D scanners are generally unavailable for simple but important circumference measures in clinical practice, especially for those in lower socio-economic status or rural areas. Thus, the limitations of other methods highlight the need for precise estimates that are both accessible and cost effective. In our study, all automated anthropometrics produced precision estimates that were similar to the tape measurements performed in our study (which were conducted professionally) and those produced by 3D scanning in others (9,29) . Interestingly, the ME360 Apple application produced better precision estimates than those collected via tape measure with ME360 Samsung only marginally lower. It is possible that these small differences are due to differences in the developmental software (i.e. iOS or Android ® ), where initial mobile applications are built for a specific device (i.e. iPhone ® or Samsung ® ) with operating systems that require a particular coding language (30) . Despite this, all ICC were > 0·930 and PE were all ≤ 2·0 cm. Considering that this degree of precision was produced simply from two two-dimensional pictures on highly accessible mobile applications, it is plausible that these applications be considered reliable and comparable with traditional methods. Several investigations have evaluated the agreement between automated anthropometrics from a mobile application and traditional tape measurements and have demonstrated considerable variation (16,26,31,32) , likely due to the different mobile applications employed in each study. Overall, WC, HC, and W:HT across all mobile applications in our study demonstrated equivalence to our reference method; however, there were slightly significant under-and overestimations for WC and HC, respectively, for each application. These bi-directional biases, where an underestimated WC was divided by a larger overestimated HC, resulted in significant underestimation of WHR for each application that did not demonstrate equivalence. So, while biases for WC and HC were relatively small, these small differences manifested in discrepancies across other variables. This relationship is also reflected in our results for W:HT, where all mobile applications underestimated W:HT as an extension of an underestimated WC. These small inconsistencies for WC and HC are problematic considering that WC and HC estimates are commonly used in other anthropometric screening tools to predict several health risks (10,33) . This is especially concerning considering the proportional biases and large LOA observed in our study. Specifically, we found that both ME360 applications demonstrated significant proportional bias for WC, where WC was underestimated to a greater degree in those with larger average WC. Significant proportional bias was observed for HC produced by both ME360 applications, but these biases were relatively small and much smaller than the proportional biases produced for WC using the same application. Because participants with a higher average WC had greater underestimations of WC without similar underestimations of HC, WHR was underestimated using this application. Interestingly, there was no proportional bias for WC using myBVI ® ; however, HC produced by myBVI ® demonstrated significant proportional bias, where HC was overestimated for those at larger average HC. Similar to ME360, albeit in differing directions, the overestimation of HC without simultaneous overestimations in WC led to underestimations of WHR. Therefore, while automated WC and HC estimates from a mobile application may demonstrate equivalence, those planning to employ these estimates as a part of a larger screening battery should do so with caution; although WC estimates produced by each mobile application did show utility in correctly determining abdominal obesity classification; a common cardiometabolic health risk assessment.
There are several issues that may explain our aforementioned results. First, the artificial intelligence used to develop each application is dependent upon the method used to train the application. For instance, if the mobile application was trained by a 3D scanner it is possible that comparisons to a tape measurement would result in lower levels of agreement. However, many mobile applications are trained by both 3D scanning and traditional tape measures, and recent investigations show agreement in body circumferences assessed by tape measurements, 3D scanners, and mobile applications (16) . Typically, these studies determine agreement by comparing automated measures to tape measurements taken at sites specifically defined by the mobile application (16,31) . While this methodological approach may result in better agreement, it may also limit real-world application given that self-or professionally measured circumferences may be taken from considerably different locations than from those suggested by the mobile application. Moreover, measurement sites may be markedly different between mobile applications making them difficult to compare. Therefore, to determine their 'real-world' performance we standardised the location of each tape measurement. It should be noted, however, that while the automated WC and HC were equivalent to tape measurements, this is specific to the performance of our investigators and the location in which our tape measurements were taken and thus, estimates taken by individual users, by other professionals, or at different locations may lead to differing results.
In addition to what has been previously suggested, our results may also be explained by our sex and race comparisons. For males, there were no significant differences between the WC measures and the reference for any mobile application. In fact, MD in WC for males were all ≤ −0·6 cm which is comparable to the results for both males and females produced by Nana et al. (31) using a single mobile application. Conversely, WC was significantly underestimated by ≥ −3·0 cm across applications for females. Size differences between our female participants and those in the study by Nana et al. (31) may explain the differences between studies, where our female participants had substantially higher weight, WC, and HC. However, male participants in our study were also much larger. As previously suggested by Neufeld et al. (32) , it is possible that differences in clothing between male and female participants, combined with the larger size of our participants, may explain our findings. Specifically, male participants wore only compression shorts/tights whereas female participants wore tights and a sports bra. It is common for females to wear high-waisted tights that cover a significant area (and often the majority) of the abdomen where the automated WC would be collected. Although steps were taken to alleviate this in our study, the automated landmarking procedures used by each mobile application may be altered when a considerable amount of abdominal mass is covered. Additionally, we observed that clothing worn by females was often tighter fitting than clothing worn by males. The tightness of the clothing, especially in female participants of larger body size, may alter the natural body shape in areas around the abdomen. It is also possible that the discrepancies in WC between the automated assessments and the reference in females was due to the inherent difficulty in measuring the female waist. As you descend the body's vertical axis, the variation in WC along that axis is more exaggerated in females compared with males as a result of normal adult female body shape (34,35) . Thus, it is possible that these issues can introduce error into the automated measurement process as each application attempts to detect specific landmarks for its WC estimate.
The sex differences in HC for myBVI ® may also explain a number of our other findings. For males, myBVI ® significantly overestimated HC which was not observed for females. Interestingly, the majority of our male participants had either overweight or obesity, in addition to larger WC and HC, compared with our female participants. Considering that the proportional biases for HC from myBVI ® were nearly identical between sexes, it is more likely that the larger WC and weight status contributed to these differences. It is well known that males tend to deposit more fat in the anterior abdominal area whereas females tend to deposit more fat in the gluteal-femoral region (36) leading to a larger WC for males. At higher degrees of overweight and obesity, the excessive accumulation of fat in the abdominal area is exposed to gravity and may extend well into the pubic region. Given that the pubic area is within the normal assessment region for HC, it is possible that the larger abdominal fat mass in male participants extended into the HC area for the automated assessments whereas tape measurements could avoid this interference. This potential discrepancy can also be observed in our comparisons by race. B/AA participants had significantly higher weight, WC, and HC compared with their White counterparts and 50 % of our B/AA participants had obesity, with all but one of our participants with severe obesity being B/AA males (the one participant was a B/AA female). Further, HC was overestimated to a greater degree for B/AA participants but without differences in proportional biases. Given the large differences in weight and WC and the lack of proportional bias in HC in this group, it is possible that at the extremes of abdominal adiposity, abdominal fat mass may impede the anterior area of the HC measurement resulting in an automated HC measurement that accounts for a portion of the waist, leading to HC overestimation. The potential impedance of the abdomen in larger individuals, coupled with the intrinsically larger distributions of fat in the gynoid region for B/AA individuals (37) , could result in larger measurement errors that may be exaggerated for B/AA males.
In conclusion, automated anthropometrics produced by twodimensional pictures from mobile applications are cost effective and accessible tools that can be used to collect clinically significant anthropometric information. Automated anthropometrics from mobile applications demonstrate high levels of precision regardless of the smartphone used. Mobile anthropometrics also demonstrate agreement with traditional tape measurements (via equivalence testing) for WC and HC estimates. However, slight deviations in WC and HC, which may be due to technical issues that are exaggerated in estimates for female and B/AA participants, may lead to inaccuracies for other measurements such as WHR. As such, these data support the overall use of this technique for estimates of WC and HC, but individuals should include these data in comprehensive health assessments with caution given the range of individual error and over-and-underestimations. Despite this, WC estimates produced by each mobile application demonstrate utility in assessing abdominal obesity. These findings also support the potential use of this assessment method in prospective studies, where participants could self-assess across an intervention without the need for additional laboratory visits. However, future research examining the accuracy of self-and at-home assessments are necessary in addition to studies examining accurate assessment over time.