CEO Selection and Executive Appearance

Abstract Survey assessments have found limited evidence of benefits of executive attractiveness. We use an objective measure of facial attractiveness that is correlated with survey assessments but less noisy and identify several benefits from executive facial attractiveness previously found in the general population but heretofore empirically elusive among executives. We examine the effect of both measures on executive compensation, promotion to CEO and the corresponding shareholder reaction, and promotion to board chair. The objective measure identifies significantly positive labor market effects for executive attractiveness in all outcomes in contrast to survey assessments of attractiveness that do not correlate with any outcome.


I. Introduction
The importance of quantitative individual executive characteristics in influencing firm policy decisions and outcomes is well established in the literature 1 but qualitative characteristics, though more difficult to measure, can perhaps be even more influential. Appearance is one such characteristic that is known to significantly influence workers' labor market outcomes (Hamermesh and Biddle (1994)) as well as impact financial contracting decisions (Duarte, Siegel, and Young We are grateful to James Bates, Kathleen Centilli, Claiborn (Mac) Fountain, Kellie Hensley, and Katy Wallace for valuable research assistance. We have also benefited from conversations with and comments from Renée Adams (2012), Brooks, Huang, Kearney, and Murray (2014), Blankespoor, Hendricks, and Miller (2017), and Ravina (2019)). In addition, as the above quote reveals, appearance has played a role, whether explicitly or implicitly, in leadership choices for centuries. Yet, empirical evidence on the role of appearance, particularly attractiveness, of executives is scarce or mixed at best. One reason for this lack of definitive evidence on the influence of executive attractiveness is the reliance upon survey assessments to measure attractiveness. At first glance, survey assessments of attractiveness seem to be the most natural way to assess or measure beauty. Studies in psychology and economics have long relied on survey assessments of attractiveness. However, in these studies the subjects, decision makers in the experiment, are the ones influenced by attractiveness (e.g., Dipboye, Fromkin, and Wiback (1975), Dipboye, Arvey, and Terpstra (1977), and Heilman and Saruwatari (1979)). In contrast, when studying the effects of executive attractiveness, those making decisions about the executives are not the ones being surveyed. Since it is the board of directors who evaluate executives, set pay and appoint a CEO or chair, the ideal survey sample to study the influence of attractiveness on these decisions would be the directors, but this is not feasible. As a result, previous researchers have relied on survey results from a different sample of individuals to obtain a subjective assessment of executive attractiveness or other characteristics. Unfortunately, this approach is fraught with noise. First, demographic differences between survey participants and those for whom their assessment serves as a proxy can contribute to the noise. For example, Foo and Clark (2011) find that young adults rate younger faces as more attractive than older faces, but older adults do not make such a distinction. Thus, young survey participants may rate an older face as less attractive, while an older person may evaluate the same face as relatively more attractive. Second, any familiarity with the assessed subjects by the survey participants can also create noise (Zajonc (1968), Rhodes, Halberstadt, and Brajkovich (2001), and Peskin and Newell (2004)). These issues make replicating findings difficult and inhibit our understanding of the role of appearance in executive labor markets.
For example, using surveys obtained through Amazon's Mechanical Turk service, Duarte et al. (2012) find a significant link between the appearance of trustworthiness and actual individual creditworthiness, but no significant relation between attractiveness and creditworthiness. However, using a different sample of surveyors Ravina (2019) finds evidence of a significant relation between attractiveness and creditworthiness. Relatedly, Graham, Harvey, and Puri (2017) use survey assessments of executive appearance by graduate students of business and find a significant association between competence and executives who are CEOs but only find a weak relation between attractiveness and CEOs. Thus, while fundamental elements of executive attractiveness may indeed be influential, survey assessments of attractiveness from one group used to proxy for those of another have not definitively identified strong economic relations.
We address these concerns by introducing a new nonsurvey-based measure of fundamental attractiveness. Specifically, we use an objective measure of facial attractiveness that is a scientifically based and practitioner proven facial mask. We use the sum of absolute deviations from 25 key facial nodes on the mask to the corresponding nodes on an executive's face, as an objective measure of facial attractiveness (smaller total deviations correspond to more attractive faces). For comparison, we also collect survey assessments of executive attractiveness in our sample. In these surveys, we find substantial variation and range of rater scores, reflecting the noise inherent in this measure of attractiveness. The mask-based measure correlates with the survey-based measure but is less noisy.
Using a sample of 237 executives most likely competing to be CEO in 100 firms, we examine the effect of attractiveness, using both the objective and survey measures, on several executive outcomes. First, we examine compensation and find that, similar to Graham et al. (2017), even after adjusting for the errors-invariables (EIV) inherent in survey measures, we are unable to identify a strong significant relation between survey-based measures of attractiveness and compensation. However, we do find a strong relation using our objective mask-based measure that is economically meaningful. A 1-standard-deviation increase in executive attractiveness (smaller deviations) is associated with a 20.5% increase in compensation.
Since the objective measure is able to identify the heretofore-elusive economic link between attractiveness and executive pay, we analyze other executive labor market events to see whether the objective mask-based measure provides additional insights beyond those provided by survey-based measures. We find no association between the survey-based measure of attractiveness and the likelihood of an executive's promotion to CEO. However, we find strong evidence that facial attractiveness, measured using the objective mask-based measure, is significantly associated with an executive's promotion to CEO after controlling for titles, board membership, and executive pay ranking.
Focusing on the subsample of 100 newly selected CEOs, we examine whether the survey-based measure or the mask-based measure of attractiveness can identify if the new CEO's attractiveness influences shareholders. While both measures of attractiveness are associated with a positive shareholder reaction, only the mask-based measure's relation is significant. Lastly, consistent with attractiveness influencing decision makers (e.g., Mobius and Roensblat (2006)), we find that newly appointed CEOs who are more attractive according to the mask-based measure, are appointed to the board chair significantly faster. Again, we are unable to detect a significant relation when using the survey-based measure.
Our findings make several important contributions to the literature. First, researchers in the fields of psychology (e.g., Aronson and Mills (1965)), politics (e.g., Todorov, Mandisodza, Goren, and Hall (2005)), economics (e.g., Hamermesh and Biddle (1994)), and finance (e.g., Duarte et al. (2012)) have long been interested in the role of physical appearance in a variety of outcomes (Hatfield and Sprecher (1986)) including the job market. Why appearances matter in the job market and what relevant information appearances convey about job performance are important and largely unanswered questions in this literature.
One reason for the remaining unanswered questions despite this vast history of research is, in part, the difficulty of replicability. Most of the related prior research has relied on survey assessments of appearance. Replicating a study based on survey assessments requires utilizing the original survey participants, which is usually not practical nor feasible. Another reason for the inconclusiveness of prior research, particularly that in the executive labor market, is that survey assessments are only a proxy for the assessments of appearance held by the actual decision makers (e.g., board of directors), which introduces another level of measurement error. Our findings highlight the difficulty in replicating prior findings on the role of appearance in the executive labor market when using a survey-based measure while showing that an objective-based measure is less susceptible to measurement error. Thus, our findings demonstrate that how appearance is measured matters. Having an objective measure of appearance that is less susceptible to measurement error and that facilitates replicability helps us to understand better the role of appearance in the executive labor market.
Using the objective measure of attractiveness, our results complement and extend earlier findings on the benefits of attractiveness widely documented in the general population to the executive population. Given the vast evidence of the influence of attractiveness among the population (e.g., Aronson and Mills (1965), Widgery and Webster (1965), Goldman and Lewis (1977), Guise, Pollans, and Turkat (1982), and Warner and Sugarman (1986)), it is reasonable to expect executive attractiveness to play a role in executive labor market outcomes. By using an objective measure, we confirm this intuitive but empirically elusive relation. Being able to identify associations between fundamental elements of executive attractiveness also helps to shed light on prior findings. For example, our finding that attractive CEOs receive the board chair title faster suggests attractiveness can facilitate entrenchment and thus weaken any positive relation between CEO attractiveness and firm performance. This, in addition to the EIV problem, may help further explain why Graham et al. (2017) do not find a significant relation between survey assessments of CEO attractiveness and firm performance.
Our use of an objective measure of fundamental facial attractiveness also extends the recent literature that is beginning to use objective measures of qualitative executive traits. Hsieh, Kim, Wang, and Wang (2020) use a machine learning based facial-feature-point detection methodology to measure CFO trustworthiness and the corresponding association with audit fees. They apply their measure, which they base on an index related to four facial features: angle of inner eyebrow ridges, face roundness, chin width, and nose-to-lip distance, to Google Images of CEOs and CFOs and find more trustworthy-appearing executives to be associated with lower audit fees. Halford and Hsu (2020) measure CEO attractiveness using a webbased facial analysis application on Anaface.com to study the relation between CEO attractiveness and shareholder reaction to their appointment and merger and acquisition decisions. To implement this, they sample each CEO six times through Anaface and take the mean of the measures. Unfortunately, the authors are not privy to the computational algorithm used by Anaface. Both studies are able to identify significant relations between their respective measures of facial appearance and the outcomes they examine. Similar to our approach, both studies base their measure of either trustworthiness or attractiveness on measurements from key nodes on the executives' faces. However, since the exact algorithms used are not disclosed, it is unclear on what basis attractiveness is assessed. Conversely, our approach provides a clear objective standard from which to measure deviations, providing a simple, replicable and systematic means of measuring attractiveness.
Our evidence also makes a significant contribution to the literature by uncovering a new and important executive trait, attractiveness. Unlike other characteristics, such as education or prior experience, with the exception of elective cosmetic surgery, which is certainly feasible, elemental attractiveness is not alterable through experience or choices. As such, it represents an innate trait that executives receive at birth that can influence their executive labor market outcomes later in life, much like one's potential IQ or height (Adams, Keloharju, and Snupfer (2018)). More generally, our findings that beauty plays a role in executive labor markets complements studies of the value of beauty in other settings like education (Hatfield and Sprecher (1986)) and political contests (Todorov et al. (2005)).

II. Measuring Attractiveness and Related Research
A. An Objective Measure of Attractiveness Recently, researchers have begun to exploit new technologies to help quantify and assess investor responses to various types of previously hard-to-measure qualitative information. For example, Mayew and Venkatachalam (2012) use voice analysis software to detect and quantify qualitative information in the variation of emotion in executives' voices during earnings conference calls. Similarly, Mayew, Parsons, and Venkatachalam (2013) find evidence that male CEOs with a deeper pitch voice head larger companies, make more money, and are forced out less often. While these new technologies do not capture every aspect of the qualitative characteristics they are measuring, for example, voice analysis software does not capture hand gestures, facial expressions, or body language used by the speaker that also convey information about executive emotion, they do capture important information related to these characteristics. Similar to these technologies, an objective mask-based measure of facial attractiveness provides a simple means of systematically quantifying a previously difficult-to-measure source of qualitative informationelemental components of facial attractiveness. Scientists in the field of psychology have long studied what people find attractive or esthetically pleasing. Throughout this research, which dates as far back as Fechner (1871), scientists have documented consistent evidence of strong preferences for objects or appearances in proportions of 1.618:1, often referred to as the golden ratio, or phi. The presence of phi as a measure of esthetic appeal is well known in art (The Sacrament of the Last Supper by Salvador Dali), music (Sonneries de la Rose þ Croix by Erik Satie), architecture (Egyptian pyramids), design (shape of postcards), nature (geometry of crystals) and mathematics (shape of pentagon). The value of phi (φ), derived as: quantities a and b are in the golden ratio φ if: While the preference for the golden ratio, over others such as unity, is still being studied (Davis and Jahnke (1991)), more recent practitioner reports reveal additional support for the use of the golden ratio, specifically as an objective standard of beauty for the human face. Dr. Stephen Marquardt developed a proprietary facial mask based on the golden ratio for the purpose of cosmetic surgery (Marquardt (2002)). While studying photographs of faces around the world considered to be beautiful, Dr. Marquardt found that the golden ratio was prominently displayed in two dimensions. Based on these observations, he developed a facial mask known as the Phi-Mask. Male and female Phi-Masks are displayed in Figures 1 and 2, respectively. In an interview with the Journal of Clinical Orthodontics, Marquardt says "The simplest configuration that describes the Golden Ratio in two dimensions is an acute Golden Triangle with sides of 1.618 and a base of 1, or an obtuse Golden Triangle with a base of 1.618 and sides of 1. Together these elements form a Golden regular pentagon, and the regular pentagon itself, if duplicated, inverted, and superimposed on itself, forms the Golden Decagona regular vertex radial decagon." The details of the derivation and properties of this Phi-Mask are available in the patent documents. 2

FIGURE 1
Male Phi-Mask The details of the mask make it a useful measure of attractiveness since it captures more aspects of facial appeal than prior measures that rely on simple characteristics. Consistent with this intuition, Schmid, Marx, and Samal (2008) find that measures of facial attractiveness based on the golden ratio better capture beauty than do simple measures of symmetry. Since its creation, there have been several studies evaluating the use of the mask as an objective measure of facial beauty. For example, the American Society of Plastic Surgeons published two articles, Bashour (2006aBashour ( ), (2006b, in the journal of Plastic and Reconstructive © MBA -Male RF Mask derive the remaining component lines and points of the mask: [n = 6; Z = 1] is used twice for the two iris complexes; [n = 5; Z = 1] is used 3 times, for the nasal tip complex, the internal lip complex, and the internal nares complex; [n = 5; Z = ϕ 1=3 ] is used once as the inner nasal tip halo complex; [n = 5; Z = 2/ϕ] is used once as the outer nasal tip halo complex; [n = 4, Z = 1] is used 14 times, for the nose/mouth complex, the mouth/chin complex, chin inferior border complex, chin complex, right and left sided chin complexes, right and left eyebrow complexes; right and left cheek complexes, and right and left nose/mouth complexes; [n = 3, Z = ϕ 1=3 ] is used twice for the right and left eyebrow/cheek complexes; [n = 2, Z = 1] is used once for the frontal repose smile complex; and [n = 1, Z = 1] is used once for the internal facial pentagon system." Surgery on the use of the golden ratio as an objective measure useful in reconstructing facial features to enhance appearance. 3 Just as voice analysis software quantifies important information conveyed in executive emotions, the facial mask technology allows us to quantify important aspects of attractiveness, which can be informative to shareholders or directors as they formulate their expectations of executives' abilities. Also, like these other technologies, it is not perfect. This two-dimensional objective measure does not fully capture all aspects of attractiveness, such as height or body type. Nor does it capture all dimensions of facial attractiveness such as smiles or skin tones. 4 Furthermore, we cannot be certain that our measure does not correlate with other important factors associated with executive ability such as perceived trustworthiness that may be reflected in appearance through one's genotype. Thus, the Sum of Mask Deviations from the Phi-Mask could capture personality traits that are important in a CEO. Nonetheless, this objective measure provides a systematic way to quantify fundamental features of facial attractiveness that is not subject the varied biases of survey participants. Such a measure can contribute significantly to research in labor economics on this previously difficult to assess, but important, individual characteristic.

B. Challenges With Survey-Based Measurements of Attractiveness
If beauty is indeed in the "eye of the beholder," then using survey assessments of attractiveness from one population, say Amazon's Mechanical Turk workers, as a proxy for how another population, say the board of directors, assesses attractiveness can be problematic. Differences in age, psychological, economic, cultural, and social status between survey participants and the board of directors, whose assessment of attractiveness they are to represent, can make this proxy very noisy (Foo and Clark (2011)). Furthermore, the familiarity associated with the assessments of highly visible executives involved in a CEO selection tournament can introduce additional noise into the survey assessments (Zajonc (1968), Rhodes et al. (2001), and Peskin and Newell (2004)).
These various sources of noise result in an attenuation bias when trying to identify any economic impact from fundamental elements of executive attractiveness. Empirically, this is an EIV problem, which results from an imprecise measurement of the variable of interests (attractiveness). We illustrate this using a modified illustration from Greene (1993). Assume that the model fits the classical normal model, where y is an executive outcome and x* is fundamental executive attractiveness. Since executive attractiveness is not measured precisely, assume that x is the survey-based measure of executive attractiveness that measures fundamental attractiveness with some error u: Assume that u is independent of y and x * and σ 2 u is positive. Substituting equation (2) into (1) yields: Since this is a violation of a central assumption of the classical model, the least squares estimate of β is inconsistent and biased toward zero. This bias is worse the greater the variability in the measurement error, and is referred to as attenuation. The intuition is that when we regress y on x, we have omitted a variable u, from the regression. If u were observed, then there would be no identification problem. Thus, the key to estimating β is to produce information about the degree of measurement error. Unfortunately, for survey assessments of attractiveness, it is impossible to determine the error with which fundamental attractiveness is measured. In Section IV.A, we discuss how to estimate a minimum feasible level of reliability for the survey measure of attractiveness and from that estimate how to implement this estimation. However, an objective measure of attractiveness alleviates this concern.

III. Data
To compare the ability of these two measures of attractiveness, survey assessments and our mask-based measure, to identify associations between attractiveness and executive labor market outcomes, we focus on executives who are likely candidates for promotion to CEO. We start by identifying 351 succession events at S&P 500 firms between 2000 and 2009, excluding transitions due to mergers or acquisitions or when the new CEO is an interim. We also determine whether the new CEO appointment followed a forced or voluntary departure of the outgoing CEO. To obtain an attractiveness score using the facial mask, we first used Google Images to extract multiple pictures of each of the newly appointed CEOs from annual reports and other internet sources around the time of the succession event. We then provided these images to a graphical designer who filtered out the cases where she could not apply the mask. For example, the designer discarded images when CEO pictures were in profile or where the picture pixels were limited, thereby, preventing facial node identification. This process yielded a mask score for 100 new CEO pictures. We then applied a similar screening strategy when collecting pictures for the non-CEO executives of these 100 firms. In total, we obtained facial attractiveness measurements for 255 executives in 100 firms that appointed a new CEO between 2000 and 2009.
We identified executives in these 100 firms most likely to be competing to be CEO by considering only those whom each firm employed during the appointment year and the prior year. We focus on these candidates since most CEO successions are internal (over 80% of our sample firms) and acquiring pictures and mask measurements for all possible external executives would be prohibitively expensive and provide little additional insight.
Among the internal executives, we use board seats, relative compensation and age to identify those most likely to be competing to be CEO. Executives competing to be CEO are often appointed to the board of directors during the evaluation process (Hermalin and Weisbach (1988)) and their compensation is likely greater than other executives (Murphy (1985), Bognanno (2001)). Thus, we consider an executive to be a CEO candidate if the executive is also on the board or if the executive's compensation ranking is in the top three within the firm. Finally, we exclude executives older than 65 since they are not likely to be long-term CEO candidates. In our sample, we find the average (median) number of executives competing to be CEO is 2.37 (2). Three firms had zero internal candidates using our criteria. Thirteen had only one, presumably an heir apparent. (Interestingly, some of these heirs were not selected as the firm went with an outside CEO.) Forty-two had 2 competitors, 31 firms had 3, 8 had 4, and 3 had 5.
The final sample includes 255 executives, 100 of which became CEO while the remaining 155 were runner-ups. The sample size is similar to those used in the various tests of Graham et al. (2017). Of the new CEO appointments, 18 are external and 82 are internal appointments. Excluding the externally appointed CEOs results in 237 internal candidates competing in CEO tournaments across the 100 sample firms. Of the internal candidates, 82 are winners and in these firms, there are 123 runner-ups. Among firms selecting an external CEO, there are 32 internal candidates not chosen.

A. Appearance Score: Phi-Mask and Survey Assessments
We measure elemental facial attractiveness using deviations of 25 key points on executive faces from the corresponding nodes of the Phi-Mask. Specifically, for each sample executive we search annual reports and use Google Images to locate a facial photo taken as near to the CEO announcement date as possible. We impose this restriction to best capture the image evaluated by the board and do not use images more than a year after appointment to avoid changes in executive appearance due to age. We hire graphic designers 5 to align each facial image with that of the Phi-Mask and then measure absolute deviations from the 25 nodes on the Phi-Mask to the corresponding nodes on the executive's face in units of points. 6 The sum of these 25 deviations represents our measure of executive facial attractiveness where faces with larger (smaller) deviations are less (more) attractive. We compute this measure for all 255 executives.
The second measure of attractiveness is based on survey assessments. Recent papers on appearance have used survey assessments from MBA students (Graham et al. (2017)) or workers from Amazon's Mechanical Turk service (Duarte et al. (2012)). We follow Duarte et al. and employ Amazon's Mechanical Turk service to obtain 25 independent workers to rate each executive photograph on the attribute of facial attractiveness, with 10 being very attractive and 1 being very unattractive. We average the ratings for each executive to derive a Survey Attractiveness Score. For ease of comparison of these two different measures of executive facial attractiveness, in the regression analyses, we report results using the standardized values of these two measure of attractiveness as well as all control variables.

B. Descriptive Statistics
To ensure that the 100 CEO transitions we examine are not systematically different from the other CEO transitions, we identify, collect, report, and compare key descriptive statistics on all 351 newly selected CEOs in S&P 500 firms within our sample period. Panel A of Table 1 reports these statistics. Nineteen percent of the new CEOs are from outside the firm. Nineteen percent also have prior CEO experience and 13% replace a forcefully removed CEO.
Next, we compare these characteristics for the subsample of CEOs with a mask deviation measure to the subsample of CEOs without a mask deviation measure. The last two columns in Panel A of Table 1 report t-tests of the differences in the means and the corresponding p-values. These columns show that the two subsamples of CEOs are not statistically different based on prior experience, or age, or whether or not they are outsiders. They differ only in the fraction of females and the fraction following a forced CEO departure. There are 4 female CEOs in our sample with facial measurements and none in the remaining sample. In our sample, 18% of the CEOs followed a forced CEO departure, compared to 10% of the remaining sample.
Panel B of Table 1 reports descriptive statistics for the 237 internal executives competing to be the next CEO for these 100 CEO transitions. The average age of the executive pool is 53 and females represent about 8% of the executive sample. The mean Sum of Mask Deviations is 381.3 points and the mean Survey Attractiveness Score is 4.47.

C. Variation in Survey Assessments of Attractiveness
Using multiple raters and taking the average reduces noise but if there is a wide range of attractiveness assessments among the raters, the resulting proxy may still be a very noisy proxy for the effects of attractiveness on the decision agents, in our case, the board of directors or shareholders. While the issue of rater variation is known, it is difficult to quantify. For example, coefficient alpha (or Cronbach's  Panel B reports descriptive statistics for the executives at the 100 CEO transitions in our sample of CEOs with mask measurements who are most likely candidates to be the next CEO. This sample consists of 82 executives selected to be the next CEO and 155 who were not selected. SURVEY_ATTRACTIVENESS_SCORE is measured as the average score for each executive from survey participants on Amazon's Mechanical Turk, where each executive is ranked on a scale of 1 to 10, with 10 being the most attractive looking. Panel C reports the estimate of interrater variability (square root of the average variance for the 237 internal executive candidates) for each executive by quartile of survey based attractiveness. *, **, and *** indicate significance at the 10%, 5%, and 1% levels, respectively. Alpha (Cronbach (1947)) is a measure of interrater reliability reported by several papers using survey assessments of attractiveness (e.g., Biddle and Hamermesh (1998), Ravina (2019)), but it is hampered by the fact that it increases with the number of survey participants (Cortina (1993)). 7 Thus, more raters and a higher coefficient alpha do not imply less noise. Duarte et al. (2012) illustrate in a simulation exercise reported in their Supplementary Material that the EIV problem arising from variation in survey assessments of attractiveness and other surveyed characteristics can still obfuscate true relations. Given the concern inherent with using multiple raters to generate survey assessments of attractiveness, we next closely examine the variation within our Mechanical Turk rater scores. As a first estimate of the variation among the Mechanical Turk raters of each executive's attractiveness, we calculate an estimate of the interrater variability among the rating scores of each executive's picture. To derive an estimate of interrater variability for the sample, we first compute the survey rating variance among the 25 raters for each of our 237 executives. We first calculate the pooled estimate of variance for our 237 executives by taking the average of these 237 variances. We then take the square root of this pooled estimate of variance as our measure of interrater variability, which for the full sample is 1.61. This degree of variation reflects the noise inherent in survey assessments of attractiveness. However, it is quite possible that this variation is not uniform across the sample. For example, since survey scores are within a finite range, truncation on either end of the scale may produce variation that is greater in the middle of the distribution and smaller in the two tails. It is also reasonable to expect that there would be less variation among survey participants when rating the most or least attractive executives.
To test this conjecture, we divide our sample of executives into quartiles based on the mean Mechanical Turk rater scores, with the least (most) attractive being in the lowest (highest) quartile. As a first step, a 1-way ANOVA analysis of these four groups reveals that the variation across these four groups are not equal. In Panel C of Table 1, we report the estimate of the interrater variability for each quartile. The greatest variation is in the middle two quartiles, which is where we expect survey participants to vary the most, among the sample of "average" looking executives. Next, we conduct pairwise F-tests for the equality of variances across each pair of quartiles. We find that the variation across the middle two quartiles is not significantly different. However, the variation in each of these middle two quartiles is significantly greater than the variation in both the top and bottom quartiles at the 5% level or better. Lastly, there is less variation, more agreement among raters, in the two extreme quartiles, the least and most attractive executives. However, even within the top and bottom quartiles, the estimate of interrater variability is still over 1.5 points out of a possible 10. The results in Panel C of Table 1 highlight two important aspects of survey assessments of attractiveness. First, there is considerable interrater variability, which reflects the noise inherent in surveys. Second, this interrater variability also varies with executive attractiveness. These multiple degrees of variation underscore a fundamental challenge with survey assessments of attractiveness. To further explore this variability we consider rater range.
Across the full sample, we find the average rater range is 6.6 and 55% of our observations have a range greater than 6. However, since ranges are driven by outliers, we next focus on the ratings that are in agreement by looking at the differences between the most common rater score (mode) and the second most common score. If these two differ by only 1 it suggests greater agreement among the raters in the executive's attractiveness. However, if there is greater disagreement among raters, then this difference is likely greater than 1. The difference between the most common and the second most common score within an executive's set of ratings is greater than 1 for 37.6% of our sample (28.3% differ by 2, 7.2% differ by 3, and 1.7% differ by 4). These statistics reveal that the noise inherent in the survey data on attractiveness creates an EIV problem that contributes to an attenuation bias, which obfuscates any possible links between attractiveness and the studied outcomes.

D. A Comparison of Survey Assessments and Objective Assessments of Executive Attractiveness
Next, we examine sources of variation among survey assessments. We expect the mask-based measure to be correlated with survey assessments (Bashour (2006a), (2006b)), but we also consider several other sources of variation, such as age, gender, race, and familiarity. Age is of particular importance in our study given that Mechanical Turk workers are much younger than directors in large firms. Duarte et al. (2012) surveyed a subsample of Mechanical Turk workers used in their analysis and found that 52% were less than 35 years of age, while 85% were less than 50 years of age. Furthermore, the Mechanical Turk workers' population is not as large or as diverse as it appears (Stewart, Ungemach, Harris, Bartels, Newell, Paolacci, and Chandler (2015)).
We also control for firm size (market capitalization) to capture the visibility of firms as a proxy for the visibility of their executives. Finally, we use Google Trends data from 2013 to June 2015, the end of our survey period, to proxy for survey participants' familiarity with the executives in our sample. The Google Trend score is indexed and scaled so that a high score indicates the search term was searched more frequently throughout the trend period and a low score reflects relatively infrequent search activity for the term. We use the firm name as the search term, under the premise that survey participants are more likely to be familiar with the executives of firms with which they are more familiar. 8 In Table 2, model 1, we regress the survey assessment measure on the Sum of Mask Deviations and other executive and firm characteristics. Since we have multiple executives per firm, we report robust standard errors clustered by firm. We find that the masked-based objective measure of attractiveness correlates significantly with survey assessments of attractiveness at the 5% level. However, we also find strong evidence of significant age-related effects. Specifically, executives who are older, bald, wear glasses, or have white hair are associated with significantly lower survey assessments. Those survey assessments are significantly and negatively related to age is important because not only does it reveal an age bias, but this bias can vary greatly across survey participants. If different people age differently, this phenotypical effect can increase variation in survey assessments of attractiveness due to survey assessors' varying age-related biases. 9 We also find that females and African Americans are associated with higher assessment scores and Asians are associated with lower scores. The Google Trends Search Score measure of familiarity relates positively to the survey assessment of attractiveness and is significant at the 10% level, which is consistent with familiarity bias being embedded in the survey measure of attractiveness. Together, these results suggest that  Table 2 reports the results of OLS regression analysis of executive attractiveness measures. The dependent variable in model 1 is a measure of the SURVEY_ASSESSMENT_OF_ATTRACTIVENESS, which is the average rating from Mechanical Turk workers who rate each executive's attractiveness on a scale of 1 to 10, with 10 being the most attractive. The dependent variable in model 2 is the SUM_OF_MASK_DEVIATIONS. Sum of Mask Deviations is the absolute sum of deviations from each of the 25 nodes on the Phi-Mask to the corresponding point on the executive's face. Standard errors are robust and clustered by firm. p-values are reported in parentheses beneath the coefficient estimates. *, **, and *** indicate significance at the 10%, 5%, and 1% levels, respectively. subjective assessments of the attractiveness of senior executives are very noisy. Thus, the objective mask-based measure, which is correlated with subjective assessments, but is not subject to these biases can be a good alternative.

Determinants of Measures of Executive Attractiveness
To assess the effect of these same characteristics on the Phi-Mask measure, we run a similar regression and report the results in model 2, with the dependent variable being the Sum of Mask Deviations. The only characteristics with significant coefficient estimates are age and facial hair. Interestingly, both of these traits can affect the precision with which any objective measure of facial appearances, such as mask deviations, is measured. For example, facial hair can obscure key facial nodes. Thus, it is not surprising that executives with facial hair are associated with greater mask deviations. Conversely, the coefficient on age is negative and significant. While the mask measure is not prone to the same phenotype effects as survey-based measures, it may capture a different phenotype effect. For example, if age-related facial changes make the underlying bone structure, or fundamental elements of facial attractiveness, more prominent, perhaps due to thinning of the skin, this can reduce measurement noise in the mask deviation measurements. Thus, while an older face with thinner skin may be viewed by survey participants as relatively less attractive (model 1), the face may actually have smaller mask deviations because one can better capture fundamental elements of attractiveness with less noise.
These two results suggest that other executive traits may also affect the precision of mask deviation measurements. For example, if more fit executives have less facial fat their mask deviations can also be measured with less noise. Thus, to the degree that executive characteristics, such as age or fitness, affect the precision of mask deviation measurements, the mask may capture these traits. Despite these shortcomings, the effects appear to be small. A 1-year increase in age is associated with only a 4.15 point reduction in the Sum of Mask Deviations, which is only 1.1% of the sample mean and median. Moreover, the decrease in deviations from a 10-year increase represents only one-fourth of a standard deviation of the Sum of Mask Deviations. The coefficient for facial hair suggests a similar magnitude. Thus, while mask measurements may be affected by executive age, fitness, or grooming, these effects appear to be very small.

IV. Attractiveness and Executive Labor Market Outcomes
In this section, we use both measures of executive facial attractiveness and examine several economic outcomes for which attractiveness is likely to play an important role.

A. Executive Facial Attractiveness and Compensation
While numerous studies and books (e.g., Hamermesh and Biddle (1994), Hamermesh (2011)) document that attractiveness is associated with higher compensation in the labor market, there is little evidence that "Beauty Pays" among corporate executives. Graham et al. (2017) find weak evidence that subjective assessments of attractiveness are associated with higher compensation among CEOs but they find a stronger association with subjective assessments of competence. In this section, we explore whether our objective measure of facial attractiveness, which is less noisy than survey assessments, provides empirical evidence of this connection.
In Table 3, we report results replicating the Graham et al. (2017) findings and regress the natural log of total compensation on firm size, operating performance, and industry and year fixed effects using our pool of executive CEO candidates. 10 We also report robust standard errors clustered by firm since we have multiple executive observations in each firm year. In model 1, we find a positive association between the survey-based measure of executive attractiveness and pay, but the relation is not statistically significant. In model 2, we use the survey-based measure of competence and find a positive and significant relation between competence and pay. A 1-standard-deviation increase in survey competence score is associated with a 14.1% increase in compensation (or $0.923 million relative to our sample mean). This is similar to Graham et al. (2017) who document that a 1-standard-deviation in survey-assessed competence is associated with 11%-14% increase in total pay. In model 3, we include both survey-based measures and continue to find that competence, but not attractiveness, is associated with higher executive pay.
These results suggest that the appearance of competence, rather than attractiveness, influences executive pay. Alternatively, it could be that the EIV problems arising from the noise inherent in survey assessments of attractiveness inhibit these models from identifying any significant relation between executive attractiveness and pay. The EIV concern, as illustrated earlier, is that the greater noise induced in the measurement of executive attractiveness can cause standard OLS regressions to bias its coefficient estimate. We can adjust for this bias if we know the reliability of the measurement. Unfortunately, we do not know the reliability with which our multiple survey participants accurately measure fundamental attractiveness. However, we can estimate the minimum reliability of the survey assessments of attractiveness required to make the variable useful. The R 2 of a regression of survey assessments on all other variables, including pay, represents how well all other variables explain survey assessment variability. If the reliability of the survey measure of attractiveness is not at least at this level, then it does not provide any additional explanatory power for the regression and need not be included. In our case, the R 2 from this regression is 0.2475, which therefore represents the lowest reasonable reliability for our survey measure of attractiveness. Thus, to account for the EIV concern we run EIV-adjusted OLS using a reliability ranging from 1.0 (standard OLS) to as low as 0.2475. 11 As we decrease the reliability of the survey 10 We selected the set of controls used in Table 2 based on Graham et al. (2017) because the focus of this analysis is to replicate their earlier findings. In addition, we use a limited set of controls similar to Graham et al. (2017), because like them we have a relatively small sample size. In unreported results, including the additional controls yielded similar results though slightly weaker statistically. 11 To account for the negative bias stemming from the measurement error of survey assessment of attractiveness, we need the reliability ratio. While we do not know the precise reliability for our survey assessments of attractiveness, we can estimate a lower bound of this reliability as described above. Given the lower bound of reliability we can estimate the errors-in-variables adjusted OLS estimates following Treiman (2009). The reliability ratio, r, is equal to (1-(variance of the measurement noise/variance of observed survey assessment of attractiveness) or equivalently (variance of true assessment of  Table 3 reports regression results of executive total compensation. The dependent variable is the ln(tdc1 þ 1), where tdc1 is the executive's total annual compensation from Execucomp. Each model includes industry  and year fixed effects. All models are OLS regressions. Model 4 incorporates an errors-in-variables estimation that makes corresponding adjustments because the SURVEY_ATTRACTIVENESS_SCORE variable is measured with error. The measurement reliability is unknown, but we assume it to be at the lowest possible value. We estimate the assumed reliability to be just higher than the R 2 from a regression of the Sum of Mask Deviations on all other variables, including total compensation. Since this regression had an R 2 of 0.247, we assumed a measurement reliability of this variable to be 0.25. SUM_OF_MASK_DEVIATIONS is the absolute sum of deviations from each of the 25 nodes on the Phi-Mask to the corresponding point on the executive's face. SURVEY_ASSESSMENT_OF_ATTRACTIVENESS (COMPETENCE) is the average rating from Mechanical Turk workers who rate each executive's attractiveness (competence) on a scale of 1 to 10, with 10 being the most attractive (competent). p-values are in parentheses beneath each coefficient estimate. *, **, and *** indicate significance at the 10%, 5%, and 1% levels, respectively. measure of attractiveness, its coefficient estimate becomes economically larger and positive. We report the results for the lowest level of reliability in model 4 but we still do not observe a significantly positive relation between survey-based measures of attractiveness and executive pay. Thus, either there is no statistically significant relation between attractiveness and pay amongst the executive ranks as there is among lower level workers (Hamermesh and Biddle (1994), Hamermesh (2011)) or we are unable to overcome the EIV problem inherent in survey assessments of attractiveness. To address this latter possibility, we next turn to our objective measure of executive attractiveness.
In model 5, we include the Sum of Mask Deviations and find this measure to be statistically and significantly associated with total compensation. A 1-standarddeviation decrease (more attractive) in mask deviations (162.4 points or about 2.25 inches) is associated with a 20.5% increase in executive compensation or about $1.3 million higher than the average executive. This magnitude is similar to other studies that attempt to quantify the effects of difficult-to-measure CEO characteristics on pay. For example, it is similar to the 19% greater ($0.85 million) pay for a generalist CEO relative to that of a specialist CEO documented in Custodio, Ferreira, and Matos (2013). 12 Falato, Li, and Milbourn (2015) document that an improvement in their credential measures of press coverage or in a career fast track is associated with approximately $2.8 million higher pay. Similarly, an improvement in their college credential is associated with approximately $1 million higher pay (about 20% of their sample mean). These results suggest that the objective mask-based measure better captures fundamental aspects of executive attractiveness with less error than a subjective survey assessment. Next, in model 6, we include both the survey-based and the mask-based attractiveness measures. We continue to find no significant relation between the survey-based measure of attractiveness and compensation, whereas we do find a positive and significant coefficient estimate for the survey-based measure of competence and a negative and significant coefficient estimate for the Sum of Mask Deviations. Because both the survey-based measure of competence and the maskbased measure of attractiveness are both measures of executive appearance, it is difficult to attribute the association with greater CEO pay to either characteristic. For example, if both measures relate to aspects of appearance that affect CEO pay, such as personality traits stemming from the executive's genotype and reflected in the executive's appearance, then we cannot attribute the fact that appearance does affect CEO pay to either characteristic. In an effort to isolate the effect of attractiveness/variance of observed survey assessments of attractiveness) (Fuller (1987)). In a simple linear regression, the OLS estimator, b, of β is given by S xy /S xx where S xy and S xx are the sample covariance and sample variance, respectively. However, as discussed earlier, when X is measured with error, the OLS estimator is biased downward because S xx is larger than its true value due to the greater deviations introduced by the measurement errors. If we know the variance of the measurement noise, we can adjust this estimate to remove the bias. Thus, a consistent estimator is given by S xy /(S xx À σ u 2 ), where σ u 2 is the variance of the measurement noise (Kmenta (1971)). If we know the reliability we can estimate σ u 2 , ((1 À r) Â variance of observed survey assessments of attractiveness) (Lockwood and McCaffrey (2020)), and thus obtain a consistent estimate of β. We estimate the variance-covariance matrix following Buonaccorsi (2010). attractiveness from the appearance of competence, in unreported results, we repeat this regression replacing the Sum of Mask Deviations with the residuals from a regression of the Sum of Mask Deviations on the Survey Competence Score and we find the results are unchanged. While we cannot rule out that the mask may be correlated with other personality traits that could be reflected in an executive's genotype, these unreported results suggest that the effect of attractiveness on CEO pay is distinct from that of the appearance of competence on CEO pay.
In summary, due to the EIV problem associated with survey assessments of executive attractiveness, we are unable to identify a significant association between survey-assessed executive attractiveness and pay. In contrast, our objective measure of attractiveness, which is uninhibited by EIV concerns, provides a strong link between executive attractiveness and compensation. Together, these results reveal that our objective measure of facial appearance provides additional explanatory power beyond that provided by survey assessments of either attractiveness or competence.

B. Tournaments and Facial Attractiveness
In Table 4, we use regression models to explore determinants of an executive being selected as CEO. Models 1-5 report results from probit model regressions while models 6-7 report results using an OLS regression (linear probability model), with industry fixed effects. The dependent variable is 1 if the executive is appointed as CEO and 0 otherwise. Since we have multiple executives per firm-year observation we incorporate robust standard errors clustered by firm. We control for board membership, compensation rank (Mobbs and Raheja (2012)), having the title of President (Naveen (2006)), ownership (excluding options), age, gender, and race.
In model 1, we find that, consistent with Naveen (2006) and Mobbs and Raheja (2012), board membership, compensation rank, and holding the title of President each significantly increase the likelihood of winning the tournament. These characteristics are important because they are also measures of human capital accumulated by executives throughout their careers. It is noteworthy that, in unreported analysis, these three variables alone explain over 27% of the variation in the likelihood of selection to be the next CEO. We also find that female executives are associated with a lower likelihood of becoming the next CEO and Asian executives have a positive association with becoming the next CEO.
Our main question is whether either the survey-based or mask-based measure of executive attractiveness provides any incremental influence in executive promotion to CEO. We begin with the survey-based measure of attractiveness and report the results in model 2. The survey assessment does not relate significantly to the likelihood of an executive's appointment as CEO. Given the purported benefits of attractiveness in the labor market, this result is puzzling. However, as noted previously, the noise in this measurement and the associated EIV concerns may prevent this measure from identifying an important significant relationship with the likelihood of an executive being promoted to CEO. In model 3, we include the mask-based measure of attractiveness, which does not have the same concerns. Here we find a significant relation between executive attractiveness (lower mask deviations) and the likelihood of being selected as CEO.  Table 4 reports results from regression analyses of the likelihood of an executive being selected to be the next CEO. The dependent variable equals 0 if the board selects the executive to be the CEO in the following year and 0 otherwise. SUM_OF_MASK_DEVIATIONS is the absolute sum of deviations from each of the 25 nodes on the Phi-Mask to the corresponding point on the executive's face. SURVEY_ASSESSMENT_OF_ATTRACTIVENESS is the average rating from Mechanical Turk workers who rate each executive's attractiveness on a scale of 1 to 10, with 10 being the most attractive. The identified most likely CEO candidates are those who are younger than 65 and either on the board or ranked in the top 3 in pay. Models 1 to 5 are probit models. Models 6 and 7 are linear probability models and include Fama-French defined industry fixed effects. EXECUTIVE_IS_A_DIRECTOR is an indicator variable that equals 0 if the executive is also on the board during the year. COMPENSATION_RANK is the rank of the executive based on total compensation with 1 being the highest ranking (i.e., highest paid) executive. PRESIDENT is an indicator variable that equals 0 if the executive held the title of President in the prior year and 0 otherwise. OWNERSHIP is the percentage of shares outstanding owned by the executive, excluding options. AGE is the executive's age in the year the board selects the CEO. Standard errors are robust and clustered by firm. p-values are reported in parentheses beneath each coefficient estimate. *, **, and *** indicate significance at the 10%, 5%, and 1% levels, respectively. In the column to the right of the coefficient estimates of model 3, we report the marginal effects for our sample. A 1-standard-deviation decline (more attractive) in the Sum of Mask Deviations is associated with a 0.423 increase in the probability of the executive being selected as CEO. This economic impact is quite large as the unconditional probability of an executive being appointed CEO is 0.18, or about 1 of 5 executives. The effect of attractiveness is also large relative to other characteristics. For example, it represents an effect more than twice that of a 1-standarddeviation increase in compensation rank (lower number).
Introducing this measure of attractiveness to the model also notably increased the explanatory power from a pseudo-R 2 of 34.59% in model 2 to 65.54%. To explore this relation further, in Figure 3, we plot the fraction of executives appointed to the CEO position in each Sum of Mask Deviation decile. A 1-standard-deviation increase in attractiveness (decrease in Sum of Mask Deviations) is equivalent to a drop of two and a half to three deciles. Greater fractions of the executives in the more attractive (smaller) deciles become the next CEO. Interestingly, the most attractive executives, those in the lowest two deciles, all become CEOs. To be sure these most attractive executives are not driving our results, we repeat the analysis excluding the executives in the bottom two deciles (most attractive). We report the results in model 4, with the corresponding marginal effects in the column to the right of the coefficient estimates. In this smaller subsample, the unconditional probability of an executive being appointed CEO is only 0.04, or 1 in 20 executives. We find results similar to those for the full sample in model 3. The Sum of Mask Deviations is still negative and significant at the 1% level, though the magnitude of the marginal effect drops to a 0.068 increase in the probability of becoming CEO for a 1-standard-deviation increase in attractiveness. This economic effect is also closer to that of being a director or having a higher compensation rank, albeit still slightly stronger. The marginal effect for being a director indicates that if the executive is a director the likelihood of becoming CEO increases by 0.038, almost doubling the unconditional probability. The marginal effect for a 1-standard-deviation improvement in compensation ranking Fraction of Executives Selected to Be the Next CEO by Phi-Mask Deviation Decile Figure 3 shows the fraction of executives selected as CEO within each Phi-Mask deviation decile, with decile 9 being those with the greatest deviation, and decile 0 being those with the least deviation.

Fraction of CEO Appointments
Mean is associated with an increase of 0.045 in the likelihood of being selected as CEO.
The coefficient for holding the title of President suggests an economic effect of about half that of a compensation rank increase. In model 5, we further exclude the executives in the bottom three deciles of Sum of Mask Deviations and find very similar results to those in model 4. Thus, though the full sample results reported in model 3, may bias the coefficient estimate upward due to the concentration of many attractive executives who are appointed CEO, the results are not solely driven by this skewness. In summary, facial beauty has a significant statistical and economic impact on tournament succession outcomes. Next, in model 6 we repeat model 3 using a linear probability model to be sure that the probit specification is not driving this finding (Caudill (1988)). We find consistent results with similar economic magnitudes. In model 7, we exclude the bottom two deciles of Sum of Mask Deviations and we continue to find similar results, but a lower economic impact from the Sum of Mask Deviations. As for the additional executive characteristics, we find that ownership relates positively to the likelihood of selection in one of the models. Executive age does not affect the likelihood of being selected CEO. The Female indicator is negative and significant in 3 of the models. Finally, the Asian and African American indicators are positive in all models. However, the African American indicator is only significant in four models whereas the Asian indicator is significant in all models. 13 , 14

C. Shareholder Reaction
Next, we consider how shareholders respond to the attractiveness of the newly appointed CEOs. In Table 5, we examine the determinants of the 3-day cumulative abnormal return (CAR) surrounding the announcement of the 100 newly appointed CEOs. Since we have only one observation per firm and several firms per industry, we use robust standard errors clustered by the Fama-French-49 industry classifications. In model 1, the only explanatory variable is the survey-based measure of the newly appointed CEO's attractiveness. We find that it is positive, but insignificantly related to shareholders' 3-day CAR. In model 2, we add the mask-based 13 In unreported analysis, we also include a control for the intensity of the tournament incentives, measured as the compensation gap between the CEO and the median non-CEO executive. Greater tournament incentives could increase the value of attractiveness in winning the tournament. However, we find no evidence of tournament incentives strengthening or weakening the effect of executive attractiveness on succession likelihood. 14 Kamiya, Kim, and Park (2018) find that the facial width to height ratio (fWHR) of the CEO predicts the riskiness of the firm. To control for this effect, we use the fWHR of the Phi Mask as our baseline. For the Phi Mask, the fWHR is the inverse of the golden ratio (0.618). For each executive in our sample we determine if their fWHR is greater or lesser than that for the Phi Mask. We measure the (x,y) deviations from the far left/right nodes and the top/bottom nodes of the executive's face and create an indicator variable that equals one if the fWHR is greater than that of the Phi Mask as a possible determinant for the likelihood of being selected as CEO. We find that 81% of the 237 executives in our sample have a fWHR greater than the Phi Mask and 70% of the 100 newly selected CEOs have a fWHR greater than the Phi Mask. However, we find that this variable is negatively related to the likelihood of being selected as CEO, but it is not quite significant (p-value = 0.11). The coefficient estimate for mask deviations remains significantly negative at the 1% level. This result is consistent with Kamiya et al. (2018), who find that executives with high fWHRs are less likely to be appointed as the CEO after the dismissals of CEOs with financial misconduct. measure of attractiveness and find a significant relation between this measure and shareholders' reactions. A decrease in the Sum of Mask Deviations (improvement in facial attractiveness) by 1-standard-deviation is associated with a 0.58 percentage point increase in shareholder reaction. This finding underscores the importance of having an objective measure less encumbered by noise that is able to detect evidence that the market indeed incorporates this qualitative information.
In model 3, we include the number of press releases the executive was responsible for in the 3 years prior to the CEO appointment announcement (Greene and Smith (2021)). The coefficient estimate is positive and significant at the 10% level. While this association is consistent with shareholders valuing the experience of publicly representing a firm, it is also consistent with shareholders being more familiar with these candidates and thus being more certain of their ability at their appointment announcement. If firms are more inclined to select more attractive candidates to speak to the press the shareholder reaction could be due to greater familiarity rather than attractiveness. However, in unreported results, we find there is no significant difference between the Sum of Mask Deviations for CEOs in the top and bottom quartiles of press releases. Furthermore, even after controlling for the number of press releases, the coefficient estimate for the Phi-Mask deviations remains essentially unchanged from that in model 2.
In model 4, we also control for the age of the newly selected CEO and find a positive and significant coefficient estimate, consistent with shareholders responding more favorably to the selection of older more experienced CEOs. Again, the masked-based measure of executive facial attractiveness remains statistically significant at the 5% level. In addition, the survey assessment of facial attractiveness, which is subject to age-related biases and greater noise in its measurement, is now Shareholder Reaction to CEO Appointment Announcement Table 5 reports regression results of the 3-day cumulative abnormal return (CAR) for the [À1,1] day window around the announcement of a new CEO appointment for the 100 CEO tournament winners. We estimate the market model using the value-weighted CRSP index as a proxy for the market returns over days [À201,À10]. We calculate the abnormal return for each day in the event window by subtracting the expected return (market model) from the actual return. SUM_OF_MASK_ DEVIATIONS is the sum of the deviations from the 25 nodes on the Phi-Mask to the actual point on the CEO's face in points as measured using Adobe Photoshop. NUMBER_ OF_PRESS_RELEASE_STATEMENTS is the number of times the executive is quoted in a press release in the 3 years prior to being appointed as CEO. Standard errors are robust and clustered by the Fama-French 49 defined industries. p-values are in parentheses beneath each coefficient estimate. *, **, and *** indicate significance at the 10%, 5%, and 1% levels respectively. only marginally insignificant at traditional levels (p-value = 0.11). Nonetheless, the flaws with a survey-based measure of attractiveness that lead to the EIV concern continue to inhibit its ability to identify significant economic relations between attractiveness and shareholder value. In summary, the evidence in Table 5 reveals that shareholders attribute value to CEO attractiveness but this value is difficult to detect when using noisy survey-based assessments.

D. New CEO Chair Appointment
Next, we use our measures of CEO attractiveness to examine another important promotion, the appointment of the CEO to board chair. Prior literature in economics and psychology suggests that attractiveness may facilitate faster promotions as attractive individuals are effective persuaders (e.g., Aronson and Mills (1965)). It is not clear whether CEO-Chair duality is beneficial for the firm, but it is clear that CEOs benefit from the combined role by having greater control over the board. Thus, it is reasonable to expect that all CEOs would desire this duality.
In addition to our measures of attractiveness, we also control for firm size, abnormal stock performance over the CEO's appointment year, an indicator if the new CEO is over 55 years old, an indicator if the new CEO was previously the firm President and a measure of tournament strength. Larger firms can have greater information asymmetry between managers and the board, which makes combining the CEO and chair role informationally efficient (Brickley, Coles, and Jarrell (1997)). Conversely, boards of larger firms may require a longer "evaluation" period of new CEOs before appointing them as board chair and outgoing CEOs may retain the chair title for an extended period (Mobbs (2015)). First-year performance reflects the new CEO's talent and ability in the position. A poor showing can delay the joint CEO-chair appointment, whereas a strong first-year performance can expedite the appointment as CEOs prove their ability to the board. Older CEOs are likely to have more experience and, thus, be better prepared to handle the additional responsibilities associated with the chair position, whereas younger CEOs may require greater time to prove their ability to the board. Finally, because the type of succession can influence the time to chair appointment, we include two succession controls. First, since "passing the baton" successions can have expectations of when the new CEO also becoming the chair (Vancil (1987)), we control for whether the chosen CEO held the title of president prior to the succession. In addition, we control for the strength of the tournament incentives with the percentage gap between the prior CEO's compensation and the median compensation of the tournament competitors. 15 In Table 6, we report results from Cox Proportional Hazard models where the dependent variable equals the number of days until the new CEO receives the chair position of the board. Since we only have one executive observation for each firm year but multiple firms per industry, we use robust standard errors clustered by industry. In model 1, we report the hazard ratios from the parameter estimates for a model using only the control variables. We find that stock performance and the Previous President indicator both relate significantly to the likelihood of becoming chair. In Cox Proportional Hazard models, explanatory variables with coefficient estimate greater than 1 imply a greater "risk" or "hazard," whereas coefficient estimates less than 1 imply a less likely "hazard." In our context, the "hazard" is appointment to board chair. The coefficient estimate for stock returns, which is greater than 1, indicates that better stock returns hasten the "hazard" or increases the likelihood the CEO receives board chair sooner. Conversely, the coefficient estimate for the Previous President indicator, which is less than 1, indicates that CEOs holding the title of President before their CEO appointment face a lower "hazard" rate or a lower likelihood they will receive the board chair appointment on a given date. Firms implementing a "pass the baton" succession plan, where the heir receives the title of President before becoming the CEO, often allow the departing CEO to hold the board chair position for an extended period after the CEO transition occurs (Vancil (1987)).
In model 2, we include the survey-based measure of CEO attractiveness and find no evidence that attractiveness affects the hazard rate of the CEO reaching the board chair position. In model 3, we incorporate the mask-based objective measure of CEO attractiveness. Here we find evidence that attractiveness does affect the timeline of the CEO's appointment to the board chair. The significant coefficient estimate on Sum of Mask Deviations of 0.823 indicates that unattractive CEOs will have a significantly lower "hazard" rate for becoming board chair, implying that CEOs that are more attractive will have a significantly higher "hazard" rate for becoming board chair. Economically, a 1-standard-deviation increase in mask Likelihood of the CEO Being Awarded the Chair Near CEO Appointment Table 6 reports the results of regression analyses on the likelihood of the new CEO being appointed as chair of the board using Cox Proportional models. The dependent variable, DAYS_TO_CHAIR, is the number of days until the CEO is appointed as Chair. SUM_OF_MASK_DEVIATIONS is the absolute sum of deviations from each of the 25 nodes on the Phi-Mask to the corresponding point on the executive's face. ABNORMAL_STOCK_PERFORMANCE is the 12-month compounded return for the fiscal year less the 12-month compounded return of the market. PRESIDENT is an indicator variable that equals 0 if the executive held the title of President in the prior year and 0 otherwise. Standard errors are robust and clustered by industry. We report p-values in parentheses beneath each coefficient estimate. *, **, and *** indicate significance at the 10%, 5%, and 1% levels respectively. deviations (less attractive) is associated with a 17.7% (1-0.823) reduction in the expected hazard, in other words, a reduction in the likelihood of becoming chair on a given date since CEO appointment. 16 Lastly, we compute the Schoenfeld residuals to test the proportional-hazards assumption in this model. We find the p-values of all the coefficients and for the global test to be 0.15 or higher. Thus, there is no evidence to support a nonproportional hazard. In other words, there is no evidence that the hazard rate varies over time, which could limit the effectiveness of the model and thus our inferences. In summary, using the objective mask-based measure of CEO attractiveness, we find that CEOs that are more attractive gain the board chair significantly faster. This outcome improves our understanding of the CEO-Chair duality decision and is consistent with attractiveness enabling the CEO to assert greater influence over the board of directors. However, these results are valid only when using the objective measure of attractiveness and not when using the survey-based measure.

V. Conclusions
Survey assessments of physical characteristics such as facial attractiveness are subject to measurement error noise, which biases estimates of any association between the measured characteristic and related outcomes. We address this limitation of survey-based measures of facial attractiveness by introducing and utilizing a scientifically based measure of facial beauty to study the role of executive facial attractiveness. We use the Phi-Mask to measure the facial attractiveness of over 230 executives competing for CEO positions in 100 large publicly traded companies. The Phi-Mask is based on the esthetically pleasing aspects of the golden ratio, which has historical evidence in a variety of contexts that date back over 2,000 years and scientific research cited in our paper over 100. Since this scientifically based measure is free from the noise that affects subjective survey assessments (e.g., differences in age, culture, or familiarity between survey participants and the board of directors), we are better able to identify empirically the positive link between executive attractiveness and labor market outcomes.
Similar to the benefits attributed to attractiveness in the general population, using this new technology we are able to identify that more attractive executives receive higher compensation and are more likely to be promoted to CEO. Furthermore, CEOs that are more attractive are associated with a more favorable shareholder reaction to the news of their appointment and are associated with being appointed as chair of the board faster. Survey assessments of executive attractiveness, which are much noisier due to inherent biases, are not able to identify these same relations. 16 Conversely, when we use the negative of the sum of mask deviations such that the more attractive CEOs have higher values (less negative), the coefficient estimate of the hazard ratio is 1.22. This indicates that a 1-standard-deviation increase in attractiveness is associated with a 22% higher hazard rate, or likelihood of being appointed to chair on a given date. Alternatively, when we repeat the analysis using an indicator for being in the bottom quartile of mask deviations (most attractive quartile) the coefficient estimate is 1.46, which suggests that a CEO in the top quartile of attractiveness has a 46% greater hazard rate compared to other CEOs.
In summary, using a more objective standard of attractiveness reveals that executive facial attractiveness can be a distinguishing trait that is important in executive labor market advancement. Moreover, the objectivity of the new method of measuring elemental aspects of beauty used in this study opens avenues for further research in other diverse fields of scientific inquiry including investments, economics, psychology, politics, and medical studies of reconstructive surgery.

Supplementary Material
Supplementary Material for this article is available at https://doi.org/10.1017/ S0022109022000461.