The foreign language effect on motivational quotes

According to the “ reduced emotionality hypothesis ” , we are less emotionally driven when reasoning in a foreign language (FL) than in a native language (NL). We examined whether this foreign language effect (FLe) extends to the way we perceive motivational quotes (i.e., encouraging slogans conveying a profound and inspirational message): we expected FL participants to rate motivational quotes as less profound than NL participants. Strikingly, we observed the opposite: FL participants found motivational quotes more profound than NL participants, even after controlling for potential confounders (e.g., IQ, reasoning style). Both FL and NL participants gave similarly low profundity ratings to pseudo-profound bullshit sentences (i.e., meaningless sentences sounding profound), indicating that the message must be meaningful for the FLe to arise. We propose that, like space or time, language could promote psychological distance. This favours a focus on the background of a message to indicate profoundness.

The origin of the FLe remains unclear. According to the "reduced emotionality hypothesis" (Costa et al., 2014a;Keysar et al., 2012), FL use lowers the intensity with which the emotional content of linguistic material is processed. Presumably, this occurs because the NL is linked to the emotions that arise when one experiences events early in life (Costa et al., 2014a). This means that associated emotions are retrieved when linguistic expressions commonly used in similar situations are processed (at least to some extent). FL use is not burdened by these early links, and as such, does not retrieve the same emotional states. The implication of this view is that making a decision involving an affective component is less distressing in an FL than in an NL. This could explain why it is possible to make less emotionally driven decisions in contexts entailing an affective component when using an FL. Some authors, however, have raised the concern that a potential "cognitive load effect", rather than reduced emotionality, is what may actually drive the FLe (Keysar et al., 2012). This concern is based on the assumption that using an FL increases cognitive load because it is costlier to process and also on the observation that cognitive load promotes deliberative reasoning (Greene, Morelli, Lowenberg, Nystrom & Cohen, 2008). It follows from these two premises that FL use leads to more utilitarian and/or less intuitive decisions due to the cognitive load associated with processing linguistic material (e.g., choices will be less conditioned by moral guidelines or fear of loss) Keysar et al., 2012). However, this concern has been addressed in prior studies that showed no correlation between the dependent variable (e.g., utilitarian responses) and continuous measures of FL proficiency level (e.g., Circi et al., 2021;Corey et al., 2017). Moreover, the fact that different studies have shown no FLe in different cognitive biases (e.g., the tendency to consider a decision as more appropriate if it has resulted in a good outcome) that do not involve emotionality also argues against the cognitive load concern (e.g., Vives, Aparici & Costa, 2018). This is because one would expect the higher deliberative reasoning promoted by an FL's higher cognitive load to reduce such cognitive biases.
Unfortunately, although the "reduced emotionality hypothesis" is widely accepted in the field, it is difficult to test it experimentally because its foundations remain vague. This has led to most authors in the field focusing on exploring the phenomenon itself (i.e., the situations in which the FLe arises). Nevertheless, this line of research is not exempt of challenges. A concern that researchers of the FLe phenomenon must face is that most situations used in its assessment lack ecological validity. For instance, many dilemmas involve extreme situations that one may never encounter (e.g., having to kill someone) or habits in which some do not engage (e.g., gambling). This raises the question of whether the FLe could be observed in more common circumstances that do not necessarily require a choice. Hadjichristidis, Geipel and Surian (2019) observed that the FLe is also present in a context that may drive everyday behaviour to a lesser or greater extent in the population: superstitious beliefs. They found that FL reduces the influence of superstitious beliefs on the (positive and negative) feelings prompted by imaginary actions (e.g., finding a four-leaf clover in the grass or seeing a falling star in the sky will be less associated with good luck in an FL scenario; walking under a ladder or breaking a mirror will cause less fear of bad luck in an FL context).
In this study, we aimed to contribute to identifying ecological situations in which the FLe takes place. To this end, we assessed whether the FLe reduces the influence of a phenomenon that is increasingly present in everyday situations: MOTIVATIONAL QUOTES. These positive and encouraging slogans are ubiquitous in social media (Twitter, Facebook, Instagram) and many other socialprofessional contexts, including working environments, gyms, and language schools (e.g., "Your teacher can open the door, but you must enter by yourself"). Because motivational quotes prompt positive affect, they have been progressively integrated in clinical and educational settings, such as therapeutic programs, to promote positive thinking and self-esteem. For instance, they have been shown to increase confidence, motivation, and satisfaction in adults suffering anxiety, depression, or stress (Bedrov & Bulaj, 2018), especially in treatments taking a positive psychology approach (Kour, El-Den & Sriratanaviriyakul, 2019). However, individual differences in sociocultural background or general intelligence seem to modulate the influence of motivational quotes (Pennycook, Cheyne, Barr, Koehler & Fugelsang, 2015). We believe that, among the possible variables modulating the effects of motivational quotes, language should be considered. These quotes are carefully worded to sound profound and create an affective impression that prompts behaviour according to the value transmitted by the quote. In other words, they are written to help individuals adopt relevant attitudes at a personal level (e.g., effort, resilience, perseveration). Interestingly, the multilingualism that characterizes current society means that many people frequently encounter these messages in a non-native language. From the perspective of the "reduced emotionality hypothesis", the efficiency of a motivational quote may be reduced if it is presented in an FL rather than an NL because their affective essence is processed differently. Of note, motivational quotes are positive emotion-laden sentences, which contrasts with the fact that most previous FLe literature has used linguistic material that elicits negative affect (see Del Maschio et al.'s (2022) and Circi et al.'s (2021) recent meta-analyses). However, in line with the results by Hadjichristidis et al. (2019) with positive superstitions, for which they found similar FLe compared to negative superstitions, we expect to find the abovementioned reduction on the efficiency of motivational quotes.
Thus, given the results of most previous studies and the "reduced emotionality hypothesis", the most straightforward prediction is that motivational quotes will be perceived as less profound if they are presented in an FL compared to an NL. As in prior literature investigating the receptivity of motivational quotes (e.g., Pennycook et al., 2015), we consider a quote profound when its meaning extends below the surface by having a broad significance that involves a transcendental value. Therefore, in the current study, we investigate whether FL use reduces the extent to which motivational quotes are perceived as profound or transcendental. To test this prediction, we asked participants to rate the profundity of ten motivational quotes using a Likert scale ("1 = not at all profound" to "5 = very profound"). Half of the participants performed the task in their NL and the other half in their FL. Confirmation of our hypothesis would follow if participants in the FL group rated motivational quotes as less profound than participants in the NL group.
A potential concern might be that FL participants could give lower profundity values to motivational quotes because of mild comprehension difficulties preventing them fully grasping the meaning of a quote. Therefore, we also introduced ten sentences considered "pseudo-profound bullshit": that is, grammatically correct sentences that are void of content, but that are written to impress and that pretend to convey a positive, encouraging, and deeply transcendental message (e.g., "Hidden meaning transforms unparalleled abstract beauty") (Bainbridge, Quinlan, Mar & Smillie, 2019;Gligorić & Vilotijević, 2020;Pennycook et al., 2015). These pseudo-profound sentences can be considered a type of motivational quote that lacks a clear meaning, and for which the pomposity of their wording results in many individuals perceiving them as profound (Pennycook et al., 2015). In fact, they are commonly used in social media. For instance, Pennycook and Rand (2020) demonstrated that the inclination to share fake news on these media was positively associated with pseudo-profound bullshit receptivity (that is, the predisposition to attribute profundity to these sentences). We applied the same rationale as we did with the motivational quotes: FL users would be less impressed by the pomposity of the wordiness and, hence, less inclined to rate pseudo-profound bullshit sentences as profound compared to NL participants. However, the main reason for including these pseudo-profound bullshit sentences was to remove the underlying meaning and assess only the extent to which their wording was able to impress the participant. Hence, we presented the motivational quotes and pseudoprofound bullshit sentences randomly, with participants being unaware that they were rating motivational quotes and pseudoprofound bullshit sentences. Observing an FLe in both sentence types would reduce concern about mild comprehension difficulties. That is, in the case of pseudo-profound bullshit sentences, the potential comprehension difficulties of FL participants would not exert any detrimental effect on grasping the meaning compared to NL participants. This is because pseudo-profound bullshit sentences already have unclear significance, which means that neither NL nor FL participants will be able to grasp a clear meaning from them. Hence, the two types of participants are expected to give their profundity ratings to pseudo-profound bullshit sentences solely guided by their impressive wording. Therefore, the concern that the lower profundity ratings that the FL participants are expected to give to motivational quotes 2 Barbara Braida et al.
are driven by mild comprehension deficits would be reduced if the same participants also show reduced profundity ratings to the pseudo-profound bullshit sentences. Of note, individual differences regarding receptivity to motivational quotes and pseudo-profound bullshit seem to exist. Pennycook et al. (2015) observed that such receptivity tends to correlate negatively with general intelligence and deliberative reasoning. Therefore, we included additional questionnaires in our study to control for potential confounding by these two variables: Raven's Progressive Matrices and the Cognitive Reflection Test (CRT). Finally, FL users completed an English proficiency selfassessment questionnaire and an English lexical task. The lexical task was used to obtain an objective measure of estimated proficiency. We used this measure to control for the potential cognitive load effects associated to the presumable higher cognitive demands of processing linguistic material in an FL compared to an NL.

Participants
In total, 115 female participants gave written consent to participate in exchange for course credits. The study was granted ethical approval by the Bioethical Committee of the University of Barcelona (Institutional Review Board 00003099). All participants lived and studied in Barcelona (Spain) at the time the study was conducted and were Catalan-Spanish bilinguals with a native level in both languages. Given that one of the two languages may be considered more dominant than the other (even if only slightly), the 57 participants in the NL group completed the study in Spanish or Catalan depending on their self-reported dominant language. Another 58 participants completed the study in English as an FL. There were no between-group differences in mean age (NL group = 20.63, SD = 3.7; FL group = 19.91, SD = 1.71; t = 1.34, p > .18) or educational attainment (all were undergraduates or masters' students at the Faculty of Psychology of the University of Barcelona).
Participants in the FL group reported having started acquiring English through classroom instruction at around age 8 years. Most had a Common European Framework (CEF) English proficiency level of "upper intermediate" (CEF = B2) or "upper & lower advanced/proficient" (CEF = C1 & C2). We estimated this proficiency level through the Lexical Test for Advanced Learners of English, LexTALE (Lemhöfer & Broersma, 2012), which has consistently been found to correlate with CEF proficiency. Participants also completed a self-rated English proficiency questionnaire in which they used a 7-point Likert-type scale to assess competence in reading, writing, speaking, and comprehension (1 = low proficient, 7 = highly proficient). Congruent with the CEF levels estimated by LexTALE, the mean self-rating value in each language domain (including the understanding capacity for material written in English, "reading") approximated lower-upper intermediate proficiency levels (see Table 1).

Material and procedure
All participants rated the motivational quotes and pseudoprofound bullshit sentences. These ratings were given by completing the Motivational and Pseudo-profound Bullshit Scale, which was presented at the beginning of the study. Next, participants completed the control tasks and questionnaires in the following order: LexTALE, CRT, Raven's Progressive Matrices, and the English proficiency self-assessment questionnaire. This protocol was created and presented via Qualtrics.

Motivational and Pseudo-profound Bullshit Scale
Participants were presented with ten motivational quotes and ten pseudo-profound bullshit sentences. The former comprised meaningful and encouraging statements (e.g., "The creative adult is the child who survived"), while the latter comprised combinations of vague buzzwords composing a valid syntactic structure that lacked any meaning (e.g., "Wholeness quiets infinite phenomena"). All sentences were taken from Pennycook et al. (2015). These authors selected the motivational quotes from the internet. The pseudo-profound bullshit sentences were selected from two generator websites. One of these websites (http://wisdomofchopra.com) creates pseudo-profound bullshit sentences by randomly combining a list of words used in tweets by Deepak Chopra, which have been categorized as nonsensical by many (e.g., Shermer, 2010). The other website (The New Age Bullshit Generator, http://sebpearce.com/bullshit/) works in a similar way to the first one but uses a list of buzzwords compiled by its author (Seb Pearce). Pennycook et al. reported good internal consistency for both sub-scales (motivational quotes and pseudoprofound bullshit). In addition, the fact that the mean ratings for the pseudo-profound bullshit sub-scale were lower than those for the motivational sub-scale was taken as a measure of the scale's sensitivity. This is because the fact that pseudoprofound bullshit sentences are void of content necessarily implies a lower perception of profundity compared to sentences with an actual underlying meaning (i.e., motivational quotes).
Participants were asked to rate the profundity of the sentences on 5-point Likert scales (1 = not at all profound at all, 5 = very profound). We used the same instructions as those reported by Pennycook et al. (2015): "We are interested in how people experience the profound. Below are a series of statements taken from relevant websites. Please read each statement and take a moment to think about what it might mean. Then please rate how 'profound' you think it is. Profound means 'of deep meaning; of great and broadly inclusive significance'." *N = adds up to more than the total number of the subjects because some indicated more than one place of exposure. ** Other = academies, audiovisual media, job, social media, etc.
Bilingualism: Language and Cognition 3 All 20 sentences appeared at once and remained visible on the same screen until the scale was completed, with the two sentence types intermixed in random orders that differed among participants (a full list of items for the Motivational and Pseudo-profound Bullshit Scale can be found in Appendix A). We used the mean profundity rating given by each participant to each type of question: the MOTIVATIONAL QUOTE RECEPTIVITY (MQR) SCORE was used to measure receptivity to motivational quotes, and the BULLSHIT RECEPTIVITY (BSR) SCORE was used to measure receptivity to pseudo-profound bullshit.

Cognitive Reflection Test
We used the multiple-choice version of this task (Sirota & Juanchich, 2018) to assess the reasoning style of participants (intuitive vs. deliberative). This tested the ability of a participant to suppress a prepotent (but incorrect) intuitive response and engage in cognitive reflection to resolve a set of mathematical word problems (seven questions in this study). Participants had to choose between four options (Frederick, 2005;Sirota & Juanchich, 2018): the correct answer (1 point each, giving a "CRT-reflective score"), the incorrect intuitive answer (one point each, giving a "CRT-intuitive score"), as well as two other incorrect answers (Frederick, 2005;Pennycook, Cheyne, Koehler & Fugelsang, 2016;Shenhav, Rand & Greene, 2012;Sirota & Juanchich, 2018). As we sought a measure of the reasoning style that was not influenced by a potential FLe, all participants (including those in the FL group) completed the CRT test in their NL. The two groups did not differ in either the CRT-INTUITIVE SCORE (NL mean = 3.07, SD = 1.68; FL mean = 3.26, SD = 1.76; p = .56) or in the CRT-REFLEXIVE SCORE (NL mean = 2.46, SD = 1.98; FL mean = 2.57, SD = 1.78; p = .75). This indicates that the two groups were equivalent in their reasoning style.

Raven's Progressive Matrices
We assessed general intelligence by means of the Superior Scale I of Raven's Advanced Progressive Matrices. The task comprised 12 items with a picture lacking a missing piece. Participants needed to indicate which of the eight pieces arranged next to the picture completed it correctly within 10 minutes (participants were informed that the test would close after this time had passed). To avoid a potential FLe in this task, all participants were given the instructions in their NL, including those in the FL group. The total number of correct responses was taken as a measure of general intelligence (RAVEN SCORE). The mean Raven score was similar in the two groups (NL group = 9.81, SD = 1.61; FL group = 9.17, SD = 2.2; p = .08).

English Proficiency and Use Questionnaire
FL participants self-reported information about English acquisition (context and age when started), percentage use across lifespan (i.e., 0-3 years, 3-6 years, 6-12 years, 12-18 years, 18 years to present), and context of use (e.g., friends, classes, watching television, or watching movies on platforms like Netflix). A summary of the bilingual profile of the FL group is presented in Table 1.

Data analyses
Data analyses were carried out using mixed-effect models. The dependent variable was the participants' ratings for different items on the MQR-BSR scale. Given that the NL and FL groups did not differ in age, Raven score, CRT-intuitive score, or CRT-reflexive score, we did not include any of these variables in the analyses. Following recent recommendations to interpret significance in mixed-effect models (Luke, 2017), we applied the Satterthwaite approximation for degrees of freedom (Satterthwaite, 1941) and used maximum likelihood (REML) to fit the models. In addition, we used deviation coding for categorical predictor variables (i.e., scale and group) to enable interpretation of their main effects and interactions. That is, the two levels of the predictor variables were coded as -0.5 and 0.5 (instead of 0 and 1, which is what is used in dummy coding). This means that no level of a predictor factor was taken as a baseline with which to compare the other level (otherwise, coefficients would represent simple effects rather than main effects). Instead, the deviation coding (-0.5, 0.5) meant the intercept represented the grand mean (across the two levels of a factor), enabling us to interpret the coefficients of the models as main effects and interactions. All models had the maximal random effect structure justified by the data (Barr, Levy, Scheepers & Tily, 2013), and the analyses were conducted with the lmerTest package R package (version 3.4.1; R core team, 2018).

Foreign language effect in motivational quotes
We evaluated whether FL use has an effect on motivational ratings, using a model with statement type (MQR vs. BSR), language group (NL vs. FL), and the interaction between these variables as experimental predictors. Besides participant and item, the by-participant random slope for the interaction between type of statement and language group were included in the random effects structure. The results showed a main effect for statement type (t = 4.25, p = .0002), indicating that overall ratings of the BSR statements were lower (mean = 2.68, SD = 0.74) than those of the MQR statements (mean = 3.4, SD = 0.68). This observation replicates the results of Pennycook et al. (2015) because, as expected, participants found meaningless statements (BSR) to be less profound than meaningful statements (the MQR ones). The main effect of language group did not reach significance (t = 1.85, p > 0.07). Importantly, the two-way interaction was significant between statement type and language group (t = 2.38, p = .019), suggesting that language exerts a differential influence in the ratings to MQR or BSR. Profoundness ratings in response to MQR were higher when sentences were presented in the FL (mean = 3.59, SD = 0.6) compared to the NL (mean = 3.21, SD = 0.71; t = 3.14, p = .002), whereas no differences appeared with respect to BSR items (FL mean = 2.67, SD = 0.69; NL mean = 2.65, SD = 0.8; t < 1) (Figure 1).
The main finding of this study is that FL use by participants was associated with rating motivational quotes as more profound when compared to NL use. By contrast, the profoundness ratings to pseudo-profound bullshit were unaffected by language (NL or FL). This raises the question of whether these results were driven by participants in the NL group rating meaningful sentences (motivational quotes) in the same way as meaningless statements (pseudo-profound bullshit). If this were the case, then the language effects would not be genuinebut, rather, by-products of methodological confounding. To exclude this possibility, we ran two complementary models. The first complementary model sought to rule out that participants in the NL group rated the profundity of pseudo-profound bullshit and motivational quotes similarly. Hence, the dependent variable included the ratings that NL participants gave to all MQR-BSR scale items, with statement type of (MQR vs. BSR) included as the only experimental predictor. The random effect structure included participant, item, and the by-participant random slope for statement type. The main effect of statement type (t = 2.39, p = .023) confirmed that NL participants rated pseudo-profound bullshit significantly less profoundly (mean = 2.69, SD = 0.78) than motivational quotes (mean = 3.21, SD = 0.71), excluding the possibility that the FLe on motivational quotes in the main analyses could be explained merely by NL participants treating motivational quotes the same as pseudoprofound bullshit (i.e., void of content). The second complementary model sought to evidence whether the FL group also showed differential ratings between the pseudoprofound bullshit and the motivational quotes. Therefore, this matched the previous model, with two exceptions. First, we only considered participant ratings from the FL group. Second, we included the LexTALE score as a control predictor and its interaction with the experimental predictor (statement type) to examine if FL proficiency affected the profundity ratings. As expected, there was a main effect of statement type (t = 5.49, p =.0001), reflecting that FL use was associated with participants rating pseudoprofound bullshit (mean = 2.67, SD = 0.69) significantly less profoundly than motivational quotes (mean = 3.59, SD = 0.6). There was no main effect or interaction involving the LexTALE score (t s < 1). The lack of interaction between the LexTALE score and statement type is particularly relevant because it limits the risk that the high profundity ratings by FL participants to motivational quotes (compared to NL participants) were driven by mild comprehension difficulties and/or cognitive load effects.
Comprehension issues could also have resulted in motivational quotes being rated more profoundly by FL users. The lack of between-group differences in profundity ratings to pseudoprofound bullshit excludes the potential that FL use led to sentences that were not fully understood being interpreted as more (or less) profound. It is therefore unlikely that FL users gave higher profundity ratings to motivational quotes because they did not fully grasp their meaning. One could still argue that slight comprehension difficulties may reflect the higher profundity ratings based on the possibility that FL users did not completely understand all words in the motivational quotes. As observed by Montero-Melis, Isaksson, van Paridon and Ostarek (2020), these mild comprehension difficulties would not prevent them grasping the general meaning of the quote, but the mental representation may be more abstract, which in turn, might promote a greater sense of profundity. However, as argued elsewhere (see also Douven, 2018), we would expect FL users merely to show a bias toward the middle of the Likert scale if this were the case. That is, we would not expect these slight comprehension difficulties to lead FL users to differ from NL speakers in their interpretation of motivational quotes that are generally rated with extreme profundity rating (not at all profound, or very profound)which is where in fact these existed, thereby excluding this as a potential concern. Indeed, a visual inspection (see Figure 2) reveals that low profundity ratings (1 and 2) tended to be given by NL participants, while high profundity ratings (4 and 5) were more frequent in the FL group, and the two groups overlapped for the middle value (3).

Discussion
The aim of this study was to contribute to the field of FLe by examining whether this phenomenon arises during the processing  of a particular type of linguistic material, motivational quotes, which are of ecological relevance due to its increasing presence in everyday situations. With this aim in mind, we investigated whether language influences how profound or transcendental we find motivational quotes, hypothesizing that participants would rate motivational quotes presented in an FL as less profound than those presented in an NL. Given the positive affective content of motivational quotes, this prediction seemed consistent with the "reduced emotionality hypothesis", which argues that the FLe reduces emotionality because the link between affective states and linguistic expressions is weaker (or even absent) when using an FL. However, our prediction was not confirmed. First, the FLe was observed in the opposite direction, with FL rather than NL use associated with participants rating motivational quotes more profoundly. This effect could not be attributed to the potential influence of general intelligence (measured by the Raven Progressive Matrices), reasoning style (measured by the CRT), or the proficiency level of FL participants (measured by the LexTALE). We included pseudo-profound bullshit sentences and made the same prediction that FL use would reduce the emotional impression and lead to them being rated as less profound (compared to an NL). However, FL did not influence the profundity ratings for pseudo-profound bullshit. As mentioned in the Introduction, the rationale for incorporating pseudo-profound bullshit sentences was to control for the possibility of comprehension difficulties. However, in this unexpected scenario, where the FLe for motivational quotes went in the opposite direction than expected, the lack of an FLe for pseudo-profound bullshit is still informative. It reveals that the FLe observed in motivational quotes does not arise with any linguistic material formulated to impress the reader. Instead, it seems that the underlying message of the linguistic material must be meaningful for the FLe to arise.
To our knowledge, no prior study has observed reverse FLe (but see below Geipel, Hadjichristidis & Surian, 2016). One difference between our study and most previous ones is that the material we used prompted positive rather than negative affects, like those one may experience when facing imaginary losses in terms of lives or money. In addition, studies using the framing effect have shown that an FL reduces loss aversion in terms of economic (e.g., Keysar et al., 2012) and social decision-making (e.g., Liu, Wang, Timmer & Jiao, 2022) strengthening the idea that it lowers emotional reactions that are particularly negative. Since the FLe has not been widely explored in contexts of positive affect, one may wonder whether it could be affected by affective valence. In fact, some studies seem to indicate that positive and negative emotion-laden linguistic material may not be processed in the same way, at least at lexical-semantic level. For example, it has been shown that (with respect to neutral words) positive emotion-laden words speed up lexical processing whereas negative emotion-laden words delay it (e.g., Rodríguez-Ferreiro & Davies, 2019). One may wonder whether this difference could somehow influence the FLe. However, this does not seem to be the case. For example, with a written lexical decision task, Conrad, Recio and Jacobs (2011) observed the typical faster response latencies for positive-laden words and slower response latencies for negative words (relative to neutral words). Importantly, both effects (the facilitation of positive words and the interference of negative words) were reduced in an FL compared to a NL: that is, an FL reduces the processing of affect at lexical-semantic level regardless of the direction of the effect (facilitatory or interfering). In addition, the results of an earlier study (where both positive and negative linguistic material was used to investigate the FLe) go in the same direction (Hadjichristidis et al., 2019). These authors found that an FL reduces the receptivity to positive and negative superstitions to the same extent (prompting the feeling of good and bad luck respectively). However, the results of another study by the same authors (Geipel et al., 2016) suggest that the modulation of FLe by affective valence is complex and needs further investigation. In the specific case of their study, participants' assessment of different factors characterizing a given situation may have interacted with affective valence. This was reflected by the FL increasing moral goodness judgements (i.e., FLe in the reverse direction) of actions carried out with negative intentions but ending up with positive outcomes (e.g., a company running a charity campaign for good publicity that boosted its profits), whereas the FL reduced such judgments (i.e., FLe in the typical direction) if the intentions were positive but the outcomes negative (e.g., giving money to a poor boy who used it to buy drugs and died of an overdose).
The fact that the reverse FLe observed in the present study is difficult to accommodate by the "reduced emotionality hypothesis" casts doubts about its capacity to explain the general nature of the FLe. In this regard, a hypothesis regarding the FLe that we had not considered during the study design concerns the possible influence of "psychological distance". In what follows, we briefly describe psychological distance and develop a tentative proposal about how it may explain the FLe, accounting for our result as well as those reported in prior studies. Given that proposing psychological distance as a potential driving force of FLe is a tentative hypothesis, it needs to be taken with caution. This is proposed here just as a potential line of future research that may serve to expand the rather vague knowledge we currently have about the origin of the FLe.
Psychological distance as a potential driving force of the FLe?
According to Liberman, Trope and Stephan (2007), one effect of psychological distancing is to construe the mental representation of an event more abstractly (as opposed to more detailed or concrete). Several factors can modulate the degree of abstraction of mental representations (e.g., time or space). Using time to illustrate our point, the more distant an event is perceived in time (either past or future), the more abstract its mental representation. Critical to this is that it has been consistently and robustly demonstrated that a high degree of abstraction entails focusing more on the goal or purpose of an action rather than the details of executing that action (Liberman & Trope, 1998;Sagristano, Trope & Liberman, 2002). For example, imagine that a person holds environmental preservation among their values. At one point they may be presented with the possibility of taking an action, such as joining a beach cleaning expedition early in the morning, before the bathers arrive. If the individual is then asked to indicate how likely it is that they will join the expedition, the proximity in time will be relevant. On the one hand, if the event is considered close in time (e.g., two days ahead), the individual will construct a concrete mental representation that includes details about the effort and its inconveniences (e.g., getting up very early and then getting tired and dirty). A response may be more realistic when indicating the probability of joining an activity in the near future, when concrete mental representation highlights factors that can put them off the activity place value systems in conflict with stated behavioural intentions. On
the other hand, if the situation is formulated at a more distant time (e.g., within two months), the individual may construct an abstract mental representation in which specific details are not considered (e.g., the effort entailed). Rather, the value placed on the act (preserving the environment) may prevail. In the case of some imagined activity planned in the distant future, the individual's value system rather than the viability concerns may best predict the probability of engagement in the activity. We propose that FL use may also contribute to psychological distancing in a similar way to time distancingthat is, by leading to more abstract mental representations of a situation. This means that, compared to an NL, an FL leads to greater consideration of the value associated with an action and to less concern with the detail of carrying out the action. We believe that the results obtained with the motivational quotes could represent a first clue in favour of this proposition. These phrases could be considered more profound or transcendent in an FL because an abstract mental representation of their meaning has been constructed, leading to a focus on the value conveyed by the quote. However, those motivational quotes may not have been considered as profound in the NL because their message is interpreted more concretely, leading to individuals examining what it would entail to behave in line with the values underpinning the quote. Of note, the "psychological distance hypothesis" tentatively proposed above predicts the same effect as the "reduced emotionality hypothesis" in many situations, such as in contexts of moral decision-making and gambling: both hypotheses predict more utilitarian behaviour and risky choices in an FL. What distinguishes the two hypotheses is how they explain these effects. According to the "psychological distance hypothesis", it is a rather abstract construction of the situation that reduces the concern with the detail of carrying out an action (e.g., killing a man, betting a large amount of money). This is because the abstract construction of the situation will lead us to focus more on the objective (e.g., saving as many people as possible, winning money) than on how it may be achieved (e.g., killing an innocent person, putting our money at risk). According to the "reduced emotionality hypothesis", the reduction of this concern is due to the FL system being unlinked from emotions. The fact that the two hypotheses predict the same in most affective situations makes them difficult to disentangle. In fact, both of them are based on the same assumption: the fact that a FL is not used when one experiences events early in life. This only has an impact on the link between the FL and the emotions, according to the "reduced emotionality hypothesis". In contrast, the "psychological distance hypothesis" assumes that all sorts of concrete details (including but not restricted to emotions) will be affected, which is the reason why mental representations would be rather abstract when based on FL linguistic material. If the "psychological distance hypothesis" is true, then, we should be able to describe any imaginary situation (including those with no affective connotations) in less detail in an FL than in an NL. Moreover, this reduced richness of details should correlate with FLe measures. This could be an avenue to explore in future studies.
In short, the present study has contributed to the FLe literature by identifying motivational quotes as a new source of this effect. Beyond the underlying causes of the FLe, the identification of new contexts where it originates has been a growing line of research in the field. Information gathered in this area has a potential practical application at socio-economic and political levels, among others. For instance, the fact of using one or another language in speeches delivered to audiences that need to make decisions (e.g., voting, investing their money, supporting social policies, etc.) may be determinant. In the particular case of motivational quotes, the results of our study suggest that presenting them in a foreign language promotes their effect, which could roughly be summarized as the promotion of positive thinking. This feature could benefit different sorts of therapeutic programs, including positive psychology treatments for anxiety or depression.
However, two specific limitations of the present study call for caution when generalizing the results to the general population. The first limitation is the fact that our sample was exclusively composed of women. It is unknown whether gender influences the receptivity of motivational quotes; it is possible that it depends on the specific value underpinning the quote, responding to current gender stereotypes. For instance, it has been shown that women tend to characterize themselves as less assertive and less competent in leadership (e.g., Hentschel, Heilman & Peus, 2019). Therefore, they could obtain more benefit from motivational quotes with values that promote the development of a confident, forceful attitude. This entails that an FL might have more or less room to exert its beneficial effect depending on gender and the type of quote. Beyond any baseline differences between genders in personality traits and social behaviour that may constrain the magnitude of FLe, it is also possible that gender has a direct impact on the individuals' propensity to experience FLe. For example, Gargalianou, Urbig and Van Witteloostuijn (2017) observed no differences between men and women in social dilemmas presented in their NL. However, in the FL condition, women behaved more cooperatively than men, suggesting that women experienced less FLe. However, this issue remains unclear, since prior studies have found no interaction between gender and language in moral decision-making tasks (e.g., Białek, Paruzel-Czachura & Gawronski, 2019;Hayakawa et al., 2017).
The second limitation concerns the use of a between-subject rather than a within-subject design. Most studies in the field use a between-subject design (e.g., Corey et al., 2017;Costa et al., 2014aCosta et al., , 2014bKeysar et al., 2012). This has been common practice in order to exclude the concern that the first language (NL or FL) used sets the specific mindset that will be applied throughout the entire task (more or less utilitarian), even if the two languages are used in two separate sessions. However, this precaution means that it cannot be known whether the NL and FL groups of participants may be unbalanced in unknown individual differences.
An a posteriori and secondary contribution of the present study has been the proposal of a tentative hypothesis regarding the origin of FLe. Psychological distance, rather than reduced emotionality, seems to be able to accommodate the reverse FLe observed in our results. This suggests that there is scope to focus research on psychological distance.

Conclusions
We provide the first evidence that an FL affects people's receptivity to motivational quotes: they are rated as more profound when presented in an FL than when presented in an NL. The surprising direction of this FLe cannot be accommodated by the "reduced emotionality hypothesis", which has been widely used to account for the FLe. This finding highlights that more research is needed to understand the origin of the FLe. We tentatively propose that FL use creates psychological distance, similar to the influence of space or time. Compared to NL use, FL use may promote a focus on the underlying message; consequently, motivational quotes may tend to be evaluated as more profound or transcendental in an FL. In any case, along with one previous study (Geipel et al., 2016), our results may represent early evidence that the FLe may reverse in certain circumstances. Further research is needed to understand when and why.