Hostname: page-component-77f85d65b8-pkds5 Total loading time: 0 Render date: 2026-03-29T14:00:10.417Z Has data issue: false hasContentIssue false

Inventors’ explorations across technology domains

Published online by Cambridge University Press:  06 November 2017

Jeff Alstott
Affiliation:
Singapore University of Technology and Design, Singapore Massachusetts Institute of Technology, Media Lab, USA
Giorgio Triulzi*
Affiliation:
Singapore University of Technology and Design, Singapore Massachusetts Institute of Technology, Institute for Data, Systems, and Society, USA United Nations University, MERIT, The Netherlands
Bowen Yan
Affiliation:
Singapore University of Technology and Design, Singapore
Jianxi Luo
Affiliation:
Singapore University of Technology and Design, Singapore
*
Email address for correspondence: gtriulzi@mit.edu
Rights & Permissions [Opens in a new window]

Abstract

Technologies are created through the collective efforts of individual inventors. Understanding inventors’ behaviors may thus enable predicting invention, guiding design efforts or improving technology policy. We examined data from 2.8 million inventors’ 3.9 million patents and found that most patents are created by ‘explorers’: inventors who move between different technology domains during their careers. We mapped the space of latent relatedness between technology domains and found explorers were 250 times more likely to enter technology domains that were highly related to the domains of their previous patents, compared to an unrelated domain. The great regularity of inventors’ behavior enabled accurate prediction of individual inventors’ future movements: a model trained on just 5 years of data predicted inventors’ explorations 30 years later with a log-loss below 0.01. Inventors entering their most related domains were associated with patenting up to 40% more in the new domain, but with reduced citations per patent. These findings may be instructive for inventors exploring design directions, and useful for organizations or governments in forecasting or directing technological change.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
Distributed as Open Access under a CC-BY 4.0 license (http://creativecommons.org/licenses/by/4.0/)
Copyright
Copyright © The Author(s) 2017
Figure 0

Figure 1. Three examples of how relatedness, $R$, was calculated. Each year citations to patents in the Semiconductors domain from patents in other domains were counted and compared to the quantity expected by random chance, given all domains’ number of citations and other factors. $R$ was the portion of years that the number of citations was above expectation.

Figure 1

Figure 2. An inventor’s $R$ to an unentered domain was the average of the $R$ to the domain from each of the domains in which they had previously patented. (Top) A diagram of an inventor before their first move. Since the inventor had patented in only one domain, the inventor’s $R$ to each unentered domain was the same as the $R$ from that one domain. (Bottom) A diagram of the same inventor before their second move. At this point the inventor had patented in two domains, so the inventor’s $R$ to each unentered domain was the mean of the $R$s from the two domains. Actual technology domains were very fine-grained (examples shown in Figure 10).

Figure 2

Figure 3. Inventors regularly explore across technology domains. The probability that an inventor’s next patent was in a previously unentered technology domain, given the number of patents the inventor already had (double logarithmic axes).

Figure 3

Figure 4. Inventors were far more likely to explore a domain if it was related to their previous work. (A) The probability density function of how likely an inventor was to move to a domain, given its mean $R$ to the inventor’s previous domains. (B) As A, conditional on the popularity of the domain (the number of patents in the domain in the previous year). (C) As A, conditional on whether the inventor had previous co-authors who had patented in the domain before working with the inventor, and whether any of the inventor’s previous patents had citations to the domain.

Figure 4

Figure 5. Inventors’ explorations were predictable. When each inventor moved, a naive predictive model ranked the inventor’s unentered domains in order of their probability to be entered. The model had been trained with patent data from 1976 up to 1980 (blue), 1990 (green), or 2000 (red). Regardless of the model used, the domain actually entered was typically high on the prediction list. Perfect prediction: 1.0, Random prediction: 0.5. Other measures of predictive power: c-statistic ${>}0.9$, log-loss ${<}0.01$ (Figure 14).

Figure 5

Figure 6. Explorers were more likely to have a guide if they entered a domain with high $R$ to their previous patents.

Figure 6

Figure 7. The number of co-authors an explorer had on their first patent in a new domain increased with $R$, while the number of co-authors who were also explorers to the domain was largely unassociated with $R$.

Figure 7

Figure 8. Explorers’ first patent in an entered domain was more likely to have co-authors with higher $R$ to the domain if the explorer had a higher $R$ themself. When an explorer entered a new domain, their first patent in the domain could have co-authors, and those co-authors could also be explorers to the domain. Of those fellow explorers, we can ask the probability that at least one of them will have a high $R$ to the entered domain, with varying thresholds for what is a ‘high’ $R$ (different colored lines). The probability of having a high $R$ co-author was a function of the explorer’s own $R$ ($x$-axis). Note that the plotted lines terminate once the explorer’s own $R$ is above the threshold; thus, this is the probability of an explorer having a co-author with a higher $R$ than their own. The data presented is for the 45% of explorers that did not have a guide on their entering patent (a co-author who had previously patented in the entered domain).

Figure 8

Figure 9. Inventors had higher total performance when they entered domains more related to their previous work, but lower average citations. (Blue) An explorer’s number of additional patents in the entered domain after the entering patent. (Green) The number of citations those patents received. (Red) The average number of citations per patent. All values are expressed relative to entering a domain with $R=0$. Dashed lines: empirical averages, binned by $R$ in 11 bins from 0 to 1. Solid colors: association of $R$ with performance, as inferred from models that accounted for other factors (Line: median expectation of performance, assigning all empirical entries a given value of $R$ and holding all other parameters constant at their empirical values. Shading: 95% credible interval). Parameter values for these models are shown in Figure 13.

Figure 9

Figure 10. Three instances of inventors exploring a new technology domain. (Left) In 1996 three inventors, Wim, Sandra, and George, all entered the domain of ‘heterocyclic compounds’ (chemicals with a ring of carbon and non-carbon atoms). They had all patented in only one domain previously, had no co-authors who had previously patented in the new domain, and their patents had not cited the new domain. Their performance in the new domain was related to the $R$ of the new domain with their previous experience (table). (Right) Maximum spanning tree of the full set of all 629 technology domains and their $R$ to each other. To aid visualization, a community structure is highlighted, and some of the larger domains are labeled. Link width: $R$ between two domains. The node sizes and link widths are visualized using all patent data from 1976 to 2010, but the rank order of both moved little during the years visualized.

Figure 10

Figure 11. The portion of pairs of domains with low $R$ was very high, but inventors’ entries were more evenly distributed across the values of $R$. The $R$ associated with each entry (green line) is calculated using patent data from the years before that entry. The $R$ associated with each pair of domains (blue line) is calculated using patent data from 2010. The differences in date biases the two lines to be closer together (the $R$ values of the blue line to be lower and of the green line to be higher), but they are still clearly distinct.

Figure 11

Figure 12. Explorers’ first patent in an entered domain was more likely to have co-authors with higher $R$ to the domain if the explorer had a higher $R$ themself. As Figure 8, but the co-author’s $R$ is within a specific range (legend).

Figure 12

Figure 13. Parameter values for models of inventors’ future number of patents (blue) and citations (green) in an entered domain. Lines: kernel density estimates of the posterior distribution of each parameter’s values. Shading: parameters’ 95% credible interval.

Figure 13

Figure 14. Multiple measures of the predictive model’s power show persistently accurate prediction on long time horizons. Predictive models were created using data from 1976 up to 1980 (blue), 1990 (green) and 2000 (red). These models were then used to predict explorers’ movements in subsequent years, after the time period included in the model training. (A) The c-statistic of the models’ predictions (area under the receiver operating characteristic curve). (B) The logarithmic loss of the models’ predictions.

Figure 14

Figure 15. The relationships between $R$ and explorers’ performance were robust to different measures of technology relatedness.

Figure 15

Figure 16. The predictability of explorers’ future moves was maintained when using different measures of domains’ interactions to quantify technology relatedness. The predictive power of three different models, measuring domains’ relatedness through three different kinds of interactions: their number of citations (Citations: A–C), how often an inventor’s portfolio has patents in both domains (Inventor Co-Occurrence: D–F), how often a patent is classified in both domains simultaneously (Co-Classification: G–I).