| Literature DB >> 33266611 |
Petros Gkotsis1, Emanuele Pugliese1, Antonio Vezzani1.
Abstract
In this work we use clustering techniques to identify groups of firms competing in similar technological markets. Our clustering properly highlights technological similarities grouping together firms normally classified in different industrial sectors. Technological development leads to a continuous changing structure of industries and firms. For this reason, we propose a data driven approach to classify firms together allowing for fast adaptation of the classification to the changing technological landscape. In this respect we differentiate from previous taxonomic exercises of industries and innovation which are based on more general common features. In our empirical application, we use patent data as a proxy for the firms' capabilities of developing new solutions in different technological fields. On this basis, we extract what we define a Technologically Driven Classification (TDC). In order to validate the result of our exercise we use information theory to look at the amount of information explained by our clustering and the amount of information shared with an industrial classification. All-in-all, our approach provides a good grouping of firms on the basis of their technological capabilities and represents an attractive option to compare firms in the technological space and better characterise competition in technological markets.Entities:
Keywords: clustering; industrial classification; industrial economics; innovation studies; patents studies; scoreboard
Year: 2018 PMID: 33266611 PMCID: PMC7512469 DOI: 10.3390/e20110887
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Representing the clustering up to 38. The 5 cluster families are highlighted in the dendogram.
Figure 2NMI with respect to the number of clusters. The blue line compares the clustering obtained using as input the IPC4 and the IPC3 codes. The green dot represents the NMI between the results of our clustering using the IPC4 codes and the ICB classification.
Figure 3Share of variance of different variables explained by the clustering for different number of clusters (blue line) versus the share of variance explained by the ICB3 classification (green dot).
Figure 4In the two subpanels we highlight the main characteristics of the 38 clusters. The size of the circles is proportional to the number of firms in the cluster. In the left panel, we show the share of patents in each of the 8 main sections of the IPC classification (IPC1): A—Human Necessities; B—Performing Operations, Transporting; C—Chemistry, Metallurgy; D—Textiles, Paper; E—Fixed Constructions; F—Mechanical Engineering, Lighting, Heating, Weapons, Blasting; G—Physics; H—Electricity. Sectors are divided in two groups related to the two main patenting industrial sectors: Health (sections A and C) and ICT (sections G and H). In the right panel, we show the distribution of firms’ technological diversification (# of IPC classes) in the clusters, presented in log-scale. Firms in clusters of Family 1 are not very diversified (around 10 fields), while the general diversification of our sample is generally very high. Firms in family 5 in particular are often characterized by a diversification above 100 different subclasses (out of 629 total subclasses in the IPC classification). From the right panel it is possible to see instead how the firms are divided in clusters with a similar breath of technological diversification (number of IPC4 codes in their patent portfolio). In particular, the clusters in family 1 show a low degree of diversification; in other words in these clusters are classified firms active in the development of a relative low number of technologies compared to the average Scoreboard firm.
Descriptive statistics of the clusters.
| Macro Cluster | Cluster | Patent Propensity | Diversification (# IPC4) | # of Firms | R&D | R&D Intensity | Top 3 Subclasses | % Top 3 IPC4 | |
|---|---|---|---|---|---|---|---|---|---|
| 5 | Patent Propensity: 0.91 | 5D | 7.6 | 280 | 2 | 629 | 1.2% | G06F,H05K,H01R | 36% |
| 5O | 2.0 | 198 | 13 | 662 | 5.5% | G03G,H04N,G06F | 30% | ||
| 5J | 1.4 | 124 | 16 | 415 | 1.3% | H01L,E21B,H01M | 44% | ||
| 5M | 0.9 | 136 | 18 | 773 | 4.3% | H01L,G06F,A61B | 39% | ||
| 5N | 0.7 | 157 | 12 | 4140 | 9.6% | G06F,H04W,H04L | 53% | ||
| 5C | 0.7 | 289 | 7 | 3180 | 6.2% | H01L,G06F,H04N | 33% | ||
| 5A | 0.6 | 178 | 12 | 650 | 3.6% | F01D,E02F,F02C | 15% | ||
| 5G | 0.5 | 141 | 11 | 1715 | 10.3% | H01L,G02B,A61K | 23% | ||
| 5K | 0.5 | 136 | 6 | 366 | 0.5% | C22C,F25J,F17C | 15% | ||
| 5F | 0.4 | 285 | 7 | 3628 | 5.0% | H01M,B60W,B60R | 14% | ||
| 5B | 0.4 | 321 | 8 | 3209 | 4.0% | F01D,F02C,G06F | 13% | ||
| 5E | 0.3 | 205 | 8 | 3099 | 5.2% | H01L,H01M,B60W | 12% | ||
| 5H | 0.3 | 140 | 8 | 922 | 4.5% | C08G,C07C,C08L | 18% | ||
| 5I | 0.2 | 221 | 4 | 2582 | 5.7% | A61B,A61F,A61K | 20% | ||
| 5L | 0.03 | 134 | 2 | 937 | 2.5% | G06Q,G06F,B01D | 24% | ||
| 4 | Patent Propensity: 0.54 | 4F | 1.2 | 93 | 19 | 273 | 6.3% | B41J,H01L,G06F | 17% |
| 4D | 0.8 | 57 | 42 | 124 | 4.6% | H01L,G02F,G06F | 34% | ||
| 4E | 0.5 | 66 | 44 | 606 | 7.5% | G06F,H04W,H04L | 39% | ||
| 4C | 0.4 | 36 | 75 | 331 | 4.1% | G06F,H04L,H04W | 46% | ||
| 4A | 0.4 | 47 | 57 | 127 | 2.3% | H01R,G01V,H01L | 16% | ||
| 4B | 0.3 | 27 | 41 | 165 | 13.9% | G06F,H01L,G11C | 44% | ||
| 3 | Patent Propensity: 0.38 | 3F | 0.6 | 99 | 19 | 226 | 3.3% | F16C,H02K,B62D | 21% |
| 3A | 0.5 | 67 | 37 | 150 | 2.7% | D06F,A47L,B29C | 9% | ||
| 3C | 0.4 | 49 | 24 | 98 | 1.0% | C22C,F01D,C23C | 19% | ||
| 3D | 0.4 | 37 | 33 | 95 | 2.6% | A01D,F16D,B60T | 21% | ||
| 3B | 0.2 | 34 | 34 | 136 | 0.7% | E21B,F24H,F16K | 20% | ||
| 3E | 0.2 | 113 | 25 | 1132 | 4.3% | B60R,F16H,B62D | 16% | ||
| 2 | Patent Propensity: 0.37 | 2C | 0.7 | 86 | 11 | 228 | 2.8% | B60C,A63B,C08L | 41% |
| 2B | 0.5 | 71 | 40 | 177 | 1.2% | H01L,C08L,C08G | 18% | ||
| 2A | 0.3 | 42 | 53 | 218 | 0.8% | H01M,B01J,C08F | 17% | ||
| 2D | 0.2 | 61 | 23 | 385 | 3.3% | A61K,C11D,A61Q | 30% | ||
| 2E | 0.2 | 78 | 15 | 1402 | 6.5% | A61M,A61K,A61F | 36% | ||
| 1 | Patent Propensity: 0.11 | 1E | 0.4 | 42 | 32 | 154 | 1.3% | A61F,B65D,A47J | 25% |
| 1B | 0.2 | 21 | 137 | 73 | 1.5% | H01L,G02B,G06F | 30% | ||
| 1D | 0.1 | 23 | 105 | 441 | 10.7% | A61K,A61B,A61M | 37% | ||
| 1C | 0.1 | 10 | 215 | 155 | 3.9% | G06F,H04L,G06Q | 46% | ||
| 1F | 0.1 | 9 | 173 | 107 | 4.3% | A61K,C07D,A61P | 56% | ||
| 1A | 0.05 | 6 | 288 | 89 | 1.3% | H01L,B60N,H01R | 14% | ||