Literature DB >> 35834491

Unsupervised deep learning supports reclassification of Bronze age cypriot writing system.

Michele Corazza¹, Fabio Tamburini¹, Miguel Valério², Silvia Ferrara¹.

Abstract

Ancient undeciphered scripts present problems of different nature, not just tied to linguistic identification. The undeciphered Cypro-Minoan script from second millennium BCE Cyprus, for instance, currently does not have a standardized, definitive inventory of signs, and, in addition, stands divided into three separate subgroups (CM1, CM2, CM3), which have also been alleged to record different languages. However, this state of the art is not consensually accepted by the experts. In this article, we aim to apply a method that can aid to shed light on the tripartite division, to assess if it holds up against a multi-pronged, multi-disciplinary approach. This involves considerations linked to paleography (shapes of individual signs) and epigraphy (writing style tied to the support used), and crucially, deep learning-based strategies. These automatic methods, which are widely adopted in many fields such as computer vision and computational linguistics, allow us to look from an innovative perspective at the specific issues presented by ancient, poorly understood scripts in general, and Cypro-Minoan in particular. The usage of a state-of-the-art convolutional neural model that is unsupervised, and therefore does not use any prior knowledge of the script, is still underrepresented in the study of undeciphered writing systems, and helps to investigate the tripartite division from a fresh standpoint. The conclusions we reached show that: 1. the use of different media skews to a large extent the uniformity of the sign shapes; 2. the application of several neural techniques confirm this, since they highlight graphic proximity among signs inscribed on similar supports; 3. multi-stranded approaches prove to be a successful tool to investigate ancient scripts whose language is still unidentified. More crucially, these aspects, together, point in the same direction, namely the validation of a unitary, single Cypro-Minoan script, rather than the current division into three subgroups.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35834491 PMCID： PMC9282481 DOI： 10.1371/journal.pone.0269544

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

Introduction

Cypro-Minoan is the term commonly used to describe a group of inscriptions dated to the latter part of the second millennium BCE, found mainly on the island of Cyprus. A dozen inscriptions also come from the port-town of Ugarit on the coast of Syria, and three have been found on Tiryns, in Greece. Currently, these inscriptions are undeciphered, which implies that the language (or languages) recorded has not been identified. The script in which they are written is syllabic and derives directly from the Linear A script of Crete [1-4], whose status is undeciphered too. Cypro-Minoan is found on a diverse array of epigraphic supports, from tablets (though in minimal quantities), to small clay balls, metal vessels, ceramic vessel fragments and other media [5, 6]. The overall corpus counts fewer than 300 inscribed objects. The purpose of this article is to address some outstanding problems related to the inventory of signs in the syllabary, which is still undetermined, and to reassess a long standing issue tied to the internal structure of the script itself. In light of its varied epigraphic nature, Cypro-Minoan currently stands divided into three separate, alleged sub-scripts (CM1, CM2, CM3). This implies that we do not have solid grounds to infer the homogeneous nature of the script itself. The suggestion that it may represent three different scripts has implications of linguistic division as well, as it alleges different languages being recorded. Moreover, the first proponent of the division suggested that the alleged three different scripts also recorded different languages [1], so there are linguistic implications to consider. To shed light on this state of affairs and on the nature of the script, we have applied specific deep learning techniques, together with other traditional methods of analysis. Indeed, our approach involves a multi-stranded methodology, merging epigraphic and paleographic considerations with the application of deep neural networks. This synergy has never been applied before to any of the undeciphered scripts from the Aegean area, Linear A or the Cretan Hieroglyphic script or indeed Cypro-Minoan. In recent years, the prominence of neural networks in different fields has advanced, thanks to the increased availability of data and progress in the methods. They have proved useful in advancing our understanding of ancient writing systems, too, to reconstruct, for instance, some documents from the Greek [7, 8] and Babylonian [9] cultures, whose content is damaged, broken or incomplete. Other approaches have applied various machine learning techniques to investigate the scribal attribution of one of the Dead Sea scrolls [10] and Linear B signs [11], identifying the hands responsible for their composition. Other approaches, however, need to be considered when dealing with scripts that are undeciphered. For example, a neural-network based approach was used to classify the presence of textual content in texts written in the Indus valley script [12], but to the best of our knowledge, no deep learning method was proposed to investigate the sign inventory of an undeciphered writing system. To investigate Cypro-Minoan, we necessarily had to use an unsupervised deep learning technique, since the status of the sign inventory is still uncertain. The neural architecture we adopted marks the first attempt to investigate an ancient undeciphered writing system using an unsupervised method, and shows promise for further applications of neural networks in the field.

A tripartite division

According to the current reference corpus produced by Jean-Pierre Olivier [5], Cypro-Minoan (CM) inscriptions mainly comprise three different scripts, termed CM1, CM2, and CM3, and a total of 96 different syllabic signs in the overall inventory (Fig 1). One early inscription is thought to represent an ‘archaic’ stage of CM, closer to Minoan Linear A, termed CM0. According to this divisive classification, especially as theorized originally in 1974 [1], CM1 should include most of the inscriptions in the corpus (over 200 heterogeneous texts), ought to use the highest number of signs, and represent the main writing system on Cyprus throughout most of the Late Bronze Age (ca. 1525–1050 BCE). CM2 was considered a system derived from CM1, represented only by four clay tablet fragments from the Cypriot site of Enkomi (and no later than the 12th century BCE), therefore being of restricted use. Finally, CM3 was initially described as a second derivative of CM1, represented by some of the inscriptions from the port-town of Ugarit in Syria, and adapted to write the local Ugaritic language (although Olivier later redefined CM3 on a geographical basis as the whole set of inscriptions from Syria).

Fig 1

Repertoire of 96 Cypro-Minoan syllabograms and their classification in three sub-corpora according to Olivier [5].

Therefore, 32 signs were supposedly shared by CM1, CM2 and CM3, while other signs were only common to two of these alleged subgroups, or even peculiar to just one (cf. Fig 1). Following the reasoning behind Masson’s original classification [1], signs unique to either CM2 or CM3 were assumed to have been adaptive innovations to write sounds of languages distinct from the language of CM1, thus explaining their absence from the latter sub-corpus. By the same token, signs absent from only CM2 or CM3 could allegedly have been discarded also for linguistic reasons. The tripartite division of CM has been challenged in several works [2–4, 13–16]. One important criticism is that it does not fully consider the way in which different media and inscribing techniques impinge on the shapes of signs. This affects how proponents of the division have identified signs supposedly peculiar to CM2 and CM3, and afterwards interpreted such potential signs as innovations and evidence that CM2 and CM3 were scripts derived from CM1. In terms of method, to distinguish variants of the same sign (e.g. our letters Q and q) is a trivial task when the writing system is known. Conversely, it can be problematic when the script is undeciphered. Different signs may look very similar on different inscriptions if two hands intervene, and the opposite holds true too. With Cypro-Minoan, it has been argued that there are inconsistencies in the way signs (graphemes) have been distinguished from mere variants of the same sign (allographs). This means that a sign exclusive to one CM subgroup might be just the allograph of a sign from the other subgroups. For example, in the reference editions it is recognized that certain signs (e.g. 070, 087, 092) have angular and compressed forms on clay tablets, all assigned to either CM2 and CM3, but elongated variants in various types of documents classed as CM1. Conversely, while shapes 088 (CM1), 089 (CM2), 090 (CM2) show the exact same graphic behavior, they were catalogued as three different signs (Fig 2) [3].

Fig 2

Incoherent classification of signs 070, 087 and 092 vis-a-vis 088, 089, 090 in [5] (after [3]).

At the same time, other scholars have recently defended that CM2 [17] and CM3 [18] are separate scripts. Arguments in favor or against the division of Cypro-Minoan are tied to the problem of the correct identification of signs and the classification remains a debated issue. The reason is that specialists can have different views as to whether two CM sign shapes are similar or not, and therefore on whether they represent variants or distinct graphemes. Paleographic assessment remains subjective to an extent and the problem is aggravated by the lack of longer and more homogeneous inscriptions. This calls for a more neutral strategy to assess the classification of CM.

Materials and methods

The dataset

Our dataset contains images of signs from the CM inscriptions as published in the two reference catalogues [5, 6] and later works [3, 19–24]. This starting dataset comprised 3499 sign shapes from 230 inscriptions. This figure excludes data from 18 inscriptions without a published or usable drawing (totaling approximately 63 signs), 5 single-sign inscriptions (often considered ‘marks’ rather than writing stricto sensu), and 5 objects which are unepigraphic or whose status as proper inscriptions is doubted. Out of the 3499 initial signs, we removed 556 that were damaged, which meant that 10 inscriptions comprising only damaged text (= 31 sign images) were thoroughly discarded as well. As our method will consider the positional distribution of CM signs in sequences, and this data may be skewed if inscriptions written in a different script or language were included, our protocol also excluded: The ‘archaic’ inscription ##001 (23 signs), as it differs significantly from the rest of the corpus in terms of chronology and paleography, and 19 of its signs do not repeat [3, 25]; 6 inscriptions (23 signs) belonging or suspected of belonging to the later Cypro-Greek syllabary (dated from ca. 1050 and onwards), which is a different writing system deciphered into a dialect of ancient Greek (##092, following [26], and ##170–172 and ##189–190, as discussed in [3, 14, 27]); Finally, we further excluded: The last two signs of ##088 as their proper segmentation is debated (making the current drawings unusable) [3]; In total, after 600 sign images were excluded, we were left with a set of 2899 signs from 213 inscriptions (see S1 Table for the comparison between our dataset and the extant CM corpus). Henceforth, by “dataset” we will mean our filtered dataset, not comprising the damaged material and excluded inscriptions. All categories of signs that have been identified in Cypro-Minoan are represented: signs used in sequences (syllabograms), two alleged logograms, numerical signs and punctuation signs. Out of the 96 categories of syllabograms established by Olivier, 95 are present in the dataset. This is because one sign shape, 083, is actually a ‘ghost sign’: its single attestation is doubtful [5] (and hence excluded here). 084 is in a similar situation, but a recent publication has wanted to see this shape in a new inscription from Erimi-Kafkalla, so we had to count it for our purposes. As shown by the figures reported above, our dataset comprises the majority of the extant CM corpus, and is therefore representative. The CM signs in the dataset are represented by individual images as drawn in the published editions, which at present constitute the starting point for paleographic discussions. The digitization process used for the dataset started with a high-contrast, black and white scan of the pages of published drawings. Each sign drawing was then manually cropped, and annotated by inscription, position in the inscription, and published transcription (i.e. each sign image was assigned the sign number provided in the published editions of the inscriptions). The transcriptions follow the editions of Olivier [5] in the case of inscriptions ##002-##217 and Ferrara [6] and individual publications in the case of material edited after 2007 (which is labelled with ##ADD). Afterwards, a threshold was applied to remove noise resulting from the scan, by fitting a quadratic polynomial curve to the color histogram of the image and selecting its minimum as a threshold, so that any color with a value higher than the threshold is considered as white. Whenever artifacts from the scan could not be removed automatically, we applied the Potrace algorithm [28] to the cleaned images and then retraced them manually to fully match the original. Issues with the drawings of some inscriptions in their current editions have been raised [3], but until a new corpus or dataset with revised illustrations sees the light, they continue to be the reference and starting point for all scholarly discussions on CM. Thus, we have purposefully not altered such drawings. It is worthwhile to underline certain properties of our CM dataset (as described above). The first is the geographic provenance of the inscriptions and the signs that make them up. Fig 3 shows the number of signs in the dataset found at each archaeological site in Cyprus and in Syria. It emerges that the largest portion of the dataset originates from the Cypriot site Enkomi, a significant amount is from Kalavasos-Ayios Dhimitrios (also on Cyprus) and from Syria, but the rest of the dataset comprises smaller amounts of material from various other Cypriot sites.

Fig 3

Distribution of CM signs in our dataset according to archaeological site.

Our dataset contains 1153 signs of CM1, 1430 of CM2 and 316 of CM3. We therefore have a relatively similar amount of data for CM1 and CM2, while CM3 constitutes a smaller though not irrelevant fraction of the total number of signs. Moreover, 1732 signs are from clay tablets, while 1167 are found on other types of documents. That most CM signs are found on tablets is unsurprising, as these documents tend to contain longer texts. Still, the amount of data on other supports is an important fraction of the total. The script as represented in the dataset shows a severe Zipf distribution, which can be observed in the frequency of the most attested signs with secure readings (as provided in the reference editions) (see Table 1). The sequence divider (|) is by far the most frequent sign, while the rest of the signs are relatively rare when compared to it.

Table 1

Number of attestations for the 15 most frequent signs in the dataset.

Sign	\|	023	102	082	004	025	006	027	075	009	097	087	104	038	107
Attestations	466	109	89	84	74	71	65	65	63	60	57	48	44	42	42

The situation is even more problematic when we consider sign trigrams. Table 2 shows the 10 most attested sequences of three signs, of which 8 involve a divider. In fact, sign sequences that contain at least one divider constitute 57% of all sign trigrams. This has implications for our aim of building an unsupervised model that considers the positional distribution of signs in sequences, rather than just their shape. Since dividers are so prominent, any method directed towards this goal needed to use this type of sign as starting point, as we will argue in the next section.

Table 2

Number of attestations for the 10 most frequent sign trigrams in the dataset.

Trigram	51-28-\|	\|-51-28	\|-102-75	\|-102-35	\|-38-33	4-75-\|	4-87-25	4-97-\|	6-82-\|	9-60-59
Attestations	11	9	7	6	5	4	4	4	4	4

DeepCluster-V2

To shed light on the inventory and classification of CM signs from an unbiased perspective, we tested the application of an unsupervised approach, whereby the model learns the relationships between signs without using any prior knowledge. Convolutional Neural Networks (CNN or Convnet) [29] represent a computational model able to carefully analyse images from different perspectives and perform different tasks on them, consistently reaching very high rates of success. As almost all neural systems dealing with images in some capacity are based on CNN [30], they seemed most fitting to our purposes. We based our model on an unsupervised method called DeepClusterv2. This model uses a CNN with residual connections (ResNet) [31], a specific kind of CNNs, to perform automatic clustering of images. DeepCluster-v2 is based on DeepCluster [32], which uses a K-Means clustering algorithm to extract pseudo-labels from the images, which in turn are used to train a classifier over them (Fig 4). The clustering step happens on normalized (unit-norm) vectors and uses a dot product as the distance metric, which is equivalent to cosine distance on an hypersphere. The main issue of this model, however, is that the classification layer used to train the model on the pseudo-labels needs to be reinitialized at every epoch, as the assignments of images to clusters changes during training. This leads to model instability and impacts its performance. DeepCluster-v2 addresses this problem by replacing the last layer of the model with the centroids obtained from K-Means. The application of the last layer to the vectors corresponds to the dot product between the vector and each centroid. Since both the centroids and the vectors are normalized to have unit-norm, this corresponds to calculating the cosine proximity between centroids and vectors. Furthermore, data augmentation is applied to the image, in the form of random crops, color distortion and random horizontal flips of the images. This way, multiple augmented versions of the image are used to train the model. Finally, additional improvements such as a Multi Layer Perceptron (MLP) projection head and cosine learning rate schedule are added to the model.

Fig 4

DeepCluster structure [32].

Since no gold standard categorization exists for the signs of CM, we used a dimensionality reduction technique, namely the t-distributed stochastic neighbor embedding (t-SNE) [33], to project the high-dimensional latent space in 3D for executing a preliminary visual inspection of the results. The use we made of this neural approach and other validation steps for the model as well as parameter selection will be discussed in the next section.

Unsupervised model for undeciphered scripts: Sign2Vec

As graphic similarity between any two sign shapes is not a sufficient criterion to prove that they represent the same grapheme, we needed a method that overcame this limitation. Thus, we modified DeepCluster-v2 to be context-aware and adapted the model to our purposes. By ‘context’ we mean the collocation of any sign with respect to other signs in any string of Cypro-Minoan text. This follows the premise that any sign in a writing system is bound to occur more frequently in certain positions within a sequence (a word or phrase) by virtue of its sound value and the distribution of that sound (or sounds) in the underlying language. Thus, by deeming any two signs more closely related considering not just how similar their own shapes are, but also how much their neighboring signs resemble one another, the factor of chance resemblance is reduced. As in our dataset the number of damaged signs and the short length of many of the texts limit the amount of available contexts significantly, we leveraged an important aspect of CM: the frequent use of dividers that separate sequences. Since some signs are found predominantly in sequence-initial or sequence-final position, we taught the model to predict, based on the images on the left and right positions of a trigram, whether the central image is a divider. We named the resulting model Sign2Vec (Fig 5). The implementation of Sign2Vec and DeepClusterv2 used in this paper are hosted at https://github.com/ashmikuz/sign2vec_d.

Fig 5

Sign2Vec (signs drawn after [5]).

For each non-damaged sign X we considered the non-damaged signs before and after it (L and R) to predict whether X is a sequence divider. Three identical ResNet CNNs analyse the signs images for producing intermediate representations and then three identical MLP networks project the outputs of the ResNets. The output obtained from the central sign (M(X)) is then multiplied with the centroids matrix obtained from K-Means (C). The concatenation of the vectors obtained from the context (M(L, R)) is then fed to a linear layer that performs a binary classification task, to decide whether X is a divider or not. The Sign2Vec loss is then defined as the sum of two components: the first is the categorical cross entropy H between the cosine similarities of the latent representation of the sign and the centroids M(X) and the actual centroid Y obtained from K-Means. The second component is the binary cross entropy H between the linear projection of the concatenated vectors representing the left and right context images, denoted as M(L, R) and a binary value that indicates whether the central sign between L and R is a divider. The λ constant is used to weight the relative importance of the two components of the loss function. In our experiments it was set to 0.7. CM sequences found in isolation or at the beginning and end of inscriptions tend to lack dividers in at least one of their extremes. In such cases, a solution was needed to mark the context of initial and final signs. Thus, an image of a divider, selected randomly from the many separators attested in the dataset, was also imposed as the left context of an unmarked beginning or the right context of an unmarked end of sequence. When the real beginning or end of a sequence is unknown because the inscription is damaged or broken, context was instead marked by a randomly generated artificial image comprised of dots that denote damage, following the convention for the representation of damaged text in the drawings of the editions of Olivier [5]. Finally, since the contextual component of Sign2Vec is tasked with predicting whether the central sign of a trigram is a divider, we also needed to consider the artificial dividers we inserted at the end of inscriptions. A fully black image was used to denote the end of a document whenever there was no sign after an artificial divider. We trained 20 different models initialized with different random parameters. By applying this procedure, the model becomes less susceptible to the the random initialization of its parameters. The full model parameters are presented in S2 Table. Afterwards, we used the 20 trained models to perform statistical tests or concatenate the vector representations of the signs. We then applied the model to the CM dataset and examined the result by creating a 3D scatter plot from a t-SNE projection of the concatenated learned representation of each sign from all 20 models. This visualization suggested that our model learns a meaningful approximation of the distinction between graphemes in CM. This procedure was repeated for both DeepClusterv2 and Sign2Vec.

Results and discussion

The 2560-dimensional representation obtained by applying the proposed neural model, namely Sign2Vec, and the baseline, DeepClusterv2, was the starting point for any further processing and paleographic consideration.

Arrangement of signs in two paleographic subgroups

We produced a 2560-dimensional representation obtained by applying the proposed neural model, namely Sign2Vec, and the baseline, DeepClusterv2 (hosted at http://corpora.ficlit.unibo.it/INSCRIBE/PaperCM/). In the resulting 3D scatter plot, we observed that at the macro-level CM signs were distributed in two main groups: at its periphery, the hypersphere contained mainly signs from clay tablets, whereas at the core we found mostly signs inscribed on other types of objects (Fig 6). This separation is similar to the traditional divisions of CM1 and CM2, though not completely equivalent, as signs from inscriptions classed as CM3 are found both on clay tablets and other media. At closer range, we observed the same tendency: some of the consensually established CM graphemes appear either in a single “filament” (meaning that the sign images stay on the same branch of the plot) or as two clusters, but along an invisible axis traced from the core to the periphery. Thus, the instances of these graphemes inscribed on clay tablets (i.e., CM2 and some CM3 inscriptions) appear mainly at the outskirts of the plot, whereas attestations on other kinds of media (i.e., CM1 and other CM3 inscriptions) are placed mainly at the center (see e.g., sign 097 in Fig 7).

Fig 6

Separation of CM signs from clay tablets (in green) and signs found in other types of inscription (in red) in the 3D scatter plot.

Fig 7

Separation of a CM grapheme in two groups in the 3D scatter plot.

Example of sign 097.

Separation of a CM grapheme in two groups in the 3D scatter plot.

Example of sign 097. Importantly, while filaments or continua of clusters appear to represent graphemes (at least some), single clusters often do not. In fact, clusters in different parts of the scatter plot often blended sign shapes that scholars have classed as distinct graphemes (for example, because they co-exist and contrast in the same inscription). Hence, we inferred that the model tended to arrange individual graphemes as filaments of sign images running from the core to the peripheral parts of the sphere, whereas the different layers of said sphere mainly reflect specific paleographic styles of signs. Because the outermost layer mainly displays signs from clay tablets (classed as CM2 as well as CM3), which characteristically are more angular (“squarish”) shapes and have fewer strokes, we also inferred that this layer reflects such shapes for multiple graphemes. That our model arranged CM signs in this way without supervision provides independent evidence in favor of the hypothesis that CM2 is not a distinct writing system, but rather a specific script style employed on clay tablets from Enkomi. If this is correct, the implication is that certain sign shapes peculiar to CM2 are not separate graphemes that are not part of the script found in CM1 and CM3 inscriptions. Rather, they would constitute variants of signs found on the other subcorpora, but with different forms (generally more angular or ‘compressed’, and/or with less strokes). Another implication of the model is that we should expect to find certain shapes to be arranged along the same core-periphery axis if they were variants of the same CM grapheme. We therefore used this property of the model to test the hypothesis that certain signs individuated by Olivier are rather allographs. However, before doing so, we need to consider one potential issue with the data.

Preliminary relabelling of single signs

The 3D scatter plot obtained from a t-SNE projection of the 2560-dimensional concatenation of the 20 models outputs highlighted 27 instances of signs whose reading (transcription) in the published editions is incoherent even in terms of the conventional classification (example in Fig 8, and for which corrections have previously been proposed [3] in most cases. In other words, some signs appear clustered with similar or identical shapes in the scatter plot, but their labels (which follow the reference transcriptions) differ. These problematic labels include cases that are just obvious misprints in Olivier’s edited corpus [5]. As subsequent quantitative analyses of the results were based also on these labels, any incorrect transcription would affect their accuracy. Thus, we performed a set of preliminary tests on both DeepClusterv2 and Sign2Vec to validate suggested corrections for these 27 instances (S3 and S4 Tables respectively).

Fig 8

Example of sign with incoherent label (reading) in the editions of CM [5]: The sign marked with a contour has been transcribed as 049, even though its shape is consistent with 052.

Such cases were tested for correction by means of a non-parametric Mann–Whitney U-test.

Example of sign with incoherent label (reading) in the editions of CM [5]: The sign marked with a contour has been transcribed as 049, even though its shape is consistent with 052.

Such cases were tested for correction by means of a non-parametric Mann–Whitney U-test. We used the cosine distances between sign images to test statistically whether a proposed correction was more supported by the models than the “conventional” reading. Notice that the cosine distance is a valid metric for the vector space, as it is the distance metric used by the K-Means step of DeepCluster-v2. For each sign with a problematic transcription, we applied a non-parametric Mann-Whitney U-test [34] comparing the distances between the sign in question and the transcription vs. the distances between the sign and the proposed correction across all 20 models. Instead of using the concatenated, 2560-dimensional vector representation of signs, we derived the distances between the signs in all 20 vector spaces and compared their distribution by means of the statistical test. We set the significance value at 0.05 and applied corrections where the null hypothesis held, i.e., when the population of distances from the sign to the correction was smaller than the population of distances from the sign to the current reading. We applied this method to both the DeepClusterv2 and Sign2Vec models. Out of a total of 31 tests performed, the results led us to re-label 26 and 20 single signs, respectively.

The paleographic vector

We divided our dataset in two subgroups, reflecting the core/periphery separation displayed by the 3D scatter plot rather than the tripartite division into CM1, CM2 and CM3. Thus, one subset was termed Tablet and comprised inscriptions on clay tablets (again, the equivalent of CM2 and part of CM3); the other was labelled Other and, as the name implies, it included all other documents (CM1 and another part of CM3). We then hypothesized that in the 2560 dimensional space, obtained by concatenating the latent representation of signs from the 20 models, we could find a vector that would encode the arrangements of instances of a single grapheme along this Other-Tablet axis. First and foremost, we would expect this to be true of signs which scholars already agree are well-attested and shared by the two larger subcorpora of CM, CM1 and CM2 (which means both clay tablets and other types of inscriptions). Thus, if the vector existed, it could then be used to also find potentially missing correspondences between signs in the Other subset and their allographs in the Tablet subset (see Fig 9). Put differently, we would obtain evidence that some signs so far catalogued separately are allographs if the vector encoded the same Other >Tablet direction for sign shapes accepted as single graphemes, and sign shapes thought to represent distinct graphemes (according to Olivier’s classification).

Fig 9

On the left: The paleographic vector (red) was first calculated from the difference vector (black) between centroids of signs from clay tablets and centroids of signs from other documents.

On the right: the vector was applied to find a missing correspondence between signs shapes from the same two sets.

On the left: The paleographic vector (red) was first calculated from the difference vector (black) between centroids of signs from clay tablets and centroids of signs from other documents.

On the right: the vector was applied to find a missing correspondence between signs shapes from the same two sets. Formally, the paleographic vector v is defined as follows: Where: N is the total number of signs present on both types of documents; T, O are sets containing the attestations of sign i that are on tablets (Tablet) and on other documents (Other), respectively; c(X) denotes the centroid (mean) of a given set. In Eq 1, we use the centroids of signs in the two subgroups to compute an average direction for the alignment of sign variants from clay tablets (Tablet) and sign variants from other types of inscription (Other). If this paleographic vector, v, is able to encode this direction, we would expect that: Where T is the set of all sets of signs on tablets and δ is our distance metric, the cosine distance. We use ≈ to denote that two sets contain the same sign. We applied the vector to the centroid of the attestations of a given sign as found on inscriptions that are not clay tablets and looked for the closest centroid among the signs from tablets. We then expect T and O to be the same grapheme.

Validating the paleographic vector

To test the hypothesis, we applied the paleographic vector to 32 consensual graphemes attested in the two largest traditional subgroups, CM1 and CM2 (always insofar as they are also present in our dataset): 001, 004, 005, 006, 008, 009, 011, 012, 017, 021, 023, 024, 025, 027, 028, 033, 036, 037, 038, 044, 061, 068, 070, 075, 082, 087, 096, 097, 102, 104, 107, 110 (Fig 10). The premise is the following: if the vector is largely able to match the variants of graphemes from CM1 with their counterparts in CM2, along the Other-Tablet axis, then it is valid. Notice that we considered CM signs attested in inscriptions classed as CM1 and CM2 even if they are not also attested in CM3 as well, because the latter subgroup is more limited in its number of inscriptions. Thus, if a sign is not yet attested in CM3 there is more chance that that absence is accidental.

Fig 10

32 consensual signs attested both in the Other and Tablets subsets.

We applied the vector to these 32 signs, taking as starting point their instances in the subset Other. After “projecting” it, for each sign we printed two rankings of the closest matches from the subset Tablet. These rankings (S5 and S6 Tables) were obtained form DeepClusterv2 and Sign2Vec, respectively. The procedure meant at the same time the calculation of the vector and its validation: each time we applied the vector to a sign, we calculated it by excluding the sign under scrutiny from the computation of the vector. We wanted to test whether the Other >Tablet direction learned from the other 31 signs was sufficiently general to predict the direction of the excluded sign. The second aim was to compare the results of DeepClusterv2 and Sign2Vec and determine whether the context-aware version of the model effectively achieved more accurate results. The results (Table 3) imply that the paleographic vector accurately reconstructs the match between Other and Tablet instances of the same grapheme and does it more accurately in its context aware Sign2Vec version. We evaluate performance by using top-N accuracy, which measures the amount of tests in which the expected sign is found in the top N positions when the paleographic vector is applied. Sign2Vec has a top-one accuracy of 0.69 while its top-two accuracy is 0.81. This means that for 22 out of 32 signs (≈69%) the correct Tablet instances were the first (closest) prediction, but if we also considered the signs that appear as second-closest prediction, the percentage of success raises to 81%.

Table 3

Application of the paleographic vector to consensual graphemes using both DeepClusterv2 and Sign2Vec.

Model	Top-1 Accuracy	Top-2 Accuracy	Top-3 Accuracy	Top-5 accuracy
DeepClusterv2	0.66	0.75	0.81	0.97
Sign2Vec_d	0.69	0.81	0.94	1.0

To assess the probability that this highly accurate result is the product of chance, we performed a simple test using a binomial distribution. We used the top-one accuracy value in DeepClusterv2, as it is the lowest level of accuracy attained by one of our models (Table 3). We needed to use: Where: c = 21 is the number signs correctly found in first position from DeepClusterv2; n = 32 is the number of tests that we performed; is the probability of randomly finding the correct sign from 64 alternatives. The formula shows that the probability of obtaining 21 or more correct answers is 1.28*10−30, which shows just how improbable it is that these results are the product of chance. Additionally, we can compute the expected value of the distribution, which is given by . In other words, if we matched the signs randomly, we would predict a single sign correctly only half of the time. These results demonstrate the overall validity of the paleographic vector and the greater accuracy of the Sign2Vec model.

The vector as evidence of allography

A set of hypotheses has been put forward [3] which question the validity of the tripartite division and imply that several pairs of sign shapes inventoried separately in Olivier are rather variants of the same grapheme. We tested whether the paleographic vector supports these hypotheses, focusing on two types of situations: Type 1: Hypotheses of complementary distribution: these propose the merger of pairs of signs where one is allegedly exclusive to CM1 and the other is supposedly peculiar to CM2; Type 2: Hypotheses that involve the merger of two signs, one scarcely attested and present only in the CM1 subcorpus, the other present in all subcorpora. In both types of hypotheses, we sought a correspondence that is potentially missing between a sign shape restricted to the subgroup Other and its possible Tablet counterpart. By applying the vector to a sign restricted to Other (taken as point of departure) and projecting it towards the peripheral layer of the hypersphere, we obtained the closest matches in Tablet (excluding the sign of departure from the rankings with the calculated distances). Thus, for example, Valério [3, 4] hypothesizes that 039 (supposedly restricted to CM1) is merely the allograph of 049 (allegedly an innovation of CM2). To test this proposal, we want to see if the vector indicated 049 as the closest Tablet match for 039, which in the scatter plot lied in the subset Other. We will now focus on the first type of proposal, involving signs in complementary distribution. The tests of Type 1 (hypotheses of complementary distribution) we performed involve the pairs listed in Fig 11. All these potential pairs of allographs have been proposed in [3, 4], and we tested also the more tentative suggestion for a merger of the pair 073 (CM1/3) + 076 (CM2). These hypotheses were based on complementary distributions between two very similar sign shapes, one supposedly attested only CM2 and another allegedly present only in CM1 or both in CM1 and CM3 (including CM3 clay tablets). This kind of distribution is not directly matched by the Other / Tablet separation observed in the scatter plot, which reveals a broad distinction between signs from clay tablets and signs from other supports. However, this by no means needs to be taken as an indication that allographs of a sign need to be identical on all clay tablets, both at Ugarit (= CM3) and Enkomi (= CM2). In other words, we contend that we can test the notion that the equivalent of 073 on the clay tablets from Enkomi is 076, while the clay tablets from Ugarit still employed the same shape as other media (073). Other hypotheses involve details that need to be accounted for:

Fig 11

Pairs of CM1 (left column) / CM2 (right column) signs in complementary distribution and hypothesized as variants of the same grapheme.

It has been hypothesized [3] that 053 (attested in CM1 and CM3), 054 (CM2) and 055 (CM3) are the same grapheme; this had two be tested separately, by applying the vectors to the pairs 053–054 and 055–054. What Olivier classed as sign 064 and deemed as attested both in CM1 and CM2, was originally catalogued as two different signs by Masson [1]: 064 (CM1) and 065 CM2). It has now been argued [3, 4] based on the different positional distributions of these shapes that indeed two different graphemes are represented: the 064 of CM1 (henceforth 064a), mainly sequence-initial, was suggested as the counterpart of the 062 of CM2; and the 064 of CM2 (henceforth 064b), mostly sequence-final, was hypothesized as the counterpart of sign shape 099 of CM1/3 (in addition to shape 100 from CM3). We tested these two parallel mergers. 089 and 090 (CM1) have been hypothesized as representing the same sign and therefore the joint counterpart of 088 (CM2). Thus, we tested whether both were indicated as the closest matches. We tested the 13 hypotheses in Fig 11 by projecting the vector from the shapes attested in CM1 (013, 019, 034, 039, 041, 046, 050, 053, 055, 064a, 073, 088, 099). The acceptable targets were all sign shapes restricted to the subset Tablet in our dataset (010, 029, 040, 047, 049, 051, 054, 056, 060, 062, 064b, 074, 076, 078, 079, 080, 089, 090, 095, 100). The vector yielded the expected result with 8/13 top-one accuracy and 9/13 top-two accuracy (S7 Table). As with the validation process, we estimated the probability that our results were the product of chance. As we have two tests (088 vs. 089/090, 099 vs 064b/100) that sought a match with two target Tablet signs, we needed to consider multiple success probabilities, so we resorted to the Poisson binomial distribution. The probability of having 8 or more correct matches by chance was estimated to be very low (p-value = 1.0*10−7). We also calculated the mean number of successes that such a distribution would obtain, by summing the probabilities of each successful event: Therefore, if we chose the matching signs randomly, we would on average obtain less than one of the matches right (). Thus, the probability of achieving the above results by chance is very low, and we interpret them as strong independent evidence in support of the allography hypotheses. We also tested three hypotheses of Type 2. In this case, both the starting and target sign shapes are present (as sign images) at the core of the scatter plot (Other), but we still expect the vector to direct us from there to the correct target at the periphery (Tablet). Thus, we tested 015, 085 and 101 (attested only in CM1) as potential allographs of the better attested 021, 096, and 102 (CM1–2-3), respectively (Fig 12). The result was again largely positive: 2/3 top-one and top-two accuracy (S8 Table). We then applied the Poisson binomial distribution and obtained a p-value of 1.7*10−3. Moreover, the mean value of the distribution is 0.07, implying that we would find a correct match by chance less than 1 out of 10 times.

Fig 12

Pairs comprised of a shape attested only in CM1 (left column) and a shape attested in all sub-corpora (right column), hypothesized as variants of the same grapheme [3].

Altogether, these results show that the vector aligned instances of consensual graphemes (69% at top one) and signs separated by Olivier (11 out of 16 tests, or 68.75%) with a nearly identical rate of success.

Conclusions

The vectors obtained from our Sign2Vec unsupervised deep-learning model largely separated CM signs found on clay tablets (periphery) from signs attested on other supports (core), as observed in the resulting 3D scatter plot. Moreover, while following this separation, the variants of several consensual graphemes as found on clay tablets and other media were still aligned along an invisible axis running from the core to the periphery of the scatter plot. This is consistent with the notion that the main differentiation among CM signs concerns how the paleographic style of signs written on tablets diverges from that of signs on other types of inscriptions. By contrast, nothing in the model is consistent with a separation of signs traditionally seen as peculiar to one subcorpus (CM1, CM2, or CM3) and diagnostic of distinct CM writing systems. A pattern of the latter kind is what might have suggested structural divergences (presence or absence of graphemes) rather than stylistic variation. A priori, it was questionable to what extent the model’s representation of the relationships between CM signs was consistent with the reality of the script (or scripts) present in the corpus. To validate it, we took as ground truth 32 CM signs whose entity as graphemes is not questioned by either proponents or opponents of the tripartite division of CM as maintained in the work of Olivier (because they are well-attested and show relatively little graphic variation). That 69% of these graphemes are arranged along a core-periphery vector that reflects the stylistic differentiation of signs inscribed on clay tablets (and is mathematically regular) strongly implies that the model’s representation is grounded. Moreover, statistical tests show that it is highly improbable that this result is accidental. The traditional classification that divides CM in three alleged scripts largely rests on the assumption that significant numbers of signs exist which are peculiar, and innovations within a given subgroup. The model reported here provides measurable evidence against this assumption, suggesting that signs supposedly diagnostic of different scripts are incorrectly identified. For example, the paleographic vector supports the notion that certain pairs of sign shapes hitherto treated as separate entities (e.g. 088 and 089+090), and potentially peculiar to a given subcorpus of CM (CM1 in the case of 088, CM2 in the case of 089 and 090), are allographs. This is because these pairs of shapes are arranged in the same way mere variants of consensual independent graphemes are (e.g., 087 as attested on clay tablets and other media). Thus, if the latter have been accepted as single entities, we suggest that by reasons of coherence the former should also be considered as potential graphemes. With these results, the tripartite division loses substantial empirical basis, and the hypothesis that it is invalid (as previously put forward based on paleographic and distributional evidence) gains strength. The implications tied to Masson’s 1974 starting assumption [1] that different languages were present in the corpus of CM inscriptions cannot be tested in the scope of this paper nor we can infer anything new on the language(s) represented based on our results. If all hypotheses of allography were correct, the failure of 31.25% of the vector tests would likely not be tied to issues of sampling. The sign shapes involved in the unsupported hypotheses were not, as a rule, less attested than shapes whose merger was supported. Thus, the imprecise matches in pairs such as 013–078 or 019–079 may have to do with variations in shape or distribution beyond the average variability found among the consensual signs that yielded the paleographic vector. That these sign shapes differ more clearly reflects the lack of consensus on their classification. Overall, the application of an unsupervised approach to CM has shown that neural networks can be fruitfully applied to ancient undeciphered scripts. In particular, the application of Sign2Vec provided an independent way to test hypotheses of sign and script classification. Nevertheless, it remains the case that any application of neural networks to undeciphered scripts must be developed ad hoc, as no general method is single-handedly effective in reaching solid and cogent results. In the future, other aspects of CM should be investigated, such as its relationship with the undeciphered syllabic Linear A script from Minoan Crete and the script historically developed from it on Cyprus, the 1st-millennium BCE Cypro-Greek syllabary.

List of CM inscriptions included in and excluded from the dataset.

NB: Reference edition refers to the source of the drawing used in the text and is not necessarily the first publication of the inscribed object. (PDF) Click here for additional data file.

DeepClusterv2 hyperparameters.

(PDF) Click here for additional data file.

Preliminary corrections to incoherent sign labels using non-parametric Mann-Whitney U test on the outputs of DeepClusterv2.

Since both tests for 086 and 112 (inscription ##211) were positive, we performed an additional test that compared them. We applied 112 as a correction even if the statistical test was inconclusive, since 112 was favored by the model. (PDF) Click here for additional data file.

Preliminary corrections to incoherent sign labels using non-parametric Mann-Whitney U test on the outputs of Sign2Vecd.

Since both tests for 086 and 112 (inscription ##211) were positive, we performed an additional test that compared them. The same is true for 102 and 024 (inscription ##215). (PDF) Click here for additional data file.

Validation of CM Other > Tablet paleographic vector for DeepClusterv2.

The correct targets are marked in bold. (PDF) Click here for additional data file.

Validation of CM Other > Tablet paleographic vector for Sign2Vecd.

The correct targets are marked in bold. (PDF) Click here for additional data file.

Test of hypothesized mergers of signs in complementary distribution.

The hypothesized correct targets are marked in bold. Some matches are impossible, because the starting Other sign shape and the target Tablet one are known to coexist in an inscription, where they are contrastive graphemes. Thus, 055 contrasts with both 051 and 095 on clay tablet ##215. 073, 074 and 095 are also contrastive on the same inscription. Impossible matches are discarded and stricken through in the table. (PDF) Click here for additional data file.

Test of hypothesized mergers of sign shapes attested only in CM1 with sign shapes attested in all subcorpora.

The hypothesized correct targets are marked in bold. Impossible matches are discarded and stricken through in the table. (PDF) Click here for additional data file. 1 Nov 2021

PONE-D-21-29213

Unsupervised Deep Learning Supports Reclassification of Bronze Age Cypriot Writing System

PLOS ONE Dear Silvia Ferrara, Thank you for submitting your manuscript for consideration at PLOS ONE. I now have in hand reports from two reviewers — one a specialist in computational linguistics and machine learning, the other a specialist of the Cypro-Minoan script with a balanced view of current debates. Both reviewers find merit in this highly original and innovative paper, but recommend that you make extensive changes in order for it to be suitable for publication. I am inviting you to submit a revision which will then be sent back to the same two reviewers. In case the reviews reveal strong disagreements over publication, or new issues, I may contact a third reviewer in addition to those two. Reviewer 1 provides a great deal of technical advice worth following, especially concerning a relative lack of clarity and exhaustivity in explaining the methods. One concern that I share with them has to do with the validation of your model on cursive Hiragana, which yields lackluster results, calling into question the subsequent applicability of the model to the Cypro-Minoan data. This, in my view, is the number one issue raised by your paper. Please address it in depth, by explaining why the model may yield reliable conclusions in spite of its limited aplicability to a better known and better documented script, and by qualifying your conclusions accordingly. Like Reviewer 1, I noticed that your paper was not accompanied by open data and code, and that you declared some restrictions would apply to the sharing of the data. Given the highly technical and innovative nature of your study, I do think that giving Reviewer 1 access to your data and code is important to let them appreciate the robustness of the results. Reviewer 2 confesses serious misgivings about your raw data, but also notes that your conclusion is plausible, and indeed can be defended on other grounds. Please address their thorough and detailed comments; they are mostly dealing with the quality and completeness of the sources. I share their last remark on the fact the three versions of Cypro-Minoan might be one and the same script, without necessarily encoding one and the same language. Thank you again for allowing us to consider your manuscript. Olivier Morin P.S. Please bear in mind this standard caveat if and when you revise the paper: Inviting a revision does not entail that the next version, or any subsequent version, will be accepted for publication. It is my policy to avoid a protracted editorial process that may in any case end in rejection. I am not pre-judging this particular case but this is something I warn all authors of.

Please submit your revised manuscript by Dec 16 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. [Note: HTML markup is below. Please do not edit.] Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: No ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: I Don't Know ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The paper is clearly written and has a straightforward research question, which aims at investigating if three subgroups of the Cypro-Minoan script are the same language or not. The methods used in the paper are relevant for the research question and well described. The appreciation toward the paper is generally positive. However, some revisions could be made to clarify more details, which is why the reviewer suggests: Accept with (major?) revisions. General comments: - The title mentions 'deep learning' but within the text, the term changes between 'machine learning' (e.g., also the term ‘machine-based techniques on p2) and 'deep learning'. I suggest to synchronize which term to use when referring to the methods. IMHO, both methods are used in the paper, e.g., k-means is more likely to be affiliated to the machine learning category while neural network is more likely to be affiliated to the deep learning category. - I understand that the authors have concern to release detailed code and data upon acceptance, but in the current state it is hard to judge how robustly was the analysis conducted. For example, there is not much details about the detailed settings of the parameters and few information from the attached supplementary tables can be used to interpret the robustness of the analysis. The description of the method is well-written though, so the editors may decide if the code is needed for reviewing or not. - P3: Is there a table or a part of text giving the distribution of the three scripts in the used data? I could not find the information in the text or in the supplementary materials (sorry if I missed it). My follow-up question on the distribution would be: If there is a lack of balance in the data, does this lack of balance between the three scripts have an effect on the output of the experiments? - P4-P5: “We tested various supervised and unsupervised deep learning models … Our preliminary experiments found ….” If these additional experiments are mentioned, their procedure and output should probably be provided somewhere, either in the text or in the supplementary materials. - P5: “The model therefore tries to reconstruct the category to which the sign belongs, both from the context (preceding/following sign) and from the sign itself.” This is a cool idea! A quick question though: if the vector model considers the context of each character, isn’t it inherently biased toward a separation of the three scripts? Since only scripts from the same category will occur together? - P6: for sign2vec, did you consider different window size and type? E.g., three surrounding characters instead of one? Or only considering the words before/after rather than symmetric context? What is the dimension of the output vector? E.g., 50, 100, 500? Sorry if it is already written somewhere and I missed it. - P6 Table 1: I understand that the authors are considering the Rand index, which is easily affected by the size of different clusters between the predicted and the actual data. Maybe the adjusted Rand index could be considered? Plus, the definition of the metrics listed in Table 1 could be explained. If the journal was a CS or CL journal that might not be necessary, but since PLOS has a larger audience, I suggest to add some brief explanations about those metrics. - P6 “The scores were not so high because…” If I understand correctly the flow of this section, the authors wanted to validate the model on the Japanese writing system. If the results are not conclusive for Japanese, how do the authors show that the model is reliable? - P8 “To evaluate our model, we could only use as ground truth a set of 37 signs” I might be a bit confused here. If only those signs were used to evaluate the model, why include the other signs? This is probably already written somewhere in the text but I might have missed it. - P9 “This demonstrates that, while the vector is not 100% accurate, it is still a reliable method to test the hypothesis that some signs allegedly exclusive to CM1 and CM2 are in reality paleographic variants.” Would it be possible to compare the accuracy obtained here with a random/majority baseline to be able to assess how high or low is the accuracy? - P11 “These results strengthen the hypothesis that the division of CM in three sub-scripts is invalid, as previously put forward on the basis of paleographic and structural evidence. The implications are of paramount importance for the script,” AFAIU, since the results do not provide a clear-cut (e.g., the accuracy of the models is not very high), I suggest that the authors could be a bit more modest when mentioning the impact of the results. The limitations of the study should also be mentioned somewhere in the conclusion, e.g., the distribution of data? The accuracy of the models and its implication on the interpretability of the results, etc… Minor comments: - P1, abstract: “assess if it holds up against a multi-pronged, multi-disciplinary attack”, I suggest to avoid using too strong terms such as 'attack'. However, that might be a personal preference. - P1: If space allows it, a map showing the location of the sites where the inscriptions have been found could be helpful for readers not familiar with the topic. - P4: “Almost all neural systems treating images in some ways are based on CNN, thus they seemed most fitting to our ends.” While I agree with the authors, a few references here would be nice to support this statement. - P5 “we applied some quantitative measures using the MNIST dataset confirming” What are those measures again? I might have missed it. - P5: I suggest avoiding sentences such as “as mentioned above” in the paper, if you do, please refer to the exact location/section in the text. - P6: Finally, we combine the DeepCluster-v2 loss, … this is a bit abstract to follow IMHO. Maybe a toy example would help? - Figure 6 and Figure 7 are hard to interpret visually. Maybe replacing the characters with points and using shapes/colors to distinguish the characters would make it easier to read? - The format of the refs should be synchronized, for example: [2,15]: The page number seems to be missing, [10,11,30]: The publisher is missing. If the place is required as in [39], it should be added for the other references too. Reviewer #2: The central question posed in this paper is an old one, and there seems to me some potential to try to address it with new methods of the sort proposed. However, I have serious misgivings about the way in which this research has been conducted. I hope that my specific comments below will demonstrate the grounds for my misgivings, and the reasons why on balance I felt compelled to record that the data (or rather the way in which the data were analysed) do not appear to support the conclusions offered. Unless the authors can address these issues seriously, I fear that the paper comes across as a superficial ‘confirmation’ of pre-existing theories that may otherwise be quite adequately argued via other methods. P1: The summary of CM inscriptions overlooks at least one further inscription from Tiryns, on the handle of a clay vessel, published by Brent Davis – this work is even on the bibliography (no. 20)! There is also a new potmark from the same site which I believe will be published by the same author. P3 L64: It seems misrepresentative to say that signs not attested in the tiny repertoire of CM3 were ‘allegedly discarded for linguistic reasons’ (L65). Masson and Olivier both seem to have accepted that there could be signs that simply have yet to be attested in the corpus from Ras Shamra. It is also worth noting that Olivier was openly sceptical of any linguistic distinction for CM3, making clear in Olivier 2007 that the designation is nothing more than geographical, and often using scare quotes for it (‘CM3’) – even though he maintained Masson’s categorisation. P4 L106-8: The possibility that some inscriptions at the end of the chronological timespan for Cypro-Minoan might actually be written in the Cypro-Greek syllabary is raised here without any critical commentary on the implications of such an assumption. These documents could be excluded on chronological grounds, but the authors should ideally take some position on their epigraphic status (whether agnostic or not). There have been several recent discussions of the problem, including e.g.: Duhoux, Y. (2012) ‘The most ancient Cypriot text written in Greek: The Opheltas’ spit’, Kadmos 51, 71-91. Egetmeyer, M. (2013) ‘From the Cypro-Minoan to the Cypro-Greek syllabaries: linguistic remarks on the script reform’ in Steele, P.M. (ed.) Syllabic Writing on Cyprus and its Context, Cambridge 2012, 107-131. Egetmeyer, M. (2017) ‘Script and language on Cyprus during the Geometric Period: An overview on the occasion of two new inscriptions’ in Steele, P.M. (ed.) Understanding Relations Between Scripts: The Aegean Writing Systems, Oxford, 108-201. Steele, P. (2018) Writing and Society in Ancient Cyprus, Cambridge, second chapter. P4 L111ff: Excluding signs on an essentially linguistic basis is methodologically worrying (the reasoning is repeated on P12). Whether or not there exist arguments in favour of linguistic differentiation, any study of sign shapes / palaeography should be blind to linguistic considerations – which surely is what the authors intend by pursuing the kinds of analysis on offer in this paper. There might be some sense in excluding all the material from Ras Shamra simply on the grounds that writing practices at that site could be somewhat different from those on Cyprus – though, on the other hand, this might be a good reason for including them. But it must be all or nothing, and the methods employed here cannot seriously investigate the possibility or otherwise that CM3 should be considered as a separate entity from the rest of the CM corpus if tablets #212 and #215 are excluded (and along with them, six sign shapes thus not represented among the data used for this study). P4 L124ff: Given the aim to achieve more neutral analysis of palaeographic variation in Cypro-Minoan, it is a shame that the authors used published drawings, presumably largely from Olivier where some examples could be criticised as to their representation of features. Those drawings also tend to flatten some kinds of variation owing to palaeographic factors, such as the comparative width of strokes*. Perhaps it is impossible for the present study, but the results of ongoing scanning projects could be particularly beneficial to this kind of analysis because of their more accurate measurement of sign features. There is nevertheless a risk here that the results of the analysis will be affected by pre-existing assumptions and biases on the part of the person who drew the signs, given that any drawing is already in itself interpretive. *Considerations of this kind indeed seem to have affected the analysis given the divergent clustering of signs on clay documents and signs on other supports, as noted by the authors at P8-9. P9 L316-8: “This property supports the argument that CM2 is not a script distinct from CM1, but rather a form of the same writing system that differs mainly due to the use of a different writing medium as well as scribal style (smaller and more angular signs).” This seems to me to be quite a bold claim (not that CM2 is not a separate script in its own right, which is surely at some level true, but that the present investigation can be used as evidence for such a position). I am not convinced that the results can only be read in this way. For one thing, it may be that the quite consistent way in which CM2 signs were drawn (presumably by Olivier?) predisposed them to a differential analysis by the neural network – as I mentioned above, this is a serious risk to the results of the study and needs to be considered carefully. It would also be helpful to know to what extent differences of scale have been factored in. The signs of the CM2 tablets are far smaller than signs on many other supports, and this makes a difference a) to what it was possible for the author to render, and b) to the accuracy of any modern drawing of the signs. Published editions tend to flatten the degree of difference in size between signs in different inscriptions, but this could indeed be a significant factor in their recognisability (whether to ancient humans or modern computational methods). P10-11: In the section ‘Application of the Vector’, it is clear that the authors seem to have drawn conclusions that supported pre-held beliefs, but very little information is given as to how the conclusions are supported. Accuracy levels such as 6/10, 7/10, 2/3, 3/3 need to be explained in some detail – what exactly is denoted by these numbers, and what does ‘accuracy’ mean here? Have the results been tested for statistical significance? P12 L449-451: “If the inscriptions in our dataset (mainly CM1 and CM2) represent the same script, then the likelihood increases that this single script recorded the same language.” This is an extremely bold and methodologically unsound claim. There are countless examples across the world and across different time periods of different languages being written in a single script / writing system. The language-related considerations offered here do not seem appropriate to the purposes of the paper. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: Yes: Philippa M. Steele [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 12 May 2022 We thank the reviewers for their invaluable feedback. Thanks to their comments, the revised manuscript has improved both in the experimental settings and the content in numerous ways. Due to the sheer amount of feedback from the editor and reviewers, we provide a detailed response to each question in a separate file included in the revised submission. Submitted filename: Response to Reviewers.pdf Click here for additional data file. 24 May 2022 Unsupervised Deep Learning Supports Reclassification of Bronze Age Cypriot Writing System PONE-D-21-29213R1 Dear Dr. Ferrara, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Olivier Morin Academic Editor PLOS ONE Additional Editor Comments (optional): Dear author, I am happy to tell you that I am accepting your paper, which both reviewers found to be much improved as a result of this round of revisions. This is an extremely exciting and well designed piece of research which I am certain will move debates forward in several fields. Thank you for considering PLoS OM Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: (No Response) Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: I thank the authors for making the revisions and providing the data and code. I think that the authors did a really good job updating the paper. The provided code also has a clear documentation. I suggest that the paper can be published. If the editors think that the following minor comments are relevant, they can be transmitted to the authors. If not, the editors may choose to ignore these comments. Minor comments: p4 "a recent publication has wanted to see this shape in a new inscription from Erimi-Kafkalla, so we had to count it for our purposes." -> which publication? please add the reference. Fig 3 -> the circles are bit hard to read and two plots take quite a lot of space by conveying partially overlapping information. Could they be merged? e.g., could we have numbers printed on the circles on the map? Fig 6 and Fig 7 -> I suggest to put the link to the live 3D scatter plot in the captions, so that is easier for readers to find it. p7 : "We trained 20 different models initialized with different random parameters. By applying this procedure, the model becomes less susceptible to the the random initialization of its parameters. ": why 20 random parameters? why not e.g., 50 or 100? Please develop a bit how 20 random parameters are excepted to cover the space of randomness for the parameters. I totally understand that there is technical limitations too, as running the code takes time. So, it would not be a problem to mention technical limitations of time, but at least how the number 20 was chosen should be clear. p7 "The 2560-dimensional representation obtained by applying the proposed neural model, namely Sign2Vecd, and the baseline, DeepClusterv2, was the starting point for any further processing and paleographic consideratio": the use of DeepClusterv2 with the term baseline is not explained before and suddenly shows up here. I suggest to add a sentence or two (here or in the previous text) mentionning why DeepClusterv2 is the baseline. It can be understood from the context, but it always better to make it clear for readers. Reviewer #2: I would like to thank the authors for their careful consideration of comments from both reviewers. I felt that all my comments had been addressed satisfactorily, and I think that the article now reads very well. I am happy to recommend it for publication without further modifications. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: Yes: Philippa M. Steele 22 Jun 2022 PONE-D-21-29213R1 Unsupervised Deep Learning Supports Reclassification of Bronze Age Cypriot Writing System Dear Dr. Ferrara: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Olivier Morin Academic Editor PLOS ONE

4 in total

Review 1. Deep learning.

Authors: Yann LeCun; Yoshua Bengio; Geoffrey Hinton
Journal: Nature Date: 2015-05-28 Impact factor: 49.962

2. Restoration of fragmentary Babylonian texts using recurrent neural networks.

Authors: Ethan Fetaya; Yonatan Lifshitz; Elad Aaron; Shai Gordin
Journal: Proc Natl Acad Sci U S A Date: 2020-09-01 Impact factor: 11.205

3. Artificial intelligence based writer identification generates new evidence for the unknown scribes of the Dead Sea Scrolls exemplified by the Great Isaiah Scroll (1QIsaa).

Authors: Mladen Popović; Maruf A Dhali; Lambert Schomaker
Journal: PLoS One Date: 2021-04-21 Impact factor: 3.240

4. Restoring and attributing ancient texts using deep neural networks.

Authors: Yannis Assael; Thea Sommerschield; Brendan Shillingford; Mahyar Bordbar; John Pavlopoulos; Marita Chatzipanagiotou; Ion Androutsopoulos; Jonathan Prag; Nando de Freitas
Journal: Nature Date: 2022-03-09 Impact factor: 69.504

4 in total