Literature DB >> 27774111

Compatibility and Conjugacy on Partial Arrays.

Abstract

Research in combinatorics on words goes back a century. Berstel and Boasson introduced the partial words in the context of gene comparison. Alignment of two genes can be viewed as a construction of two partial words that are said to be compatible. In this paper, we examine to which extent the fundamental properties of partial words such as compatbility and conjugacy remain true for partial arrays. This paper studies a relaxation of the compatibility relation called k-compability. It also studies k-conjugacy of partial arrays.

Entities: Chemical Disease Gene

Mesh：

Substances：
DNA

Year: 2016 PMID： 27774111 PMCID： PMC5059777 DOI： 10.1155/2016/5010316

Source DB: PubMed Journal: Comput Math Methods Med ISSN： 1748-670X Impact factor: 2.238

1. Introduction

The genetic information in almost all organisms is carried by molecules of DNA. A DNA molecule is a quite long but finite string of nucleotides of 4 possible types: a (for adenine), c (for cytosine), g (for guanine), and t (for thymine). The stimulus for recent works on combinatorics is the study of biological sequences such as DNA and protein that play an important role in molecular biology [1-3]. Sequence comparison is one of the primitive operations in molecular biology. Alignment of two sequences is to place one sequence above the other [2, 4] in order to make clear correspondence between similar letters or substrings of the sequences. Partial words appear in comparing genes. Indeed, alignment of two strings can be viewed as a construction of two partial words that are compatible. The compatibility relation [5] considers two arrays with only few isolated insertions (or deletions). In some cases, it allows insertion of letters which relate to errors or mismatches. A problem appears when the same gene is sequenced by two different labs that want to differentiate the gene expression. Also, when the same long sequence is typed twice into the computer, errors appear in typing. Partial array A of size (m, n) over Σ, a finite alphabet, is partial function A : Z + 2 → Σ, where Z + is the set of all positive integers. In this paper, we extend the combinatorial properties of partial words to partial arrays. Also, this paper studies a relation called k-compatibility where a number of insertions and deletions are allowed as well as k-mismatches. The conjugacy result [6] which was proved for partial words is extended to partial arrays. k-Conjugacy of partial arrays is discussed.

2. Preliminaries on Partial Words

In this section, we give a brief overview of partial words [7].

Definition 1 .

Partial word u of length n over A, a nonempty finite alphabet, is partial map u : {1,2,…, n} → A. If 1 ≤ i ≤ n, then i belongs to the domain of u (denoted by Domain(u)) in the case where u(i) is defined, and i belongs to the set of holes of u (denoted by Hole(u)), otherwise. A word [8-10] is a partial word over A with an empty set of holes.

Definition 2 .

Let u be a partial word of length n over A. The companion of u (denoted by u ◊) is map u ◊ : {1,2,…, n} → A ∪ {◊} defined by The symbol ◊ is viewed as a “do not know” symbol. Word u ◊ = ba◊ab◊ is the companion of the partial word. The length of the partial word is 6. D(u) = {1,2, 4,5}. H(u) = {3,6}. Let u and v be two partial words of length n. Partial word u is said to be contained in partial word v (denoted by u ⊂ v), if Domain(u) ⊂ Domain(v) and u(i) = v(i) for all i ∈ Domain(u). Partial words u and v are called compatible (denoted by u↑v), if there exists partial word w such that u ⊂ w and v ⊂ w (in which case we define u∨v by u ⊂ u∨v and v ⊂ u∨v and Domain(u∨v) = Domain(u) ∪ Domain(v)). As an example, u ◊ = aba◊◊a and v ◊ = abab◊a. The following rules are useful for computing with partial words: Multiplication: If u↑v and x↑y, then ux↑vy. Simplification: If ux↑vy and |u| = |v|, then u↑v and x↑y. Weakening: If u↑v and w ⊂ u, then w↑v.

Lemma 3 .

Let u, v, x, y be partial words such that ux↑vy. If |u | ≥|v|, then there exist partial words w, z such that u = wz, v↑w, and y↑zx. If |u | ≤|v|, then there exist partial words w, z such that b = wz, v↑w, and x↑zy.

Definition 4 .

Two partial words u and v are called conjugate, if there exist partial words x and y such that u ⊂ xy and v ⊂ yx.

Definition 5 .

Two partial words u and v are called k-conjugate, if there exist nonnegative integers k 1, k 2 whose sum is k and partial words x and y such that u⊂ xy and v⊂ yx.

3. Preliminaries on Partial Arrays

This section is devoted to review the basic concepts on partial arrays [11].

Definition 6 .

Partial array A of size (m, n) over Σ, a nonempty set or an alphabet, is partial function A : Z + 2 → Σ, where Z + is the set of all positive integers. For 1 ≤ i ≤ m, 1 ≤ j ≤ n, and if A(i, j) is defined, then we say that (i, j) belongs to the domain of A (denoted by (i, j) ∈ D(A)). Otherwise, we say that (i, j) belongs to the set of holes of A (denoted by (i, j) ∈ H(A)). An array [5] over Σ is a partial array over Σ with an empty set of holes.

Definition 7 .

If A is a partial array of size (m, n) over Σ, then the companion of A (denoted by A ◊) is total function A ◊ : Z + 2 → Σ ∪ {◊} defined bywhere ◊∉Σ. The bijectivity of map A → A ◊ allows defining the catenation of two partial arrays in a trivial way.

Example 8 .

Partial array is the companion of partial array A of size (3, 3), where D(A) = {(1,1), (1,2), (1,3), (2,2), (2,3), (3,1), (3,3)}, H(A) = {(2,1), (3,2)}. LetBy column catenation, we mean By row catenation, we mean If A and B are two partial arrays of equal size, then A is contained in B denoted by A ⊂ B if D(A)⊆D(B) and

Definition 9 .

Partial arrays A and B are said to be compatible denoted by A↑B, if there exists partial array C such that A ⊂ C and B ⊂ C.

4. Compatibiltiy and k-Compatability of Partial Arrays

4.1. Compatibility

The rules mentioned for partial words are also true for partial arrays. Let A, B, X, Y be partial arrays. Multiplication: If A↑B and X↑Y, then AX↑BY either by column catenation or by row catenation. Simplification: If AX↑BY either by column catenation or by row catenation with A and B being of same size, then A↑B and X↑Y. Weakening: If A↑B and C ⊂ A, then C↑B. Lemma 3's version for partial arrays can be stated as follows.

Lemma 10 .

Let A, B, X, Y be partial arrays such that AX↑BY, either by column catenation or by row catenation. If order of A≥ order of B, then there exist partial arrays C, Z such that A = CZ, B↑C, and Y↑ZX. If order of A≤ order of B, then there exist partial arrays C, Z such that B = CZ, A↑C, and X↑ZY.

4.2. k-Compatibility

Definition 11 .

If A and B are two partial arrays of same size and k is nonnegative integer, then A is said to be k-contained in B denoted by A⊂ B if D(A) ⊂ D(B) and there exists subset E of D(A) of cardinality k called the error set such that

Definition 12 .

If A and B are two partial arrays of same order and k is a nonnegative integer, then A and B are called k-compatible denoted by A↑ B if there exist partial array Z and nonnegative integers k 1, k 2 such that A⊂ Z with error set E 1; B⊂ Z with error set E 2; E 1∩E 2 = ϕ; k 1 + k 2 = k.

Example 13 .

, , then there exists partial array with E 1 = {(1,1), (1,2)}, E 2 = {(1,3)} and k 1 = 2, k 2 = 1⇒k = 3; that is, A↑3 B. Equivalently, A and B are k-compatible, if there exists subset E of D(A)∩D(B) of cardinality k called the error set such that If A and B are arrays, then A↑∘ B means A = B. We sometimes use notation A↑≤ B, if set E has cardinality ≤k. A(i, j) = B(i, j)∀(i, j) ∈ D(A)∩D(B)∖E; A(i, j) ≠ B(i, j)∀(i, j) ∈ E. Multiplication. If A↑ B and X↑ Y, then AX↑ BY where A, B, X, and Y are partial arrays and k 1, k 2 are nonnegative integers, using column catenation.

Example 14 .

, , , . AX↑6+7 BY. Simplification. If AX↑ BY and order of A is equal to order of B, then A↑ B and X↑ Y for some k 1, k 2, satisfying k 1 + k 2 = k.

Example 15 .

, , , . AX↑8 BY⇒A↑5 B and X↑3 Y with 5 + 3 = 8. Weakening. If A↑ B and Z ⊂ A, then Z↑≤ B.

Example 16 .

, , . Z↑≤7 B with k = 7.

Theorem 17 .

Let A and B be partial arrays of orders a × b and a × c, respectively. If there exist array Z of order a × d and integers k 1, k 2, m, and n such that A⊂ Z with error set E 1 and B⊂ Z with error set E 2, then there exist integers p and q such that A ↑≤ B withMoreover, if E 1(a, |b|, n)∩E 2(a, |c|, m) = ϕ, then A ↑ B .

Proof

Let A and B be partial arrays of a × b and a × c, respectively. Let array z of order a × d exist such that, by using column catenation, A⊂ Z and B⊂ Z for some integers k 1, k 2, m, and n. Let E 1 be the error set of cardinality k 1 such that A(i, j) = Z (i, j) for all (i, j) ∈ D(A)∖E 1 and A(i, j) ≠ Z (i, j) for all (i, j) ∈ E 1 and E 2 be the error set of cardinality k 2 such that B(i, j) = Z (i, j) for all (i, j) ∈ D(B)∖E 2 and B(i, j) ≠ Z (i, j) for all (i, j) ∈ E 2. We have A ⊂ Z with error set E 1(a, |b | , n) of cardinality nk 1 and B ⊂ Z with error set E 2(a, |c | , m) of cardinality mk 2. Let (1,1)≤(i, j)≤(a, d ) and Z (i, j) = a for some letter a. There are 4 possibilities. Case 1. If (i, j) ∉ E 1(a, |b | , n) and (i, j) ∉ E 2(a, |c | , m), then A (i, j)∈{◊, a} and B (i, j)∈{◊, a}. It does not give any error, when we align A with B . Case 2. If (i, j) ∉ E 1(a, |b | , n) and (i, j) ∈ E 2(a, |c | , m), then A (i, j)∈{◊, a} and B (i, j) = b for some b ≠ a. It gives an error in the alignment of A with B only when A (i, j) = a or when (i, j) ∈ D(A)(a, |b | , n). Case 3. If (i, j) ∈ E 1(a, |b | , n) and (i, j) ∈ E 2(a, |c | , m), then B (i, j)∈{◊, a} and A (i, j) = b for some b ≠ a. It gives an error in the alignment of A with B only when B (i, j) = a or when (i, j) ∈ D(B)(a, |c | , m). Case 4. If (i, j) ∈ E 1(a, |b | , n) and (i, j) ∈ E 2(a, |c | , m), then A (i, j) = b for some b ≠ a and B (i, j) = c for some c ≠ a. It gives an error in the alignment of A with B only when b ≠ c. Therefore, if E 1(a, |b | , n)∩E 2(a, |c | , m) = ϕ then A ↑ B with k = ‖(D(a)(a, |b | , n)∩E 2(a, |c | , m))∪(D(B)(a, |c | , m)∩E 1(a, |b | , n)‖ and E 1(a, |b | , n)∩E 2(a, |c | , m) ≠ ϕ then A ↑≤ B .

Example 18 .

, , We have A⊂4 Z 3 with error set E 1 = {(1,2), (2,2), (2,3), (3,3)}, and B⊂2 Z 2 with error set E 2 = {(1,2), (2,2)}. k = 6: D(A) = {(1,1), (1,2), (2,1), (2,2), (2,3), (3,2),(3,3)}. D(B) = {(1,1), (1,2), (2,1), (2,2), (3,2)} D(A)(a, |b|, 2) = {(1,1), (1,2), (2,1), (2,2), (2,3), (3,2), (3,3), (1,4), (1,5), (2,4), (2,5), (2,6), (3,5), (3,6)}. D(B)(a, |c|, 3) = {(1,1), (1,2), (2,1), (2,2), (3,2), (1,3), (1,4), (2,3), (2,4), (3,4), (1,5), (1,6), (2,5), (2,6), (3,6)} E 1(a, |b|, 2) = {(1,2), (2,2), (2,3), (3,3), (1,5), (2,5), (2,6), (3,6)}. E 2(a, |c|, 3) = {(1,2), (2,2), (1,4), (2,4), (1,6), (2,6)}. E 1(a, |b|, 2)∩E 2(a, |c|, 3) ≠ ϕ. k = ‖(D(A)(a, |b|, 2)∩E 2(a, |c|, 3)) ∪ (D(B)(a, |c|, 3)∩E 1(a, |b|, 2))‖ = ‖(((1,1), (1,2), (2,1), (2,2), (2,3), (3,2), (3,3), (1,4), (1,5), (2,4), (2,5), (2,6), (3,5), (3,6))∩((1,2), (2,2), (1,4), (2,4), (1,6), (2,6)))∪(((1,1), (1,2), (2,1), (2,2), (3,2), (1,3), (1,4), (2,3), (2,4), (3,4), (1,5), (1,6), (2,5), (2,6), (3,6))∩((1,2), (2,2), (2,3), (3,3), (1,5), (2,5), (2,6), (3,6)))‖ = ‖(1,2), (1,4), (1,5), (2,2), (2,3), (2,4), (2,5), (2,6), (3,6)‖. k = 9: A 2↑≤9 B 3(A 2↑6 B 3).

5. Conjugacy and k-Conjugacy of Partial Arrays

5.1. Conjugacy

Definition 19 .

Two partial arrays A and B of same order are called conjugate if there exist partial arrays X and Y such that A ⊂ XY and B ⊂ YX using row catenation or column catenation. 0-conjugacy on partial arrays with same order is trivially reflexive and symmetric but not transitive.

Example 20 .

, , . By taking and , we get A ⊂ XY and B ⊂ YX showing that A and B are conjugate similarly and, by taking and , we get B ⊂ X′Y′ and C ⊂ Y′X′ showing that B and C are conjugate. But A and C are not conjugate. That is, conjugate relation is not transitive.

Proposition 21 .

Let A and B be nonempty partial arrays of same size. If A and B are conjugate, then there exists partial array C such that AC↑CB, either by column catenation or by row catenation. Let A and B be nonempty partial arrays of same order. Suppose A and B are conjugate and let X, Y be partial arrays such that A ⊂ XY and B ⊂ YX either by column catenation or by row catenation; then AX ⊂ XYX and XB ⊂ XYX. So, for C = X, we have AC↑CB.

5.2. k-Conjugacy

Definition 22 .

Two partial arrays A and B of same order are k-conjugate, if there exist nonnegative integers k 1 k 2 whose sum is k and partial arrays X and Y such that A⊂ XY and B⊂ YX with row or column catenation.

Theorem 23 .

Let A and B be nonempty partial arrays of same order. If A and B are k-conjugate, then there exists partial array Z such that AZ↑≤ ZB. Let A, B be two partial arrays of same order. Supposing that A and B are k-conjugate, then, by definition, there exist nonnegative integers k 1, k 2 whose sum is k and partial arrays X and Y such that A⊂ XY with error set E 1 and B⊂ YX with error set E 2 using row catenation or column catenation accordingly. Then, AX⊂ XYX with error set E 1 and XB⊂ XYX with error set E 2′ = {(i + number of rows of X, j)/(i, j) ∈ E 2} or E 2′ = {(i, j + number of columns of X)/(i, j) ∈ E 2} according to row or column catenation and so, for Z = X, we have AZ↑≤ ZB.

Example 24 .

Given , . There exist and with A⊂3 XY and B⊂2 YX, k = k 1 + k 2 = 5. There exist such that AZ↑≤5 ZB.

6. Conclusion

Motivated by compatibility and conjugacy properties of partial words, we define the conjugacy of partial array and derive the compatibility properties of partial arrays. By giving relaxation to the compatibility relation, we discuss k-compatibility and k-conjugacy of partial arrays. We prove that, given partial arrays A, B and integers p, q satisfying |A| = |B|, we find k such that A ↑ B . Also, there exists partial array Z such that AZ↑≤ ZB.

1 in total

1. Special factors in biological strings.

Authors: A Colosimo; A De Luca
Journal: J Theor Biol Date: 2000-05-07 Impact factor: 2.691

1 in total