Sparse complete sets for coNP: Solution of the P versus NP problem

—P versus NP is considered as one of the most important open problems in computer science. This consists in knowing the answer of the following question: Is P equal to NP? A precise statement of the P versus NP problem was introduced independently in 1971 by Stephen Cook and Leonid Levin. Since that date, all efforts to ﬁnd a proof for this problem have failed. Another major complexity class is coNP. Whether NP = coNP is another fundamental question that it is as important as it is unresolved. In 1979, Fortune showed that if any sparse language is coNP-complete, then P = NP. We prove there is a possible sparse language in coNP-complete. In this way, we demonstrate the complexity class P is equal to NP.


INTRODUCTION
The P versus N P problem is a major unsolved problem in computer science [1].This is considered by many to be the most important open problem in the field [1].It is one of the seven Millennium Prize Problems selected by the Clay Mathematics Institute to carry a US$1,000,000 prize for the first correct solution [1].It was essentially mentioned in 1955 from a letter written by John Nash to the United States National Security Agency [2].However, the precise statement of the P = N P problem was introduced in 1971 by Stephen Cook in a seminal paper [1].
In 1936, Turing developed his theoretical computational model [3].The deterministic and nondeterministic Turing machines have become in two of the most important definitions related to this theoretical model for computation [3].A deterministic Turing machine has only one next action for each step defined in its program or transition function [3].A nondeterministic Turing machine could contain more than one action defined for each step of its program, where this one is no longer a function, but a relation [3].
Another relevant advance in the last century has been the definition of a complexity class.A language over an alphabet is any set of strings made up of symbols from that alphabet [4].A complexity class is a set of problems, which are represented as a language, grouped by measures such as the running time, memory, etc [4].
The set of languages decided by deterministic Turing machines within time f is an important complexity class denoted T IM E(f (n)) [5].In addition, the complexity class N T IM E(f (n)) consists in those languages that can be decided within time f by nondeterministic Turing machines [5].The most important complexity classes are P and N P .The class P is the union of all languages in T IM E(n k ) for every possible positive constant k [5].At the same time, N P consists in all languages in N T IM E(n k ) for every This article may appear completely or partially in: Academia.edu,figshare, Zenodo and HAL preprint servers.This is not a plagiarism, since I am the author of those manuscripts.
possible positive constant k [5].Whether P = N P or not is still a controversial and unsolved problem [2].In this work, we proved the complexity class P is equal to N P .Hence, we solved one of the most important open problems in computer science.

MOTIVATION
The biggest open question in theoretical computer science concerns the relationship between these classes: Is P equal to N P ?In 2012, a poll of 151 researchers showed that 126 (83%) believed the answer to be no, 12 (9%) believed the answer is yes, 5 (3%) believed the question may be independent of the currently accepted axioms and therefore impossible to prove or disprove, 8 (5%) said either do not know or do not care or don't want the answer to be yes nor the problem to be resolved [6].It is fully expected that P = N P [5].Indeed, if P = N P then there are stunning practical consequences [5].For that reason, P = N P is considered as a very unlikely event [5].Certainly, P versus N P is one of the greatest open problems in science and a correct solution for this incognita will have a great impact not only for computer science, but for many other fields as well [2].

SUMMARY
In computational complexity theory, a sparse language is a formal language (a set of strings) such that the complexity function, counting the number of strings of length n in the language, is bounded by a polynomial function of n.The complexity class of all sparse languages is called SP ARSE.SP ARSE contains T ALLY , the class of unary languages, since these have at most one string of any one length.
Fortune showed in 1979 that if any sparse language is coNP-complete, then P = N P (this is Fortune's theorem) [7].Mahaney used this to show in 1982 that if any sparse language is NP-complete, then P = N P [8].A simpler proof of this based on left-sets was given by Ogihara and Watanabe in 1991 [9].Mahaney's argument does not actually require the sparse language to be in N P , so there is a sparse NP-hard set if and only if P = N P [8].
We create a class with the opposite definition, that is a class of languages that are dense instead of sparse.We show there is a sequence of languages that are in NP-complete, but their density grows as much as we go forward into the iteration of the sequence.The first element of the sequence is a variation of the NP-complete problem known as HAM-CYCLE [10].The next element in the sequence is constructed from this new version of HAM-CYCLE.Indeed, each language is created from its previous language in the sequence.
Since the density grows according we move forward into the sequence, then there must be a language so much dense such that its complement is sparse.Fortunately, we find this property from a language created with the languages of this sequence.Moreover, we show this too dense language is still NP-complete.Thus, the complement of this language remains in coNP-complete, because the complement of every NP-complete language is complete for coN P [11].
In this way, we find a sparse language in coNP-complete.As a consequence of Fortune's theorem, we demonstrate that P is equal to N P .To sum up, we proved there is a sparse complete set for coN P and therefore, we just solved the P versus N P problem.

BASIC DEFINITIONS
Let Σ be a finite alphabet with at least two elements, and let Σ * be the set of finite strings over Σ [12].A Turing machine M has an associated input alphabet Σ [12].For each string w in Σ * there is a computation associated with M on input w [12].We say that M accepts w if this computation terminates in the accepting state, that is M (w) = "yes" [12].Note that M fails to accept w either if this computation ends in the rejecting state, that is M (w) = "no", or if the computation fails to terminate [12].
The language accepted by a Turing machine M , denoted L(M ), has an associated alphabet Σ and is defined by We denote by t M (w) the number of steps in the computation of M on input w [12].For n ∈ N we denote by T M (n) the worst case run time of M ; that is where Σ n is the set of all strings over Σ of length n [12].We say that M runs in polynomial time if there is a constant k such that for all n, T M (n) ≤ n k + k [12].In other words, this means the language L(M ) can be accepted by the Turing machine M in polynomial time.Therefore, P is the complexity class of languages that can be accepted in polynomial time by deterministic Turing machines [4].A verifier for a language L is a deterministic Turing machine M , where L = {w : M (w, c) = "yes" for some string c}.
We measure the time of a verifier only in terms of the length of w, so a polynomial time verifier runs in polynomial time in the length of w [12].A verifier uses additional information, represented by the symbol c, to verify that a string w is a member of L. This information is called certificate.N P is also the complexity class of languages defined by polynomial time verifiers [5].If N P is the class of problems that have succinct certificates, then the complexity class coN P must contain those problems that have succinct disqualifications [5].That is, a "no" instance of a problem in coN P possesses a short proof of its being a "no" instance [5].
A function f : Σ * → Σ * is a polynomial time computable function if some deterministic Turing machine M , on every input w, halts in polynomial time with just f (w) on its tape [3].Let {0, 1} * be the infinite set of binary strings, we say that a language L 1 ⊆ {0, 1} * is polynomial time reducible to a language L 2 ⊆ {0, 1} * , written An important complexity class is NP-complete [11] An instance of the language HAM-CYCLE is a simple graph G = (V, E) where V is the set of vertices and E is the set of edges, each edge being an unordered pair of vertices [4].We say (u, v) ∈ E is an edge in a simple graph G = (V, E) where u and v are vertices.A simple graph is an undirected graph without multiple edges or loops [4].For a simple graph [4].A Hamiltonian cycle is a simple cycle of the simple graph which contains all the vertices of the graph.A simple graph that contains a hamiltonian cycle is said to be hamiltonian; otherwise, it is nonhamiltonian [4].The problem HAM-CYCLE asks whether a simple graph is hamiltonian [4].

RESULTS
Definition 5.1.A dense language on m is a formal language (a set of binary strings) such that for a positive integer n 0 , the counting of the number of strings of length n ≥ n 0 in the language is greater than or equal to 2 n−m where m is a real number and 0 ≤ m ≤ 1.The complexity class of all dense languages on m is called DEN SE(m).
In this work, we are going to represent the simple graphs with an adjacency-matrix [4].For the adjacencymatrix representation of a simple graph G = (V, E), we assume that the vertices are numbered 1, 2, . . ., |V | in some arbitrary manner.The adjacency-matrix representation of a simple graph G consists of a |V |×|V | matrix A = (a i,j ) such that a i,j = 1 when (i, j) ∈ E and a i,j = 0 otherwise [4].In this way, every simple graph of k vertices is represented by k 2 bits.
Observe the symmetry along the main diagonal of the adjacency matrix in this kind of graph that is called simple.
We define the transpose of a matrix A = (a i,j ) to be the matrix A T = (a T i,j ) given by a T i,j = a j,i .Hence the adjacency matrix A of a simple graph is its own transpose A = A T .Definition 5.2.The language NON-SIMPLE contains all the graph that are represented by an adjacency-matrix A such that A = A T Lemma 5.3.NON-SIMPLE ∈ P .
Proof.Given a binary string x, we can check whether x is an adjacency-matrix which is not equal to its own transpose in time O(|x| 2 ) just iterating each bit a i,j in x and checking whether a i,j = a j,i or not where | . . .| represents the bitlength function [4].Proof.OEIS A000088 gives the total number of graphs on n unlabeled points [13].For 8 points there are 12346 so just over half the graphs on 8 points are Hamiltonian [13].
For 12 points, the highest in the Hamiltonian list, there are 152522187830 Hamiltonian graphs out of 165091172592 which would claim that over 92% of the 12 point graphs are Hamiltonian [13].For n = 2 there are two graphs, neither of which is Hamiltonian [13].For n < 8 over half the graphs are not Hamiltonian [13].It does not seem surprising that once n gets large most graphs are Hamiltonian [13].
Choosing a graph on n vertices at random is the same as including each edge in the graph with probability 1  2 , independently of the other edges [14].You get a more general model of random graphs if you choose each edge with probability p [14].This model is known as G n,p [14].It turns out that for any constant p > 0, the probability that G contains a Hamiltonian cycle tends to 1 when n tends to infinity [14].In fact, this is true whenever p > c×log n n for some constant c.In particular this is true for p = 1 2 , which is our case [14].
For all the binary strings z such that z = xy where the bit-length of x is equal to ( |z| ) 2 , the amount of elements of size |z| in HAM-CYCLE' is equal to the number of binary strings x ∈ HAM-CYCLE or x ∈ NON-SIMPLE multiplied by . Since the number of Hamiltonian graphs increases as much as we go further on n, it does not seem surprising either that once n gets large most binary strings belong to HAM-CYCLE'.Certainly, we can affirm for a sufficiently large positive integer n 0 , all the binary strings of length n ≥ n 0 which belong to HAM-CYCLE' are indeed more than or equal to 2 n−1 elements.In this way, we prove HAM-CYCLE' ∈ DEN SE(1).Definition 5.7.We will define a sequence of languages HAM-CYCLE' k for every possible integer 1 ≤ k.We state HAM-CYCLE' 1 as the language HAM-CYCLE'.Recursively, from a language HAM-CYCLE' k , we define HAM-CYCLE' k+1 Furthermore, we can extend this property for every positive integer k > 3 in HAM-CYCLE' k .Indeed, HAM-CYCLE' k is in N P for every integer 1 ≤ k, because the verification of whether the whole string or substrings are indeed elements of HAM-CYCLE' 1 can be done in polynomial time with the appropriated certificates.
Proof.This is true for k = 1 by Lemma 5.5.Let's assume is valid for some positive integer 1 ≤ k .Let's prove this for k + 1.We already know the adjacency-matrix of n 2 zeros represents a simple graph of n vertices which does not contain any edge.This kind of a simple graph does not belong to HAM-CYCLE' 1 .Suppose, we have an instance y of HAM-CYCLE' k .We can reduce y in HAM-CYCLE' k to zy in HAM-CYCLE' k +1 such that y ∈ HAM-CYCLE' k if and only if zy ∈ HAM-CYCLE' k +1 where the binary string z is exactly a sequence of |y| zeros.Due to this reduction remains in polynomial time for every positive integer 1 ≤ k , then we show HAM-CYCLE' k +1 is in NP-hard.Moreover, HAM-CYCLE' k +1 is also in NP-complete, because of Lemma 5.8.Theorem 5.10.For every integer 1 ≤ k, if the language HAM-CYCLE' k is in DEN SE(k ) for every natural number n ≥ n 0 , then HAM-CYCLE' k+1 is in DEN SE( k 2 ) for every integer n ≥ 2 × n 0 + 1.
Proof.If the language HAM-CYCLE' k is in DEN SE(k ) for every natural number n ≥ n 0 , then for every integer n ≥ n 0 + 1 the amount of elements of size n + i in HAM-CYCLE' k+1 (where i = n or i = n − 1) is greater than or equal to The reason is because there must be more than or equal to 2 i−k elements of size i in HAM-CYCLE' k which are prefixes of the binary strings of size n + i in the language HAM-CYCLE' k+1 .Moreover, there must be more than or equal to 2 n−k elements of size n in HAM-CYCLE' k which are suffixes of the binary strings of size n + i in HAM-CYCLE' k+1 .If we join both properties, we obtain the described by the formula above.Indeed, this formula can be simplified to and extracting a common factor we obtain which is equal to Certainly, if we multiply both member of the inequality by 2 k , we obtain that it is true for every real number 0 ≤ k ≤ 1.Thus where Since every binary string of size n has also the bit-length n + i for some natural number n (where i = n or i = n − 1), then there are more than or equal to 2 n −( k 2 ) elements of the language HAM-CYCLE' k+1 with length n ≥ 2 × n 0 + 1.In this way, we show HAM-CYCLE' k+1 is in DEN SE( k 2 ) for every integer n ≥ 2 × n 0 + 1.

Lemma 5.11. HAM-CYCLE'
for every natural number n ≥ 2 k−1 × n 0 + 2 k−1 − 1 where the constant n 0 is the positive integer used in the Definition 5.1 and Lemma 5.6 for HAM-CYCLE'.
Proof.HAM-CYCLE' ∞ ∈ N P , since we can build a unique polynomial verifier such that for a given binary string xn k−1 the verifier accepts when x ∈ HAM-CYCLE' k+1 according to Lemma 5.8.Certainly, we can verify whether some substrings of a given binary string belong to HAM-CYCLE' or not with the appropriated certificates.In addition, we will only need to verify firstly whether n k−1 is the binary representation of the integer k − 1 ≥ 0 and check later the membership of x in HAM-CYCLE' k+1 using a certificate on a unique polynomial verifier according to the properties of these N P languages described in Lemma 5.8.Moreover, we can reduce every element y of HAM-CYCLE' such that y ∈ HAM-CYCLE' if and only if yz ∈ HAM-CYCLE' ∞ where the binary string z is exactly a sequence of |y| + 1 zeros.Certainly, the numbers represented by the binary strings which are suffixes of yz could be 0 or greater than or equal to 2 |y|+1 .However, there is no a possible chance that y belongs to some language HAM-CYCLE' k when k > 2 |y|+1 + 1.Hence, the unique possible candidate is the integer k = 1 represented by the binary string 0 which is a suffix of yz.In this way, if yz is in HAM-CYCLE' 2 , then this surely implies that y is in HAM-CYCLE' since the sequence of zeros z can never belong to HAM-CYCLE' (see more in Theorem 5.9).Consequently, HAM-CYCLE' ∞ is in NP-hard and in N P and therefore, HAM-CYCLE' ∞ ∈ NP-complete.Theorem 5.14.HAM-CYCLE' ∞ ∈ DEN SE(0) when the bit length n of the binary strings tends to infinity.
Proof.When k tends to infinity, then 1  2 k−1 tends to 0. In this way, when k tends to infinity, then HAM-CYCLE' k ∈ DEN SE(0) as a consequence of Lemma 5.11.In addition, when k grows, then the constant n 0 becomes exponentially larger in relation to k where n 0 is the positive integer used in the Definition 5.1 for HAM-CYCLE' k .However, the elements of HAM-CYCLE' k when k tends to infinity are prefixes of the binary strings which are elements of HAM-CYCLE' ∞ when the bit length n of the binary strings tends to infinity.In conclusion, the density could totally grow for HAM-CYCLE' ∞ when the bit length n of the binary strings tends to infinity.Proof.coHAM-CYCLE' ∞ is the complement of language HAM-CYCLE' ∞ .In Theorem 5.14, we obtain HAM-CYCLE' ∞ ∈ DEN SE(0) when the bit length n of the binary strings tends to infinity and thus, the complexity of counting the number of strings with length n in coHAM-CYCLE' ∞ is bounded by a polynomial function of n.Certainly, a language is sparse if and only if its complement is in DEN SE(0) when the bit length n of the binary strings tends to infinity [8].Indeed, the sparse languages are called sparse because there are a total of 2 n strings of length n, and if a language only contains polynomially many of these, then the proportion of strings of length n that it contains rapidly goes to zero as n grows (which means its complement should be in DEN SE(0) when n tends to infinity) [8].Therefore, as a consequence of Theorem 5.14, then coHAM-CYCLE' ∞ ∈ SP ARSE.

Theorem 5.16.
There is a sparse language in coNP-complete.
Proof.Due to Theorem 5.13, we have coHAM-CYCLE' ∞ ∈ coNP-complete, because the complements of the NP-complete problems are complete for coN P [11].In Corollary 5.15, we prove coHAM-CYCLE' ∞ ∈ SP ARSE and thus, this sparse language in coNP-complete actually exists.Lemma 5.17.P = N P .
Proof.By the Fortune's theorem, if any sparse language is coNP-complete, then P = N P [7].As result of Theorem 5.16, there is a sparse language in coNP-complete.Finally, we demonstrate that P is equal to N P .

DISCUSSION
A logarithmic space Turing machine has a read-only input tape, a write-only output tape, and a read/write work tape [3].The work tape may contain O(log n) symbols [3].In computational complexity theory, LOGSP ACE is the complexity class containing those decision problems that can be decided by a logarithmic space Turing machine which is deterministic [5].Whether LOGSP ACE = P is another fundamental question that it is as important it is unresolved [5].
A logarithmic space Turing machine M may compute a function f : Σ * → Σ * , where f (w) is the string remaining on the output tape after M halts when it is started with w on its input tape [3].We call f a logarithmic space computable function [3].We say that a language L 1 ⊆ {0, 1} * is logarithmic space reducible to a language L 2 ⊆ {0, 1} * , written L 1 ≤ l L 2 , if there exists a logarithmic space computable function f : {0, 1} * → {0, 1} * such that for all x ∈ {0, 1} * , The logarithmic space reduction is frequently used for the class P-complete [5].
In 1999, Jin-Yi Cai and D. Sivakumar, building on work by Ogihara, showed that if there exists a sparse P-complete problem, then LOGSP ACE = P [15].We might extend the proof of this paper to demonstrate that LOGSP ACE = P .Certainly, we might only need to find some P-complete which belongs to DEN SE(1) because the P-completeness is closed under complement [5].Indeed, the other steps of that possible proof might be similar to the arguments that we follow in this paper.Consequently, this work would help us not only to solve P versus N P , but also LOGSP ACE versus P .

CONCLUSION
No one has been able to find a polynomial time algorithm for any of more than 300 important known NP-complete problems [10].A proof of P = N P will have stunning practical consequences, because it leads to efficient methods for solving some of the important problems in N P [1].The consequences, both positive and negative, arise since various NP-complete problems are fundamental in many fields [1].This result explicitly concludes supporting the existence of a practical solution for the NP-complete problems because P = N P .
Cryptography, for example, relies on certain problems being difficult.A constructive and efficient solution to an NP-complete problem such as 3SAT will break most existing cryptosystems including: Public-key cryptography [16], symmetric ciphers [17] and one-way functions used in cryptographic hashing [18].These would need to be modified or replaced by information-theoretically secure solutions not inherently based on P-NP equivalence.
There are enormous positive consequences that will follow from rendering tractable many currently mathematically intractable problems.For instance, many problems in operations research are NP-complete, such as some types of integer programming and the traveling salesman problem [10].Efficient solutions to these problems have enormous implications for logistics [1].Many other important problems, such as some problems in protein structure prediction, are also NP-complete, so this will spur considerable advances in biology [19].
But such changes may pale in significance compared to the revolution an efficient method for solving NP-complete problems will cause in mathematics itself.Stephen Cook says: " . ..it would transform mathematics by allowing a computer to find a formal proof of any theorem which has a proof of a reasonable length, since formal proofs can easily be recognized in polynomial time."[1].
Research mathematicians spend their careers trying to prove theorems, and some proofs have taken decades or even centuries to find after problems have been stated.For instance, Fermat's Last Theorem took over three centuries to prove.A method that is guaranteed to find proofs to theorems, should one exist of a "reasonable" size, would essentially end this struggle.
Indeed, with a polynomial algorithm for an NP-complete problem, we could solve not merely one Millennium Problem but all seven of them [2].This observation is based on once we fix a formal system such as the first-order logic plus the axioms of ZF set theory, then we can find a demonstration in time polynomial in n when a given statement has a proof with at most n symbols long in that system [2].This is assuming that the other six Clay conjectures have ZF proofs that are not too large such as it was the Perelman's case [2].
Besides, a P = N P proof reveals the existence of an interesting relationship between humans and machines [2].For example, suppose we want to program a computer to create new Mozart-quality symphonies and Shakespearequality plays.When P = N P , this could be reduced to the easier problem of writing a computer program to recognize great works of art [2].In October 2017, he contributed as co-author with a presentation in the 7 th International Scientific Conference on economic development and standard of living ("EDASOL 2017 -Economic development and Standard of living").In February 2017, his book "Protesta" (a book of poetry and short stories in Spanish) was published by the Alexandria Library Publishing House.He was also Director of two IT Companies (Joysonic and Chavanasoft) created in Serbia.

Definition 5 . 4 .Lemma 5 . 6 .
The language HAM-CYCLE' contains all the binary strings z such that z = xy, the bit-length of x is equal to ( |z| )2 and x ∈ HAM-CYCLE or x ∈ NON-SIMPLE where | . . .| represents the bit-length function and y could be the empty string.Lemma 5.5.HAM-CYCLE' ∈ NP-complete.Proof.Given a binary string x we can decide in polynomial time whether x / ∈ NON-SIMPLE just verifying when x = x T .In this way, we can reduce in polynomial time a simple graph G = (V, E) of k vertices encoded as the binary string x such that when x has k 2 bits and x / ∈ NON-SIMPLE then x ∈ HAM-CYCLE if and only if x ∈ HAM-CYCLE'.Then, we can reduce in polynomial time each element of HAM-CYCLE to HAM-CYCLE'.Therefore, HAM-CYCLE' is in NP-hard.Moreover, we can check in polynomial time whether a binary string z such that z = xy where the bitlength of x is equal to ( |z| ) 2 and complies with x ∈ HAM-CYCLE or x NON-SIMPLE since HAM-CYCLE ∈ N P , NON-SIMPLE ∈ P and P ⊆ N P [5].Consequently, HAM-CYCLE' is in NP.Hence, HAM-CYCLE' ∈ NP-complete.HAM-CYCLE' ∈ DEN SE(1).

Frank
Vega is essentially a back-end programmer graduated in Computer Science since 2007.In August 2012, he ventured as Independent Researcher in Computational Complexity writing a work published by IEEE Latin America Transactions which claimed a solution to an outstanding problem of Computer Science, but in the proof were found flaws.In August 2017, he was invited as a guest reviewer for a peer-review of a manuscript about Theory of Computation in the flagship journal of IEEE Computer Society.