VB Helper Tutorial: Does P = NP?

Home
Search

What's New
Index
Books
Links
Q & A
Newsletter
Banners

Feedback
Tip Jar

C# Helper...

Tutorial: Does P = NP?
This article describes the sets P and NP, asks the question "does P = NP?" and explains why that question is of profound philosophical and practical importance.

P

Does P = NP? This is undoubtedly the most profound question in computer science. To understand the question and its impact, you first need to know what P and NP are. This discussion is a little imprecise, but it should give you an idea of what's involved.

P is the set of problems that can be solved in deterministic polynomial time. That means for a problem with inputs of size N, there must be some way to solve the problem in F(N) steps for some polynomial F. F can be any polynomial, even N to the 10 millionth power.

For example, consider the problem of finding the smallest number in a list of N integers n1, n2, n3, etc. One method for solving this problem is:

    Dim max_number As Integer

    max_number = n1
    If n2 > max_number Then max_number = n2
    If n3 > max_number Then max_number = n3
    If n4 > max_number Then max_number = n4
    etc.

Because there are N numbers in the list, this takes roughly N steps. N is a polynomial function of N, so this problem is in the set P.

[Technically when you talk about the size of a problem's inputs, you need to consider the size of the numbers themselves. For example, if these numbers are 32-bit integers, the complete size of the inputs is N * 32. But N is also a polynomial function of this value so in this case we can ignore the extra factor of 32.]

If the numbers are stored in an array, you can write this program with less code as:

    Dim max_number As Integer
    Dim i As Integer

    max_number = n(1)
    For i = 2 To N
        If n(i) > max_number Then max_number = n(i)
    Next i

This version takes less code and will be easier to maintain, but it uses the same number of steps as the previous version. It just repeats the loop N - 1 times instead of executing N - 1 comparison statements on separate lines.

NP

NP is the set of problems you can solve in non-deterministic polynomial time. That means for a problem with inputs of size N, there must be some way to solve the problem in F(N) steps for some polynomial F just as before. In NP, however, the program is allowed to make lucky guesses, though it must prove the solution is correct. The standard format for a program in NP is:

Guess the answer.
Verify that the answer is correct in polynomial time.

For example, factoring is in NP. Suppose you have a number A that you want to break into two factors. The NP program is:

    Guess factors P and Q.
    Multiply P times Q and verify that the results is A.

This takes only two non-deterministic steps so the problem is in NP.

Note that all problems in P are also in NP. You do not need to use non-deterministic guesses in an NP program if you don't want to.

Note also that factoring is not known to be in P. It is trivial to write a fast MP algorithm for factoring, but no one has ever discovered a deterministic polynomial time algorithm for factoring.

This can be a little confusing since factoring does not seem like a hard problem. You can easily write a program that factors some relatively large numbers. In Visual Basic, you can use long integers to factor 10-digit numbers very quickly.

The catch is that the size of the inputs is actually the length of the number you are factoring, not the number itself. If you are factoring 1,128,768,361, the number of digits 9 is the value N you need to use in finding the polynomial F(N). Or if you want to think in binary, this is a 32-bit number so you need to find a polynomial using the value 32, not the value 1,28,768,361 which is much bigger.

While you can easily write a Visual Basic program to factor 10-digit numbers, try factoring a few 100-digit numbers. That is much more difficult. In fact, it likely that any Visual Basic program you write would take hundreds if not millions of years to factor randomly selected 100-digit numbers. (You might be able to factor some numbers quickly, but many you could not.)

Does P = NP?

Now that you know what P and NP are, does P = NP? No one has ever proven that they are equal and no one has ever proven they are not. A lot of smart people have spent a lot of years showing one problem is as hard as another. There is a large class of problems called NP-complete that you can show are equally difficult. For example, if you can find a deterministic polynomial solution for the Traveling Salesman problem (finding the shortest path that visits a set of cities and returns to the start), you can also find one for the SAT problem (find an assignment of values to Boolean variables in a logical statement to make it true). Despite all efforts, however, no one has ever shown that an NP-complete problem is also in P.

Because no one has found such an example, many researchers believe that P <> NP. Still, programming is a relatively new science and lots of other hard problems remained unsolved for centuries before someone came up with a solution, so perhaps this one just needs more time.

In some sense, this is a question of solvability. NP-complete problems are extremely hard to solve. In most cases, you can write a program to solve small examples. Often you can write a program to find an approximate solution to larger ones. But solving them exactly is impossible.

Philosophically this means there are a large number of problems, many with useful practical applications, that you cannot solve precisely. There are some things you just cannot know exactly.

Who Cares?

So who cares? Aside from its theoretical and philosophical implications, this question has a practical application of huge importance: cryptography. Much of modern cryptography is based on tough problems such as factoring. Factoring is in NP but is not known to be in P (it's also not know whether it is NP-complete). If you could factor big numbers quickly, a lot of the modern cryptographic systems used by banks, governments, military forces, large corporations, and even secure Web sites would become worthless overnight. The stakes are unbelievably large.

And just so you don't think this is the end of the story, there are some problems with no known solutions even in NP.

The study of the speed and complexity of programs is called "complexity theory." It is a great way to learn more about designing efficient algorithms both theoretically and practically.
My book Ready-to-Run Visual Basic Algorithms shows how to analyze the complexity of Visual Basic programs more precisely. It also describes several other NP-complete problems and explains approximate solutions for them.

Subscribe to the VB Helper newsletter

www.vb-helper.com/tut8.htm

Updated