It is traditional to evaluate the degree of complexity of an algorithm by the amount of basic computer resources it uses: processor time and RAM. In this regard, concepts such as time complexity of an algorithm and volume complexity of an algorithm are introduced.

The time complexity parameter becomes especially important for problems that involve interactive program operation or for real-time control problems. Often a programmer who compiles a control program for some technical device, we have to look for a compromise between the accuracy of calculations and the running time of the program. Typically, increasing accuracy leads to increasing time.

The volumetric complexity of a program becomes critical when the volume of data being processed reaches the limit of the computer's RAM capacity. On modern computers The severity of this problem is reduced due to both the increase in RAM capacity and the effective use of a multi-level storage system. The program has access to a very large, practically unlimited memory area (virtual memory). The lack of main memory only leads to some slowdown due to exchanges with the disk. Techniques are used to minimize the loss of time during such an exchange. This is the use of cache memory and hardware viewing of program instructions for the required number of moves ahead, which allows you to transfer the required values ​​from disk to main memory in advance. Based on the above, we can conclude that minimizing capacitive complexity is not a priority task. Therefore, in the future we will be mainly interested in the time complexity of algorithms.

The program execution time is proportional to the number of executed operations. Of course, in time units (seconds) it also depends on the speed of the processor ( clock frequency). In order for the time complexity indicator of the algorithm to be invariant with respect to technical characteristics computer, it is measured in relative units. Typically, time complexity is measured by the number of operations performed.

Typically, the time complexity of an algorithm depends on the input data. This may depend both on the size of the initial data and on its volume. If we denote the value of the algorithm's time complexity parameter α by the symbol Tα, and the letter V denotes some numerical parameter characterizing the source data, then the time complexity can be represented as a function Tα(V). The choice of parameter V depends on the problem being solved or on the type of algorithm used to solve this problem.

Example 1. Let us estimate the time complexity of the algorithm for calculating the factorial of a positive integer.

Function Factorial(x:Integer): Integer;

Var m,i: Integer;

For i:=2 To x Do m:=ro*i;

Let's do the math total number operations performed by the program for a given value of x. The operator m:=1 is executed once; the body of the loop (in which two operations: multiplication and assignment) is executed x - 1 time; The assignment Factorial:=m is performed once. If each of the operations is taken as a unit of complexity, then the time complexity of the entire algorithm will be 1 + 2 (x - 1) + 1 = 2x Hence it is clear that the value x should be taken as a parameter. The time complexity function turned out to be as follows:

In this case, we can say that the time complexity depends linearly on the data parameter - the value of the argument of the factorial function.

Example 2. Calculation of the scalar product of two vectors A = (a1, a2, …, ak), B = (b1, b2, …, bk).

For i:=l To k Do AB:=AB+A[i]*B[i];

In this problem, the volume of input data is n = 2k. The number of operations performed is 1 + 3k = 1 + 3(n/2). Here you can take V= k= n/2. There is no dependence of the complexity of the algorithm on the values ​​of the elements of vectors A and B. As in the previous example, here we can talk about a linear dependence of time complexity on the data parameter.

Two theoretical problems are usually associated with the time complexity parameter of an algorithm. The first is to find an answer to the question: to what limit of time complexity can one reach by improving the algorithm for solving the problem? This limit depends on the problem itself and is therefore its own characteristic.

The second problem is related to the classification of algorithms by time complexity. The function Tα(V) usually grows as V increases. How fast does it grow? There are algorithms with a linear dependence of Tα on V (as was the case in the examples we considered), with a quadratic dependence, and with a dependence of higher degrees. Such algorithms are called polynomial. And there are algorithms whose complexity grows faster than any polynomial. The problem that algorithm theorists often solve is the following question: is a polynomial algorithm possible for a given problem?

Functions often encountered when analyzing algorithms:

  • log n(logarithmic time),
  • n(linear time),
  • n log n,
  • n 2 (quadratic time),
  • 2n(exponential time).

The first four functions have a low growth rate and algorithms whose operating time is estimated by these functions can be considered fast. The growth rate of an exponential function is sometimes characterized as “explosive.” For comparison, let us assume that there are algorithms whose complexity (number of operations) is quite accurately reflected by these functions. Let these algorithms be executed on a computer running at a speed of 10 12 operations per second. With entrance length n≤ 100000 algorithms whose speed is estimated by the first four functions will receive an answer in a tiny fraction of a second. For an algorithm with complexity 2 n The operating time is estimated as follows:

  • n= 50 ≈ 19 minutes,
  • n= 60 ≈ 320 hours,
  • n= 70 ≈ 37 years old.

Question 15=49. Sequential, cyclic and recursive algorithms.

Sequential Algorithms – algorithms in which blocks are executed sequentially one after another, in the order of a given scheme.

Example. Calculate the perimeter of a triangle with sides a,b,c.13

Branching structure algorithm

In practice, it is rarely possible to present the solution to a problem in the form of an algorithm

linear structure. Often depending on any intermediate

calculation results are carried out either according to one or the other

formulas, i.e. depending on the fulfillment of some logical condition

the computational process is carried out according to one or another formula.

The algorithm for such a computational process is called an algorithm

branching structure.

Branching - control structure, organizing the execution only

one of the two specified actions depending on the fairness

some condition.

A condition is a question that has two possible answers: yes or no.

Branching is recorded in two forms: complete and incomplete (Fig. 1 a, b).

a) full form b) incomplete form

Cyclic algorithms algorithms in which it is necessary to repeatedly calculate values ​​using the same mathematical dependencies (block diagrams) for different values ​​of the quantities included in them. Using loops can significantly reduce the size of the circuit

algorithm and the length of the corresponding program. There are cycles with

a given and unknown number of repetitions. With a given number of repetitions -

loop with counter. With an unknown number of repetitions - a loop with a precondition,

loop with postcondition.

A function (or procedure) that directly or indirectly refers to itself is called recursive. Recursion is a method of defining a function through its previous and previously defined values, as well as a method

organization of calculations in which a function calls itself with another argument

When implementing recursive algorithms, each recursion step does not directly solve the problem, but reduces it to the same problem smaller size. This process should result in a problem of a size such that

the solution is quite easy. Then the “reverse move” gives successive solutions for a problem of ever larger size, up to the original one. The implementation of a recursive procedure is based on a stack (store-type memory), which stores data involved in all calls to the procedure in which it has not yet completed its work. Recursion is a way of organizing the calculation process when the algorithm refers to itself. The principle of recursion allows you to solve a complex problem by sequentially solving simpler subproblems. As a rule, recursion is necessary in cases where you need to go through too many options. Recursion is considered to be one of the varieties of a cyclic algorithm. The recursive form of organization makes it possible to give the algorithm a more compact form. Thus, the problem is solved from complex to simple - the content of the recursive algorithm reflects a more complex object through a simpler one of the same type. Typically a recursive algorithm contains the following main parts:

– condition for completing the cycle;

– the body of the recursion, which includes actions intended to

execution at each iteration;

– a recursion step in which the recursive algorithm calls itself.

There is a distinction between direct and indirect recursion. In the first case, the algorithm

contains a function that calls itself. If a function calls another function, which in turn calls the first, then such a function

is called indirectly recursive.

The main requirement for recursive algorithms is that the reversal process is not

must be infinite. In other words, it must be implemented

checking for call completion, or in a recursive definition should

there is a restriction under which further initialization

recursion stops.

An example of a recursive function is calculating the factorial of a number.

int factoria(int n)

if (n) return n* factoria(n-1);

else return 1;

Example of a recursive procedure:

procedure Rec(a: integer); begin if a>0 then Rec(a-1); writeln(a); end;

Let's consider what happens if a call, for example, of the form Rec(3) is made in the main program. Below is a flowchart showing the execution sequence of the statements.



Term: January 8, 2010

To specified period the article should not be edited by other project participants website. Upon completion, any participant has the right to correct this article at his own discretion and remove this warning, displayed using the template ((Task)).

See also guidelines on use of the Resource website in the educational process.

Computational complexity theory- a branch of computational theory that studies the amount of work required to solve a computational problem.

A problem is considered difficult if solving the problem requires a large amount of resources, regardless of the algorithm used to solve it. The theory formalizes this intuitive concept by introducing mathematical models of computation to study these problems and quantify the amount of resources required to solve them, such as time and memory used. Other measures of complexity are also possible, such as: the number of messages (communication complexity), the number of elements in the circuit of functional elements (circuit complexity) and the number of processors. In particular, computational complexity theories define practical limits on what computers can and cannot do.

Closely related to the theories of computational complexity are the analysis of algorithms and the theory of computability. The main difference between computational complexity theory and algorithm analysis is that the latter is concerned with analyzing the amount of resources required by a particular algorithm to solve a problem, while the former asks a more general question about all the possible algorithms that could be used to solve the same problem. problem. More precisely, computational complexity theory attempts to classify problems that can or cannot be solved by an appropriate amount of limited resources. In turn, the introduction of restrictions on available resources is what distinguishes computational complexity theory from computability theory: the latter asks what problems can, in principle, be solved algorithmically without limiting computing resources.

Computational problems

Task Instances

Computational problems can be viewed as an infinite set of pairs: (instance of problem, solution for a given instance). The input string for a computational problem is a string describing the problem instance. The output string for a computational problem is a description of the solution for the problem instance described by the input string. For example, the problem of recognizing the primeness of a number: an instance of the problem is a number for which it is necessary to determine whether it is prime or not, the solution is the string “yes” if this number is prime and “no” otherwise. The theory of computational complexity considers only massive problems, i.e. the requirement for an infinite set of task instances is mandatory.

Task View

When considering computational problems, the description of a problem instance is a line above the alphabet. As a rule, the alphabet is taken to be binary (i.e. the set (0,1)). Various mathematical objects must be encoded accordingly. So, for example, integers can be represented in binary system radix numbers and graphs can be encoded directly through their adjacency matrices or through their encoding of adjacency lists in binary.

Recognition tasks

Recognition problems are one of the central objects of research in the theory of computational complexity. A recognition problem is a special type of computational problem to which the answer is either “yes” or “no” (1 or 0). The recognition problem can be formulated as a problem of whether an input string belongs to a certain subset (language) of the set of all input strings. An input problem string belongs to the corresponding language if and only if the answer to that string is “yes”. Thus, the recognition task is the task of recognizing whether an input string belongs to a certain language.

An example of a recognition task. Input line: description of an arbitrary graph. The problem is to decide whether a given graph is connected or not. The language of connected graphs is a set of descriptions of all connected graphs. To obtain a precise definition of this language, one needs to decide how graphs are encoded as binary strings.

Search tasks

A search task is a computational task where the output value is more complex than in a recognition task (that is, it is not simply “yes” or “no”).

An example of a search problem is the traveling salesman problem. The traveling salesman problem (traveling salesman - traveling merchant) is one of the most famous combinatorial optimization problems. The task is to find the most profitable route that passes through the specified cities at least once and then returns to the original city. The conditions of the problem indicate the criterion for the profitability of the route (shortest, cheapest, aggregate criterion, etc.) and the corresponding matrices of distances, costs, etc. As a rule, it is indicated that the route should pass through each city only once - in such a way In this case, the choice is made among Hamiltonian cycles. Input string: description of a weighted (i.e., with numerical marks on the edges) graph. The output line is a description of the optimal route of the traveling salesman.

There is a pairwise relationship between recognition tasks and search tasks. The search problem can be formulated as a recognition problem. For example, for a search task of “multiplying two numbers,” the corresponding pairwise recognition task can be represented as a set of triplets (A, B, C) such that the relation A × B = C holds.

Measuring Difficulty

The theory of computational complexity arose from the need to compare the performance of algorithms and clearly describe their behavior (execution time, amount of memory required, etc.) depending on the size of the input and output.

The number of elementary operations spent by an algorithm to solve a specific instance of a problem depends not only on the size of the input data, but also on the data itself. For example, the number of operations of the insertion sort algorithm is much less if the input data is already sorted. To avoid such difficulties, consider the concept of worst-case time complexity of an algorithm.

The time complexity of an algorithm (in the worst case) is a function of the size of the input and output data, equal to the maximum number of atomic operations performed by the algorithm to solve a problem instance of a specified size. In problems where the size of the output is not greater than or proportional to the size of the input, one can view the time complexity as a function of the size of the input only.

Similarly to the concept of time complexity in the worst case, the concept of time complexity of an algorithm in the best case is defined. We also consider the concept of the average operating time of an algorithm, that is, the mathematical expectation of the operating time of the algorithm. Sometimes they say simply: "Time complexity of an algorithm" or "Running time of an algorithm", meaning the time complexity of an algorithm in the worst, best or average case (depending on the context).

By analogy with time complexity, the spatial complexity of the algorithm is determined, only here we are not talking about the number of elementary operations, but about the amount of memory used.

Despite the fact that the time complexity function of an algorithm can be determined precisely in some cases, in most cases it is pointless to look for its exact value. The point is that, firstly, the exact value of time complexity depends on the definition of elementary operations (for example, complexity can be measured in the number of arithmetic operations or Turing machine operations), and secondly, as the size of the input data increases, the contribution of constant factors and terms lower orders appearing in the expression for the exact operating time becomes extremely insignificant.

Consideration of large input data and estimation of the order of growth of the algorithm's running time leads to the concept of asymptotic complexity of the algorithm. Moreover, an algorithm with lower asymptotic complexity is more efficient for all input data, with the possible exception of small data.

Complexity is determined based on the computational model in which the calculations are performed.

Computational models

There are many different models of computing: the Post machine, the Minsky machine, lambda calculus, partially recursive functions, normal Markov algorithms, random access memory machines (RAM machines), etc. We will only mention the most popular computing model - the Turing machine.

Turing machine

Turing machine (MT) is an abstract performer (abstract computing machine). It was proposed by Alan Turing in 1936 to formalize the concept of an algorithm.

A Turing machine is an extension of a finite state machine and, according to the Church-Turing thesis, is capable of simulating all other executors (by specifying transition rules) that somehow implement the process of step-by-step calculation, in which each step of the calculation is quite elementary.

A Turing machine consists of a tape that is infinite in both directions (Turing machines are possible that have several infinite tapes), divided into cells, and control device, capable of being in one of many states. The number of possible states of the control device is finite and precisely specified.

The control device can move left and right along the tape, read and write symbols of some finite alphabet into the cells of the tape. A special empty symbol is allocated, filling all the cells of the tape, except those of them (the final number) on which the input data is written.

The control device operates according to transition rules that represent the algorithm implemented by a given Turing machine. Each transition rule instructs the machine, depending on the current state and the symbol observed in the current cell, to write a new symbol into this cell, move to a new state and move one cell to the left or right. Some states of a Turing machine can be marked as terminal, and a transition to any of them means the end of the work, stopping the algorithm.

A Turing machine is said to be deterministic if there is at most one rule corresponding to each combination of state and ribbon symbol in the table. If there is a pair (ribbon symbol - state) for which there are 2 or more instructions, such a Turing machine is called non-deterministic.

The Turing machine model allows for various extensions. One can consider Turing machines with an arbitrary number of tapes and multi-dimensional tapes with various restrictions; machines that use a source of randomness.

The Turing machine is one of the main models of computation in complexity theory.

Difficulty classes

Complexity classes are sets of computational problems that are approximately equal in computational complexity. There are language complexity classes and functional complexity classes. The complexity class of languages ​​is a set of predicates (functions that take a word as input and return the answer 0 or 1) that use approximately the same amount of resources for calculation. The concept of a functional complexity class is similar, except that it is not a set of predicates, but a set of functions. In complexity theory, the default complexity class is the complexity class of languages. A typical definition of complexity class looks like this:

The complexity class X is the set of predicates P(x), computable on Turing machines and using O(f(n)) resources for calculation, where n is the length of the word x.

The resources are usually taken to be the computation time (the number of working cycles of the Turing machine) or the working area (the number of used cells on the tape during operation). Languages ​​recognized by predicates from some class (that is, the set of words in which the predicate returns 1) are also said to belong to the same class.

Additionally, many classes can also be described in terms of mathematical logic or game theory.

Classes are usually denoted in capital letters. The complement of class C (that is, the class of languages ​​whose complements belong to C) is denoted co-C.

For each class there is a category of tasks that are “most difficult”. This means that any task from a class is reduced to such a task, and moreover, the task itself lies in the class. Such problems are called complete problems for a given class.

Class P

Class P (from the English polynomial) is a set of recognition problems that can be solved on a deterministic Turing machine in a time polynomial in the length of the input. Similarly, for search problems the class FP (from the English functional polynomial) is defined.

More formally, consider deterministic Turing machines that calculate the answer given a word from the input alphabet given on the input tape. The operating time of a Turing machine for a fixed input word x is the number of working cycles of a Turing machine from the start to the stop of the machine. The complexity of a function calculated by some Turing machine is a function that depends on the length of the input word and is equal to the maximum operating time of the machine for all input words of a fixed length:

.

If for the function f there is a Turing machine M such that for some number c and big enough n, then they say that it belongs to the class FP, or is polynomial in time.

Class P is one of the fundamental classes in the theory of computational complexity.

Class NP

The NP class (from the English non-deterministic polynomial) is a set of recognition problems, the solution time of which significantly depends on the size of the input data; at the same time, there is an algorithm that, having received, along with a description of the input values, some additional information (a witness to the solution), can solve the problem quite quickly (in a time not exceeding a polynomial of the size of the data).

More formally, a language L is said to belong to the class NP if there exists a two-place predicate R(x, y) from the class P (that is, computable in polynomial time) and a polynomial p such that for any word x of length n the condition “x belongs to L” ” is equivalent to the condition “there is y of length less than p(n) such that R(x, y) is true.” The word y is called a witness that x belongs to the language L. Thus, if we have a word that belongs to the language and another witness word of limited length (which can be difficult to find), then we can quickly verify that x really belongs to L. Any problem belonging to NP can be solved in exponential time by searching through all possible witnesses of length less than p(n).

Example problem from NP: recognition problem “Existence of an integer solution to a system of linear inequalities.” The witness is the solution to the system of inequalities. It is easy to verify in polynomial time that the witness solution is suitable.

The NP class includes the P class.

Open Issues

In the theory of computational complexity, there are many unsolved problems; they mainly concern issues of division or nesting of certain complexity classes. One of these questions is the problem of equality of classes P and NP.

Problem of equality of classes P and NP

Ultimately, the problem of class equality P and NP is this: if the positive answer to a question can be quickly verified (in polynomial time), then is it true that the answer to this question can be quickly found (in polynomial time)?

The following corollary immediately follows from the definition of classes P and NP: . However, nothing is yet known about the strictness of this inclusion, i.e. is there an algorithm that lies in NP but not in P? If such an algorithm does not exist, then all problems belonging to the class NP can be solved in polynomial time, which promises enormous benefits from a computational point of view. Currently, the most difficult NP problems (so-called NP-complete problems) can be solved in exponential time, which is almost always unacceptable.

The question of the equality of these two classes is considered one of the most difficult open problems in the field of theoretical computer science. Currently, most mathematicians believe that these classes are not equal. The Clay Mathematics Institute included this problem in the list of the Millennium Problems, offering a reward of one million US dollars for its solution.

Literature

  1. Gehry M., Johnson D. Computers and difficult problems to solve. Publishing house Mir in 1982. - 420 s. The monograph of American scientists is devoted to solving complex (including NP-hard) combinatorial problems arising in discrete optimization, mathematical programming, algebra, automata theory with examples.
  2. Corman, Thomas H.; Leiserson, Charles I.; Rivest, Ronald L.; Stein, Clifford Algorithms: Construction and Analysis, 2nd edition = Introduction to Algorithms second edition. - M.: “Williams”, 2005. -

To evaluate the effectiveness of the algorithm, the most important indicators are:

Algorithm execution time,
- required amount of RAM.

Nowadays, due to half a century of technological progress, the first indicator (execution time) is often much more important than the second, so further we will dwell only on it in detail.

Simplifications for estimating execution time of algorithms


In the works of D. Knuth, the following approach was proposed for analyzing the execution time of algorithms: the total time consists of the values ​​of cost * frequency for each basic operation. Basic operations may include addition, multiplication, division, getting an element by index from an array, comparing integers, etc. It is easy to see that in this case, calculating an estimate of the execution time of the algorithm is quite tedious. Therefore, A. Turing said that it is convenient to use even rough approximations of estimates of the execution time of algorithms: you can assign weights to various operations depending on their frequency of occurrence during the operation of the algorithm and take into account only those operations that have the highest weights. For example, when multiplying matrices, only operations such as multiplying and writing numbers should be taken into account, because These are the most common operations.Considering only the mostfrequently occurring operations - first simplification, proposed for approximate calculation of the execution time of algorithms.

The second simplification is to discard lower-order terms (i.e. terms) that contribute little to the overall running time of the algorithm. For example (hereinafter the number N characterizes the size of the input data),

\(1/6 N^3 + 20N + 16 \sim 1/6N^3\),

instead of \(1/6N^3\) they write "this algorithm has complexity \(O(N^3)\), instead of \(3N^4\) they write "this algorithm has complexity \(O(N^4)\ )".

Definition of big O

We say that \(f\) is "O large" of \(g\) for \(x \to x_0\) if there is a constant \(C>0\) such that for all \(x\) from in a neighborhood of the point \(x_0\), the inequality \(|f(x)| \leq C|g(x)|\) holds. Below is an illustration of the definition (\(x\) axis is the size of the input data, \(y\) axis is the execution time of the algorithm). We see that starting from a certain point, as the size of the input data tends to \(\propto\) \(f(n)\) grows more slowly than \(g(n)\) and in general \(g(n)\) as if limiting it from above.

Examples. \(1 = O(N), N = O(N^2).\)

Along with estimates of the form \(O(N)\), the estimate \(\Omega(N)\) (omega large) is used. It denotes the lower bound for the growth of the function. For example, let the number of operations of the algorithm be described by the function \(f(N)=\Omega(N^2)\). This means that even in the most successful case, at least \(N^2\) actions will be performed. While the estimate \(O(N^3)\) guarantees that the worst case will be no more than order \(N^3\) actions. Also used is the estimate \(\Theta(N)\) which is the upper and lower asymptotic estimate when\(O(N)\) and \(\Omega(N)\) are the same.So, \(O(N)\) is an approximate estimate of the algorithm on the worst input data,\(\Omega(N)\) - on the best input data,\(\Theta(N)\) - shorthand notation for identical\(O(N)\) and \(\Omega(N)\).

Running time estimates for different algorithms

Let us denote T(N) as the execution time of the algorithm. Let the algorithm under study have the form:

1. a set of instructions, including only basic operations:

Statement 1;
...
statement k;

Then T(N) = T(statement 1) + ... + T(statement k).

Because Each instruction includes only basic operations, then the execution time of this piece of code does not depend on the size of the input data (does not increase with the size of the input data), i.e. is a constant. This algorithm has O(1) complexity.

2. if-else statements

If (condition) (
sequence of statements 1
}
else(
sequence of statements 2
}

Here either sequence of statements 1 or sequence of statements 2 will be executed, therefore, since we want to estimate the worst case execution time, T(N) = max(T(sequence of statements 1), T(sequence of statements 2)). For example, if the execution time of sequence of statements 1 is O(N), and sequence of statements is O(1), then T(N) = O(N).

For (i = 0; i< N; i++) {
sequence of statements
}

Because the loop will be executed N times, then the sequence of statements will also be executed N times. If T(sequence of statements) = O(1), then T(N) = O(N)*O(1) = O(N).

4. Nested loops.

For (i = 0; i< N; i++) {
for (j = 0; j< M; j ++){
...
}
}

The outer loop is executed N times. Each time the outer loop is executed, the inner loop M is executed

Now consider this code:

For (i = 0; i< N; i++) {
for (j = i + 1; j< N; j ++){
sequence of statements
}
}

Let's look at the change in the number of iterations of the inner loop depending on the iteration of the outer loop.

I cycle j (number of execution times)
0 N
1 N-1
2 N-2
...
N-1 1

Then the sequence of statements will be executed N + N-1 + ... + 1 times. To quickly calculate such amounts, formulas from mathematical analysis are useful, in this case the formula


Those. this algorithm will have complexity \(O(N^2)\).

And here are other most often needed formulas that are useful for such cases:

4. When an assertion includes a method call, the assertion's execution time estimate is calculated taking into account the method's execution time estimate. For example:

for (j = 0; j< N; j ++){


If the execution time of the method is \(g(N)=O(N)\), then \(T(N) = O(N)*O(N) = O(N^2)\).

5. Binary (binary) search.

Int l = 0;
int u = A.length - 1
int m;
while (l<= u) {
m = l + (u - 1)/2
if A[m]< k {
l = m +1;
}
else if A[m] == k (
return m;
}
else(
u = m - 1;
}
}
return -1;

Binary search allows you to find the index of the number k in a sorted array; if this number is not in it, then -1 is returned. First we compare k with the number in the middle of the array. If k is less than this number, then we must then look for it in the left half of the array, if it is more, then in the right half. Next, k is compared with the number located in the middle of the half of the array selected at the previous step, etc. With each iteration, the search space is reduced by half. The question arises: how many iterations will need to be done in the worst case (that is, when a number equal to k is never found in the array and there is no data left for comparison).

We see that after 1 iteration there will be \(N/2\) data left to search for index \(k\), after 2 iteration there will be \(N/4\) data left, after 3 iteration - \(N/8\) and etc. We will know the number of iterations in the worst case if we solve the equation \(\frac(N)(2^x)=1\). This equation is equivalent to the equation \(2^x=N\), hence \(x=log_(2)(N)\) or \(x=log(N)\) (see definition of logarithm). Therefore, the complexity estimate for the binary search algorithm is \(O(logN)\).

The good news is that to characterize the execution time of most algorithms, just a few functions are enough: \(1, logN, N, NlogN, N^2, N^3, 2^N\). The graph illustrates the different rates of growth in the execution time of the algorithm depending on the size of the input data:

From this graph, in particular, it is clear that if the execution time of the algorithm is “logarithmic”, i.e. the algorithm has complexity \(O(logN)\), then this is very cool, because its execution time grows very slowly with increasing size of the input data, if the execution time depends linearly on the size of the input data, then this is also not bad, but it is better not to use algorithms with exponential running time (\(O(2^N)\)) or use only on very small data sizes.

classes P and NP

A real non-negative function \(f(m)\), defined for positive integer values ​​of the argument, is called polynomially bounded if there is a polynomial \(P(m)\) with real coefficients such that \(f(m) \leq P( m)\) for all \(m \in N^+\). Problems for which there are algorithms with "polynomial" running time belong to class P (these problems are mostly solved quickly and without any problems).

Formal definition. A language L belongs to class P if and only if there exists a deterministic Turing machine M such that:

For any input data, M finishes its work in polynomial time,
- for all \(x \in L\) M produces the result 1,
- for all \(x\) not belonging to \(L\), M produces the result 0.

NP class problems- tasks that satisfy the condition: if there is an answer (possible solution), then it is easy to verify - check whether it is a solution or not.

Let's consider an example of a problem from the NP class. Let a set of integers be given, for example, (-7, -3, -2, 5, 8). You need to find out whether among these numbers there are 3 numbers that add up to 0. In this case, the answer is “yes” (for example, such a triple is the numbers (-3,-2,5). As the size of sets of integers increases, the number of subsets, consisting of 3 elements increases exponentially. Meanwhile, if we are given one such subset (it is also called a certificate), then we can easily check whether the sum of its elements is equal to 0.

Formal definition:

A language L belongs to the class NP if and only if there exist polynomials \(p\) and \(q\) and a deterministic Turing machine M such that:

For any \(x,y\), the machine M on input data \((x,y)\) executes in time \(p(|x|)\),
- for any \(x \in L\) there is a string \(y\) of length \(q(|x|)\) such that \(M(x,y)=1\),
- for any \(x\) not from \(L\) and all strings of length \(q(|x|)\) \(M(x,y)=0\).

Polynomial reducibility or Karp reducibility. The function \(f_1\) reduces to the function \(f_2\) if there is a function \(f \in P\) such that for any \(x\) \(f_(1)(x)=f_(2 )(f(x))\).


Problem T is called NP-complete, if it belongs to the class NP and any other problem from NP can be reduced to it in polynomial time. Perhaps the most famous example of an NP-complete problem is the SAT problem (from the word satisfiability). Let's be given a formula containing Boolean variables, operators "AND", "OR", "NOT" and parentheses. The problem is: is it possible to assign values ​​to all variables appearing in the formula? lie And true so that the formula takes the value " true".

Problem T is called NP-hard, if for it there is an NP-complete problem that reduces to T in polynomial time. What we mean here is Cook's reducibility. Cook's reduction of problem \(R_1\) to \(R_2\) is a polynomial time algorithm that solves problem \(R_1\) provided that the function that finds the solution to problem \(R_2\) is given to it as an oracle, that is, accessing it takes only one step.

Here are the possible relationships between the above classes of problems (scientists are still not sure whether P and NP are the same).


Close