Modular Multiplicative Inverse

Let us start with few facts and know importance of this topic. While using mod (%) operator,

(a + b) % m = a%m + b%m
(a * b) % m = a%m * b%m

But this relation doesn’t satisfy when performing division.

(a / b) % m ≠ a%m / b%m

Being a commonly used operator, this problem has to be overcome. i.e. to apply mod operator on a division operation. Extended Euclidean algorithm comes to rescue when b,m are co-primes.

When b and m are co-primes, find MI = modular inverse of b over m. This modular inverse can be used to find quotient of division applied to modulus.
i.e

(a / b) % m = a%m * MI%m

Here is the C++ function for finding modular multiplicative inverse using extended euclidean:

int modInv(int b, int m)
{
	int r1 = m,r2 = b;
	int t1 = 0,t2 = 1;
	int q,r,t;
	while(r2 > 0)
	{
		q = r1/r2;
		r = r1 - q*r2;
		r1= r2;
		r2= r;
		t = t1 - q*t2;
		t1 = t2;
		t2 = t;
	}
	if(r1 == 1)
		return t1;
	else
	{
		printf("Inverse doesn't exist");
		return -1;
	}
}

Inverse doesn’t exist only in the cases when b and m are not coprimes.

Fermet’s Little Theorem:
This theorem is saviour to overcome the difficulty in able to remember the Extended Euclidean algorithm. In most of the problems on competitive programming, MOD value is generally a constant prime, usually with a value of 10^9 + 7. Fermet’s Little theorem states that inverse of a number ‘A’ over mod ‘N’, when N is prime is modPow(A,N-2,N). i.e.

A-1 = AN-2 % N

Though the value of N is very large, you can always use Modular Exponentiation Algorithm to solve it efficiently.

Modular Exponentiation Algorithm

In this post I will be discussing a method to find (ab)%MOD. The problem is simple, we have been doing it from long time to find ab. But as b increases, we may think of writing a code for it. Say a simple method like this:

long long int modPow(long long int m, long long int n, long long int mod)
{
    long long int p = 1;
    for(int i = 0;i < n;i++) 
    { 
        p *= m; 
        p %= mod; 
    } 
    return mod; 
}  

Above mentioned function solves the need. But is not efficient. Complexity of the function is O(n) and hence cannot find required answer for large values of b or n(for n values > 108) in reasonable time. This can be optimised using Modular Exponentiation Algorithm. This algo solves the problem in O(log n) complexity. Here is the function:

 
long long int fastModPow(long long int a, long long int b, long long int mod) 
{ 
    long long int ans = 1; 
    while(b > 0)
    {
        if(b%2 == 1)
            ans = (ans*a)%mod;
        a = (a*a)%mod;
        b /= 2;
    }
    return ans;
}

How this function works?
Let us solve for 550. Leave the modulus as it is just an additional operation that is done commonly at each step.

bin(50) = 110010

Initial values    : ans = 1                      a = 5                      b = 50
Step 1            : ans = 1                      a = 5*5                      b = 25
Step 2            : ans = 1 * 52               a = 52 * 52               b = 12
Step 3            : ans = 52                      a = 54 * 54             b = 6
Step 4            : ans = 52                      a = 58 * 58              b = 3
Step 5            : ans = 52 * 516            a = 516 * 516            b = 1
Step 6            : ans = 518 * 532          a = 532 * 532            b = 0

We are left with ans = 550 which is the desired result. As we move from LSB to MSB in binary form of exponent, compare the respective steps shown above. The calculations followed are different for the bit values 1 and 0.

Largest submatrix of all 1’s in a binary matrix

Problem Statement : Given a binary matrix(matrix that have only 1’s and 0’s), find the largest rectangular submatrix that have only 1’s.

Explanation:
We need to construct a matrix of histograms from given matrix. The height of histogram at a cell(H[i][j]) would be number of consecutive 1’s above the current cell along with current cell itself.

Ex: For the binary matrix

0 0 0 0 1
1 1 0 0 1
0 0 1 1 1
0 0 1 1 1
1 1 1 1 1

Histogram matrix would be

0 0 0 0 1
1 1 0 0 2
0 0 1 1 3
0 0 2 2 4
1 1 3 3 5

Now, we need to traverse the matrix and calculate the maximum area possible for each histogram. For this to understand, please refer this page. Observe the graphs shown in the link to understand how to calculate the maximum possible rectangular area in histograms. In our histogram matrix, each row can be considered as a separate histogram and should keep track of maximum area possible for the complete matrix.

Code :

#include <bits/stdc++.h>

using namespace std;

int maxArea(int a[][100],int m,int n)
{
	int area = 0,count,temp;
	/*
	 *Finds maximum area of rectangle from the histograms build.
	*/
	for(int i = 0;i < m;i++)
	{
		for(int j = 0;j < n;j++)
		{
			if(a[i][j] > 0)
			{
				count = a[i][j];
				temp = j-1;
				while(temp >= 0 && a[i][temp] >= a[i][j])
				{
					count+=a[i][j];
					temp--;
				}
				temp = j+1;
				while(temp < n && a[i][temp] >= a[i][j])
				{
					count += a[i][j];
					temp++;
				}
				if(count > area)
					area = count;
			}
		}
	}
    return area;
}

int solve(int a[][100],int m,int n)
{
	int temp = 0;
	/*
	 *Sets a[i][j] to consecutive 1's above it till that cell.
     *It is like building histograms on each row with size of histogram at 
     *an index of a row as number of consecutive 1's till that index.
    */
    for(int i = 0;i < n;i++)
	{
		for(int j = 0;j < m;j++)
		{
			if(a[i][j] == 1)
			{
				if(i >= 1)
					a[i][j] = a[i-1][j] + 1;
				else
					a[i][j] = 1;
			}
		}
	}
    return maxArea(a,m,n);
}

int main()
{
	int m,n;
	scanf("%d %d",&m,&n);
	int a[100][100];
	for(int i = 0;i < m;i++)
	{
		for(int j = 0;j < n;j++)
		{
			scanf("%d",&a[i][j]);
		}
	}
	printf("%d\n",solve(a,m,n));
	return 0;
}
Ex :
Matrix A[][]

5 5
0 1 0 1 1
0 1 0 0 1
1 1 1 1 1
0 1 1 1 0
0 0 1 0 1

Histogram matrix H[][] will be

0 1 0 1 1
0 2 0 0 2
1 3 1 1 3
0 4 2 2 0
0 0 3 0 1

When calculated for maximum area in histogram, the cells H[3][2] and H[3][3] give count = 6 and is the maximum area over any other cell in matrix H. In the code, I wrote values of histogram matrix in given matrix(A) itself so that no extra space is used to solve the problem.

For any doubts on this topic, comment below.

Maximum number of 1’s in a submatrix of given area in a binary matrix

Firstly, let me make the statement clear. You will be given a binary matrix. A binary matrix is one which only consists of 1’s and 0’s. You will be asked to find a maximum number of 1’s in rectangular sub-matrix of a given area.

Naive Approach:
Find all possible rectangles for given area.
For all possible rectangles in given matrix, find number of 1’s in given area.

(I will be discussing for a square matrix of dimensions NxN. Similar solution for any rectangular matrix can be constructed)

for(i = 1;i <= N;i++)
{
	if(area%i == 0 && area/i <= N) 
	{//(area/i) should be less than N else the rectangle goes is out of NxN
		x = i;
		y = area/i;
		for(j = 1;j <= N;j++)
		{
			for(k = 1;k <= N;k++)
			{
				if(j+x-1 <= N && k+y-1 <= N)
				{//checking if other ends lie in range of matrix dimensions
					temp = 0;
					for(l = j; l <= j+x-1 ; l++)
					{//here I find number of 1's in the rectangle
						for(m = k;m <= k+y-1; m++)
						{
							if(a[l][m] == 1)
								temp++;
						}
					}
					if(temp > max)
						max = temp;
				}
				if(j+y-1 <= N && k+x-1 <= N)
				{//this is another rectangle of same dimensions but different orientation
					temp = 0;
					for(l = j; l <= j+y-1 ; l++)
					{
						for(m = k;m <= k+x-1; m++)
						{
							if(a[l][m] == 1)
								temp++;
						}
					}
					if(temp > max)
						max = temp;
				}
			}
		}
	}
}
printf("%d\n",max);

This approach takes a long time to give output for even a values of N around 100. Let us now learn a better approach to this problem.

Solving using Dynamic programming :
We can construct a 2d DP where dp[x][y] has number of 1’s in submatrix a[i][j] where 1 ≤ i ≤ x and 1 ≤ j ≤ y.
The DP can be constructed by this formula

dp[i][j] = 	dp[i-1][j] + dp[i][j-1] - dp[i-1][j-1]	    ,if a[i][j] = 0
		dp[i-1][j] + dp[i][j-1] - dp[i-1][j-1] + 1  ,if a[i][j] = 1

In single statement,

dp[i][j] =     dp[i-1][j] + dp[i][j-1] - dp[i-1][j-1] + a[i][j]

Understanding the formula : For dp[x][y], we add (number of 1’s in sub-matrix a[i][j], where 1 ≤ i ≤ x and 1 ≤ j ≤ y-1) and (number of 1’s in sub-matrix a[i][j], where 1 ≤ i ≤ x-1 and 1 ≤ j ≤ y).
In doing this we add a sub-matrix a[i][j], where 1 ≤ i ≤ x-1 and 1 ≤ j ≤ y-1 twice. So we subtract that value from the sum.
And lastly, we check if the value at a[i][j] is 1 or not and add the value to sum.

Now we find all possible dimensions of the rectangles, and find number of 1’s in all these rectangles. Return maximum of values found as answer.

How to find number of 1’s in a rectangle now? For example, you are at index (x,y) and dimensions of rectangle is (r,s), number of 1’s in that rectangle would be,

dp[x+r-1][y+s-1] - (dp[x+r-1][y-1] + dp[x-1][y+s-1]) + dp[x-1][y-1].

This is much similar as the way in DP formula that we discussed earlier.

void constructDP(int a[][1001],int dp[][1001])
{
	for (int i = 0; i <= n; ++i)
	{
		dp[0][i] = 0;
		dp[i][0] = 0;
	}
	for (int i = 1; i <= n; ++i)
	{
		for(int j = 1;j <= n; ++j)
		{
			dp[i][j] = dp[i-1][j] + dp[i][j-1] - dp[i-1][j-1] + a[i][j];
		}
	}
}

int solve(int a[][1001],int dp[][1001],int n)
{
	constructDP(a,dp);
	ans = INT_MAX;
	for (int i = 1; i <= n; ++i)
	{
		if(count%i == 0 && count/i <= n)
		{
			x = i;
			y = count/i;
			for(int j = 1;j <= n;j++)
			{
				for(int k =1; k <= n;k++)
				{
					if((j+x-1 <= n) && (k+y-1 <= n))
						ans = max(ans, dp[j+x-1][k+y-1]-dp[j-1][k+y-1]-dp[j+x-1][k-1]+dp[j-1][k-1]);
					if((j+y-1 <= n) && (k+x-1 <= n))
						ans = max(ans, dp[j+y-1][k+x-1]-dp[j-1][k+x-1]-dp[j+y-1][k-1]+dp[j-1][k-1]);
				}
			}
		}
	}
	return ans;
}

Using the function, you can note that, with a O(N2) pre-computation, we reduce a query of finding number of 1’s in a rectangle every time from O(N2) to O(1).

Problems related with this concept : SWAPMATR

Heap Sort Algorithm

There are a numerous sorting algorithms. Among them Heap Sort is one which has a consistent complexity for any input. For any of worst-case, best-case, reverse order heap sort has O(nlogn) runtime which is considerably best complexity for sorting.

To perform heap sorting, the given array is first constructed into heap. Heap is a tree-based data structure where there is a relation for all parent-child nodes. There are two heaps namely “max heap” and “min heap”. In max heap, parent node has a value greater than its children nodes. Similarly, in min heap, parent node has a value lesser than its children nodes.

For heap sorting we use complete binary max-heap which mean the heap data structure is a complete binary tree.

Let us here construct the required heap in an array instead of making a tree. For constructing a heap, for every new element we take in, we need to compare it with a set of already existing elements in the array. Let us first look its function.

#define swapAB(a,b) a=a+b-(b=a)

void heapify(int a[],int heap[],int n)
{
    int i = 1,temp;
    heap[0] = a[0];
    for(;i < n;++i)
    {
		heap[i] = a[i];
		temp = i;
		while(heap[(temp-1)/2] < heap[temp])
		{
			swapAB(heap[(temp-1)/2],heap[temp]);
			temp = (temp-1)/2;
		}
	}
}

Let a[] is the input array for which heap is to be constructed. heap[] will be initially empty and consists pre-order traversal of heap tree after execution of this function. n is number of elements in input array. In the construction, each new element that is entering heap[] is compared to its immediate parent node. If value of parent node is less than current node, values are swapped and current node is shifted to parent node. The process is continued until the swapping takes place( or till parent node has greater value than child node) or till it reaches root node. Parent node of heap[i] would be heap[(i-1)/2] as we follow 0-based indexing.

If you now see the heap, we could notice that root node(heap[0]) is the maximum value of the array. In sorting the inputs, we take this fact and swap first index with last index and omit the last index in further calculations as we gave the last position to largest number. We need to make sure that remaining heap, has the heap conditions satisfied i.e. root node value is greater than children node values. As we changed only one node in remaining heap, i.e. heap[0], we need to start checking only for that node till it reaches to its correct position.

Heap sort implementation in C++.

#include <bits/stdc++.h>

using namespace std;

#define swapAB(a,b) a=a+b-(b=a)

void printA(int a[],int n) //prints array
{
	int i = 0;
	while(i < n)
		cout<<a[i++]<<" ";
	cout<<endl;
}

void heapify(int a[],int heap[],int n)  //constructs heap
{
    int i = 1,temp;
    heap[0] = a[0];
    for(;i < n;++i)
    {
		heap[i] = a[i];
		temp = i;
		while(heap[(temp-1)/2] < heap[temp])
		{
			swapAB(heap[(temp-1)/2],heap[temp]);
			temp = (temp-1)/2;
		}
	}
}

void heapsort(int heap[],int n)
{
	int temp,m;
	while(n!=1)
	{
		swapAB(heap[0],heap[n-1]);
		n=n-1;
		temp = 0;
        while(1)
        {
			if((2*temp + 2) < n)
			{
				m = max(heap[2*temp + 1],heap[2*temp + 2]);
				if(heap[temp] < m)
				{
                    if(heap[2*temp + 1] == m)
					{
						swapAB(heap[temp],heap[2*temp + 1]);
						temp = 2*temp + 1;
					}
					else
					{
						swapAB(heap[temp],heap[2*temp + 2]);
						temp = 2*temp + 2;
					}
				}
				else goto endwhile;
			}
			else if(2*temp + 1 < n)
			{
				if(heap[temp] < heap[2*temp + 1])
				{
					swapAB(heap[temp],heap[2*temp + 1]);
					temp = 2*temp + 1;
				}
				else goto endwhile;
			}
			else
				break;
        }
        endwhile: {}
	}
}

int main()
{
    int n;
    cout<<"No. of elements in Array : ";
    scanf("%d",&n);
    int i = 0;
    int a[n],heap[n];
    cout<<"Enter elements : ";
    for(;i < n;++i)
		scanf("%d",&a[i]);
    heapify(a,heap,n);
    cout<<"Heap : ";
    printA(heap,n);  //prints array
    heapsort(heap,n);
    cout<<"Sorted Array : ";
    printA(heap,n);
	return 0;
}

Maximum Subarray Sum algorithm

This algorithm is used to find the maximum sum of a contiguous sub array in a given array of integers. If the set of integers include only positive integers, maximum sum would be sum of all the elements of array. But when the array include negative integers too, the problem comes of how to solve it efficiently.

A naive approach would be finding sum of elements of all sub arrays.

MaxSubarraySumNaive(a[])
    max = 0
    for i = 1 to n
        for j = i to n
            temp = 0
            for k = i to j
                temp = temp + a[i]
            if temp > max
                max = temp
    return max

That’s an O(n3) solution which is not at all appreciable. So let’s learn this algo which could solve this problem with an O(n) complexity.

MaxSubarraySum(a[])
    max_sum = 0
    temp = 0
    for i = 1 to n
        temp = max( 0 , temp+a[i] )
        max_sum = max( max_sum , temp )
    return max_sum

This short function solves the problem in real quick time. Working of the function with an example is shown below

For increasing i,
Array elements  : 1 -2 8 -1  6  3 -12  4 -16  5 10
temp values     : 1  0 8  7 13 16   4  8   0  5 15
max_sum values  : 1  1 8  8 13 16  16 16  16 16 16

Fast Doubling method to find nth Fibonacci number

One among very common questions asked in maths category in competitive programming is Fibonacci Series.

For a question that asks to find nth term of Fibonacci series, a naive approach to solve is an iterative method like

#define MOD 1000000007
long long int fib(long long int n)
{
    if(n < 2)
        return n;
    long long int a = 0,b = 1,ans;
    int i = 1;
    while(i < n)
    {
        ans = (a+b) % MOD;
        a = b;
        b = ans;
        i++;
    }
    return ans;
}

Above function has an O(n) complexity. With all our patience we may use it to calculate for at most n = 10^9 which gives output in around 10-15 seconds.

But as n gets larger, it takes hours,days,months,years,decades and so on for increasing n.

So the question is can we optimise it? Do we have methods to find nth Fibonacci number in less than a second?

Yes. We have few methods to do this. Out of them matrix exponentiation is most commonly used concept. Another well known concept is fast doubling method, which we are going to learn now.

Fast doubling is based on two formulae

F(2n) = F(n)[2*F(n+1) – F(n)]
F(2n + 1) = F(n)2 + F(n+1)2

Let us consider n starts from 0 and F(0) = 0. So our Fibonacci series would be F(0) = 0, F(1) = 1, F(2) = 1, F(3) = 2, F(4) = 3, F(5) = 5, F(6) = 8 …

For calculating terms, F(2n) & F(2n + 1), lets have a record of F(n) & F(n+1). By following this statement and taking care of equations and data type overflow, code would be as follows,

#include <bits/stdc++.h>

using namespace std;

#define MOD 1000000007;
long long int a,b,c,d;

void fast_fib(long long int n,long long int ans[])
{
    if(n == 0)
    {
        ans[0] = 0;
        ans[1] = 1;
        return;
    }
    fast_fib((n/2),ans);
    a = ans[0];             /* F(n) */
    b = ans[1];             /* F(n+1) */
    c = 2*b - a;
    if(c < 0)
        c += MOD;
    c = (a * c) % MOD;      /* F(2n) */
    d = (a*a + b*b) % MOD;  /* F(2n + 1) */
    if(n%2 == 0)
    {
        ans[0] = c;
        ans[1] = d;
    }
    else
    {
        ans[0] = d;
        ans[1] = c+d;
    }
}

int main()
{
    long long int n;        /* nth value to be found */
    scanf("%lld",&n);
    long long int ans[2]={0};
    fast_fib(n,ans);
    printf("%lld\n",ans[0]);
    return 0;
}

This code has a complexity of O(log n) which is way too faster than previously discussed function.