Algorithm C++ programming Coding Searching and Sorting

C++ Programming for Anagram Substring Search (Or Search for all permutations)

October 15, 2017 2 Min Read

294 0

Given a text txt[0..n-1] and a pattern pat[0..m-1], write a function search(char pat[], char txt[]) that prints all occurrences of pat[] and its permutations (or anagrams) in txt[]. You may assume that n > m.
Expected time complexity is O(n)

Examples:

1) Input:  txt[] = "BACDGABCDA"  pat[] = "ABCD"
   Output:   Found at Index 0
             Found at Index 5
             Found at Index 6
2) Input: txt[] =  "AAABABAA" pat[] = "AABA"
   Output:   Found at Index 0
             Found at Index 1
             Found at Index 4

This problem is slightly different from standard pattern searching problem, here we need to search for anagrams as well. Therefore, we cannot directly apply standard pattern searching algorithms like KMP, Rabin Karp, Boyer Moore, etc.

A simple idea is to modify Rabin Karp Algorithm. For example we can keep the hash value as sum of ASCII values of all characters under modulo of a big prime number. For every character of text, we can add the current character to hash value and subtract the first character of previous window. This solution looks good, but like standard Rabin Karp, the worst case time complexity of this solution is O(mn). The worst case occurs when all hash values match and we one by one match all characters.

[ad type=”banner”]

We can achieve O(n) time complexity under the assumption that alphabet size is fixed which is typically true as we have maximum 256 possible characters in ASCII. The idea is to use two count arrays:

1) The first count array store frequencies of characters in pattern.
2) The second count array stores frequencies of characters in current window of text.

The important thing to note is, time complexity to compare two count arrays is O(1) as the number of elements in them are fixed (independent of pattern and text sizes). Following are steps of this algorithm.
1) Store counts of frequencies of pattern in first count array countP[]. Also store counts of frequencies of characters in first window of text in array countTW[].

2) Now run a loop from i = M to N-1. Do following in loop.
…..a) If the two count arrays are identical, we found an occurrence.
…..b) Increment count of current character of text in countTW[] …..c) Decrement count of first character in previous window in countWT[]

3) The last window is not checked by above loop, so explicitly check it.

[ad type=”banner”]

Following is C++ implementation of above algorithm.

[pastacode lang=”cpp” manual=”%2F%2F%20C%2B%2B%20program%20to%20search%20all%20anagrams%20of%20a%20pattern%20in%20a%20text%0A%23include%3Ciostream%3E%0A%23include%3Ccstring%3E%0A%23define%20MAX%20256%0Ausing%20namespace%20std%3B%0A%20%0A%2F%2F%20This%20function%20returns%20true%20if%20contents%20of%20arr1%5B%5D%20and%20arr2%5B%5D%0A%2F%2F%20are%20same%2C%20otherwise%20false.%0Abool%20compare(char%20arr1%5B%5D%2C%20char%20arr2%5B%5D)%0A%7B%0A%20%20%20%20for%20(int%20i%3D0%3B%20i%3CMAX%3B%20i%2B%2B)%0A%20%20%20%20%20%20%20%20if%20(arr1%5Bi%5D%20!%3D%20arr2%5Bi%5D)%0A%20%20%20%20%20%20%20%20%20%20%20%20return%20false%3B%0A%20%20%20%20return%20true%3B%0A%7D%0A%20%0A%2F%2F%20This%20function%20search%20for%20all%20permutations%20of%20pat%5B%5D%20in%20txt%5B%5D%0Avoid%20search(char%20*pat%2C%20char%20*txt)%0A%7B%0A%20%20%20%20int%20M%20%3D%20strlen(pat)%2C%20N%20%3D%20strlen(txt)%3B%0A%20%0A%20%20%20%20%2F%2F%20countP%5B%5D%3A%20%20Store%20count%20of%20all%20characters%20of%20pattern%0A%20%20%20%20%2F%2F%20countTW%5B%5D%3A%20Store%20count%20of%20current%20window%20of%20text%0A%20%20%20%20char%20countP%5BMAX%5D%20%3D%20%7B0%7D%2C%20countTW%5BMAX%5D%20%3D%20%7B0%7D%3B%0A%20%20%20%20for%20(int%20i%20%3D%200%3B%20i%20%3C%20M%3B%20i%2B%2B)%0A%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20(countP%5Bpat%5Bi%5D%5D)%2B%2B%3B%0A%20%20%20%20%20%20%20%20(countTW%5Btxt%5Bi%5D%5D)%2B%2B%3B%0A%20%20%20%20%7D%0A%20%0A%20%20%20%20%2F%2F%20Traverse%20through%20remaining%20characters%20of%20pattern%0A%20%20%20%20for%20(int%20i%20%3D%20M%3B%20i%20%3C%20N%3B%20i%2B%2B)%0A%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%2F%2F%20Compare%20counts%20of%20current%20window%20of%20text%20with%0A%20%20%20%20%20%20%20%20%2F%2F%20counts%20of%20pattern%5B%5D%0A%20%20%20%20%20%20%20%20if%20(compare(countP%2C%20countTW))%0A%20%20%20%20%20%20%20%20%20%20%20%20cout%20%3C%3C%20%22Found%20at%20Index%20%22%20%3C%3C%20(i%20-%20M)%20%3C%3C%20endl%3B%0A%20%0A%20%20%20%20%20%20%20%20%2F%2F%20Add%20current%20character%20to%20current%20window%0A%20%20%20%20%20%20%20%20(countTW%5Btxt%5Bi%5D%5D)%2B%2B%3B%0A%20%0A%20%20%20%20%20%20%20%20%2F%2F%20Remove%20the%20first%20character%20of%20previous%20window%0A%20%20%20%20%20%20%20%20countTW%5Btxt%5Bi-M%5D%5D–%3B%0A%20%20%20%20%7D%0A%20%0A%20%20%20%20%2F%2F%20Check%20for%20the%20last%20window%20in%20text%0A%20%20%20%20if%20(compare(countP%2C%20countTW))%0A%20%20%20%20%20%20%20%20cout%20%3C%3C%20%22Found%20at%20Index%20%22%20%3C%3C%20(N%20-%20M)%20%3C%3C%20endl%3B%0A%7D%0A%20%0A%2F*%20Driver%20program%20to%20test%20above%20function%20*%2F%0Aint%20main()%0A%7B%0A%20%20%20%20char%20txt%5B%5D%20%3D%20%22BACDGABCDA%22%3B%0A%20%20%20%20char%20pat%5B%5D%20%3D%20%22ABCD%22%3B%0A%20%20%20%20search(pat%2C%20txt)%3B%0A%20%20%20%20return%200%3B%0A%7D” message=”C++” highlight=”” provider=”manual”/]

Output:

Found at Index 0
Found at Index 5
Found at Index 6

[ad type=”banner”]

Tags:

Venkatesan Prabu

Other Articles

JAVA Programming-Write a program to print all permutations of a given string

JAVA Programming-Backtracking Set 1 (The Knight’s tour problem)

No Comment! Be the first one.

Leave a Reply

Popular Posts

How to Install TWRP Recovery and Root Mi5s & Mi5s Plus

Download Huawei P9 Lite Android Nougat Update [B336]

How to Downgrade Honor 5c from Nougat to Marshmallow

Download Samsung Galaxy S8 Stock Wallpapers

Categories

Related Posts

How to Build a Reliable Python Web Data Collection Pipeline for Location-Aware Research: A Complete Guide to Fetching, Parsing, and Validation

How Signaling-Level Fraud Impacts Telecom Revenue

Building Data-Heavy React Applications: Why Advanced Components Matter

Type and hit Enter to search

Type and hit Enter to search

C++ Programming for Anagram Substring Search (Or Search for all permutations)

Tags:

Share Article

Other Articles

No Comment! Be the first one.

Leave a Reply

Popular Posts

Categories

Related Posts