Repeated String Match (Rabin Karp Algorithm)

Hero Image

Problem Overview

The LeetCode problem "Repeated String Match" presents an interesting challenge involving string operations. The goal is to determine the minimum number of times a string A has to be repeated such that another string B is a substring of the repeated string. If no such number exists, the function should return -1.

Understanding the Problem

Let's break down the problem:

  • You have two strings, A and B.
  • You need to determine how many times you need to concatenate A with itself so that B can be found as a substring in the concatenated result.
  • If B cannot be a substring, output -1.

Example: Given A = "abc" and B = "cabcabca", you need at least 4 repetitions of A to form "abcabcabca" which contains B as a substring.

Key DSA Concepts and Theory

String Operations

At the core of this problem are basic string operations, specifically:

  • Concatenation: Joining two strings end to end.
  • Substring Search: Checking if a string is part of another string.

Repetition and Substring Theory

When searching for B within a repeated A, understanding the length relationship is crucial:

  • If B is longer than A, it's a given that A must be repeated at least len(B) / len(A) times to potentially contain B.

Efficient Searching

This problem involves efficiently searching for a substring within a larger string. Despite the brute force approach of repeatedly concatenating A and checking for B, efficient substring search operations (such as those in C++ and Java's standard libraries) can significantly optimize this solution.

Solution Approach

To solve the problem, we will follow these steps:

  1. Estimate Number of Repeats: Initially calculate the minimum necessary repetitions based on the ratio of lengths (B length divided by A length).

  2. Construct and Search: Concatenate A the estimated number of times, plus an additional time to consider overlapping cases. Check if B is a substring of this concatenated result.

  3. Return the Result: Return the number of times A was repeated if B is found, otherwise return -1.

C++ Solution

#include <iostream>
using namespace std;

int repeatedStringMatch(string A, string B) {
    int count = 1;
    string repeated = A;
    
    // Continue adding A until it's length is at least as large as B
    while (repeated.size() < B.size()) {
        repeated += A;
        count++;
    }
    
    // Check if B is a substring of the current repeated
    if (repeated.find(B) != string::npos) return count;
    
    // Try once more with one more repetition
    repeated += A;
    count++;
    if (repeated.find(B) != string::npos) return count;
    
    return -1;
}

int main() {
    string A = "abc";
    string B = "cabcabca";
    cout << repeatedStringMatch(A, B) << endl; // Output: 4
    return 0;
}

Java Solution

public class Solution {
    public int repeatedStringMatch(String A, String B) {
        int count = 1;
        String repeated = A;
        
        // Keep appending A until repeated size is at least B's size
        while (repeated.length() < B.length()) {
            repeated += A;
            count++;
        }
        
        // Check if B is a substring of the current repeated
        if (repeated.indexOf(B) != -1) return count;
        
        // Try once more with an extra repetition
        repeated += A;
        count++;
        if (repeated.indexOf(B) != -1) return count;
        
        return -1;
    }

    public static void main(String[] args) {
        Solution solution = new Solution();
        String A = "abc";
        String B = "cabcabca";
        System.out.println(solution.repeatedStringMatch(A, B)); // Output: 4
    }
}

Time and Space Complexity Analysis

Time Complexity

The time complexity of the solution is O(n * m), where n is the number of characters in A and m is the number of characters in B. This is primarily due to the find (or indexOf) function, which needs to compare significant portions of the string.

Space Complexity

The space complexity is O(n * k), where k is the number of times A is concatenated. This complexity accounts for the space required to store the concatenated string.

Common Mistakes to Avoid

  • Insufficient Repetition: Not repeating A enough times to check all possible overlaps. Always consider one extra repetition after reaching the minimum necessary length.
  • Substring Search Overhead: Using inefficient methods to check substring existence can lead to timeouts, especially for large inputs.

Similar Problems on LeetCode

  1. Simplified String Search Problems
  2. Implement strStr() - LeetCode #28
  3. Rotate String - LeetCode #796

Additional Resources and References