Repeated String Match (Rabin Karp Algorithm)

Problem Overview
The LeetCode problem "Repeated String Match" presents an interesting challenge involving string operations. The goal is to determine the minimum number of times a string A
has to be repeated such that another string B
is a substring of the repeated string. If no such number exists, the function should return -1
.
Understanding the Problem
Let's break down the problem:
- You have two strings,
A
andB
. - You need to determine how many times you need to concatenate
A
with itself so thatB
can be found as a substring in the concatenated result. - If
B
cannot be a substring, output-1
.
Example:
Given A = "abc"
and B = "cabcabca"
, you need at least 4 repetitions of A
to form "abcabcabca"
which contains B
as a substring.
Key DSA Concepts and Theory
String Operations
At the core of this problem are basic string operations, specifically:
- Concatenation: Joining two strings end to end.
- Substring Search: Checking if a string is part of another string.
Repetition and Substring Theory
When searching for B
within a repeated A
, understanding the length relationship is crucial:
- If
B
is longer thanA
, it's a given thatA
must be repeated at leastlen(B) / len(A)
times to potentially containB
.
Efficient Searching
This problem involves efficiently searching for a substring within a larger string. Despite the brute force approach of repeatedly concatenating A
and checking for B
, efficient substring search operations (such as those in C++ and Java's standard libraries) can significantly optimize this solution.
Solution Approach
To solve the problem, we will follow these steps:
Estimate Number of Repeats: Initially calculate the minimum necessary repetitions based on the ratio of lengths (
B
length divided byA
length).Construct and Search: Concatenate
A
the estimated number of times, plus an additional time to consider overlapping cases. Check ifB
is a substring of this concatenated result.Return the Result: Return the number of times
A
was repeated ifB
is found, otherwise return-1
.
C++ Solution
#include <iostream>
using namespace std;
int repeatedStringMatch(string A, string B) {
int count = 1;
string repeated = A;
// Continue adding A until it's length is at least as large as B
while (repeated.size() < B.size()) {
repeated += A;
count++;
}
// Check if B is a substring of the current repeated
if (repeated.find(B) != string::npos) return count;
// Try once more with one more repetition
repeated += A;
count++;
if (repeated.find(B) != string::npos) return count;
return -1;
}
int main() {
string A = "abc";
string B = "cabcabca";
cout << repeatedStringMatch(A, B) << endl; // Output: 4
return 0;
}
Java Solution
public class Solution {
public int repeatedStringMatch(String A, String B) {
int count = 1;
String repeated = A;
// Keep appending A until repeated size is at least B's size
while (repeated.length() < B.length()) {
repeated += A;
count++;
}
// Check if B is a substring of the current repeated
if (repeated.indexOf(B) != -1) return count;
// Try once more with an extra repetition
repeated += A;
count++;
if (repeated.indexOf(B) != -1) return count;
return -1;
}
public static void main(String[] args) {
Solution solution = new Solution();
String A = "abc";
String B = "cabcabca";
System.out.println(solution.repeatedStringMatch(A, B)); // Output: 4
}
}
Time and Space Complexity Analysis
Time Complexity
The time complexity of the solution is O(n * m), where n
is the number of characters in A
and m
is the number of characters in B
. This is primarily due to the find
(or indexOf
) function, which needs to compare significant portions of the string.
Space Complexity
The space complexity is O(n * k), where k
is the number of times A
is concatenated. This complexity accounts for the space required to store the concatenated string.
Common Mistakes to Avoid
- Insufficient Repetition: Not repeating
A
enough times to check all possible overlaps. Always consider one extra repetition after reaching the minimum necessary length. - Substring Search Overhead: Using inefficient methods to check substring existence can lead to timeouts, especially for large inputs.
Similar Problems on LeetCode
- Simplified String Search Problems
- Implement strStr() - LeetCode #28
- Rotate String - LeetCode #796
Additional Resources and References
- C++ string::find Reference
- Java String.indexOf() Method
- LeetCode Discussion Forum for community-contributed insights on various approaches.