Index of the First Occurrence in a String (using KMP algo / LPS(pi) array)

Problem Overview
The problem at hand is the classic "Implement strStr()" which is a common question that tests understanding of string manipulation and searching algorithms. The challenge is to implement a function that finds the needle (substring) in the haystack (string) and returns the index of its first occurrence. It’s essentially the same as using the indexOf
method provided in most programming languages.
URL: Implement strStr() on LeetCode
Understanding the Problem
Given two strings, needle
and haystack
, the goal is to return the index of the first occurrence of needle
within haystack
. If needle
is an empty string, return 0
. If needle
does not occur in haystack
, return -1
.
Example 1:
Input: haystack = "hello", needle = "ll" Output: 2
Example 2:
Input: haystack = "aaaaa", needle = "bba" Output: -1
Key DSA Concepts and Theory
1. String Manipulation
Understanding how strings work in terms of indexes and operations like iteration, slicing, or accessing substrings is crucial.
2. Searching Algorithms
The problem can be tackled using a simple linear scan (brute force), but more efficient string-searching algorithms like the Knuth-Morris-Pratt (KMP) algorithm could be employed for better performance.
3. Edge Cases
Consider cases where needle
is an empty string, which by definition should return 0
, or when needle
's length is greater than haystack
.
Solution Approach
The problem can be solved using a straightforward linear scan approach. Although simple, it is effective unless optimization is necessary.
Step-by-Step Approach
Linear Scan
Initialization:
- First, handle the special case where
needle
is an empty string. In such cases, directly return0
.
- First, handle the special case where
Iterate Through Haystack:
- Start from the very first character of
haystack
and take substrings of the same length asneedle
.
- Start from the very first character of
Compare Substrings:
- For each substring, compare it to
needle
. If they match, return the starting index of this substring.
- For each substring, compare it to
Return Result:
- If the loop completes without finding
needle
, return-1
.
- If the loop completes without finding
Code Implementation
C++
class Solution {
public:
int strStr(string haystack, string needle) {
if (needle.empty()) return 0; // Special case
int m = haystack.length();
int n = needle.length();
for (int i = 0; i <= m - n; ++i) {
if (haystack.substr(i, n) == needle) {
return i;
}
}
return -1;
}
};
Java
class Solution {
public int strStr(String haystack, String needle) {
if (needle.isEmpty()) return 0; // Special case
int m = haystack.length();
int n = needle.length();
for (int i = 0; i <= m - n; i++) {
if (haystack.substring(i, i + n).equals(needle)) {
return i;
}
}
return -1;
}
}
Time and Space Complexity Analysis
- Time Complexity: O(m * n), where
m
is the length ofhaystack
andn
is the length ofneedle
. In the worst case, for each character inhaystack
, we might compare up ton
characters ofneedle
. - Space Complexity: O(1), excluding the space required to store input or output. The use of additional space is constant.
Common Mistakes to Avoid
- Forgetting to handle the empty
needle
case at the beginning can lead to incorrect results. - Not checking the bounds properly, especially when taking substrings, which might lead to
IndexOutOfBounds
errors, especially in languages like Java.
Similar Problems on LeetCode
Other questions that involve string searching and manipulation:
- LeetCode 796: Rotate String
- LeetCode 28: Find the Index of the First Occurrence in a String (variation of the current problem)
Additional Resources and References
This article delves into understanding and effectively implementing the "Implement strStr()" problem, ensuring a deep grasp of string manipulation and search algorithms, potentially broadening your understanding for more advanced algorithmic challenges.