Issue
I was introduced to an example interview question, where given a 2D array of characters and a word, find the word in the array, where the next character of the word is either to the right or below the previous character, and return a list of the co-ordinates of each character found.
I wrote the following code to solve this:
co_ords = []
def init_recursion(word, grid):
co_ords.clear()
return find_word(word, grid, 0, 0, True, 0)
def find_word(word, grid, x, y, explore, pos):
word_len = len(word) - pos
if word_len == 0:
return True
x_size = len(grid[0])
y_size = len(grid)
if word_len > (x_size - x) + (y_size - y):
return False
try:
char_on = grid[y][x]
if word[pos] == char_on:
co_ords.append((y, x))
if find_word(word, grid, x + 1, y, False, pos + 1):
return True
elif find_word(word, grid, x, y + 1, False, pos + 1):
return True
else:
co_ords.pop()
if word_len - 1 > (x_size - x) + (y_size - y):
return False
if explore and find_word(word, grid, x + 1, y, True, 0):
return True
else:
return explore and find_word(word, grid, x, y + 1, True, 0)
except IndexError:
return False
# Valid grid and word combinations:
grid1 = [
['c', 'c', 'x', 't', 'i', 'b'],
['c', 'c', 'a', 't', 'n', 'i'],
['a', 'c', 'n', 'n', 't', 't'],
['t', 'c', 's', 'i', 'p', 't'],
['a', 'o', 'o', 'o', 'a', 'o'],
['o', 'a', 'a', 'a', 'o', 'o'],
['k', 'a', 'i', 'c', 'k', 'i']
]
word1 = 'catnip' >>> [(1, 1), (1, 2), (1, 3), (2, 3), (3, 3), (3, 4)]
word2 = 'cccc' >>> [(0, 0), (0, 1), (1, 1), (2, 1)] (one choice of many)
word3 = 's' >>> [(3, 2)]
word4 = 'bit' >>> [(0, 5), (1, 5), (2, 5)]
My complexity analysis for this is as follows:
Where:
- R = number of rows
- C = number of columns
- W = characters in word
The find_word
function will loop through the entire 2D grid RC – W2 times, and for each exploration call, could recuse up to Wlog(W) times attempting to find a match. There would therefore be a total of Wlog(W)(RC – W2) recursive calls at the worst case.
Hence my questions:
- One, have I got the complexity analysis of the above correct?
- Two, in big O notation (worst case time complexity), how would this be represented?
I think that it could be one of the following:- O(RCW(log(W)) – W3)
- O(RCW(log(W)))
But I’m not sure which – normally the smaller term would be ignored in big O, however I’m not sure, as W is present in both the n3 and n3log(n) terms.
Solution
I disagree with your search depth being O(Wlog(W)). In the worst case, where every intermediate letter (except the last) matches the word, your recursion will have a branching factor of 2, and will explore 2^W
paths. Take, as an example, an extremely large grid, filled entirely with A
s, and a sole B
in the lower right corner, and a search word AAA...AB
. You can see how that would explode exponentially (with respect to W).
You are correct too, in that you’d ignore the smaller term, so overall I think the time complexity should be O(RC*2^W).
Answered By – Dillon Davis
Answer Checked By – Pedro (BugsFixing Volunteer)