Thursday, October 8, 2015

LeetCode - Wildcard Matching

Link
https://leetcode.com/problems/wildcard-matching/

Problem
Implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "*") → true
isMatch("aa", "a*") → true
isMatch("ab", "?*") → true
isMatch("aab", "c*a*b") → false

Solution

Approaches to the Problem

#1

The general approach is to loop through each character in s and p and compare them.  Break out of the loop once the end is reached for either s or p.

- If the characters match, then move on to the next set of characters.

- If the pattern character is '?', then move on to the next set of characters.

- If the pattern character is a wildcard '*', then call the function recursively with the wildcard character removed.  If the recursive call returns true, then the candidate string and pattern matches.  If the recursive call returns false, then move on to the next character in the candidate string.

- If the characters don't match, then return false.

Finally, return true if the end has been reached for both s and p.  Otherwise, return false if the end has not been reached for either s or p.

#2

The first approach works, but has an incredibly inefficient runtime complexity of O(2^n).  This is caused by the fact that whenever a wildcard is encountered, it performs a recursive call and possibly recomputes the same exact s and p multiple times.  Due to the O(2^n) time complexity, LeetCode threw a 'Time limit exceeded" error.

This problem is analogous to the fibonacci problem where the same computation is performed many times.

To solve the issue of duplicate computations, we can use a technique called memoization.  By creating a lookup table to store failed matches, we can avoid going down the same rabbit hole by checking if s and p has already failed to match.

#3

The second approach significantly improves runtime complexity, but LeetCode still threw a 'Time limit exceeded" error on the inputs:

s = "aaaaabbbbbababbabbbbaaabbaabaaabaabbbbabaaaabaaabbbbbbabbaaaaabbbaaababbaaabaabbaabbbbaaabaabbbaabbbbabbabaabbabbbbaabbaabbabaababababbaabbabbbaaabaaababaaaabbaaaaabbbaaaabbbababbabaaaabbbaabaabbaaaabaaab"

p = "*aab***aaab**a*b**b*aa***baabababb*abbbbb**a**ba******b**b*a*ab*b***aa*****a*aba*a***aa*b****a*a*****"

Upon closer inspection, we can infer that the contiguous wildcards was causing the algorithm to branch out wildly.  To solve this issue, we can remove contiguous wildcards from p so it looks like

p = "*aab*aaab*a*b*b*aa*baabababb*abbbbb*a*ba*b*b*a*ab*b*aa*a*aba*a*aa*b*a*a*"

Conclusion

By combining the three approaches above, LeetCode accepted my submission.


This was one of the more challenging problems I have encountered, and justifiably so because the solution acceptance rate is 15.8% (13th lowest out of 273 problems on the site) and its difficulty is rated 'Hard'.



The real world application of the wildcard matching problem includes anything that uses text search, some of which are: Google search, Microsoft Word, text editors, etc.

Saturday, May 9, 2015

Computational Time Complexity of Algorithms

Computer science problems often have multiple solutions/algorithms for a given problem.  For example, what algorithm would you use to find the largest number in an array of integers?  You could sort the array and get the largest number in O(n log(n)) time, or simply iterate through the array and maintain a variable that keeps track of the largest number in O(n) time.

We usually want to use the optimal (most efficient) algorithm because it makes our software run faster.  In order to find the optimal algorithm, it helps to understand the major time complexity 'classes' and the common problems and algorithms that fit within each class.

O(1)
  • Single operation, e.g. setting a value or performing a math operation
  • Lookup/delete nth element from an array
  • Lookup/insert/delete element from a hash table (average case, where there are few collisions)
O(log n)
  • Continuously splitting a data set or number of operations in half
  • Binary search
  • Search/insert/delete element from a binary search tree (average case, where tree is balanced)
  • Search/insert/delete element from a min/max heap
O(n)
  • Iterating through an entire data set
  • Search/insert/delete from a linked list
O(n log n)
  • Merge sort, quick sort, heap sort
O(n^2)
  • Looking at all pairs of items
  • Selection sort, insertion sort, bubble sort
  • 0/1 knapsack problem using dynamic programming
O(2^n)
  • Continuously doubling a data set or number of operations
  • Naive fibonacci
  • Travelling salesman problem using dynamic programming
O(n!)
  • Generating all permutations of a set
  • Naive travelling salesman problem

Sunday, February 22, 2015

Objection Orientation: Polymorphism, Abstraction, Encapsulation, and Inheritance

Developers who are new to object-oriented programming often struggle to understand the long-winded explanations that are scattered across the interwebs.  The intent of this post is to ease the pain (hopefully) by summarizing the main concepts of object orientation in a concise and precise format by providing a definition, two examples, and a brief explanation of its benefits.

Polymorphism

Definition:  Polymorphism is the concept where a parent type can have many sub-types.

Example #1:  ArrayList and LinkedList are sub-types of the parent type List.

Example #2:  Car, Motorcyle, and Bus are sub-types of the parent type Vehicle.

Benefits:  Polymorphism allows you to use a single parent type to interact with all possible sub-types.  It also helps to decouple your code.  When you are working with a List object, you can freely swap out the underlying implementation from an ArrayList to a LinkedList and vice-versa.

Abstraction

Definition:  Abstraction is the generalization of common functionality across different objects.

Example #1:  List is an abstraction of ArrayList and LinkedList because it generalizes the common functionality of adding and removing elements.

Example #2:  Vehicle is an abstraction of Car, Motorcyle, and Bus because it generalizes the common functionality of accelerating and braking.

Benefits:  Abstraction allows you to interact with different objects via a common interface.

Encapsulation

Definition:  Encapsulation is the packing (hiding) of data and implementation details into a class or function.

Example #1:  Data and implementation details about a car are encapsulated (packed, hidden) into a Car class.  Access to its data can be controlled via public, protected, and private modifiers.

Example #2:  The functionality to accelerate and brake a car is encapsulated (packed, hidden) into functions.  Access to these functions can be controlled via public, protected, and private modifiers.

Benefits:  Encapsulation allows you to hide implementation details and protect against mis-use by other classes.  It also helps to decouple your code.  For example, I could modify the formula in the accelerate function without breaking any code dependencies on that function.

Inheritance

Definition:  Inheritance is the re-use of a base class's functionality by a sub-class.

Example #1:  Car is a sub-class that inherits the functionality of its base class Vehicle, including the data (currentHeading, velocity, weight) and functions (turn, accelerate, brake).

Example #2:  A Car can turn because it inherited the functionality from its base class Vehicle.

Benefits:  Inheritance allows you to re-use functionality and reduce duplicate code.

Polymorphism, abstraction, encapsulation, and inheritance are distinctly different concepts, but together they support the paradigm of object orientation.  I hope this post cleared up any misunderstandings of these four pillars object orientation!