Sunday, June 12, 2011

How to Build a Balanced Binary Search Tree From an Array

Problem
Create a balanced binary search tree from an array of n elements.

Solution
It's always easier to have an example to follow along with, so let the array below be our case study.


The first step is to sort the array.  The next step is to load the tree with these elements.  A simple way to do it is to use binary search with recursion, except that we're adding an element to the tree instead of 'searching' for it.

  /**
   * Create binary search tree.
   * @param array of elements
   * @return The root of the tree.
   */
  public Node makeTree(String[] array) {
    int low = 0;
    int high = array.length - 1;
    return makeTree(low, high, array);
  }

  /**
   * Create binary search tree.
   * @param low The lowest array position.
   * @param high The highest array position.
   * @param array of elements
   * @return The root of the tree.
   */
  private Node makeTree(int low, int high, String[] array) {
    if (low > high) {
      return null;
    }
    else {
      // Same as (low + high) / 2
      int mid = (low + high) >>> 1;
      Node node = new Node(array[mid]);
      node.left = makeTree(low, (mid - 1), array);
      node.right = makeTree((mid + 1), high, array);
      return node;
    }
  }

The function recursively adds all the elements in the left subtree and then all the elements in the right subtree.  Note that it works for edge cases where the input size is zero or one.  The resulting balanced binary search tree is


Now the next question is, how do we check that the solution is correct?  For this trivial example, we can verify it just by looking at the tree.  We can see that every child on the left is less than its parent, and every child on the right is greater than its parent.

For a large input with hundreds or thousands of elements, it would be too time consuming to draw and visually verify the tree like we did above.  The solution to this proceeding problem will be in a future blog post which will cover preorder, inorder, and postorder tree traversals.

Sunday, June 5, 2011

Why Every Student Should Take A Software Engineering Course

Software development is still a relatively young practice.  Unlike traditional engineering where things are built to spec most of the time, we are still figuring out how to write applications that do not irradiate people to death with high-powered electron beamscause wrong organs to be removed from donors, or plunge the stock market by 600 points in 15 minutes.  Software engineering attempts to solve these problems by adopting the traditional engineering practices of defining a systematic, disciplined, and quantifiable approach towards the development of software.


So why should a student majoring in computer science take a software engineering course before graduating?  As a student, you most likely blew through your lower level classes like data structures, discrete mathematics, and algorithms without a clue of why or how any of it is relevant.  The problem with traditional academia is that they teach a whole lot of theory without a whole lot of real-world application.  If your university offers a software engineering course like mine, take the opportunity to put theory into actual practice while you're still in school and before you join the workforce.  You will learn how to develop a high quality software system in a team-oriented environment, and all the tools and concepts that come with it: 

1. Version control
Version control is essential for the organization of multi-developer projects.  The benefits of version control are revision tracking and source code management.  Revision tracking records any changes to the project and attaches a timestamp to it.  This is useful for tracking down bugs and it offers a chance to revert back to a previous version in case of a critical error.

Source code management facilitates the merging of changes to a project.  When multiple developers are working on different sections of code, source control (also known as version control) automatically merges their work together, or solves the problem that arises when their work involves the same lines of code.

The two most popular version control systems today are Subversion and Git.

2. Testing
Would you buy a car without taking it for a test drive?  Would you purchase a cool new smartphone without reading reviews on it?  Would you test your next foothold if you were rock climbing without a rope?  I think you get the idea.  Testing does not guarantee that something works perfectly, but it is a good indicator of its reliability, especially if other people have already used it.  If many people have used ProductX, and many people have said ProductX was good, then by inference, ProductX is good.

In software, there are different types of testing.  The most important types are unit and usability testing.  Unit testing verifies that a function is working correctly by returning the expected value.  However, it does not prove correctness because the expected value may not be the correct value.  This is classified as a logical error on the programmer's part.  Two popular testing frameworks are JUnit for Java, and NUnit for C#.  There are other tools like Jacoco that show you which sections of code have been run, thereby testing those sections of code.


While unit testing verifies whether your code is up to par, usability testing verifies if you made good design choices.  Would-be users literally use your software, point out problems, and suggest improvements. This is arguably the most critical type of testing because if would-be users are happy with your pre-release software, then they will also be happy with your post-release software.

3. Build systems
Build systems automate a variety of tasks such as: compiling source code into binary code, packaging binary code into an executable, deploying executable into a production system, creating documentation, and most importantly, running automated tests.  A developer should be able to start a build in one step, or in other words, with one line in the terminal or one click of a button.  If the build passes all phases of the process without any errors, then the software can be considered to be in a good, clean, shippable condition.  Joel Spolsky explains more in depth the benefits of a build system here.

Two popular build systems are Apache Maven and Apache Ant.

4. Software development lifecycle
A software engineer is responsible for the entire life cycle of a software project.  This includes interacting with the client to gather requirements, designing a specification to meet those requirements, writing code that conforms to the specifications, testing the software, and maintaining the software post-release.

Requirements gathering and design tend to be the most difficult phases because it requires a software engineer to fully understand what the client is asking for, in addition to the technologies and skills required to produce a solution.
5. Project management
Contrary to popular belief, most software developers are not socially inept Asperger's geeks who sit in front of a computer and code all day.  In a sufficiently large and challenging project, there are usually three or more developers working together as a team.  This in itself requires project management, which is a concerted effort by team members to communicate and distribute tasks such that project goals can be successfully accomplished.

Two popular project management platforms are Google's Project Hosting and Microsoft's Team Foundation Server.  Both offer source control and task tracking to facilitate collaboration between developers.



Why should every student take a software engineering course in college?  Because it will teach students the fundamentals of developing a high quality software system and introduce them to the tools that professional developers use everyday.  Not to mention that software engineering was rated the best job of 2011.