Sunday, September 25, 2011

Multithreading with Java and Swing

It was seven years ago when Intel released the world's first dual-core processor for the consumer market.  Back then, concurrency was completely new and anyone who knew how to program for multiple cores was considered a hotshot programmer.  Physical limitations (mainly heat and power leakage) limited individual cores to operating frequencies of ~3.5 GHz.  Their solution?  Die shrinks and multiple cores.


Today, dual-core and quad-core processors have become mainstream in mobile electronics and home computers.  As a result, concurrency skills are no longer considered a luxury, but a requirement for any serious developer.  The rest of this post will detail my first recent experience with multithreaded programming using Java and Swing.

Applications that have graphical user interfaces (GUIs) are naturally multithreaded.  Usually you have one thread that draws the GUI, and another thread that runs the logic behind the scenes.  By splitting these two tasks into separate threads, you can get a nice performance boost if each thread runs on a separate core.  The GUI is quick n' snappy.  The user is happy.  And the developer is happy, well, for the most part if the job gets done correctly (issues of race conditions, deadlocks, and starvation may be covered in a future post).

Moving on, Java SE6 provides two convenient libraries for getting our hands dirty with GUI-based multithreading, Swing and Thread.  I used these libraries to whip up an application to spawn multiple threads, assign a priority to each thread, perform a computationally intensive calculation in a loop, and report the average time for each thread to complete a calculation.

Screenshot of Swing application

The workflow for my Swing application is:
  1. Create the GUI elements like the frame, start button, dropdown lists, panels, checkboxes, and labels.
  2. Attach an event listener to the start button.
  3. Set the frame to visible with frame.setVisible(true).
  4. Handle an event in the actionPerformed method when the user clicks the start button.  The event handler checks the number of threads that the user wants to start.  Then it creates the threads with the user-selected thread priority.
  5. Calculate the sum of the first 1000 cosines for 120 times in a loop for 3 minutes on each thread.
  6. After each iteration of the loop, calculate the average amount of time it took (in milliseconds) to perform the calculation and update the GUI.
So how does this relate to GUI-based multithreading?  Swing incorporates a separate thread called the Event Dispatch Thread (EDT).  It is the thread that is responsible for handling all events and updating the GUI.  This means that you, as a programmer, cannot manually update the GUI yourself in the main thread or a runnable thread that you have created yourself.  To update a GUI component properly, you must either place your code in the actionPerformed method (because it runs on the EDT), or place your code in another class and invoke it using the SwingUtilities.invokeLater method (which tells it to run on the EDT).

The main idea is to break the program into threads that perform background computation and thread(s) that update the GUI (the EDT does this for us).  Since you want to keep the GUI responsive, and the EDT is responsible for updating the GUI, you don't want to have long-running pieces of code on the EDT.  Instead, have the EDT spawn off new threads and run your (slower) code in these new threads.  Proper usage of these concepts can be seen in my program and source code here.  Compile with "javac ManyPanels.java" and run with "java ManyPanels 0".

I ran an experiment to see how thread priority scheduling works on different operating systems.  On a Windows machine with 2 cores and a Mac machine with 2 cores, I ran 10 threads with priorities ranging from 1 to 10.  The first number under each panel indicates the average amount of time (in milliseconds) it took to get through one iteration, and the second number in parentheses indicates the number of iterations that the thread was able to execute in 3 minutes.


Windows.  Thread priorities are all over the place.  A thread with priority 1 should not run faster than a thread with priority 8.
Mac.  Thread priorities are in order, with 1 running the slowest and 10 running the fastest.
The results show that thread priority scheduling in the JVM and Windows is completely unreliable, while thread priority scheduling in the JVM and Mac works as expected.  This is food for thought for developers who program with thread priorities.

Monday, July 25, 2011

How to Get SharePoint ASP.NET Auto Generated Control ID Using JavaScript

One of the most irking problems I have encountered while working with ASP.NET are auto-generated server control IDs.  My particular issue was figuring out how to gain a handle on the Hour and Minute fields in the DateTime object shown below

SharePoint's version of the DateTimePicker

Everytime a client loads the page, the server generates its own ID for the Hour and Minute elements.

HTML for the Hour dropdown
HTML for the Minute dropdown

As you can see in the screenshots above, ASP.NET mauled the client-side IDs with "ct100........."

Google searching yielded one possible solution: <%=#YourControlID.ClientID%>.  I tried this and got "An error occurred during the processing of Page.aspx.  Code blocks are not allowed in this file."  This error meant that my SharePoint environment was configured to stop any server-side code from being executed on the client-side.  Asking my SharePoint administrator to relax the security settings would be pretty stupid so it was not an option.

After five additional hours of futile Google searching, I got lucky and found this post on Marc Anderson's blog.  Although it was not a direct solution, it was something I could work with.  I modified the function to return the control's client-side generated ID and used it with jQuery to gain a handle on the Hour and Minute fields.

$(document).ready(function() {
  // Get the control IDs of the DateTimePicker dropdowns
  var startHourID = getTimeID('ff3_1', 'DateTimeFieldDateHours');
  var startMinuteID = getTimeID('ff3_1', 'DateTimeFieldDateMinutes');
  
  // Get the hour and minute value from their IDs using jQuery
  var startHour = $("[id='" + startHourID + "'] :selected").text();
  var startMinute = $("[id='" + startMinuteID + "'] :selected").text();
 
  // Display the hours and minutes
  alert("Hour: " + startHour);
  alert("Minute: " + startMinute);
});

/** Get the client-side ID of the Hours or Minutes control in the DateTimePicker field.
 *  @param fieldID    The control's ID before being mauled client-side
 *  @param fieldType  Use DateTimeFieldDateHours or DateTimeFieldDateMinutes
 *  @return  The client-side ID
 */
function getTimeID(fieldID, fieldType) {
  // Get all dropdown elements in the page
  var tags = document.getElementsByTagName('select');
  var controlID;
  for (var i = 0; i < tags.length; i++) {
    // alert(' tags[' + i + '].id=' + tags[i].id);
    // Find the element with the matching fieldID and fieldType
    if (tags[i].id.indexOf(fieldID) > 0 && tags[i].id.indexOf(fieldType) > 0) {
      controlID = tags[i].id;
    }
  }
  return controlID;
}

To get the control's client-side generated ID, simply call getTimeID with the appropriate parameters.  The caveat here is that fieldID and fieldType must always be present in the generated ID ctl00_PlaceHolderMain_g_dcc91698_d7a9_43a3_baf2_91d2ae764f94_ff3_1_ctl00_ctl00_DateTimeField_DateTimeFieldDateHours.  This function will not work if a completely random ID is generated each time the page is loaded because there will be nothing static for you to latch onto.

To test whether I hooked on correctly to the Hour and Minute fields, I alerted their values.


I hope this saves many people hours of anguish.  Happy coding!

Sunday, June 12, 2011

How to Build a Balanced Binary Search Tree From an Array

Problem
Create a balanced binary search tree from an array of n elements.

Solution
It's always easier to have an example to follow along with, so let the array below be our case study.


The first step is to sort the array.  The next step is to load the tree with these elements.  A simple way to do it is to use binary search with recursion, except that we're adding an element to the tree instead of 'searching' for it.

  /**
   * Create binary search tree.
   * @param array of elements
   * @return The root of the tree.
   */
  public Node makeTree(String[] array) {
    int low = 0;
    int high = array.length - 1;
    return makeTree(low, high, array);
  }

  /**
   * Create binary search tree.
   * @param low The lowest array position.
   * @param high The highest array position.
   * @param array of elements
   * @return The root of the tree.
   */
  private Node makeTree(int low, int high, String[] array) {
    if (low > high) {
      return null;
    }
    else {
      // Same as (low + high) / 2
      int mid = (low + high) >>> 1;
      Node node = new Node(array[mid]);
      node.left = makeTree(low, (mid - 1), array);
      node.right = makeTree((mid + 1), high, array);
      return node;
    }
  }

The function recursively adds all the elements in the left subtree and then all the elements in the right subtree.  Note that it works for edge cases where the input size is zero or one.  The resulting balanced binary search tree is


Now the next question is, how do we check that the solution is correct?  For this trivial example, we can verify it just by looking at the tree.  We can see that every child on the left is less than its parent, and every child on the right is greater than its parent.

For a large input with hundreds or thousands of elements, it would be too time consuming to draw and visually verify the tree like we did above.  The solution to this proceeding problem will be in a future blog post which will cover preorder, inorder, and postorder tree traversals.

Sunday, June 5, 2011

Why Every Student Should Take A Software Engineering Course

Software development is still a relatively young practice.  Unlike traditional engineering where things are built to spec most of the time, we are still figuring out how to write applications that do not irradiate people to death with high-powered electron beamscause wrong organs to be removed from donors, or plunge the stock market by 600 points in 15 minutes.  Software engineering attempts to solve these problems by adopting the traditional engineering practices of defining a systematic, disciplined, and quantifiable approach towards the development of software.


So why should a student majoring in computer science take a software engineering course before graduating?  As a student, you most likely blew through your lower level classes like data structures, discrete mathematics, and algorithms without a clue of why or how any of it is relevant.  The problem with traditional academia is that they teach a whole lot of theory without a whole lot of real-world application.  If your university offers a software engineering course like mine, take the opportunity to put theory into actual practice while you're still in school and before you join the workforce.  You will learn how to develop a high quality software system in a team-oriented environment, and all the tools and concepts that come with it: 

1. Version control
Version control is essential for the organization of multi-developer projects.  The benefits of version control are revision tracking and source code management.  Revision tracking records any changes to the project and attaches a timestamp to it.  This is useful for tracking down bugs and it offers a chance to revert back to a previous version in case of a critical error.

Source code management facilitates the merging of changes to a project.  When multiple developers are working on different sections of code, source control (also known as version control) automatically merges their work together, or solves the problem that arises when their work involves the same lines of code.

The two most popular version control systems today are Subversion and Git.

2. Testing
Would you buy a car without taking it for a test drive?  Would you purchase a cool new smartphone without reading reviews on it?  Would you test your next foothold if you were rock climbing without a rope?  I think you get the idea.  Testing does not guarantee that something works perfectly, but it is a good indicator of its reliability, especially if other people have already used it.  If many people have used ProductX, and many people have said ProductX was good, then by inference, ProductX is good.

In software, there are different types of testing.  The most important types are unit and usability testing.  Unit testing verifies that a function is working correctly by returning the expected value.  However, it does not prove correctness because the expected value may not be the correct value.  This is classified as a logical error on the programmer's part.  Two popular testing frameworks are JUnit for Java, and NUnit for C#.  There are other tools like Jacoco that show you which sections of code have been run, thereby testing those sections of code.


While unit testing verifies whether your code is up to par, usability testing verifies if you made good design choices.  Would-be users literally use your software, point out problems, and suggest improvements. This is arguably the most critical type of testing because if would-be users are happy with your pre-release software, then they will also be happy with your post-release software.

3. Build systems
Build systems automate a variety of tasks such as: compiling source code into binary code, packaging binary code into an executable, deploying executable into a production system, creating documentation, and most importantly, running automated tests.  A developer should be able to start a build in one step, or in other words, with one line in the terminal or one click of a button.  If the build passes all phases of the process without any errors, then the software can be considered to be in a good, clean, shippable condition.  Joel Spolsky explains more in depth the benefits of a build system here.

Two popular build systems are Apache Maven and Apache Ant.

4. Software development lifecycle
A software engineer is responsible for the entire life cycle of a software project.  This includes interacting with the client to gather requirements, designing a specification to meet those requirements, writing code that conforms to the specifications, testing the software, and maintaining the software post-release.

Requirements gathering and design tend to be the most difficult phases because it requires a software engineer to fully understand what the client is asking for, in addition to the technologies and skills required to produce a solution.
5. Project management
Contrary to popular belief, most software developers are not socially inept Asperger's geeks who sit in front of a computer and code all day.  In a sufficiently large and challenging project, there are usually three or more developers working together as a team.  This in itself requires project management, which is a concerted effort by team members to communicate and distribute tasks such that project goals can be successfully accomplished.

Two popular project management platforms are Google's Project Hosting and Microsoft's Team Foundation Server.  Both offer source control and task tracking to facilitate collaboration between developers.



Why should every student take a software engineering course in college?  Because it will teach students the fundamentals of developing a high quality software system and introduce them to the tools that professional developers use everyday.  Not to mention that software engineering was rated the best job of 2011.

Thursday, April 14, 2011

Career Plan

It's that time again when a computer science student needs to figure out what to do for the summer.  Coincidentally, one of my professors made it an assignment to have each student reflect and present their career plans to everyone else in the class.

Monday, February 7, 2011

API Design

As a software developer, I have used a number of APIs for programming languages, documentation writing, web frameworks, etc.  Like many who are starting out in the field, I did not think I would ever need to write an API... until now.  Before delving into the details of my experience in API design, let us start with the basics.

API stands for Application Programming Interface.  An example is the Java SE6 API.  An API documents the sets of rules that define how software systems can interact with each other, and the boundaries for what a programmer can do.  For an easy analogy, a LCD TV manual lists all of the possible ways to use the TV, from turning it on via the power button to changing color temperatures via a combination of buttons.  Likewise, an API is the programmers' manual that describes the functions that can be used to do something useful.

APIs are not written only for programming languages, but for a wide variety of technologies including applications, libraries, operating systems, etc.  They are sometimes synonymous to the terms "reference guide" or "reference manual".  The technology for which I have designed and written an API is the iHale system.  The top-level architecture can be visualized as follows:

Diagram of the iHale system.

The iHale API serves as a point of interaction between the house systems, web user interface, repository, and other optional interfaces.  I designed a REST API and a Java API.  The REST API specifies how the different house systems and optional interfaces communicate with the iHale system through HTTP requests and commands.  The Java API specifies how information is stored and retrieved in the repository based on the HTTP requests it receives.

Designing the APIs was not particularly challenging because I had already written user stories and had a sense of what the possible requirements were.  For example, the Aquaponics system would need to support functions to return information pertaining to water volume, temperature, PH, oxygen level, and time of data procurement.  However, this may not fully describe all the fields and functions required for the aquaponics system.

As we make progress on the solar decathlon project in the following weeks, my team and I will gather requirements in greater detail by speaking with the engineers and referring to specification sheets for the sensors and meters that will be used in the house.  As we gain a better understanding of what is possible and what isn't, we will revise and update our API.

Sunday, January 30, 2011

Berkeley DB

Berkeley DB is a Java based non-relational database that is scalable and offers multiple thread and process support.  This is ideal for use in the Solar Decathlon home management system, now renamed to iHale.  The main idea is shown in the diagram below:

Client-server-database communication

On the client side, Wicket provides the front end user interface via a webpage.  Through this webpage, a user may request to perform a GET, PUT, or DELETE operation.  The client communicates to the server via HTTP over the Restlet framework, where the request is interpreted and executed by the server.  Information in the database is persisted through a file saved on the disk.

To get familiar with Berkeley DB, I worked on two code katas that dealt with the storage and retrieval of contacts.  These were modified versions of the katas I completed on Restlet in the last two blog posts.  More details can be seen here.

Kata 1: Timestamp as a secondary index
The primary index for a contact is their unique ID.  This means that entries can be added or searched in the database via a unique ID.  The task was to add a timestamp as a secondary index.  The timestamp is appended to a contact when it is created.  The most time-consuming part of Kata 1 was writing unit tests and learning Berkeley DB's API to query the database.  It was fairly straightforward and took about 3 hours to complete.

Command line usage of all the available operators: put, get, get-timestamp, get-range.

Kata 2: Wicket, REST, and Berkeley DB: A match made in heaven
This kata is a combination of all the technologies I have learned so far which includes Wicket, REST, and Berkeley DB.  It combines the use of Wicket in Restlet Part I, concepts of persistency from Restlet Part II, and Berkeley DB from Kata 1 above.  The difference is that contact information is now persisted in a file on the disk via Berkeley DB instead of in-memory.

Webpage for adding entries and querying the database.

The overall layout of Wicket, REST, and Berkeley DB is readily understood, but getting it all working properly together is a whole different story.  So far, this kata has been a debug-fest.  It took about 5 hours to code the core Wicket, resource server, database, and ant build files, but debugging has taken over 7 hours.

Some of the problems I encountered were trivial, such as forgetting to attach the timestamp to a contact XML document.  These resulted in exceptions when trying to get or put a contact into the database.  Other problems were due to communication errors between Wicket and the resource server, which resulted in odd exceptions such as Internal resource error 500.  I was able to resolve these by desk checking my code several times.

The lessons that I learned from this exercise is that one must be extra careful when copying and pasting code from working modules in past projects.  This method is extremely error-prone.  Also, it is ideal to import small portions of code at a time and perform frequent testing instead of importing huge blocks of code all at once and testing everything afterwards.


The final distribution of my katas can be downloaded here.

Saturday, January 22, 2011

Restlet Part II

In continuation from Restlet Part I, Part II includes three new katas that exercised the use of GET, PUT, and DELETE operators.  It was many times more difficult to implement because it required six different components to interact with each other:  a client, record of a client, in-memory persistent database, server for accessing the database, server for handling individual requests for a contact, and a server for handling requests for all contacts in the database.  More details can be found here.

Kata 5: Contacts resource
The task was to add a new resource that returns an XML representation of all the contacts in the database.  In other words, the contacts were stored in a Java collection and the goal was to transform it into an XML document.  At first I felt overwhelmed, but I identified the steps required to get the job done in about 4 hours:
  1. Create an XML document.
  2. Iterate through each contact in the collection, then attach it to the document.
  3. Return the XML document when the servers asks for this resource.
XML representation of contact resources in the database

Kata 6: Command line client manipulation of Contacts resource
The task was to extend the command line client functionality to include additional operators "get-all" and "delete-all". The get-all operation would retrieve all contacts in the database, retrieve their respective URLs, then print out contact information.  The steps required to do this was similar to Kata 5 but in reverse; an XML representation of a contact had to be transformed into its Java representation.  This kata took about 5 hours to complete because I had to look online and learn how to play with XML nodes.

Printout of contact information after calling the get-all operator

Kata 7: Add a telephone number to the Contact resource
The task was to extend the Contact resource to include a person's phone number.  In addition, the server would check to see if it was in the correct format ###-####.  If not, it would throw an error.  This kata was particularly easy and took about 1 hour to complete.

Error code 417 when the phone number does not adhere to the xxx-xxxx format

By completing these katas, I learned how to implement a RESTful client and server system that uses an in-memory persistent database.  I also improved my skills in performing meaningful unit testing to meet quality assurance standards.

A distribution of my work can be downloaded here.

Tuesday, January 18, 2011

Restlet Part I


Representational State Transfer (REST) is a software architecture designed for use on the web.  Whereas the web is mainly used for interaction between a client and a server, REST allows for interaction between servers themselves.  It relies on an architectural usage of HTTP URLS which includes resources and verbs such as GET, PUT, DELETE, and POST.  For example, http://hawaii.com/news would have the resource (hawaii.com) and the verb GET (news).  Its action is to retrieve news from Hawaii.

A system like REST can be implemented with a Java framework called Restlet, which is the topic for today's blog post.  To introduce myself to this new framework, I practiced on four Restlet Code Katas.  Additional information on these katas can be found here.

Kata 1: Time resources
The task was to add three new resources to return the current hour, minute, and second.  I completed it in about 10 minutes.  Since it was the first introductory kata, the most difficult part was figuring out the actual resource and verb, which was localhost:8111/dateservice/[hour/minute/second].

The current month is January so the output corresponds to 0.
Kata 2: Logging
The task was to log Restlet's messages into a log file instead of having it display in the console.  The program was to read configuration settings from a properties file and use a handler to record all messages into a text file.  I completed this in about 2 hours.  The bulk of the time was spent trying to understand the whole process and just figuring out how to get it done.  A really handy explanation of logging is explained here.

Console output from Restlet is recorded into this log file.

Kata 3: Authentication
The task was to authenticate a user by asking for a username and password and verifying it against locally stored credentials.  Similar to Kata 2, this took me about 2 hours to complete.  Nearly all of the time was spent on research.  The solution can be found here.  Most of the problems I had were finding only outdated articles.  One article recommended using the class Guard, which was dated back to Java 1.4.

[Update (Jan. 24): I discovered that this feature is buggy.  The application does not always challenge the user with a username and password even after the server has been restarted.  I will update this section with an appropriate screenshot and reupload a new .zip distribution file once I figure out the problem.]

Kata 4: Wicket
The task was to create a client that could access the server via a webpage instead of the command line.  This required the use of the Wicket web framework.  I was able to complete this kata with about 6 hours of effort.  Several hours were wasted in debugging a client-to-server connection issue.  At first, I thought it might have been due to authentication but it turned out to be trivial problem.  The issue was due to a simple mismatch in ports; the server was being hosted on port 8111 but the client was trying to connect to 8112.

Basic webpage with buttons for requesting the date.

By completing these four katas, I learned the concepts of the Restlet framework.  I also learned the basics of networking in setting up a client and server side system.

A distribution of my files are available here.