Since I became a professional software engineer in 2006, I've spent a fair amount of time reading books and online articles about software. Most article were related to object orientation, web technologies, software design, testing, automation, Agile and Lean, etc. Last year, I decided to take a break from such topics on purpose. So, I cleared all my RSS subscriptions.
Instead, I thought I would revisit some of the algorithms that I learned during my undergraduate courses. It’s been a few years since I studied algorithms and I thought I was more seasoned to appreciate and solve some of the algorithm problems than in the past. So, I started with the dynamic programming problems such as: Longest Common Subsequence (LCS) and Knapsack Problem and found this solution in Wikipedia:Source code for finding the length of the longest common subsequence:
function LCSLength(X[1..m], Y[1..n]) C = array(0..m, 0..n) for i := 0..m C[i,0] = 0 for j := 0..n C[0,j] = 0 for i := 1..m for j := 1..n if X[i] = Y[j] C[i,j] := C[i-1,j-1] + 1 else C[i,j] := max(C[i,j-1], C[i-1,j]) return C[m,n]
As a software developer, when I’m reading an article, if there is accompanying source code, my eyes automatically scroll into it skipping any textual blurb. It was the same in this case, and I must say, I noticed a few things here:
- The single literal variable names need some love.
- The code could use some logical grouping and naming to easily communicate what’s happening here.
I found this to be a UI problem. The algorithm itself is quite complicated for an average person to understand. However, we can probably reduce some noise with better naming or grouping. Here’s an alternate version of the same code.Modified source code, with descriptive names:
function LCSLength(sequence1[1..sequence1Size], sequence2[1..sequence2Size]) table = GetTableWithZerosInFirstRowAndColumn(sequence1Size, sequence2Size) for sequence1Index := 1..sequence1Size for sequence2Index := 1..sequence2Size if sequence1[sequence1Index] = sequence2[sequence2Index] IncrementLength(table, sequence1Index, sequence2Index) else UseCurrentLength(table, sequence1Index, sequence2Index) return table[sequence1Size, sequence2Size] function GetTableWithZerosInFirstRowAndColumn(columns, rows) table = array(0..column, 0..rows) InitializeFirstRowWithZeros(table) InitializeFirstColumnWithZeros(table) return table function InitializeFirstRowWithZeros(table[columns x rows]) for columnIndex := 0..columns table[columnIndex, 0] = 0 function InitializeFirstColumnWithZeros(table[columns x rows]) for rowIndex := 1..rows table[rowIndex, 0] = 0 function IncrementLength(table, columnIndex, rowIndex) table[columnIndex,rowIndex] := table[columnIndex-1,rowIndex-1] + 1 function UseCurrentLength(table, columnIndex, rowIndex) leftCell = table[columnIndex-1,rowIndex] topCell = table[columnIndex,rowIndex-1] table[columnIndex,rowIndex] := max(leftCell, topCell)
I know this modified code is verbose, but I find it self-explanatory. Using descriptive names is not a new concept in software engineering at all. I wish we had our algorithm books with annotated source code like this, where it’s readable by humans.
Let’s use uglifiers and minifiers to do the machinification for us.