I have been in the software development industry since 2010, working on enterprise product development using ADF. I am usually keen on learning about software design and emerging technologies. You can find me hanging around in the JavaRanch Forums where I am one of the moderators. Apart from Java, I am fascinated by the ease of use and simplicity of Ruby and Rails. Mohamed is a DZone MVB and is not an employee of DZone and has posted 57 posts at DZone. You can read more from them at their website. View Full User Profile

Redundancy: An Open Enemy to Writing Good Code

03.27.2012
| 4360 views |
  • submit to reddit

Those who are working on High Available systems/databases consider Redundancy as one of the possible ways to achieve high availability. Redundancy in this case is helping in positive way. But consider the other side of it- In a high available database systems the data is replicated across different nodes. Any change/update at one node has to be propagated to all other nodes, failure during replication can cause mismatch in the data at different nodes. This shows redundancy is both good and bad. But why I am writing about high available systems when the title says “good code”. Because when it comes to redundancy in code- there is NO goodness in it.

The most common tendency of every programmer is to Copy-Paste. I remember reading somewhere that programmers have to be lazy, but where ever I read it the author didn’t mean that being lazy is to Copy-Paste. Instead its to write code which is concise and clear. One of the possible reasons why we tend to Copy-Paste is because the design of the existing code doesn’t facilitate reuse. Now any change to code at one place would trigger same change at multiple places. This is one of the most common ways of introducing redundancy.

The other possible reason would be- use of same algorithm at different places but with a different set of data each time OR a slight change in the order in the algorithm at different places. The first reaction to this would be- how can this be redundant, there’s no same code? The redundancy here is of the idea/logic. Suppose the algorithm has to change, then this change would have to be done at all the places where it is being used.

How is redundancy going to affect the code?

  • The code becomes fragile – The developer might not be aware of all the possible copies of the code, may be because he is new to the system, and hence might miss fixing the code at few places and this can lead to a broken functionality. So any change to be made has to be done very carefully.
  • The code is hard to maintain and extend – with no option of resue, copy-paste of code will become a rage. There would be no element of reuse. And it would also add to the number of lines of code.
  • The code becomes hard to read – especially in case of redundancy of algorithms/logic. The person reading the code would have no idea why the same algorithm is written in 2 different ways at 2 different places. It creates a grey area in the code.

These were some of the ill-effects of redundancy that I could think of.

I would like to give an example of redundancy in a SQL statements-

IF something IS NOT NULL THEN
  SELECT
    somecolumn, someothercolumn, onemorecolumn
  FROM
    sometable
  WHERE
    somecolumn = something
    AND condition1
    AND condition2;
ELSIF someotherthing IS NOT NULL THEN
  SELECT
    somecolumn, someothercolumn, onemorecolumn
  FROM
    sometable
  WHERE
    someothercolumn = someotherthing
    AND condition1
    AND condition2
    AND condition3;
END IF;

(I did copy paste the queries to create the second query but edited the condition in WHERE clause)

Suppose I need to change the condition1 or condition2, then I would have to change at 2 places. This example looks obvious and looks like an easy change, but imagine the query being more complicated, and there are 4 or 5 such if … elsif. It would take sometime to understand what each of those query does only to find out they are all the same with a few changes in the WHERE clause. Let me try to remove the redundancy:

SELECT
  somecolumn, someothercolumn, onemorecolumn
FROM
  sometable
WHERE
  (something IS NULL OR somecolumn = something)
  AND condition1
  AND condition2
  AND (someotherthing IS NULL
      OR ( someothercolumn = someotherthing AND condition3 ));

I pulled out common elements together and put the differing elements together and their usage being decided by value of something or someotherthing.

How can we eliminate redundancy?

  •  When ever someone sees something being repeated then its always better to refactor that code, the aim should be to improve the quality of the code with each check in.
  • Refactor out the common behavior into a method or a class. There are some refactoring moves explained in the Refactoring book by Martin Fowler. There’s also another book on Refactoring databases.
  •  Use various design patterns to refactor the code. This is especially useful when the redundancy is due to a design flaw and not just due to copy-paste of code.
  • And never ever tend to copy-paste the code from a different place.

Now there can be few concerns- what if I refactor and end up breaking the functionality? This happens usually when the features are not backed by corresponding tests. If each of the feature has automated tests, then one can easily identify if the particular refactoring is harmful or not. That is why writing automated tests is so much important.

A related popular principle is: Dont Repeat Yourself (DRY)

This was a short write up based on my limited experience and exposure. Please feel free to add your comments/concerns and share your ideas about redundancy and how it limits your productivity.

 

 

 

Published at DZone with permission of Mohamed Sanaulla, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Tags:

Comments

Roger Lindsjö replied on Tue, 2012/03/27 - 4:18am

I see this DRY mentioned a lot, but I also see it being misinterpreted time after another. There is a burden of reusing code if the code is not very local. I have seen this happen with a local formatting utility (used to format the output for taglibs) was "found", reused in another project which then broke when the original component needed to change the output.

Imagine the extreme, you must never copy-and-pase code found by googling, instead you must find the source project, refactor it so the code is reusable. Iganine the overhead and what kind of code that would produce. 

This is a situation where I think copy-and-pase is a form of reuse where the knowledge gained is reused (hopefully) but the actual code is copied.

 

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.