Joachim Hofer is a long-time senior consultant and lead developer on the Java platform who has recently become a fan of Scala. He regularly contributes to Open Source community projects and shares his experience in his blog and talks. Joachim is a DZone MVB and is not an employee of DZone and has posted 12 posts at DZone. You can read more from them at their website. View Full User Profile

Detecting the Copy/Paste Antipattern from within SBT

  • submit to reddit

I’m in the middle of porting the build process of a few (Java) projects from old and complicated Ant to SBT, trying to fill a few gaps in SBT tooling, especially those that concern those of us sentenced to Java.

I’ve already done FindBugs integration which you can find here. FindBugs works on bytecode, so theoretically, it can work with Scala code, too. However, I’ve heard that it reports a lot of false positives there, as it’s really tailored to Java in practice.

Then, there’s the very useful Copy/Paste Detector from the PMD code analysis suite. I’ve written an SBT plug-in for CPD yesterday, too.

Now, if you ask me, Java is basically Copy/Paste Paradise: Without higher-order functions or closures, there’s always the temptation to just copy all the boilerplate again and again. I had a Java code review project recently where about 40% of the whole codebase was generated by copy/paste!

But of course, this is not Java-specific. You can copy/paste Scala code, too, although in Scala, you should really never have to…

CPD can help you avoid this pitfall. Especially coupled with the nice DRY plug-in for the Hudson Continuous Integration Server, which gives you a pretty good overview of CPD’s results, especially in Brownfield projects.

And now, thanks to me (yay for me!), you can use CPD from within SBT, too.

There’s one drawback, though: CPD doesn’t really support Scala as a language, either. I think it may work sufficiently well using the “Any” language tokenizer they have. But if you want good results, there’s some work to do for you. And looking at the existing tokenizers for the other languages CPD supports, it’s not too much work.

So, is anyone up to adding Scala support to CPD?



Published at DZone with permission of Joachim Hofer, author and DZone MVB.

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)