Big Data/Analytics Zone is brought to you in partnership with:

Alec is a Content Curator at DZone. He lives in Raleigh and spends his free time writing and programming. Alec is a DZone Zone Leader and has posted 494 posts at DZone. You can read more from them at their website. View Full User Profile

The Best of the Week (Feb. 14): Big Data Zone

02.23.2014
| 3880 views |
  • submit to reddit

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Feb. 14 to Feb. 20). Here they are, in order of popularity:

1. Dev of the Week: Rafał Kuć

Every week here and in our newsletter, we feature a new developer/blogger from the DZone community to catch up and find out what he or she is working on now and what's coming next. This week we're talking to Rafał Kuć, software architect and Solr and Lucene specialist.

2. Designing Map/Reduce Algorithms: In-Mapper Combiner

Recently the author read a book on Map/Reduce algorithms by Lin and Dyer. This book gives a deep insight in designing efficient M/R algorithms. Today, in this post, he will discuss the in-mapper combining algorithm and a sample M/R program using this algorithm.

3. Clean and Optimize the ElasticSearch Indexes of Logstash

ElasticSearch index files grow large quickly, and one of the most common questions about them is how to optimize them and clean them, getting rid of old records you're not interested in any longer. A very easy way to accomplish these tasks is using the following two scripts.

4. Introduction to Apache Avro

Apache Avro is a popular data serialization format and is gaining more users, because many Hadoop-based tools natively support Avro for serialization and deserialization. In this post we will understand some basics about Avro.

5. Eclipse's BIRT: Scripted Data Set

If you want to use Java objects as data source and data set in eclipse's BIRT you need to do that by using sripted data source and scripted data set. This article presents the usage of sripted data set in eclipse's BIRT.