Big Data/Analytics Zone is brought to you in partnership with:

Alec is a Content Curator at DZone and lives in Raleigh, North Carolina. He is interested in Java and Android programming, and databases of all types. When he's not writing for the NoSQL and IoT Zones, you might find him playing bass guitar, writing short stories where nothing happens, or making stuff in Java. Alec is a DZone Zone Leader and has posted 572 posts at DZone. You can read more from them at their website. View Full User Profile

The Best of the Week (Feb. 14): Big Data Zone

02.23.2014
| 3989 views |
  • submit to reddit

Make sure you didn't miss anything with this list of the Best of the Week in the Big Data Zone (Feb. 14 to Feb. 20). Here they are, in order of popularity:

1. Dev of the Week: Rafał Kuć

Every week here and in our newsletter, we feature a new developer/blogger from the DZone community to catch up and find out what he or she is working on now and what's coming next. This week we're talking to Rafał Kuć, software architect and Solr and Lucene specialist.

2. Designing Map/Reduce Algorithms: In-Mapper Combiner

Recently the author read a book on Map/Reduce algorithms by Lin and Dyer. This book gives a deep insight in designing efficient M/R algorithms. Today, in this post, he will discuss the in-mapper combining algorithm and a sample M/R program using this algorithm.

3. Clean and Optimize the ElasticSearch Indexes of Logstash

ElasticSearch index files grow large quickly, and one of the most common questions about them is how to optimize them and clean them, getting rid of old records you're not interested in any longer. A very easy way to accomplish these tasks is using the following two scripts.

4. Introduction to Apache Avro

Apache Avro is a popular data serialization format and is gaining more users, because many Hadoop-based tools natively support Avro for serialization and deserialization. In this post we will understand some basics about Avro.

5. Eclipse's BIRT: Scripted Data Set

If you want to use Java objects as data source and data set in eclipse's BIRT you need to do that by using sripted data source and scripted data set. This article presents the usage of sripted data set in eclipse's BIRT.