We Recommend These Resources
For nearly two years,
I've been trying to branch out and add another programming language to
my brain. I read and blogged about Seven Languages in Seven Weeks,
by Brian Tate, an excellent book that I blasted through in seven days
to save a little time. If you read my blog, you'll know that I finally
settled on Haskell, started posting about my experience as an
object-oriented programmer writing in a functional language, and then
things kind of fizzled out.
I really like Haskell.
However, I think I'm one of those people who tend to learn better when
under pressure. Since I didn't have a job requirement to learn Haskell
or an otherwise motivating situation, I never really quite got in to it.
I still plan to, some day.
But, I have finally picked the "new" language I want to learn, and that is
R (I say "new" because of course R is not a new language). I had a number of reasons to do so:
- Big Data is all the buzzword-rage right now, and R figures prominently in many big-data scenarios.
- I'm taking MOOCs at coursera,
and the ones I'm taking use R as the programming platform, ensuring
that I must have more than a superficial understanding of the language.
I had actually looked at R once before and never stuck with it for the
same reasons I did not stick with Haskell -- no looming deadlines!
- As I learn more
about R, I become more impressed by how handily it performs tasks that
require a lot of boilerplate code in any other language I've used, so
that experience provides me more motivation to keep learning.
- I am currently
working at a bank, and I'm already starting to use R not only to greatly
speed up some tasks that I need to perform, but also to perform
analyses that would have required so much Java code that they would have
gone on the "back burner."
I'm also happy to report
there has been some convergence, for me, among big data, R, Haskell and
my recent exposure to functional programming. R is an interesting
language. I don't have an especially formal computer-science background
(instead, I'm from physics, math, and electrical engineering), so I
probably would not be the best person to articulate how R checks (and
does not check) boxes for functional and object-oriented languages. But
all that Haskell investigation helped a lot when I started learning
MapReduce,
and seeing functional features in R that also fit well into the
MapReduce paradigm makes me feel - as all curious types should - that
all that investigation was worthwhile.
I'll still blog about
Java occasionally, but my posts for the near future will be focused on
my self-training to fill in gaps in my skill set related to big data. I
have started a new blog on this topic, called
Data Scientist in Training. If you read me on
DZone,
you don't have to do much to find me, as my posts from both blogs will
continue to find their way to DZone (the big-data posts go to a
microzone called
Big Data/BI Zone).
If you read me directly on Blogger, then please bookmark the link
above if you're interested in what I'm doing. At the least, please
check out my
Welcome!
post, where I explain my path and reference some resources that you,
too, may want to check out in the event that you want to learn more
about big data, too.
My posts about R on
Data Scientist in Training
will not explicitly say anything in the title like "Java developer
struggles with R data frames", but it will still be obvious that my
approach to R is that of a developer who has used Java for about 90% of
his coding for the last 15 years. If you're a Java developer and are
learning R, I hope there will be some content there of special use to
you. As I've searched online while learning R, I've noticed helpful
responders trying to explain how to move from the "use a for-loop to
iterate and then build your model in rows" approach to "use a mapping
function to create your new column of data, then add it to your data
frame". (In fact, this reminds me of another feature I like about R -- R
data frames remind me of tables in the column-oriented databases used
extensively in big data). I'm going to blog in near-real-time so I
don't forget those dead ends I encountered as I was trying to map Java
onto R, and that perspective is the one I think will be most helpful to
fellow Java/OO developers.
There are a few posts on
Data Scientist in Training already. The next one will be specifically about R -- I hope you check it out when it arrives!
Comments
Ken Wong replied on Thu, 2013/01/31 - 1:46pm
Thanks. I saw two of four posts on "Data Scientist in Training" are about apache pig.
I am curious about why they chose "pig" as the name, since even the swine flu was renamed to "h1n1 flu" to avoid unpleasant feelings.
Wayne Adams replied on Sun, 2013/03/24 - 10:43pm
in response to:
Ken Wong
Hi Ken: I was hoping if I waited long enough, someone who knew the answer would reply, but no luck. I was unable to find your answer, but you have to admit: naming a framework "Pig" opens up some enjoyable possibilities, such as naming the command-shell "Grunt", and so on. :)