Google App Engine - Where Does It Fit?
This is yet another blog about Google App Engine (GAE) recent release
of Java. Naturally, as an interested party, I rushed immediately to
deploy my sample GridGain app on it, but immediately realized that you
cannot deploy any sort of clustering or grid applications on the Google
App Engine. This is simply not something they wish to support, happily
leaving this portion of the market to Amazon EC2 customers.
A good way to think about GAE for Java is as of a Plain Vanilla Servlet Container Hosting.
If you have a standard J2EE app that simply processes web requests and
accesses DB data, then GAE is an ideal solution for you and will be
much easier to use than Amazon EC2. But if your application needs to do
anything beyond retrieving and showing DB data, then with GAE you will
run into a brick wall. So, if GAE fits, if fits like a glove, but if it
doesn't - then there is nothing you can do to help it.
Here is the list of some limitations you may wish to consider when starting with GAE:
1. You have no control over number of deployment instances.
This
may be a non-issue for trivial use cases, but you immediately hit a
wall if your application requires any knowledge of clustering or even
of threads. For example, let's say you need to generate a web report
that would normally take about 1 minute to generate in a single thread,
which is normally too long for a web-request (GAE will automatically
time it out after 30 secs). Ideally, to make it run faster, I would
split it into multiple sub-parts, run it in parallel on several nodes,
aggregate the results and produce a response to user within, say, 5
seconds.
However, there is nothing you can do to speed it up
on GAE. Firstly, you have no idea how many parts to split the report
into, and even if you did, there is no way to tell GAE that every part
should run on a separate CPU or on a separate server instance
altogether. On top of that, what if 9 out of 10 sup-parts completed
successfully and 1 failed? Again, you cannot instruct GAE to retry just
the part that failed - the whole request will have to be retried.
2. You have no control over load balancing
Not
every application is a website and not every application can be
measured in number of page hits. The underlying notion here is that if
you are not a website, then GAE is not for you. However even if you are
a website, not being able to control load-balancing can be quite
limiting. If you take my report generation example above, GAE again
would not be able to load balance it at all.
3. You cannot use any of the existing clustering infrastructure you have
If
you already make use of clustering within your application (like
discovery of different app server nodes, exchanging messages between
app servers, using existing caching or compute grid products, etc..),
you cannot do it in GAE. You are limited to the set of caching and data
storage libraries provided by GAE only.
4. You cannot use full set of the JDK classes or services
Google published JRE White List
for a list of JDK classes supported by the App Engine. The main
limitation there is lack of AWT or Swing packages. Although not often
used in web apps, this may be quite limiting for apps that use AWT for
purposes other than thick UI, like dynamic image generation, etc... I
should mention that GAE does offer its own image manipulation service
to compensate for this.
5. You cannot access or store files
The
only way to access a file is to put it in WAR and then get it off the
classpath. Creation of files is prohibited - you can only store data
using GAE datastore services. This alone can become a show-stopper for
many applications, as even if your application does not access or store
any files, it may very well depend on the libraries that do.
6. You don't get any access to the box your app is deployed on
From
GAE standpoint, this limitation is justified. The GAE is running
multiple WARs from different users on the same app server, so by giving
you direct access to the box, they will be giving you access to the
code that does not belong to you. On the other hand, not being able to
access the box will scare many traditional sys admins who generally
like to poke around production environments, look at OS-specific logs,
do tuning, and other environment and runtime geeky stuff.
I will
stop here as far as listing limitations. I think it is quite clear that
GAE is not targeting enterprises as their customers. On the contrary,
it looks like are they willingly leaving this portion of the market to
Amazon EC2 and are concentrating on simple web application hosting
which is a huge market on its own. The way they are beating Amazon here
is by being elegantly simpler to use and deploy - there is no image
creation, nor should user be concerned about number of images that are
started. GAE will automatically manage web load and add CPU's to your
app as load changes.
So, if you are a small business owner and need a quick and cheap hosting solution - go with GAE!
- Login or register to post comments
- 9239 reads
- Printer-friendly version
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)




Comments
Neeraj Vora replied on Fri, 2009/04/17 - 8:08am
1. You have no control over number of deployment instances.
You don't have to, as long as your app functions.
2. You have no control over load balancing.
You don't have to, as long as your app scales.
3. You cannot use any of the existing clustering infrastructure you have.
If you have figured out scalability you don't need GAE.
4. You cannot use full set of the JDK classes or services.
Yes this is something under debate but it's not a straight forward black and white thing.
5. You cannot access or store files.
Again something valid
6. You don't get any access to the box your app is deployed on.
You don't need to as long as your app functions and scales.
You are right in your concluding paragraph about elegance and simplicity and I think even enterprise customers can find it appealing as long as the application domain fits into the hammer GAE is providing.
Alessandro Santini replied on Fri, 2009/04/17 - 8:22am
Barev Dmitriy,
I understand you have - being part of GridGain - a personal interest in dismissing GAE as a toy for developers and not ready for the enterprise.
The reality is that many high-performance, clustered application do not need to know that they are operating (and thus scaling) in a clustered environment. GAE is also providing a distributed cache facility (with some quota limitations) and a persistent storage using JDO (the choice of JDO still slips from my understanding).
I instead accept your point when talking about *distributed* applications by design (e.g. queue networks).
In conclusion, I personally think that not being aware of a cloud/clustered environment is an advantage rather than a defect.
Fabrizio Giudici replied on Fri, 2009/04/17 - 10:46am
in response to: killerloop
Alessandro, GAE/J supports JPA too in addition to JDO (see http://groups.google.com/group/google-appengine-java/web/will-it-play-in-app-engine).
While some of the points by Dmitriy are questionable (as discussed in the first comment), I don't think he's "dismissing" GAE/J as a toy. Clearly GAE/J serves a much smaller scope than EC2 and all the other similar things. For some it could be still interesting, if they just need a simple infrastructure. For others it could be considered as a "toy" if they can't do something because of the runtime restrictions. For me it's useless because of the lack of the imaging stuff.
BTW, somebody says that Google provides an "alternate" engine for imaging - I don't think I'll ever use it (yet another!) even though all my projects rely on a meta- imaging framework that could probably adapted to GAE/J stuff. But, for the sake of curiosity, can somebody please give me a pointer?Thanks.
Alessandro Santini replied on Fri, 2009/04/17 - 12:22pm
in response to: fabriziogiudici
Ciao Fabrizio,
thanks for your comment and for the pointer to JPA. I want to share this one also, http://code.google.com/appengine/docs/java/datastore/usingjpa.html, very informative as to limitations when using JPA.
Yes, the verb to dismiss may be inappropriate (hopefully won't offend anyone) but I still read that AppEngine does not scale because neither your apps nor yourself are in control of the clustering/cloud infrastructure; I still believe that this point is highly debatable, especially if it is not supported by a benchmark of some kind.
My conclusion are:
Hope this clarifies my intent.
P.S.: as to the imaging API, the only link I know is http://code.google.com/appengine/docs/java/images/ but I am sure you have this already.
Neeraj Vora replied on Fri, 2009/04/17 - 1:07pm
in response to: killerloop
Very, very true. EC2 is renting you land on which you build a playground, play in it and then tear it down. You are charged for the time you rent the land, whether or not you play in it all the time(it has a robot which you can instruct on how to build the playground each time). GAE/J gives you a ready custom playground to play in (you need to like the playground). It then charges you by the rides. If you rest all day in this playground, then you have a free pass.
Andy Jefferson replied on Fri, 2009/04/17 - 1:31pm
>> and a persistent storage using JDO (the choice of JDO still slips from my understanding).
Perhaps because it is not an RDBMS datastore behind it and that JPA was designed solely with RDBMS in mind, whereas JDO was and still is designed to be datastore agnostic.
Perhaps because JDO offers way more capabilities to the users plate than what JPA (1 or "2") does.
At the end of the day the user has both (as well as now also having a REST API to their datastore) and can choose for themselves. Regarding the limitations they currently have in their plugin, some of those may be removed in future revisions since it is early access.
--Andy (DataNucleus)
Alessandro Santini replied on Fri, 2009/04/17 - 4:15pm
in response to: jpox1
Hi Andy,
thanks a lot for your comment. You are indeed right - that is the most probable reason.
Alex(JAlexoid) ... replied on Sun, 2009/04/19 - 2:00am
I am sorry, but you are exactly 1 year off with your overview. Since day one of GAE with Python, it was obvious what Google will provide for you and what they will limit for your own sake.All the same points are valid about Python, therfore valid about GAE as a whole.
GAE is a PaaS, while Amazon EC2 is a IaaS. Please know the distinctions before comparing the two.