Cloud computing is understandably a hot topic. It refers to the ability to deploy your application on infrastructure that you do not necessarily own nor manage yourself and that can be easily scaled up to handle demand and provide resiliency to failure. The hardware infrastructure can be composed of clusters of many cheap machines or it may be high-end hardware which is virtualized into many nodes. In either scenario, the cost of ownership is dramatically less than in the traditional model.
Additionally, your application executes in an environment where the services it consumes can be transparently provided for you and scale along with the application. Think services like databases, messaging and servlet session clustering for example.
Software as a service is an increasingly popular paradigm for webapps so let’s look at how Jetty is used in the cloud and some of the types of technologies involved.
Cloud Platform: Morph
Lets look briefly at a cloud platform solution for scalable webapps. Morph allows you to upload your war file and have it automatically deployed to as many virtual nodes as you need. Morph handles the load balancing for you and allows you to add or subtract virtual servers elastically as your load dictates. Moreover, your webapp immediately has access to some of the most commonly-needed resources like relational databases, mail servers and soon also a JMS service! These services are provisioned, configured, backed-up and monitored by Morph 24×7. Better still, Morph ensures high-availability of your webapp by configuring a fail-over pair as a matter of course. And at the heart of this great service, what do we find? Yes, that’s right – Jetty! Jetty is the servlet container into which webapps are deployed and has been especially configured for a cloud-hosted environment.
Cloud Technology: Terracotta
Terracotta provides a shared memory model. It is mostly unobtrusive in code, generally just requiring good synchronization boundaries around the objects to be shared to enable it to efficiently disseminate updates amongst nodes. You can use Terracotta to implement cloud-type facilities for your webapp when running in Jetty. In fact, Jetty can already make use of Terracotta as its distributed sessions mechanism. We’ve recently been collaborating with the Terracotta guys to really hone the performance of the Jetty/Terracotta session clustering and we’re getting some very pleasing results, which will be the subject of another blog.
Cloud Infrastructure: Hadoop
Hadoop is an open-source implementation of the MapReduce algorithm for breaking computational problems into smaller blocks that can be distributed over a cluster so that they may execute in parallel. Hadoop uses Jetty in two ways: to help distribute the jobs amongst the nodes, and also to monitor and report job execution status. FYI, Hadoop recently broke the terabyte sorting record – well done guys!
Cloud Infrastructure: Gigaspaces
Gigaspaces provides a space-based infrastructure to scale-out applications. The space’s job is to ensure that data can be made available in the most efficient way possible to whichever node requires it. There are different options for configuring the space including partitioning based on characteristics of the data, or data persistence via an RDBMS. A number of different API facades onto the space (JMS, JDBC, Map and Space) are provided, so you can pick the semantic appropriate to your application. Jetty itself uses the Space API in the implementation of the Jetty/Gigaspaces clustered sessions module1.
However, the space can be used for more than just distributing data, and can also be used to scale applications themselves, more akin to grid computing. In this scenario, nodes in the space (or grid) called processing units execute application logic and can be added on demand to handle load. Webtide has been collaborating with the Gigaspaces guys and we’ve put a Jetty instance into each and every processing unit. This means that a webapp can be instantly scaled simply by deploying it to more processing units in the grid.
Session clustering refers to the ability of more than one node to access the servlet session established between a client and the servlet container. In a non-clustered environment, the servlet session exists only in the memory of the servlet container on the node hosting the container. Thus, if that process or node fails, that user’s session and concomitant data is lost. With clustered sessions, another node can take over for the failed one, accessing the established session and allowing the user to continue using the site. So, in a cloud environment, where the physical nodes hosting your webapp may not be permanently allocated to you and change over time, or when nodes fail and are replaced by others, your site can continue to be available to all your users.