The expert group for JSR 315 (servlet-3.0) has come to a bit of an impasse regarding some new features for auto discovery of servlets and filters.   Some members of the EG have some security/flexibility concerns regarding these features, but others do not think the concerns  significant enough to warrant additional complexity in configuration options. 

In order to resolve this impasse, the EG has decided to solicit more community feedback. So this is my biased blog soliciting that feedback. I say biased, because I am a strong advocate FOR some additional flexibility in these new features. I understand that those AGAINST will also be making their case to the community and I will link to them from here once they become available.  Thus I’m looking for community support of my views, or corrections to my representations of the situation or just people telling me to chill and to not worry so much about such things.

The Requirement

It can be difficult, confusing and error prone to configure a web application that is built from many components, frameworks and web tools. The problem being that the current monolithic  web.xml  must contain all the configuration for  all the components, frameworks and tools.   This means that using a web framework is not as simple as  just dropping a  jar file into  WEB-INF/lib.   Currently snippets of web.xml need to be taken from the framework (either from templates or doco) and merged into the main web.xml for the web applications.  The the web.xml file because a mix of structure declarations, application configuration and framework configuration. 

The requirement given to JSR-315 was to come up with a way to simplify deployment of frameworks and to allow modular or decomposed configuration.  There are already some features that partially address this in servlet 2.5, specifically:

  • Annotations on servlet classes can be used to add additional configuration to the servlets declared in web.xml: The current support is only for @postConstruct, @preDestroy, @runAs and @resource annotations, but this is the start of decentralized configuration.
  • Jar files in WEB-INF/lib are scanned for TLD descriptors that can instantiate Listeners

3.0 Framework Pluggability

The early review draft of Servlet-3.0 (out soon) will contain several new features to further meet the requirement for decentralized drop in style configuration: 

  1. Additional annotations such as @Servlet, @Filter and @FilterMapping have been defined with sufficient parameters (eg urlPattern and initParams) so as to be able to configure filters and servlets entirely from annotations of classes contained within WEB-INF/lib or WEB-INF/classes
  2. Support for web.xml fragments to be included in the /META-INF directory of jar files within WEB-INF/lib.  These web fragments are combined with the web.xml with well defined merging rules (already mostly defined when arbitrary element ordering was supported in 2.5 web.xml)
  3. Programmatic configuration of Filters and Servlets via new methods on ServletContext.  These methods are only active when called from a Listener class defined either in a web.xml fragment or discovered TLD file.

The intent of these features is that a web framework can have all of it’s defaults configuration backed into it’s jar as either annotated servlets, web.xml fragments or as code triggered by a TLD defined listener.   Thus it should be possible to simple drop a web framework jar into WEB/INF lib and have that framework available without any editing of the web.xml.  It is proposed that  these features are turned on by default when a 3.0 web.xml is present or there is no web.xml at all.   Some (all?) of these features (at least discovery of annotated servlets) can be turned off by using the meta-data-complete element within a web.xml.

The Automagic Discovery Problems

I really like these new features.  I specially like web.xml fragments and programmable creation of servlets and filters.  I can also appreciate why those that like annotations would like the ability to completely configure a servlet in annotations. 

However I have several significant concerns about the security and flexibility aspects of the automatic discovery mechanism implicit in these proposals:

  1. Accidental Deployment: Web applications can contain many many third party jars. I have seen several web applications that have over 100 jars that have been pulled in by their frameworks and their dependencies. Other than the performance issue of scanning all these jars at startup, there is a real risk of accidental deployment of features, debugging aids, admin UIs or hacker attacks.  The developer/deployers must be aware of all the web features and facilities in all the jars they use!  Maintainers that update a jar within a webapp will have to perform due diligence that they are not adding new web features unintentionally.  Tools will not be able to greatly assist with this process as the analysis of the programmatic configuration is a NP-complete problem – so you will need to deploy a jar to see what it defines, and even then you don’t know if it may later decide to define new filters and/or servlets.
  2. All or Nothing:. If there is just 1 servlet defined under WEB-INF that a developer does not want automagically deployed, then there is no mechanism to select included and/or excluded  jars.  The only  options are:
    • to  modify the jar to remove the unwanted configuration
    • to turn off automagic discovery and  define every other filter, servlet and listener in web.xml
    • let the unwanted servlet deploy and try to block it with security constraints.
  3. Parameterization: The jars with auto configured frameworks will contain a good default configuration, most probably set up for developer.  There is currently no mechanism available to parameterize the configuration within a jar, other than by overriding it in the main web.xml. This will lead either to configuration in two places, configuration cut-and-pasted out of the jar or the all-or-nothing options above.
  4. Ordering: The ordering of auto discovered configuration has yet to be defined. Ordering is important as this can affect the order of filters and which configuration may be overridden. If the order (when it is defined) is not the desired order, then there is no mechanism to change the order and the all-or-nothing options above will have to be used.
  5. Disabling: The <meta-data-complete> element will disable automatic scanning for annotations in all jars.  It may also disable checking for web.xml fragments (under discussion).  But there is currently no mechanism in 3.0 to disable the scanning for TLD listeners with their new capability for deploying arbitrary filters and servlets.  Deployment of closed source jars will become an event greater exercise in trust as only decompilation will reveal what may be deployed.

The Proposed Solution

Joe Walker (of DWR fame) proposed  a simple solution to these problems, which I embellished with some additional ideas.   This proposal has also evolved a little as a result of telephone and email discussions with the EG. 

The main idea is to allow web.xml to have optional  <include> elements to guide the automagic discovery of configuration.  Without a web.xml or with a 3.0 web.xml that does not list any inclusions, the default would be to search all of  WEB-INF for annotated servlets and filters, TLD listeners and web.xml fragments as currently proposed.  If however, a web.xml contained <include> element, then the discovery process would be modified as the following examples illustrate:

<include src="WEB-INF/lib/dwr.jar"/><include src="WEB-INF/lib/cometd.jar"/><include src="WEB-INF/classes"/>

This include would scan only the dwr.jar and cometd.jar for annotations, TLD fragments and web.xml fragments, the WEB-INF/classes directory would be scanned for annotated servlets.  No other jars or classes  would be scanned unless listed in their own include elsewhere in the web.xml. The ordering between the includes is well defined, and  these element could be placed in the web.xml with other listener/servlet/filter/include declarations before, between or after them.

<include src="WEB-INF/lib/dwr.jar!META-INF/web.xml"/>

This include would use the web.xml fragment within the dwr.jar.  Similar includes could be used to scan for differently named web.xml fragments and TLD descriptors either within jars or as files within WEB-INF.

<include src="WEB-INF/lib/dwr.jar!org/dwr/ReverseAjaxServlet.class"/>

Scan the specified class within the DWR jar for servlet or filter annotations.  Note that this clause is effectively the same as just a <servlet> or <filter> element, as that would cause the class to be scanned and any annotations for mappings respected.  In essence this proposal just extends the current ability to nominate a servlet or filter for auto configuration to jar files, TLD files and web.xml fragments.

<include src="WEB-INF/lib/cometd.jar!dojox/cometd/CometdServlet>  <init-param>    <param-name>maxIntervalMs/param-name>    <param-value>3000</param-value>  </init-param></include>

This include element would deploy the annotated CometdServlet from the cometd.jar and would apply the init-param as an override to any default init-params specified in annotations.  Similarly init paramters could be set on web.xml fragments or even for listeners discovered in TLD files.

An earlier form of this proposal included wild-card support for the partial URIs passed to the include elements.  While this may be useful, it does increase the complexity and I believe the proposal works well enough for most cases without it.  A web application with 100 jars is still only likely to include a few web toolkits.

The Case Against?

Due to my declared bias, I am not the best one to make the case against.  But I will paraphrase it as best I can and will link to the blogs of others when they become available.

The case against the <include> element is that it is a complexity and confusion that can be done without, because the majority of servlet users are either unconcerned about the possibility of accidental deployment or that they are happy to restrict themselves to business as usual with a single main web.xml.

Rebuttal

So I’m debating myself now…  I think this is called a straw man.

I don’t see this proposal as complex, specially now that I have removed wild carding. The list of include elements may sometimes be long, it will be far more compact, readable and maintainable than copying all the configuration into a single web.xml.

I do find many servlet users that are very concerned both about security and ease of configuration and would at least like the option to explicitly list which components are auto configured. Please tell me  if you are one are one or not!

 

JSR-315 Needs YOU!

28 thoughts on “JSR-315 Needs YOU!

  • April 25, 2008 at 12:23 am
    Permalink

    I would worry that Servlets, Filters, Listeners that are in various libraries would automatically get turned on when I don’t want them (or redo how they get invoked).  Spring, Quartz, etc. I can see lots of these being a problem and then you have to split all of these out into separate jars so that users of those frameworks get what they want.  I like the include option but I also like to be fully in control AND have ease of deployment of jars that come from other folks.

    I’d almost make a maven plugin that would generate this stuff from a pom and then include them all in a generated file (can an included file call another include?)

    Just some thoughts…

  • April 25, 2008 at 2:43 am
    Permalink

    I should say that while I’m at slight odds with the current form of this proposal from JSR-315, I’m not at odds with the process or the group itself.  For those that remember the bad old days of tantrums in JSR-154 with resignations etc…. this is not at all like that.  The group jointly decided that we needed more community input to help weigh up the differing pros and cons.

    My  straw-man paraphrasing of the case against is not intended to belittle that point of view.  Indeed the proposal for include elements has been improved and simplified as a result of their valid concerns.

  • April 25, 2008 at 5:42 am
    Permalink

    If it works as you explained, then the majority of users would never need to know about the include element, yes? In which case, I can’t see what the objection is. On top of a complete auto-discovery mechanism, this seems like a relatively small amount of work.

    Instead, I’m more nervous about the notion that the default is to auto-crawl every included jar and put it up on the web. I’d rather see the auto-discovery be done only when requested in the web.xml. Generally I like secure defaults, and Java’s poor library management means people are inclined to throw in all sorts of jars without looking too closely at them.

    I hope that helps, and many thanks for getting community input!

  • April 25, 2008 at 2:25 pm
    Permalink

    I am a big fan of convention over configuration, its a key tenet in the design and execution of maven, so I am quite pleased with the general direction this is taking.

    However I have similar reservations as you Greg that its not quite detailed enough and perhaps worse, transparently forces convention on a community that hasn’t necessarily dealt with it to a large degree.  Unless there is a deterministic lifecycle involved in the discovery process then all manner of bad things are possible, things being loaded out of intended order, even unintended things being loaded.

    Even in maven where there is a strong convention base there is a need to make sure that X runs before Y in any given phase of a lifecycle so I agree there ought to be some sort of middle ground here that balances convention and configuration in a declarative way.  Being able to specifically state that jar X should be have a discovery process run over it followed by jar Y having that same discovery process run in a good place to start.  Furthermore, being able to inject property changes into that discovery process is a vital component as well.

    Seems to me that I am restating your arguments now, but I guess phrasing it in another way can’t hurt.

    cheers!

  • April 25, 2008 at 7:23 pm
    Permalink

    I mostly side with you.

    I do not like the idea of things being automagically turned on. Especially from Annotations in classes!  That are very hard for me to change/control/maintain.  I would do away with auto-wiring from annotations.

    I like the cleanliness of including web.xml pieces, though you will have to include ordering options which could complicate matters ( "after:*", "before:*" ).

    The compromise is for me to explicitly turn on features using something like the include option you mention.  And being able to pass in parameters to those web.xml fragments would be perfect!  So if there is a way to add fragment specific parameters.. (like an init-param that only applied to that web.xml fragment ).

    But then if you go that far, I like the idea of doing away with XML and moving to straight Java (programmatic construction of webapp ); as much of the app configuration/management is moving towards ( spring, guice, etc ).

    If you have time, take a look at Tapestry IoC, it’s inspired from Guice, and I think is a fine implementation to draw further inspiration from.. (for programmatic configuration/glueing, with pluggable/modular support.)

  • April 26, 2008 at 10:27 am
    Permalink

    What about having a couple of different dirs: one whose jars (‘s contents) should be auto-included (in its many ways), called e.g. WEB-INF/autodeploy, and one for additional jars that should just be on the classpath, at the current WEB-INF/lib? That would lead to a very conscious decision, IMO. The autodeploy could contain not only jars, but web.xml fragments.

    I do however think that overrides must be possible, and for that the suggested include notation could be employed, maybe just changed to "override" instead of "include"?

    There could be a third dir for overrides, WEB-INF/config: all xml-snippets in here would be read, and override the ones from autodeploy. This would mean that the config in WEB-INF/autodeploy were only defaults, with the user config residing in WEB-INF/config.

    Any element in web.xml would override all.

    A specific value for init-param could denote that this element was NOT_CONFIGURED. This could be set in the autodeploy part (the defaults). Then one had to override it in a WEB-INF/config xml snippet (or in web.xml proper) for the servlet container to start up (it’d refuse to start if there was any init-param or others that had the value NOT_CONFIGURED (or javax.servlet.NOT_CONFIGURED, or whatever) at startup time).

    This would enable a simple "add module" approach: stash in the jar into WEB-INF/autodeploy. However, the module requires some parameters – e.g. the mail server hostname – and it won’t work unless this is configured. When trying to start the server, one would get an error message "the init-param com.example.mailserver is NOT_CONFIGURED, and needs to be set in either a WEB-INF/config snippet, or in web.xml proper. The parameter was defined in mailtool.jar!web.xml:line123" – and then one could go in there, and read the comments on this parameter – or the error message could even dump the preceding comment along with the error.

  • April 27, 2008 at 3:48 am
    Permalink

    I (partially) agree with you,  my disagreement regards this part:

    <blockquote>Without a web.xml or with a 3.0 web.xml that does not list any inclusions, the default would be to search all of  WEB-INF for annotated servlets and filters, TLD listeners and web.xml fragments as currently proposed</blockquote>
    The default behavior should be the exact opposite, do not autoload anything if it is not specified, and you could allow <pre>include</pre> to use regex in case one needs to load everything, like

    <pre>

    <include src="WEB-INF/lib/*.jar"/>

    </pre>

  • April 27, 2008 at 7:56 am
    Permalink

    Mohamed,

    I agree with you that the automagic discovery should be off by default and need to be explicitly turned on.   However, there is a lot of support in the EG and from the EE JSRs for zero web.xml configuration – so can’t see that being accepted.

    Besides, the servlet spec already has security constraints the wrong way around (everything allowed except for things explicitly denied), so this is not worse that that 🙁

    Note that it will not be turned on by default if the web.xml is a 2.x descriptor.

  • April 27, 2008 at 8:02 am
    Permalink

    Thanks for the further info, Greg.

    If automagic discovery will be the default, then they must, must, must provide some easy way for people to know what has been discovered and where it came from. I don’t know quite what the right thing is, but definitely some sort of API to figure out what is being served and why. Plus sample code and an easy-to-install servlet that displays that on a web page to anybody on localhost. And a simple way to dump out that info as well.

    I’m all for magic, but if there will be an easy way for people to accidentally open security holes, there should be equally easy ways to discover and block the holes.

  • April 27, 2008 at 6:56 pm
    Permalink

    Hmmm, I would NOT want any of this auto discovery on by default. In fact, I would find that a step-back.

    Maybe it is just me, or is the thing with a big, unstructered web.xml file a real or a perceived problem of the EG? I for one, having build lots of large web applications, have never found this to be a problem. True, there is some configuration to do in web.xml, but it is just configuration. And most of it, I want to set for my case.

    And when talking about annotations. I like annotations, but not for configuration data, like what web.xml expresses. XDoclet has had the feature of auto-generating a web.xml from annotations for a long time already, but it was not anything, that really helped anyone (says I).

    I think, the web.xml file is really nice as it is. The place where *I* wire up what *I* include.

    But then again, I also never saw web.xml as a problem.

    ?

  • April 28, 2008 at 10:10 am
    Permalink

    Could we please, please, have the possibility to get the ServletContext from a ServletRequest? All we can do now is go through the HttpSession (if it is created), or register a custom ServletFilter that "remembers" the current ServletContext in a ThreadLocal var.
    This alone would be a fantastic addition to the Servlet API!

  • April 28, 2008 at 10:52 am
    Permalink

    Automagic lookup needs to be turned off.  There are lots of scenarios – say Spring configuration – where the framework has servlet, listener and Struts plugin ways to create the WebApplicationContext.  There are also application-server-specific classes as well WebLogicXZY vs WebSphere or JBoss (or Jetty).  These cannot be automatically turned on.

    I have no problem with auto discovery in WEB-INF/classes but NOT in WEB-INF/lib.  I also like building up the web.xml information from snippets in WEB-INF/conf – that would have been very very handy for an application that is deployed over and over (ASP model).

    Please do NOT have it be auto configured in WEB-INF/lib – that is a huge mistake.

  • April 28, 2008 at 2:53 pm
    Permalink

    If you are going to change the way web applications are loaded surely it should be configurable and backwardly compatible.

    For example you could have e.g. a configurator tag which specifies a class to be used for configuration.  (A bit like the listener tag) This would give you the options to do lots of things e.g.  If the Configurator Class implements the ServletConfiguration interface then it is a servlet configuration and is passed a class by the servlet container to allow it to register new servlets.  By default this could be a web.xml servlet configurator which XML parses the web.xml for its defined servlets.

    This gives you lots of possibilities for tag libs, filters and what ever else that could be added in the future as well as servlet container specific configurations that may be required (Can’t really think of a good example).  Just implement the relevant interface and place it in your web.xml and it is passed the objects relevant to allow you to do the configuration.

    If you want to do auto-deploy then enforce servlet containers to implement the autodeploy mechanism e.g. a WebXmlFragmentServletConfigurator would scan the classpath for fragements representing servlet configurations and deploy them,  you could have two methods one for just WEB-INF/classes and one for WEB-INF/lib assuming the servlet container can provided the relevant classloader to the implementation or use a different classloader depending on the location of the Configurator itself.  (The normal class loading rules for servlet contains would need to be considered of course).

  • April 28, 2008 at 5:23 pm
    Permalink

    I think that tools could be perfectly useful for analyzing your configuration: what makes you think that this is an NP-complete problem? As far as I can see, all this requires is a simple search of:

    • a bunch of WARs/JARs/folders for files called <tt>WEB-INF/web.xml</tt>, and
    • each <tt>.class</tt> file for one of three annotations

    This is a simple, easy and linearly fast search problem, with only a few hundred bytes of each class required reading. This is only difficult if you search for those annotations on classes that are loaded dynamically, which doesn’t have any real benefits (and so hopefully isn’t proposed). Right?

  • April 28, 2008 at 7:12 pm
    Permalink

    Complete reasoning on the TSS-post:
    Please make it possible to completely turn off those automatisms. Make it possible to just define one single web.xml with NO other possibilities for additions. Important for the corporate world.
    If this switch is available, then automatism can be put into the spec… else: stay away from it.
    Give the user the choice.

  • April 28, 2008 at 9:03 pm
    Permalink

    I don’t see how the proposed solution can handle order dependency of annotated servlet-mapping or filter-mapping. For example, incorporating multiple filters in a chain, one might want to order them in a different order than auto-discovered, even when specifically included.

  • April 29, 2008 at 5:33 am
    Permalink

    Sarah,
    scanning can find annotations and web.xml fragments.  However it cannot discover what filters and servlets a Listener may configure.   Listeners may call arbitrary code on the classpath to determine what filters and servlets it will deploy via the new APIs.  It is not possible to analyze code to see what the eventual configuration will be.    The only option will be to run the code, and then you still don’t know if it will behave differently when run in a different environment (perhaps a framework may try to phone home and if it can, deploy some sort of license servlet – if it can’t it may run is some limited test mode).

    So either you will have to trust the Listeners totally – or you are stuffed, because currently there is no way of stopping them other than cracking open the jars and removing them from the TLD descriptors.

  • April 29, 2008 at 5:35 am
    Permalink

    The proposed include mechanism can specify individual classes, web.xml fragments and/or TLD descriptors.   So long as those are internally consistent, then ordering should be able to be done at a very fine level of granularity.

  • April 30, 2008 at 1:49 am
    Permalink

    Never mind web.xml… please, please give me programmatic login and remember me!

  • May 1, 2008 at 4:52 am
    Permalink

    The servlet spec needs to handle login and user attributes, not just delegate the problem to JAAS.  The EG really needs to look long and hard at JSR 286 and figure out what they can do to make the next Portlet Spec better. Many of its deficiencies are a direct result of having to kludge around issues in the servlet spec.

  • May 1, 2008 at 8:12 am
    Permalink

    If there will be no way of saying which jars should be used for autodiscovery then i will be forced to define everything in web.xml as in old days. We can’t afford not knowing what is actually used. This  new feature would be useless for us. So i definitely agree with you and i hope include tag will be possible to define in web.xml.

  • May 3, 2008 at 7:37 am
    Permalink

    I prefer the current configuration of filters instead of auto discovery. It’s clean and elegant, and most importantly, manageable. Usually, a web app does not have many filters. I would like to spend time on configuring filters instead of figuring out which filters are used.

  • August 11, 2008 at 7:54 pm
    Permalink

    I looked at the early draft JavaDoc. The "addFilter() / addServlet()" on ServletCtx looks good. But, why is getServlets() still deprecated ? Wouldn’t it good to check if a servlet is already added (or not) ?

  • Pingback:URL

Comments are closed.