Saturday, October 8, 2011

Servlet

URI And URL Difference

Before going into URL and URI,
you need to know some background. Do you ever thought about, who decides what is URL? and what is URI? or who is the authority for URL, URI and such naming conventions?

W3C and IETF

There are two separate bodies W3C and IETF. The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. The specifications for URI and URL are defined by W3C.
It was founded and headed by Sir Tim Berners-Lee. He is one of the greatest scientist living now. He created this www model of server and client architecture, a web server serving web pages through network and client browsers reading it. He did it first when he was with CERN. He created the world’s first web page http://info.cern.ch/ . He was also in HTML 2.0 working group of IETF. So it is very appropriate for W3C to define URI and URL.
Internet Engineering Task Force (IETF) is an open international community working on Internet related standards. In general it addresses issues of Internet protocols.
In particular W3C defines the web, html specifications and related information. IETF defines IP, TCP, or DNS, for security at any of these levels; with SMTP or NNTP protocols.
So the stake for definition for URI and URL is with W3C. But as there is only a thin line between these organisation’s work they tend to cross each other. In some place IETF gives dissimilar definition for URL and URI.
If you read through the huge volume of journals available in web for this topic, you can sense that experts :-( are using URI and URL synonymously. Which is causing all these confusion among the web community about URI, URL and URN.

URI

An URI identifies a resource. It is a locator. It includes a URI scheme, authority, path, query and fragment by syntax. For example, http: is a URI scheme.
Syntax of URI based on RFC 3986
foo://example.com:8042/over/there?name=ferret#nose
\_/ \______________/\_________/ \_________/ \__/
| | | | |
scheme authority path query fragment
| _____________________|__
/ \ / \
urn:example:animal:ferret:nose

URL

The term “Uniform Resource Locator” (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network “location”).

URN

The term “Uniform Resource Name” (URN) is used to identify a resource independent of its location. Example urn:ISBN:1-23-432536-5

Summary of differences between URI and URL

  • A URI was either a URL or a URN.
  • URL is a subset of URI. It identifies a resource using one of the URI schemes.
  • URN is a subset of URI. It identifies a resource independent of its location.
Whenever you have a doubt that, whether something is a URL or URI then use URI as a term to identify it. Since URI is a super set of URL.

Session Life Cycle

When I say life cycle, I can hear you murmur “Oh no not again, how many life cycles I have to deal with”! In real world everything has life cycle, then why not in programming, after all, software is all about mimicking real life.  In a previous article I discussed about methods used for session tracking.
Frog Life Cycle
Frog Life Cycle
It has fundamental information about what a session is and how to manage it. At the end of that article I have given a preview about “5. Session tracking API”. Now we are going to dive deep into it.
Just to recap, session is a conversion between a server and a client. An elite way to manage the session in servlets is to use API. Any web server supporting servlets will eventually have to implement the servlet API. It may or may not provide with more features of luxury but the minimum is guaranteed. Servlet specification ensures that, the minimum features provided make the session management job easier. Servlet API will use one of the underlying traditional mechanisms like cookies, URL rewriting, but that will happen behind the scenes and you need not worry about it!

How to access HttpSession object?

Every request is associated with an HttpSession object. It can be retrieved using getSession(boolean create) available in HttpServletRequest. It returns the current HttpSession associated with this request or, if there is no current session and create is true, and then returns a new session. A session can be uniquely identified using a unique identifier assigned to this session, which is called session id. getId() gives you the session id as String.
isNew() will be handy in quite a lot of situations. It returns true if the client does not know about the session or if the client chooses not to join the session. getCreationTime() returns the time when this session was created. getLastAccessedTime() returns the last time the client sent a request associated with this session.

How to store data in session?

Once you have got access to a session object, it can be used as a HashTable to store and retrieve values. It can be used to transport data between requests for the same user and session. setAttribute(String name, Object value) adds an object to the session, using the name specified. Primitive data types cannot be bound to the session.
An important note to you, session is not a bullock cart. It should be used sparingly for light weight objects. If you are in a situation where you have to store heavy weight objects in session, then you are in for a toss. Now it’s time to consult a software doctor. Your software design is having a big hole in it. HttpSession should be used for session management and not as a database.
Follow a proper naming convention to store data in session. Because it will overwrite the existing object if the name is same. One more thing to note is your object needs to implement Serializable interface if you are going to store it in session and carry it over across different web servers.

How to retrieve data from session?

getAttribute(String name) returns the object bound with the specified name in this session. Be careful while using this, most programmers fell into a deeply dug pit NullPointerException. Because it returns null if no object is bound under the name. Always ensure to handle null. Then, removeAttribute(String name) removes the object bound with the specified name from the session. Note a point; be cautious not to expose the session id to the user explicitly. It can be used to breach into another client’s session unethically.

How to invalidate a session object?

By default every web server will have a configuration set for expiry of session objects. Generally it will be some X seconds of inactivity. That is when the user has not sent any request to the server for the past X seconds then the session will expire. What do I mean by expire here. Will the browser blowup into colorful pieces? When a session expires, the HttpSession object and all the data it contains will be removed from the system. When the user sends a request after the session has expired, server will treat it as a new user and create a new session.
Apart from that automatic expiry, it can also be invalidated by the user explicitly. HttpSession provides a method invalidate() this unbinds the object that is bound to it. Mostly this is used at logout. Or the application can have an absurd logic like after the user logins he can use the application for only 30 minutes. Then he will be forced out. In such scenario you can use getCreationTime().
Generally session object is not immortal because of the default configuration by the web server. Mostly these features are left to the imagination of web server implementers. If you take Apache Tomcat 5.5, there is an attribute maxInactiveInterval. A negative value for this will result in sessions never timing out and will be handy in many situations.
Long live the sessions!

An example servlet program to demonstrate the session API

import java.io.IOException;
import java.io.PrintWriter;
import java.util.Date;
import java.util.Enumeration;
 
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.servlet.http.HttpSession;
 
public class SessionExample extends HttpServlet {
 
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
response.setContentType("text/html");
PrintWriter out = response.getWriter();
 
// getting current HttpSession associated with this request or, if there
// is no current session and create is true, returns a new session.
HttpSession session = request.getSession(true);
 
// Calculating Hit Count
Integer count = (Integer) session
.getAttribute("SessionExample.HitCount");
if (count == null)
count = new Integer(1);
else
count = new Integer(count.intValue() + 1);
session.setAttribute("SessionExample.HitCount", count);
 
out.println("<HTML><HEAD><TITLE>SessionExample</TITLE></HEAD>");
out.println("<BODY><H1>Example session servlet to "
+ "demostrate session tracking and life cycle</H1>");
 
// Displaying the hit count
out.println("Hit count for your current session is " + count);
 
out.println("<H2>Some basic session information:</H2>");
out.println("Session ID: " + session.getId() + "</BR>");
out.println("Is it a new session: " + session.isNew() + "</BR>");
out.println("Session Creation time: " + session.getCreationTime());
out.println("(" + new Date(session.getCreationTime()) + ")</BR>");
out.println("Last accessed time: " + session.getLastAccessedTime());
out.println("(" + new Date(session.getLastAccessedTime()) + ")</BR>");
out.println("Max in active time interval: "
+ session.getMaxInactiveInterval() + "</BR>");
// Checks whether the requested session ID came in as a cookie
out.println("Session ID came in as a cookie: "
+ request.isRequestedSessionIdFromCookie() + "</BR>");
 
out.println("<H2>Iteratively printing all the values "
+ "associated with the session:</H2>");
Enumeration names = session.getAttributeNames();
while (names.hasMoreElements()) {
String name = (String) names.nextElement();
String value = session.getAttribute(name).toString();
out.println(name + " = " + value + "</BR>");
}
 
out.println("</BODY></HTML>");
}
}
Output of the example session servlet
Output of the example session servlet

Servlet JSP Communication

getServletConfig().getServletContext().getRequestDispatcher(“jspfilepathtoforward”).forward(request, response);
The above line is essence of the answer for “How does a servlet communicate with a JSP page?”
When a servlet jsp communication is happening, it is not just about forwarding the request to a JSP from a servlet. There might be a need to transfer a string value or on object itself.
Following is a servlet and JSP source code example to perform Servlet JSP communication. Wherein an object will be communicated to a JSP from a Servlet.

Following are the steps in Servlet JSP Communication:

1. Servlet instantiates a bean and initializes it.
2. The bean is then placed into the request
3. The call is then forwarded to the JSP page, using request dispatcher.
Example Servlet Source Code: (ServletToJSP.java)
public class ServletToJSP extends HttpServlet {
  public void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
 
    //communicating a simple String message.
    String message = "Example source code of Servlet to JSP communication.";
    request.setAttribute("message", message);
 
    //communicating a Vector object
    Vector vecObj = new Vector();
    vecObj.add("Servlet to JSP communicating an object");
    request.setAttribute("vecBean",vecObj);
 
    //Servlet JSP communication
    RequestDispatcher reqDispatcher = getServletConfig().getServletContext().getRequestDispatcher("/jsp/javaPapers.jsp");
    reqDispatcher.forward(request,response);
  }
}
Example JSP Source Code: (javaPapers.jsp)
<html>
<body>
<%
  String message = (String) request.getAttribute("message");
  out.println("Servlet communicated message to JSP: "+ message);
 
  Vector vecObj = (Vector) request.getAttribute("vecBean");
  out.println("Servlet to JSP communication of an object: "+vecObj.get(0));
%>
</body>
</html>

Session Tracking Methods

Following answer is applicable irrespective of the language and platform used. Before we enter into session tracking, following things should be understood.

What is a session?

A session is a conversation between the server and a client. A conversation consists series of continuous request and response.

Why should a session be maintained?

When there is a series of continuous request and response from a same client to a server, the server cannot identify from which client it is getting requests. Because HTTP is a stateless protocol.
When there is a need to maintain the conversational state, session tracking is needed. For example, in a shopping cart application a client keeps on adding items into his cart using multiple requests. When every request is made, the server should identify in which client’s cart the item is to be added. So in this scenario, there is a certain need for session tracking.
Solution is, when a client makes a request it should introduce itself by providing unique identifier every time. There are five different methods to achieve this.

Session tracking methods:

  1. User authorization
  2. Hidden fields
  3. URL rewriting
  4. Cookies
  5. Session tracking API
The first four methods are traditionally used for session tracking in all the server-side technologies. The session tracking API method is provided by the underlying technology (java servlet or PHP or likewise). Session tracking API is built on top of the first four methods.

1. User Authorization

Users can be authorized to use the web application in different ways. Basic concept is that the user will provide username and password to login to the application. Based on that the user can be identified and the session can be maintained.

2. Hidden Fields

<INPUT TYPE=”hidden” NAME=”technology” VALUE=”servlet”>
Hidden fields like the above can be inserted in the webpages and information can be sent to the server for session tracking. These fields are not visible directly to the user, but can be viewed using view source option from the browsers. This type doesn’t need any special configuration from the browser of server and by default available to use for session tracking. This cannot be used for session tracking when the conversation included static resources lik html pages.

3. URL Rewriting

Original URL: http://server:port/servlet/ServletName
Rewritten URL: http://server:port/servlet/ServletName?sessionid=7456
When a request is made, additional parameter is appended with the url. In general added additional parameter will be sessionid or sometimes the userid. It will suffice to track the session. This type of session tracking doesn’t need any special support from the browser. Disadvantage is, implementing this type of session tracking is tedious. We need to keep track of the parameter as a chain link until the conversation completes and also should make sure that, the parameter doesn’t clash with other application parameters.

4. Cookies

Cookies are the mostly used technology for session tracking. Cookie is a key value pair of information, sent by the server to the browser. This should be saved by the browser in its space in the client computer. Whenever the browser sends a request to that server it sends the cookie along with it. Then the server can identify the client using the cookie.
In java, following is the source code snippet to create a cookie:
Cookie cookie = new Cookie(“userID”, “7456″);
res.addCookie(cookie);
Session tracking is easy to implement and maintain using the cookies. Disadvantage is that, the users can opt to disable cookies using their browser preferences. In such case, the browser will not save the cookie at client computer and session tracking fails.

5. Session tracking API

Session tracking API is built on top of the first four methods. This is inorder to help the developer to minimize the overhead of session tracking. This type of session tracking is provided by the underlying technology. Lets take the java servlet example. Then, the servlet container manages the session tracking task and the user need not do it explicitly using the java servlets. This is the best of all methods, because all the management and errors related to session tracking will be taken care of by the container itself.
Every client of the server will be mapped with a javax.servlet.http.HttpSession object. Java servlets can use the session object to store and retrieve java objects across the session. Session tracking is at the best when it is implemented using session tracking api.

What happens if you call destroy() from init() in java servlet?

destroy() gets executed and the initialization process continues. It is a trick question in servlets interview.
In java servlet, destroy() is not supposed to be called by the programmer. But, if it is invoked, it gets executed. The implicit question is, will the servlet get destroyed? No, it will not. destroy() method is not supposed to and will not destroy a java servlet. Don’t get confused by the name. It should have been better, if it was named onDestroy().
The meaning of destroy() in java servlet is, the content gets executed just before when the container decides to destroy the servlet. But if you invoke the destroy() method yourself, the content just gets executed and then the respective process continues. With respective to this question, the destroy() gets executed and then the servlet initialization gets completed.
Have a look at this java servlet interview question: Servlet Life Cycle – Explain, it might help you to understand better.

How to avoid IllegalStateException in java servlet?

The root cause of IllegalStateException exception is a java servlet is attempting to write to the output stream (response) after the response has been committed.
It is always better to ensure that no content is added to the response after the forward or redirect is done to avoid IllegalStateException. It can be done by including a ‘return’ statement immediately next to the forward or redirect statement.
Example servlet source code snippet:

public void doGet(HttpServletRequest request,  HttpServletResponse response) throws ServletException, IOException {
if("success".equals(processLogin())) {
response.sendRedirect("menu.jsp");
return; // <-- this return statement ensures that no content is adedd to the response further
}

Note: This same scenario of IllegalStateException is applicable in JSP also.
/*
other servlet code that may add to the response….
*/
}

What is servlet mapping?

Servlet mapping specifies the web container of which java servlet should be invoked for a url given by client. It maps url patterns to servlets. When there is a request from a client, servlet container decides to which application it should forward to. Then context path of url is matched for mapping servlets.

How is servlet mapping defined?

Servlets should be registered with servlet container. For that, you should add entries in web deployment descriptor web.xml. It is located in WEB-INF directory of the web application.
Entries to be done in web.xml for servlet-mapping:
<servlet-mapping>
<servlet-name>milk</servlet-name>
<url-pattern>/drink/*</url-pattern>
</servlet-mapping>
servlet-mapping has two child tags, url-pattern and servlet-name. url-pattern specifies the type of urls for which, the servlet given in servlet-name should be called. Be aware that, the container will use case-sensitive for string comparisons for servlet matching.

Syntax for servlet mapping as per servlet specification SRV.11.2:

A string beginning with a ‘/’ character and ending with a ‘/*’ suffix is used for path mapping.
A string beginning with a ‘*.’ prefix is used as an extension mapping.
A string containing only the ‘/’ character indicates the “default” servlet of the application. In this case the servlet path is the request URI minus the context path and the path info is null.
All other strings are used for exact matches only.

Rule for URL path mapping:

It is used in the following order. First successful match is used with no further attempts.
1. The container will try to find an exact match of the path of the request to the path of the servlet. A successful match selects the servlet.
2. The container will recursively try to match the longest path-prefix. This is done by stepping down the path tree a directory at a time, using the ’/’ character as a path separator. The longest match determines the servlet selected.
3. If the last segment in the URL path contains an extension (e.g. .jsp), the servlet container will try to match a servlet that handles requests for the extension. An extension is defined as the part of the last segment after the last ’.’ character.
4. If neither of the previous three rules result in a servlet match, the container will attempt to serve content appropriate for the resource requested. If a “default” servlet is defined for the application, it will be used.

What is implicit mapping?

A servlet container can have a internal JSP container. In such case, *.jsp extension is mapped to the internal container. This mapping is called implicit mapping. This implicit mapping allows ondemand execution of JSP pages. Servlt mapping defined in web application has high precedence over the implicit mapping.

Example code for java servlet mapping:

<servlet>
<servlet-name>milk</servlet-name>
<servlet-class>com.javapapers.Milk</servlet-class>
</servlet>
<servlet>
<servlet-name>points</servlet-name>
<servlet-class>com.javapapers.Points</servlet-class>
</servlet>
<servlet>
<servlet-name>controller</servlet-name>
<servlet-class>com.javapapers.ControllerServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>milk</servlet-name>
<url-pattern>/drink/*</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>points</servlet-name>
<url-pattern>/pointlist</url-pattern>
</servlet-mapping>
<servlet-mapping>
<servlet-name>controller</servlet-name>
<url-pattern>*.do</url-pattern>
</servlet-mapping>

What is Servlet Invoker?

As defined by Apache Tomcat specification, the purpose of Invoker Servlet is to allow a web application to dynamically register new servlet definitions that correspond with a <servlet> element in the /WEB-INF/web.xml deployment descriptor.By enabling servlet invoker the servlet mapping need not be specified for servlets. Servlet ‘invoker’ is used to dispatch servlets by class name.
Enabling the servlet invoker can create a security hole in web application. Because, Any servlet in classpath even also inside a .jar could be invoked directly. The application will also become not portable. Still if you want to enable the servlet invoker consult the web server documentation, because every server has a different method to do it.
In Tomcat 3.x, by default the servlet invoker is enabled. Just place the servlets inside /servlet/ directory and access it by using a fully qualified name like http://[domain]:[port]/[context]/servlet/[servlet.
This mapping is available in web application descriptor (web.xml), located under $TOMCAT_HOME/conf.
/servlet/ is removed from Servlet 2.3 specifications.
In Tomcat 4.x, by defaul the servlet invoker id disabled. The <servlet-mapping> tag is commented inside the default web application descriptor (web.xml), located under $CATALINA_HOME/conf. To enable the invoker servlet uncomment the following two blocks.
<!– The mapping for the invoker servlet –>
<!–
<servlet-mapping>
<servlet-name>invoker</servlet-name>
<url-pattern>/servlet/*</url-pattern>
</servlet-mapping>
–>


<!–
<servlet>
<servlet-name>invoker</servlet-name>
<servlet-class>
org.apache.catalina.servlets.InvokerServlet
</servlet-class>
<init-param>
<param-name>debug</param-name>
<param-value>0</param-value>
</init-param>
<load-on-startup>2</load-on-startup>
</servlet>
–>

Difference between HttpServlet and GenericServlet

javax.servlet.GenericServlet
Signature: public abstract class GenericServlet extends java.lang.Object implements Servlet, ServletConfig, java.io.Serializable
  • GenericServlet defines a generic, protocol-independent servlet.
  • GenericServlet gives a blueprint and makes writing servlet easier.
  • GenericServlet provides simple versions of the lifecycle methods init and destroy and of the methods in the ServletConfig interface.
  • GenericServlet implements the log method, declared in the ServletContext interface.
  • To write a generic servlet, it is sufficient to override the abstract service method.
javax.servlet.http.HttpServlet
Signature: public abstract class HttpServlet extends GenericServlet implements java.io.Serializable
  • HttpServlet defines a HTTP protocol specific servlet.
  • HttpServlet gives a blueprint for Http servlet and makes writing them easier.
  • HttpServlet extends the GenericServlet and hence inherits the properties GenericServlet.

What is preinitialization of a java servlet?

In the java servlet life cycle, the first phase is called ‘Creation and intialization’.
The java servlet container first creates the servlet instance and then executes the init() method. This initialization can be done in three ways. The default way is that, the java servlet is initialized when the servlet is called for the first time. This type of servlet initialization is called lazy loading.

The other way is through the <load-on-startup>non-zero-integer</load-on-startup> tag using the deployment descriptor web.xml. This makes the java servlet to be loaded and initialized when the server starts. This process of loading a java servlet before receiving any request is called preloading or preinitialization of a servlet.
Servlet are loaded in the order of number(non-zero-integer) specified. That is, lower(example: 1) the load-on-startup value is loaded first and then servlet with higher values are loaded.
Example usage:
<servlet>
       <servlet-name>Servlet-URL</servlet-name>
       <servlet-class>com.javapapers.Servlet-Class</servlet-class>
       <load-on-startup>2</load-on-startup>
</servlet>

0 comments:

Post a Comment