The Locomotive is an all-Java web application server, whose source may be viewed and edited by the public. This manual is a guide for those who are interested in how the Locomotive works, and for those who may be interested in altering or adding to its internal architecture. In this document, we'll talk about the major classes that comprise the Locomotive, and we'll also describe what actually happens when the Locomotive starts up, shuts down, and receives a request. We'll describe the classes first; if it helps you to get a perspective on how the classes are used before you learn about them, then you should read the second part first.
To get everyone up to speed, here's a brief description of what the Locomotive is. First, the Locomotive is a server application that listens on a port for requests. The kinds of requests it listens for are HTTP requests, whose information is sent via a special protocol by a tunnel program that's connected to a web server. Each request it serves gets a different thread, so it can handle a large number of requests at once. Once it receives a request, it does some internal parsing and preparation, then it hands the request off to different modules based on the URL requested. Technically, it actually hands the request off to a particular service routing table, which then figures out which module should handle the request. The Locomotive currently has support for two different services: Servlets and Handlers. Each service provides an API that a module must implement in order to work with the Locomotive, and the services each manage their own set of modules, independent of each other.
Now the request has a number of resources available to it that are worth mentioning. First, it has a database connection, which it can use to retrieve and store information in the database. Secondly, it has access to a template insertion system driven by an insertion language we call Steam. The templates it can access are cached, so the performance penalty involved in loading a page from the file system and parsing it is minimal.
Finally, the request can set HTTP headers and write its response back to the web server via the tunnel. Once the request is finished, the database connection is returned to a pool for reuse, and the thread the request ran on is returned to the thread pool. That's a simple introduction to the Locomotive. Below we'll go into a lot more detail, and talk about each of the main classes that do the work we just described.
Table of Contents
Introduction to the Locomotive Classes and Package Descriptions
The Locomotive is comprised of a fairly small number of classes, each of which tend to do a lot of work. Generally, we've tried to separate the functionality of the classes into three different parts, and our package structure reflects this:
In addition to those packages, we also have the two more packages, org.locomotive.loco.servlet and org.locomotive.loco.handler, that contain classes specific to each service. Finally, we have the various modules, which are described elsewhere, and the Permissions package, org.locomotive.loco.perm. In this document we'll stick to the three main packages and the two service packages, as they comprise the fundamental base of the Locomotive. Next we'll introduce the major classes, broken out by functional group, to give you an idea of what each class does.
The central core of the Locomotive Server is contained in three classes, the Loco, RequestManager, and ServiceRoutingTable classes. These classes handle just about all the work, from starting up the locomotive and shutting it down, to receiving, preparing, and routing requests.
org.locomotive.loco.Loco
When you start the Locomotive, it is the main method of the Loco class
that gets everything up and running, and also provides many
access methods to much of the Locomotive functionality. The main method
actually contains the code that listens on the socket and passes requests
off to the other classes for handling, so it's the best place place to start if
you want to trace a request right from the beginning. When the Loco class
gets a connection on the socket, it pulls a thread out of the thread pool,
instantiates a new RequestManager, and sends it off- it doesn't do any of
the work involved in preparing the request for the modules. The Loco class
also owns most of the central Locomotive resources, like the Logs, the
Database Connection Pool, and the Thread Pool, so if you're interested in
the way the Locomotive handles its resource management, look here.
View the javadocs for this class
org.locomotive.loco.RequestManager
The RequestManager is the class that contains most of the dirty details. It handles the initialization of different services, and sets up many of the internal variables the Locomotive draws on during each request. The initialization is all static, and is generally completely separate from the RequestManager's central role: it is the main conduit that takes a request, prepares the various objects for the modules, and passes the request off to one of the service routing tables. So this is the class that handles parsing out the information sent by the tunnel via the various protocols the tunnels send information in. This is also the class that creates the objects each module requires, like the session, the user, and the response, as well as the CGI variables. Finally, this class also parses out the browser stamp cookie, manages virtual sessions, and handles Session expiration. Because this class does so much, it's definitely a candidate for modularization. In terms of the methods inside the class, it's broken up into four parts--
org.locomotive.server.ServiceRoutingTable
The ServiceRoutingTable class is not ever actually instantiated during the Locomotive life cycle. It is the superclass of two objects, the HandlerRoutingTable and the ServletRoutingTable, that handle the loading and routing of the the different kinds of modules. This class provides the module caching architecture, and also handles the loading and the parsing of the service config file. It also loads and maintains the various config files for the different modules.
Modules should be invoked via the routeRequest()
method, which
must be implemented by each subclass. The routeRequest()
is called
by the RequestManager once it figures out which service has been requested.
The routeRequest method can get ServiceEntries out of the table via one of two
methods: routeToClass()
and routeToService()
. Then,
the routeRequest method should set up the required state for the module, and
invoke it.
The structure of this class is a little funky. It uses instances of an internal class, ServiceRoutingTable.ServiceEntry, to represent each service entry. What each ServiceEntry does is specific to each subclass of ServiceRoutingTable, so each subclass actually requires another class that subclasses the ServiceRoutingTable.ServiceEntry. That's pretty unorthodox Java -- you don't usually subclass inner classes -- but that's how it ended up and it seems to be working okay. It's certainly open for revision.
The ServiceEntries are kept in a Hash table keyed by the token used to invoke the module. There's also another Hash table of the properties for each entry, keyed by entry's class name, to allow efficient access to module properties without requiring knowledge of the token used to identify the class. Nothing from ServiceEntries ever need to be accessed by any classes other than ServiceRoutingTable. The ServiceRoutingTable class provides access to anything anyone should normally want from a ServiceEntry.
The ServiceEntries can be reloaded during runtime via the
reloadService()
method, which just re-reads the service
entry's config file, and re-initializes it. Since there's presently no way
to enumerate through the entries, to reload the whole table to best bet is
to shut down the table and start up a new one. There are static methods in
the Loco class that do that for the two subclasses of ServiceRoutingTable.
View the javadocs for this class
There are a number of crucial resources the Locomotive relies on which provide efficient management of many of the objects that get used and re-used for each request. The instantiations of all of these resource objects are controlled by the Locomotive. Access to them and manipulation of them by modules is generally done through static calls like Loco.getConfig().
org.locomotive.loco.LocoConfig
The LocoConfig object handles loading checking the loco.conf file. It is
a subclass of org.locomotive.server.Config, which we wrote because,
first, we wanted type checking for the our config variables, and second,
because we were using JDK1.0.2 and there was no Properties object we could
use. This object requires that all the configs that should exist in the
loco.conf file be hard coded into the class, with default values. If it
senses any configs that are not part of its default set, it will throw an
error and the Locomotive will not start up. Presently, there is no means
in the Locomotive to change the value of config vars after startup, but the
necessary method in LocoConfig has already been written. All that needs to
be added is a user interface.
View the javadocs for this class
org.locomotive.steam.CachedPageLoader
This class acts as both the cache and the evaluator for Steam embedded templates. The cache is static, and instantiation is lightweight, so presently we instantiate a new CachedPageLoader for each template evaluation. When a template is requested that is not in the cache, the CachedPageLoader will load the file, then parse the Steam commands into a tree of expressions and cache that. So the load-heavy string parsing of the file only occurs once.
The functionality of each of the commands of Steam is implemented through a series of classes which each handle a different expression. All the classes are contained in the org.locomotive.steam package. Each of the classes subclass MixedExpr and override two methods to implement the functionality for each expression:
The parsing of the template file into an expression tree is handled by
the inner TemplateCacheEntry class. This class makes wide use
of the GrowableChar class, which operates more or less like a
StringBuffer. Conversion of the code to use StringBuffer will make it a
fair amount more readable, as the buffer overflow checking can be
removed. The routing of expressions to classes is handled in the
parseSpecialExpr() method. Right now this goes
through a giant if-else loop for all the support expressions. It can
probably be made more efficient by turning it into a Hash table lookup
instead. Doing that would pave the way for dynamically loaded Steam
extensions, too.
View the javadocs for this class
org.locomotive.server.DBConnectionPool
The DBConnectionPool class is a subclass of
ObjectPool
class that maintains a set of JDBC Connections. The minimum and maximum of
connections to maintain in the pool is set in the constructor, and the
minimum number of connections is opened in the
initialize() method. When a request for a connection is made, it will
check its pool and return a connection, if there is one available. If
not, it will open a new connection, as long as the number of open
connections is less than the maximum number to open. If a connection
cannot be opened, or if the maximum number of connections are already open,
the DBConnectionPool will ask the requesting thread to wait for an
indefinite amount of time. When we've had hanging problems with the
Locomotive in the past, it's almost definitely been because of this.
Adding a timeout or some other safety catch would certainly add to the
robustness of the Locomotive.
View the javadocs for this class
org.locomotive.server.ThreadPoolManager
The ThreadPoolManager opens up a set of threads during construction, and
feeds them out to the Locomotive when they are requested. If there are no
threads available for a request, a new one is created, up to the maximum
set in the constructor. This class also requires requesting classes to wait
until threads are available, though this is not so much of a worry as
Thread creation is internal to the VM, unlike Connection creation, and so
is much more reliable.
View the javadocs for this class
org.locomotive.server.Log
The Log class provides simple logging to a file. The Log uses the
notion of log levels. Calls are made to the log with a certain number, and
if the number is lower than the log level, then whatever was sent in the
method will be written to the file. Writing to the log is synchronized, so
that lines won't get munged by simultaneous requests. The directory the
log lives in and the log file name are specified in the constructor. If the
log file name contains a '%', then that character is replace with the
current date, in 'yyyy-MM-dd HH:mm:ss z' form. If the day rolls over while
the log is still active, then the log will close the file it started will
and open a new one with the current date in its filename, which takes care
of log rolling nicely. If log rolling is not desired, then the '%'
character can be removed and the log will continue to write to the file it
opened during construction. This class is used by both the main server log
and the event log.
View the javadocs for this class
These are the classes most often used by modules during execution. The Locomotive handles creation of each of these objects, and each are passed to modules in ways specific to the service the module implements.
org.locomotive.loco.Session
The Session object handles Session management for the Locomotive. This includes both database storage and retrieval, as well as creation of new sessions and expiration of old ones. Session implements javax.servlet.http.HttpSession, and provides all the required session access and manipulation methods required for Servlets.
Sessions are tracked by the Locomotive using cookies- we don't currently have a way to track sessions using URLs. The Session cookie is by default a temporary cookie that is not saved to the browser's cookie file, and expires after one hour of inactivity. The Cookie is called LOCO_SID_SYSTEM_TAG where the SYSTEM_TAG is value set for that config variable in the loco.conf file. The value of the Cookie looks like SID_SRID, where SID is the session id- a unique long identification number, and SRID is a large random integer generated when the session is created and used as a security precaution against session spoofing techniques. You can change config variables to allow the session cookie to be permanent, and to set the amount of time sessions should expire. By default, when the Locomotive senses a session should be expired, it flushes the session cookie, and sends a 'session expired' error page. If you'd prefer to let the modules handle their own session expiration, you can turn of auto session expiration by setting the config variable LOCO_SESSION_AUTOHANDLE to "NO". The code for expiring sessions unfortunately lives in the RequestManager, is handled in the handleSessionExpiration() method.
Currently, the Session class interacts with two tables in the database:
LOCO_SESSIONS and LOCO_SESSION_OBJECTS. LOCO_SESSIONS contains all the
information specific to each session: the id, the expiration time, the
userid to be associated with, if any, etc. The LOCO_SESSION_OBJECTS table
contains serialized Hash tables of the objects associated with each session.
For details on the schema for these tables, see below.
Right now we provide users the option of writing the session objects on
every request, or only on startup or shutdown. We think there are better
ways to do this, so that's liable to change.
View the javadocs for this class
org.locomotive.loco.User
The User object allows modules to interact with a persistent user across requests. Our user objects is pretty bare bones- all it contains is a username, userid, password, a flags field, and some timestamp information. We've found that user information changes enough from application to application that it seems to make more sense to keep only the required fields in the core Locomotive table and let developer's create their own user table to use on top of it, then to try to figure out all the fields developers would want and add them. We moved the most of the information we thought would be needed for users into the User Management System, which provides another table and administration access to it.
New users are created by registering with the Locomotive. To register, you have to make sure the username is unique, and then you use one of the constructors, which will automatically add the user to database.
Once people have created a user and want to return to the site and access that user, they must log in. Logging in requires two steps:
If a user has been already associated with a session, it is retrieved
from the database by the Session object during request initialization. The
user is stored in the session objects hash table with the key
'locomotive.user'. If session objects caching is turned on, the user
object is cached along with the session objects table.
View the javadocs for this class
org.locomotive.loco.Response
The Response object is used to send information back to the web server,
which is in turn sent back to the client. The Response class has a number
of constants that represent each of the kinds of protocols used for the
various tunnels. The reply_type
instance variable is set to
one of these values by the RequestManager during request preparation, so
the Response object knows how to encode the response it sends. Response
headers are set using the addHTTPHeader()
method, with a
couple of notable exceptions, like addCookie()
and
setContentType()
, for example. The response body is set using
the addString()
or addBytes()
methods.
This class uses an internal OutputStream class which caches the response
body, to allow both setting headers after addString()
is
called, and to send the response in one large chunk for efficiency. You
can flush the response using flush()
; this will automatically
send the set headers before the body is sent.
View the javadocs for this class
org.locomotive.loco.handler.Handler
The Handler interface is a very simple set of methods classes must
implement to be executed as handlers by the Locomotive. Currently, there
are only three methods to implement: init()
,
handleRequest()
, and shutdown()
. The Locomotive
does not keep instantiated handlers between requests; instead, it simply
creates a new handler each time one of these methods are to be called.
Because of this, you don't have to worry about making instance variables
multi-thread safe; however, you also can't conserve data between requests
in instance variables.
View the javadocs for this class
org.locomotive.loco.handler.HandlerData
The HandlerData object is a wrapper object for all the things we could
conceive one would need during a request. It holds essential locomotive
resources like the Session and User objects, and more rarely used objects
like the Socket connection to the web server and the CGI environment. We
left all these objects as publicly accessible variables so developers
wouldn't have to go through the trouble of using accessor methods to get
and set them.
View the javadocs for this class
org.locomotive.loco.handler.HandlerRoutingTable
The HandlerRoutingTable is a subclass of the ServiceRoutingTable. There
isn't much to it except for the routeRequest()
method, which
does the following:
routeToClass()
method to access the
Handler class to invoke.
The Locomotive support version 2.1 of the Servlet API, which includes a couple of new features, like support for Servlet includes and forwarding, and request attributes. The kinks are still being worked out for some of these features, so don't be surprised if you see some bugs pop up here and there.
org.locomotive.loco.servlet.ServletRoutingTable
The ServletRoutingTable is the center of the Servlet support for the Locomotive. It handles the loading of Servlets and routing of requests to Servlets. It also implements the ServletContext and HttpSessionContext interfaces of the Servlet API.
Servlets are instantiated when each entry in the ServletRoutingTable is loaded. If the Servlet implements SingleThreadModel, then a ServletPool is created with a bunch of those Servlets, so that no single Servlet will be used for more than one request concurrently.
The routing is done inside the aptly named routeRequest()
method. Here's what it does:
routeToService()
method. If the Servlet to be invoked implements SingleThreadModel, then
the object returned is a ServletPool, and the Servlet to be invoked is pulled
from the pool. Otherwise, an instance of the Servlet is returned.
service()
method.
The ServletContext implementation has a number of areas that are worth talking about.
First of all, thanks to the new 2.1 version of the Servlet API, a number of methods
have been seemed security risks and now are supposed to return null or empty enumerations.
These are the getServlet()
and getServletNames(),
and
getServlets()
methods, as well as the HttpSessionContext methods:
getIds()
and getSession()
.
The implementations of a number of methods of the ServletContext class are
not as robust as they should be. Methods like getMimeType()
,
getResource()
, getResourceAsStream()
and
getRealPath()
should be used with care. Please check out the
Servlet Project Page for the latest information.
View the javadocs for this class
org.locomotive.loco.servlet.LocoServletRequest
This class provides all the methods required for both the ServletRequest
and HttpServletRequest objects. It also provides access to the various
objects the Locomotive provides for each request, like a database
connection, tokenized URL, and browser. You can access those via the
getAttribute()
method, passing in the appropriate key.
Form data is parsed whenever getParameter()
or
getParameterNames()
is called. The Servlet API's HttpUtils are
used to parse the form unless it is a multipart form, in which case
org.locomotive.server.FormParser is used.
If a RequestDispatcher is created, then the RequestDispatcher will call
the generateForward()
method, which creates a HandlerData
object with all this request's information. This object is passed by the
RequestDispatcher object to the Loco.redirect() method, which is how
forwards are presently handled.
As of now, the getRealPath()
and
getPathTranslated()
don't return anything useful. They should
probably return the document root for the web server, with the tunnel token
and service token stripped out- any suggestions on this would be great.
View the javadocs for this class
org.locomotive.loco.servlet.LocoServletResponse
This subclasses the Response class to implement the ServletResponse and
HttpServletResponse interfaces. It provides just about everything any
given Servlet would need, with the following exception:
encodeURL()
and encodeRedirectURL()
don't put
session information in the URL. This should be added as soon as possible,
as URL-based state information is part of the Servlet API.
The generateInclude()
method is called by
RequestDispatcher.include()
. It returns an instance of the
inner CrippledResponse class which contains all the response information,
but won't let the Servlet set headers.
View the javadocs for this class
org.locomotive.loco.servlet.ServletPool
The ServletPool is basic subclass of org.locomotive.util.ObjectPool that creates
a pool of instances of Servlets. The ServletRoutingTable will create a ServletPool
for a particular Servlet entry if that Servlet implements the SingleThreadModel.
View the javadocs for this class
Here we'll run through the order and logic that drive the creation of the Locomotive objects during startup. We'll attempt to list a method call for each step, so you can follow easily in the code. Most of the startup code is in the Loco class; we'll note steps where that's not the case. Since the startup process is fairly linear and without any major branching, we'll just provide a list:
That's the general gist of the startup procedure. Note the dependencies: the database connection pool is started up before the modules in case modules want to access the database to load caches or set up other state. Everything in the startup is single threaded. Only after everything else is initialized do we create the ThreadPool. So there shouldn't be any concerns about accessing shared resources during this portion of the Locomotive's life cycle.
A request for shutting down a Locomotive can be received in one of two ways:
srv?shutdown
URL
If the request comes in from the URL, then the Loco.setStopAccepting() method is called. This will set a flag that will cause the Locomotive to begin shutting down when the next request is received.
When the stoploco application sends a shutdown request to the application, it connects to the locomotive via a special protocol. The RequestManager, recognizes the protocol at the beginning of the readCGIEnvars() method. If a shutdown request is detected, then the attemptShutdown() method is called. This will discern where the request came from, and match it with the list of IP addresses and subnets loaded at startup. If it matches, then, like above, the Loco.setStopAccepting() method is called.
After the Locomotive detects the shutdown flag has been set by Loco.setStopAccepting(), is closes the socket and calls its shutdownAll() method, which does the following:
And that's it. The shutdown procedure is designed to allow the Locomotive to finish handling any requests that are currently being handled when the shutdown request is received. Notice, however, that requests which take longer that 15 seconds to handle may be cut short. We may want to change this for the future.
Here we'll start with a general description of what the tunnel does when it receives a request from the web server. Then, we'll talk about what happens behind the scenes to get everything ready for module to execute, and run through how modules are chosen by the routing classes and executed. After that, we'll spend a little time going over the oft-used methods modules call while handling requests, and finally, we'll talk about how the request gets sent back to the tunnel and what resources get collected.
The Tunnel
Though each of the tunnels handles the request somewhat differently internally, they all follow a standard pattern. First, the CGI environment is collected into a table. Then a Locomotive is chosen based on virtual session state and a Locomotive list supplied to the tunnel. The CGI tunnel requires that the list be hard coded into the tunnel and recompiled. The Apache and NSAPI tunnels make use of a config file, whose format and description can be found in the Tunnel Documentation.
Next, a socket is opened to the Locomotive. The first thing that gets sent to the Locomotive is the protocol version the tunnel is using. The CGI tunnel and the Apache tunnel both use a protocol where each environment entry is concatenated into a string with the format 'NAME=VALUE'. For each entry, the length of the string is sent, then the string itself. The NSAPI tunnel uses a slightly different protocol, where the name and the value of each entry is sent separately, preceded by their respective length. All the tunnels use a buffer for efficiency.
If there is any form input, then that is written to the socket next, using the CONTENT_LENGTH HTTP header as the length. Then the tunnel waits for a response from the Locomotive.
Request Preparation
When the Locomotive gets an accept on the socket it's listening on, the first thing it does is pull a thread out of the pool, instantiate a new RequestManager object, and have the thread start, which calls the RequestManager's run() method. The run() method goes through the following steps:
Request Routing
Both the HandlerRoutingTable and the ServletRoutingTable use the module URL token to find the correct module to invoke. Beside this, they handle things fairly differently. You can read about what they do by looking at the ServletRoutingTable or HandlerRoutingTable descriptions above.
Request Cleanup
Request cleanup presently consists of only a few steps:
The first three are handled by the RequestManager class in the
doCleanup()
method. The last one is done by the PooledThread
class upon completion of its runnable's run()
method.