Design

Starting From Scratch

I have a friend who recently joined a startup in San Fransisco, where they are looking to start development on a product in addition to their consulting business. This prompted a good conversation on what technologies to use when starting a product from scratch with a wide open playing field on choices. There's definitely no one-size-fits-all answer here; the technology choices will be influenced both by the needs/goals of the business and the nature of the application.

The interesting part of the conversation was identifying the questions / criteria to use in guiding the decision making. The answers to these questions definitely vary depending on the growth plans of the business (e.g., is the goal to scale to a big development shop? Is this a small auxiliary effort to a core business like consulting?). Technology choice is about investment. It's the foundation for building something that must be evolved and supported. In my friend's case, time-to-market, cost, productivity, and agility were necessities of the startup environment.

The following questions seem pretty helpful in guiding the decision making, regardless of the particulars of the technology:

  • What skills do I have available right now?
  • Do I have confidence that my technology choices will be able to meet the evolution of the business?
    • Will the application be able to scale?

Pull Based APIs with Iterators and Streams

The use of iterators and streams in the design of an API/interface can be a powerful way to structure APIs for improved system performance, while maintaining simplicity for code that uses the API. These can be used to implement a design methodology, called "Pull Based" or "Pull" APIs. The basic idea behind a pull methdology is, rather than presenting the application code with an object model or a collection of state to be navigated or processed by the application, the application code asks for the data a piece at a time. Some variants of this design methodology allow the application to specify what kinda of data it is interested in, which can be used to push filtering into lower layers to improve performance.

The StAX (Streaming API for XML) is an example of a pull based API, which realizes significant performance gains. In the Telecom Web Services Server (TWSS) product, a pull based API was used to implement SOAP attachment extensions to the WebSphere ESB product. This approach was modeled after the pull based SAAJ SOAP attachment APIs.

The basic idea behind improving performance is to hide the I/O (network or disk) behind the abstraction of the iterator and stream objects. Application code can begin processing in parallel with reading from the network. This is done through the implementation of the API or underlying platform. Each call of the application code to the iterator will block until the data becomes available. As soon as a chunk of data is available, the application can process in parallel while the next chunk is read. If the application code is performing writing, then this mode of processing will form a pipeline. Output can be written as soon as input has been processed. For network based applications, this can be a useful means of processing large amounts of data with limited memory requirements, by pushing the stream of data into the network.

Patterns for Proper Web Service Interface Design

Using proper Web Service interface design can greatly improve application responsiveness and simplify integration logic. Poor design will not only bog down logic, but actually increase the complexity of both client and server-side implementation by requiring both to handle a larger number of possible error cases. This post attempts to capture three key patterns for designing/improving Web Service interfaces. While this article approaches the problem from the perspective of Web Services, many of these principles apply to Web Service design.

Appropriate Granularity:

The traditional interface design paradigms that folks are used to using in OOP and functional languages are often inappropriate for remote interfaces. They favor having many different methods that can be used to query information or state about an objects internals. This is particularly true for data objects. Since each query results in a remote interface invocation, querying information can quickly add up. A better approach is to use the Data Transfer Object (DTO) pattern that is heavily talked about in the concept of remote EJBs. The DTO is a data object that contains all queriable information in a single data object, which is transferred once. The cost of transferring more data in a single shot is marginal compared to the latency cost of going back and forth. This is particularly true for interfaces that allow you to create/update/query/delete data objects.

Using AJAX for JSR168 Inter-Portlet Communication

Most articles around use of AJAX and JSR168 Portlets tend to focus on how AJAX techniques can be used to provide more dynamic visual display capabilities to Portlets. However, AJAX techniques (particularly the use of XMLHttpRequest calls can be used effective to provide inter-Portlet communication in a portable manner. Let me explain what I mean by portable: Full Portlet containers, such as WebSphere Portal Server provide additional APIs beyond the JSR168 Portlet standard to enable inter-Portlet communications. These contains offer a lot of function, but come at the cost of heavier weight infrastructure. Increasingly, support for JSR168 Portlets are appearing in lighter-weight containers. For example, the release of WebSphere Application Server V6.1 last year included support for JSR168 Portlets embedded in web modules (WAR files) in J2EE applications. WAS supports a bare-bones Portlet container, sticking only to the JSR168 specification, and providing no means for inter-Portlet communication. Portlets that are written for WAS 6.1 are considered forward compatible, in that they can be deployed in WebSphere Portal. The converse is typically not true, since WebSphere Portal contains many features that go beyond the JSR. Use of AJAX techniques can provide a design that is both forward compatible between lighter and heavier weight Portlet contains, and should allow for portability between different Portlet containers.

Why It's Worth Optimizing Local Web Service Invocations

When people typically think of Web Services, they think of remote calls that cross system boundaries and typically networks. Web Services is also typically used with the design philosophy of Service-Oriented Architecture (SOA), which suggests encapsulating a bunch of business logic and function as a callable service with a more granular interface. A huge advantage of Web Services is how is supports a loosely coupled componentization model. This can be a boon for developing flexible, scalable systems that support dynamic selection of function at runtime; Web Services can effectively be used to bind together internal components of the system as well as exposing more granular services. This technique of binding together the system also supports a very flexible packaging model. In J2EE parlance, components can be bundled as multiple WAR files in a single EAR or as a set of EARs, where each EAR contains a single and very self-contained component.

The usual concern with this approach of moving one step down in granularity from the service level is one of performance: in the CPU cost and latency of making the Web Service invocation. This is where local Web Services optimization comes in. Within a cluster of application servers, all components can be distributed to each application server; each application server contains a copy and executes all the code for each components. In this example, each component is packaged in its own EAR -this comes with advantages discussed below. When calling shared components, the fronting service or component is configured with a local endpoint is given to the Web Service runtime: http://localhost/foo/endpoint. Each copy of the service code makes an invocation with the localhost endpoint, thereby hitting local code.

What's Your Threading Model?

I've found that time and time again, when developing on top of application servers and frameworks, that many developers do not fully understand the threading model that they are working in and its implications. In a larger development shop, this can be difficult to catch until late in the development process. Typically, the design of the component in question is sound (and goes through the review process), but the implementation architecture (structuring of the code) does not consider these details. The problem is too close to the code and typical unit testing does not simulate a multi-threaded environment. These problems tend to be discovered either through code review or at the start of system test (functional test also typically tests one thread of execution), both of which tend to happen later in the cycle -particularly on projects with tight schedules and resources and which project doesn't fall into that category?

Once discovered, these problems are fixable, but typically the solution is of lower code quality than had the code been written with multi-threading in mind in the first place. By lower code quality, I mean that the code tends to exhibit higher defects and often times lower performance. For example, a developer keeping multi-threaded code would try to reduce the number of locks acquired and time the locks are held by consolidating update to state in a single location in the code in a particular path of execution. A developer unaware of the cost of synchronization (or that synchronization was even needed at all at first) might spread out the updates throughout the code. When the time comes to add locking, locks must be distributed throughout the code, typically resulting inefficient acquiring and releasing of the same lock. In addition to being more error prone, this code is also more difficult to maintain. When another developer takes over the code, it can be hard to surmise the concurrency strategy in the code and the effects of change.

Syndicate content