Saturday, September 24, 2011

How to build Large Web Applications, part I

I've been busy the last two years building fairly intricate web apps for my customers and while it's a fun job it's also by necessity not public in the same way an open-source project is. The only comments I get and discussion I'm involved in are generally from my customers, so it's a bit of a closed loop.

When I met Paul Irish in May this year in San Francisco he asked me,' But what do you actually do?', which was a fair question but took me by surprise. I'm mostly known for .. being that Swedish guy. With some luck people know I prefer Dojo as well, but I guess that's about it.

So what do I do? I build web apps as an independent consultant. Sure, I arrange conferences and user group meetings too, but that's where my beer comes from, so to speak.

This series of posts is both written to share and discuss what I feel are best practices around building large web apps, from a developer perspective, but also to let people know what I do. Possibly with the secretly evil intent of getting new customers if anything I write seems to be useful :)

Let's begin with defining what a large web app is.

A large web app (according to me, but please challenge and/or add to the list of items);

  1. Have composite views (showing both a static status area and a changing subview of items and lists, for example)
  2. Have widgets that are variants of each other.
  3. Have widgets that need to show up in dynamic configurations (e.g. a list of n, or each a always have a b except on Saturdays, ..)
  4. Initializes communication with the server or service and caches some or all information.
  5. Have widgets that need to change states, remember states and react to state changes in other widgets.

This list could go on and on, probably, but these are the most important things off the top of my head.
One of my customers had an existing, successful product that was built in .Net C#. It was a client-server system where the client had the following features;

  • Personal view of things going on, of various sorts.
  • List of personal things, arranged per time period in a tree structure, which showed a grid with the selected things.
  • Library of things of other people, arranged in a tree by both time period, user names and project names.
  • *very* complex search view where list of search queries could be build up dynamically and combined like date, project names but also other domain-specific (and visual) properties.
  • A thing editor which let the user compose a thing that was composed of various parts which could be dragged from a tab to the left and then changed according to their type and rearranged. There were parts like annotatable image, rich text, project reference, and so on.
  • Some of the parts needed access to the system clipboard for reading and writing, and some parts needed to start with and communicate with Excel. Several parts needed file access.
  • Lots of other, minor stuff.

And now, two years later, the Dojo/HTML5 version of the client have about 80% of the functionality of the old client (and some that it didn't) being sold to clients and everything. It's been quite a ride!

I realize that I both want to cover the story of the programs creation and 'how to make large web apps' at the same time. It probably is a bad idea, but let's anyway.

First of all, before digging into specific implementation I would like to talk about bare necessities. Stuff you just can't do without, stuff that if you are indeed left without it you will have to reinvent, poorly, and too late and botch things up completely. Yes, I have, and probably you too. But I didn't on this project(s) :)

Anyhoo, If we take the first list of things that define a large web app, the items leads to certain demands on your tooling and libraries of choice. First of all you need to have a clean widget abstraction to work with, one that clearly modularize your code and your markup. It also help tremendously if you have a widget lifecycle metaphor, which is handled for you and that you can plug in to when and if you need to.

Widget Abstraction

The widget abstraction should, at minimum do the following (lots of lists, sorry);

  1. Separate code and markup, so you have a separate file or string for the markup of the widget
  2. Make it possible to create widgets in classes that derive from each other, so you can have one grid baseclass, then a librarygrid that inherits from the basegrid, et.c.
  3. Completely hide id allocation to elements and sub-widgets. By this I mean that the abstraction need to let you use symbolic names inside the markup (
    ) that are then used inside the widget class in lieu of id lookups.
  4. Use markup replacement, so that parts of the markup is replaced with data in the widget class (
    will be replaced with in the widget before the widget is placed in the DOM)
  5. Have a lifecycle so that when you create a widget, the widget "subsystem" / superclasses will call a number of functions on it that corresponds to the different event like a) init, b) before markup creation, c) after markup creation, d) before widget inserted into the DOM, et.c.

Dojo does all of this (as does several if it's peers (JavascriptMVC, Sproutcore, ExtJS, et.c.), but since the 'everything in a  box' approach is a conceptual hard-sell, people generally want to use smaller and separate libraries that does just one thing, but well.

A good, but a little bit dated list of a couple of markup libraries can be found here. But markup libraries (like Mustache.js) only solves one of the problems.


I'm going to cover a bit how Dojo works here, but please let me know in the comments if you know about stand-alone libraries for these functions. i would think that some are probably part of a larger library's class-system, but still.

When I create a widget using Dojo, I create two files. One is the JavaScript class (obv) whic inherits from another custom widget or directly from dijit._Widget (widget subsystem) and dijit._Templated (markup replacement and symbolic element/subwidget naming). It could look like this;

In the directory "my/custom" the file "widget.js";

dojo.declare(""my.custom.widget", [dijit._Widget, dijit._Templated],
    templatePath: dojo.cache("my.custom", "widget.html"),
    username: "nisse",
    postCreate: function()    
    dojo.connect(this.mybutton, this, function(e)
                       console.log("Someone pressed me button, eh?");        

The file "widget.html";

< div>
     < button dojoAttachPoint="mybutton" >${username}< button >
<div / >

The 'postCreate' function on a widget is called (if present) just before the widget is placed in the DOM, but after all markup replacement have been done. We now have markup that contains a div with a button in it, and the content of the button reads 'nisse', since that was what the class variable 'username' contained.

To use the widget you just have the following things in you code;

var foo = new my.custom.widget();
// Not mentioning argument passing or automatic placement in page stuff. it's just a widget in the wind right now

There a re lots of other things you can do, like automatic event wiring, but I hope this small example is just enough to make you think 'Hmm.. that could actually be kind of useful'.

I am not really touching on the namespacing either, which is a simple things that have had ridiculous positive effects on the development speed and stability. We'll get to that later.

What the widget will look like in the actual page is something like this;

< div>
    <button id="mycustomwidget_0_mybutton">nisse</ button >
<div/ > 

What's actually in the id is not important, the important thing is that the id is never visible to you (unless you really want to, but you don't. Why would you? The only thing you need is to get a reference to the node in question (in this case, to attach an event handler to it).

Also, what may be less apparent, is that when you customer want another widget just like it beside the first, you don't have to worry about widget id lookup code, you know that you abstraction will handle those details and just create a new one and smack it up - no problem.

There are tons of more things I'd like to fit in here, but I have to clean the house while parts of the family is playing Deus Ex (the original) and others are off galivanting with horses.  Oh well..

Oh, right. I'll be doing a series of free, half-day Dojo-training events at Valtech in Sweden beginning the 13/10 2011. I'll post here (and on twitter etc) if you live nearby and want to learn the basics of creating custom Dojo widgets.

See you soon!


Great Britain Shop Information said...
This comment has been removed by a blog administrator.
spartan, the said...

>> A large web app (according to me, but please challenge and/or add to the list of items)...

Obs, something is missing here. Have you seen web apps with one edit box and one button? And they are very popular indeed. I'm talking about search engines.

According to your definition they don't fit into "large web apps" category. It's because you pay attention at client side only leaving server side a dumb part of the application.

Such applications in my opinion are called "web apps" just because a web is a fashionable platform to deploy programs. If "download free app" would be a king then everybody shall discuss how to create yet another OnClickImpl killer app. As it was in VB6 times.

Peter Svensson said...

@spartan; Definitions are hard, especially in a space so crowded with meaning around same or similar words.

WHen I talk about web apps or web applications in the above post, I refer to programs written in JavaScript that live ther lives inside just one page load.

What 'large web apps' means then is just the same kind of applications but with lots of functionality, which translates into lots of functions and lines of code, which either have to be managed, or result in a disatser sooner or later.

And I guess you could say that I'm writing about how to manage that.

You could call the google search page (for example) a web app, but in that case we give different meanings to that name.

spartan, the said...

I see. Such apps are mapped to SPA keyword (single-page app) in my mind. I agree, there is a big challenge to manage them properly... Especially without much of help from compiler, eh... you are a fan of now fashionable dynamic typed languages, so it is not an issue for you ;)

Peter Svensson said...

@spartan - Well, the fact that people have hundreds of files lying about in heaps is the reason I wrote this post.

Also, yes, the static typing is good when I know exactly what I want to do. Mostly, I don't and so I'm forced either to lots and lots of copy-paste and typing - or relying on IDE and XML magic which tend to break down out of my control.

My favorite development environment is actually any of the Smalltalk-derived Squeaks or Pharo where there is no build, there is only the image.

Actually, having had a whiskey, I would argue that every time a file is created, the developer loses. Each file has to be maintained. Either by the developer directly (and then he knows about it) or by a byzantine build system (and then he doesn't). Dynamic languages only use one file per whatever and that file is only text - everything else is generated and/or consumed under the sheets by the VM. No need to manage that. So one can focus on programming rather than massaging the system

spartan, the said...

Squeaks or Pharo: could you compare them with Aptana? I use it when there is a lot of JS stuff to do.

dojo actually helps to organize stuff into at least "dijits". When people get this system, they will more likely use packages as well. After they will start asking themselves: where should I put that coolest little thing now? And so the code organization will be born. Good or bad, it is matter of taste, but at least an attempt to organize it, split into pieces, think about relationships.

Peter Svensson said...

@spartan: You can download pharo and see for yourself here;

But the main difference between the Smalltalk environment and an IDE like Aptana is that the former allow you to change and interact directly with the code, or in other words, everything is 'live' both running and as source code (if you want to look at it) at the same time.

I actually always organize my code in namespaces, since dojo encourages it and shows good examples in its source tree.

For example at one customer I have the namespace'customer.commom.ui.dialogs' where a dozen custom dialogs and their superclasses is put, whereas in 'customer.navigation.ui.widget.library' contains the five widgets that make up a 'library' view, and then of course 'customer.common.model' which houses a hierarchy of models which are created out of server data, and so on.


spartan, the said...

Pharo in Smalltalk-80 :) I wish I were 30 years younger to get it as cool thing :)

Seriously I see Smalltalk as most influenced language that made OOP so popular. It must be in top ten of most important programming languages in this early age of IT. Not the first one because FORTRAN will be above.

But come on... It's year 2011 now. Developers will soon be coding in web-based IDEs.

Peter Svensson said...

@spartan; I was referring to a richer interactivity with the Smalltalk image model vs. the classical IDE. The language could really be any dynamic language. Call it developer experience, if you want.

And, yes, we now have for example, but also Amber and the Livey Kernel for counter examples of dynamic web-based invornments.

And speaking of 2011, it's really depressing that development has turned back towards the waterfall FORTRAN-esque static compile-source drearyness from the epic beauty that the Lisp machines and Smalltalk environments pioneered.

Essentially, though, that model have won if you look at web development.

With firebug (or chrome debugger, depending on taste) open, deguffing the live web page has a similar feel to what Smalltalk, et.c. environments give.

spartan, the said...
This comment has been removed by the author.
spartan, the said...

Peter, I meant coderun as an example; just found cloud9ide looks promising, too. Also ShiftEdit, etc.

Just a short note about web IDEs. Will return to static vs. dynamic stuff some time later.

And yes, I like FFox (with some plugin) and Chrome "live edit" features too.

Peter Svensson said...

@spartan: Yes, there's been web-based IDE's for some time, I agree. The cloud9 approach is taken a bit further though with having a good github integration and live collaboration services.

spartan, the said...

Actually, if source control system integration is good thing or bad thing for web IDE it's a (good) question.

S├ębastien Argobast in his "We don’t need an online IDE!" post puts it under question. It's right to question it because the easiest way to go to new technology is to by copying all the odd baggage with you (i.e. Oracle Web Forms was an unsuccessful attempt to convert desktop application to web application).
But I disagree with S├ębastien thinking that SCS is needed in web IDEs too. Developer's commit should be "transactional", not just "on every save". Just SCS should work behind the curtain, far away from local machine.