Programming Opa: Web development, reimagined

MLstate's Opa streamlines Web app development with a single language for client and server, but the bright promise is not without pitfalls

By Rick Grehan

InfoWorld |

Building a Web application today means using a variety of different software technologies, each executing in a different domain. JavaScript, HTML, and CSS in the browser; PHP, Python, Java, Ruby, or the like on the server; MySQL, PostgreSQL, SQL Server, MongoDB, or any of a growing list of database servers as your persistent storage back-end. With Opa, an open source Web development technology from the French company MLstate, building a Web application tomorrow could be much more straightforward -- and safer.

Not only are today's diverse technologies difficult to master, they complicate security. Each boundary between the different domains requires a communication mechanism that passes data between those technologies. And each of these communication conduits can be exploited, as attackers can intercept data or inject damaging information.

HTML5 Deep Dive — [ Also on InfoWorld: 10 programming languages that could shake up IT | 11 programming trends to watch | 12 programming mistakes to avoid | Keep up on key application development insights with the Fatal Exception blog and Developer World newsletter . ]

Opa tackles these issues from a fresh angle. With Opa, you write your Web application as though it were a single-tier program, and the compiler handles the knotty details of partitioning your program and deploying the resulting components to their proper domains. The compiler also builds the communication infrastructure among application components, and that infrastructure is invisibly managed by the runtime. The security weaknesses inherent in today's Web applications are virtually eliminated.

This apparent magic is possible because Opa is not just a Web framework, but an entirely new language, engineered specifically to let you create Web applications as though they were monolithic applications. You program the entire application in the Opa language; you don't soil your hands with JavaScript or SQL. Opa creates the client code, the server code, and the database code under the covers. (Actually, Opa doesn't soil its hands with SQL, either. Its internal database is not SQL-based.)

In addition, an Opa application's monolithic structure is not merely conceptual. All the pieces of your Web application -- the HTML files, JavaScript source, image files, CSS files, and so on -- that would ordinarily be placed in separate directories are bundled by the compiler into a single executable. This makes deployment simple; you only need copy one executable file to the deployment destination.

It also improves security significantly. Your code thinks that the application's directories and files are actual directories and files out on a file system somewhere. But, from the perspective of the outside world, the files and directories aren't there. Someone who manages to access your Web application's home directory cannot manipulate the constituent HTML, CSS, or JavaScript files because they simply don't exist. In addition, the Opa compiler runs a security audit as it builds your application, minimizing the likelihood that you might have inadvertently introduced a client-to-client or client-to-server code injection security weakness. The compiler will not allow foreign code to be inserted into the execution flow at runtime.

Test Center Scorecard

	25%	25%	20%	20%	10%
MLstate Opa 1.0 S3.5	6	8	6	7	9	7.0 Good

Test Center Scorecard

25%

20%

10%

MLstate Opa 1.0 S3.5

7.0

Good

Sounds wonderful, but doesn't this all-in-one structure complicate development? Suppose that, upon executing your application, you discover that you've put the wrong PNG file on one of the Web pages. If everything is stuffed into a single executable, you have to replace that PNG file, then rebuild the whole project, right? Happily, no. Simply execute your application with one of Opa's debug options enabled, and Opa creates an "opa-debug" directory -- a directory that contains all of the application's modifiable files. Replace the incorrect PNG file with the correct version -- which you can do even as the application executes -- and Opa will use the copy in the "opa-debug" directory, rather than the version embedded in the application.

Inside Opa

Opa is written in OCaml, which has influenced the design of the Opa language. Programmers from imperative or object-oriented language backgrounds will face a steep learning curve and will need to grapple with new terminology. For example, a sum type is an entity that, depending on the execution path, can have different values and serves roughly the function of a union in C/C++. The term "pattern matching" refers to the mechanism used in Opa's branching structures, which are roughly analogous to switch() statements, but far more powerful. And all this is on top of Opa's demand that you jettison the habit of explicitly considering which part of an application executes in the browser and which part executes on the server.

Opa is a typed language, and its data types extend beyond the primitives of int, string, and so on, and even beyond complex types like list and record. CSS, for example, is a data type, as is XHTML, and these designations are more than mere language conveniences. The compiler automatically escapes XHTML values, which avoids injection attacks. The compiler can also perform type inferencing, in which it deduces a data type based on the data itself. So you don't have to declare a value as type XHTML; just store XHTML into it, and the compiler will figure it out.

As you might already have guessed, Opa cannot completely avoid all aspects of Web programming. You still have to work with HTML and CSS to construct your Web application's views. The connection between declarations in an executing Opa application and generated XHTML delivered to the browser is provided by what Opa calls an insert. Place the string {x.author} into your XHTML, and Opa understands this to mean that the value defined by the author member of the argument x is to be poured into the resulting XHTML (properly escaped, of course). Handling events -- user actions in the browser -- is equally straightforward. Using a notation similar to inserts, you specify which Opa function is called when, say, an ondblclick (on double-click) event is triggered on an element in a Web page. From your application's point of view, the function that executes as a result of the event is an Opa function rather than a JavaScript function.

An Opa application's initial runtime component -- the element of the application that actually receives HTTP requests and dispatches execution to the proper handler -- is a server. This corresponds roughly to the Web server in a traditional Web application, and it's invoked by calling the server() function. And you can invoke multiple server() functions in a single Opa application. This can be useful, for example, if your application provides both secure (HTTPS) and unsecure (HTTP) access. Your application can invoke two servers, each to manage a separate port.

Similarly, you might invoke multiple servers, each responsible for different sets of requests. When an incoming request arrives, the receiving server inspects the URL and determines whether it's responsible for handling that request. If not, that server can use Opa's internal communication systems to hand the request off to whichever of the available servers is responsible.

Opa also allows multiple instances of an application to run in parallel, and thereby provide application scaling. Requests are load-balanced among the application's instances, and the Opa load-balancer ensures that requests from the same client are sent to the same server. However, although the components that handle scaling are distributed among the application's instances, they are not replicated. Consequently, a distributed Opa application is not fault-tolerant; if one instance dies, the session state of clients served by that instance is lost. As you might imagine, this is an area of intense development for the Opa engineers.

Opa database options

Opa's built-in database works somewhat like a key-value persistent store system. The key is a path that leads down the database tree to a leaf (the value). So, for example, to fetch the value associated with the message001 leaf of a blog database, you might use code that looks like

blogvar = /blog["message001"]

Because the database uses a tree structure, it is schemeless. No data definition functions are required (or even available) for specifying what data types appear where in the database. Nor are there any restrictions placed on the data types that can be stored in the leaves of the database tree; primitives are handled as easily as complex data structures.

The database's snapshot feature is particularly powerful. Preceding a path with a ! takes a snapshot of that path, and this snapshot can be saved in a variable. Subsequent store operations to values on the path will not affect the snapshot, which remains "alive" as long as the associated variable remains in scope.

However, as flexible as Opa's internal database system is, it is meant primarily to support either prototype applications or applications that do not require data-intensive operations. The database does not scale, and scalability is supposed to be one of Opa's outstanding characteristics. For applications that must manage large databases (or that anticipate database scaling issues) the Opa engineers recommend the use of MongoDB, a well-known NoSQL database system that stores data in the form of BSON (binary JSON) documents.

The current release of Opa provides an evaluation API for MongoDB that includes functions for translating between BSON and Opa data types. Opa's MongoConnection library manages connections with a MongoDB server, handling database cursors as well as automatic reconnection and failover on disconnect. Meta operations -- deleting an entire database, for example -- are available via Opa's MongoCommands module.

Full MongoDB support, as well as support for the well-known CouchDB database, is expected in the upcoming S4 release. Note that the MongoDB and CouchDB modules do not replace Opa's existing database API. You can use the Opa path-oriented database side-by-side with either MongoDB or CouchDB.

Long live Opa

Currently, Opa runs on 64-bit x86 Linux systems; Windows and 32-bit versions are in the works. Build an Opa application today, and you could deploy it on just about any cloud provider that supports 64-bit Linux. You could, for example, host your Opa application on Amazon EC2. However, MLstate has partnered with DotCloud to provide specific Opa support. (Information is available at DotCloud's website.) Note that, at the time of this writing, the Opa service provided by DotCloud was considered to be in beta and did not support scaling.

Licensing of Opa is free until revenues from an Opa-based application reach $1 million. At that point, you must contact MLstate for pricing information.

Opa's documentation is still under construction. This makes learning the language particularly difficult, to say nothing of the work involved in finding your way through the functions of the runtime APIs. As an example, while working through the code snippets provided in the short tutorials sprinkled throughout the online manual, I discovered that there are at least three ways to define a function. One is straightforward and very C-like, but the others are more complex. I could find nothing in the documentation (at the time of this writing) that covers this topic, or even tells whether there is any advantage to defining a function one way rather than another.

Possibly Opa's greatest weakness is its lack of a robust IDE and, as a result, its lack of a good debugger. Because Opa's server code runs in OCaml, you can use OCaml facilities to debug server-side code. And given that Opa's client-side code is JavaScript, you could use, say, Firebug to debug that. But considering Opa's central promise is that you don't have to treat a Web application as a client/server creation, having to use separate debuggers for client code and server code feels like a violation. An Eclipse plug-in is available, but it's described as being in development. So, currently, the preferred technique for debugging your Opa application is by using the tried-and-true "debug-by-printf" method.

Opa's fundamental principle -- that building a Web application shouldn't require a developer to wrestle with multiple languages and technologies -- is a worthy goal that I hope MLstate ultimately achieves. Nevertheless, as worthy as that objective is, it demands that you swallow at least one bitter pill: You have to learn another programming language, and you have to hope that the effort you exert in freeing yourself from the tangle of different technologies will be worth it in the long run. In short, you have to hope that Opa will survive as a technology so that the effort you've expended on it isn't wasted. Here's hoping Opa succeeds.

Opa Web framework at a glance

	Pros	Cons
MLstate Opa 1.0 S3.5	Powerful, terse, expressive language All components packaged into a single executable Support for MongoDB, with support for CouchDB underway Free until you make $1 million in revenue	Incomplete documentation Steep learning curve Linux only (64-bit x86) No IDE nor debugging facilities

This article, "Programming Opa: Web development, reimagined," originally appeared at InfoWorld.com. Follow the latest news in programming at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.

Next read this: