Related Topics: XML Magazine

XML: Article

Is the sky falling and the end near for open source?

Microsoft's vision for XML isn't XML; it's another Microsoft attempt at a Web-based RPC framework

(LinuxWorld) — Does what Microsoft is doing with XML spell the end for open-source office applications in general and OpenOffice.org in particular?

Gary Edwards, a design consultant for Web applications and OpenOffice.org's representative on the OASIS Open Office XML Format Technical Committee, seems to think so. Certainly, he has repeatedly expressed his concern that next-generation Microsoft Office Suites will force an even greater degree of Microsoft lock-in than current releases do.

Edwards is undoubtedly right to be worried. Microsoft's use of XML is sufficiently Microsoft-centric that the price of use will almost certainly include platform-consistency — meaning that all users must be at the same release level for both the Microsoft operating system and Microsoft Office. Because Microsoft Office documents operate a bit like Internet worms, spreading Microsoft Office wherever they land simply because there's no other reasonable way for the recipient to access them, the concern is that early adopters will force their business partners to follow suit and eventually lock out open-source products like OpenOffice.org.

Of course, the open-source community could follow Microsoft's lead to add XML functionality to its products, thus maintaining interoperability, but there are some difficult issues that may make this technically impractical and legally impossible.

The impracticality arises because it's actually much harder to maintain bad software than good — and what Microsoft is promising to do with XML is bad software. Microsoft can throw a few thousand people at keeping the resulting house of cards more or less standing; open source cannot. The legal issue is more difficult, as Microsoft has become increasingly willing to use litigation to prevent open-source programmers from matching its technologies — and matching some of the stuff they are promising will almost inevitably require using the same algorithms and thus expose the programmer to the threat of legal action.

Doomsayers therefore conclude that the sky is falling and the end is near for open source.

I don't think that perception or the arguments behind it are right; on the contrary, Office 1X and Microsoft's vision for XML spell opportunity, not Armageddon, for groups like OpenOffice.org.

To start with, Microsoft is not using XML as the rest of us understand it. Get trapped into thinking that XML is XML, and you're halfway to a "follow blindly" decision; understand the differences and you'll quickly realize that there are better options.

Standard XML defines a class of document type definitions (DTDs) for use in document markup. As such, an XML DTD such as SAML (Security Assertion Markup Language) is fundamentally passive, acting merely as an information container whose known characteristics make its contents easily accessible to a rendering engine.

Unfortunately, this isn't how Microsoft sees XML. In fact, Microsoft's vision is about as related to standard XML as ActiveX was to X Window System.

Microsoft & XML

Microsoft sees XML as a Web-programming language and not as a framework for markup. A Microsoft XML-document processor can actively trigger applications that replace content in the current document with the output of programs running locally or on other machines. That change makes it possible to see XML as a Web-programming language — something that will, when implemented, make it possible to write documents that contain truly active, self-updating components. For example, companies could send out financial statements as Office 1X documents that continually update themselves from the production database behind the source company's financials.

This, of course, is the promise the Unix community responded to with the development of RPCs (Remote Procedure Calls) and distributed file systems in the early 1980s. RPCs turned out to depend on mutual trust. The devolution of Internet-access to the masses required these ideas to morph into more-secure forms, eventually giving rise to the ideas behind specifications for things like CORBA object brokers at the Internet level and JavaBeans, enterprise or otherwise, at the local level.

Even within the Unix community, these secured forms of RPC have been much less widely accepted than expected, in part because the PC got in the way, in part because this stuff isn't easy, but mainly because most companies turned out not to want to do this most of the time.

Recognize that Microsoft's vision for XML isn't XML but yet another Microsoft attempt at a Web-based RPC framework, and you'll probably agree that it's likely to eventually suffer the fate of its predecessors. In the short term, the security issues are going to prove insurmountable. In the medium term, most customers will find they really don't want to do this. In the longer term, Microsoft will be forced to adopt the ideas that the Unix community evolved to effectively manage and use RPCs.

Meanwhile, however, it can do a lot of damage to the open-source movement if enough customers are befuddled into thinking both that it will work and that many other people will want to use it. After all, it is marketing success, not technical success, that counts for Microsoft.

Hype & security

The pre-marketing push appears to be focusing on document security and related applications. Thus, customers are promised the ability to distribute documents that:

  1. The recipient can't copy or forward to third parties
  2. Authenticate the user before allowing themselves to be displayed
  3. Keep their own contents current through live links to remote databases.

There are many users in business and government to whom this sounds like hot stuff because it promises to combine the security of paper documents with the flexibility of network-distributed electronic documents. In reality, however, some of these things cannot be done, and there are far better ways (read: open source) of achieving the others.

It is not possible, for example, to make a document that can be read but not copied. You can make copying harder, but you can't make it impossible as long as the person doing the reading is free to leave the building and talk to others.

It appears that the Microsoft way to make it harder to copy or distribute documents will be based on embedding active XML-controls in the documents. The application reading the document will then first read the authorization component, perform the appropriate online checks, download the decryption key for the text itself and then decrypt and process the remaining XML to produce the document.

For this to work, the user needs to have the right tools and the online-verification process also has to proceed smoothly. For this to happen, all of the required services have to be online at the time the document is opened. The document could not be opened, for example, if the remote server were offline, had its name or address changed or had its software changed substantively. Document life would therefore be limited to that of the shortest-lived library or external resource called from within it.

There could be advantages to putting out documents knowing that inevitable hardware or software change at the source server will leave them unreadable in the hands of remote users. There are risks, too. For example, the next SQL-Slammer worm could render internal documents unusable for the duration of the external network emergency.

The underlying reality here is that the idea of putting the authentication client in the document and then using RPCs to authorize them isn't very smart. In contrast, the open-source solution — regenerating documents at source on an as-authorized basis — is both elegant and effective.

In my Nichievo project (click here for each installment: Part 1 | Part 2 | Part 3 | Part 4), for example, I use Cocoon to do both things. Documents are assembled on request at the Web server, marked up for formatting to meet user needs, given unique internal identifiers, formatted, encrypted and dispatched to authorized requesters identified via a key exchange. This approach takes some minor risks with document integrity but has strong controls on document identification that should make it impossible to pass off a forgery as the real thing. Someone with the right skills and access could steal documents — but not exploit them against the company except through publicity and the resulting minor embarrassment.

Microsoft's big idea

It is hard to believe, however, that the people in charge at Microsoft don't know this. Look at Microsoft's attempt to extend XML into a Web-programming language from a technical perspective; it looks naive at best and suicidal at worst. This is not a combination of characteristics usually associated with the people who run Microsoft. Why do they think this is a great idea?

I've heard a number of opinions, some of which start out from the premise that Microsoft is fundamentally dishonest. For example, people tell me Microsoft is doing this because:

  1. The open-source movement depends on people motivated by technical excellence. They won't be willing to follow Microsoft down this path, thus giving Microsoft back its desktop monopoly.
  2. So far it's just a puppet show — Microsoft issues press releases, which results in unquestioning media acceptance that spreads fear, uncertainty and doubt in the open-source community. Meanwhile, Microsoft doesn't actually need to do anything.
  3. Microsoft has no idea what it is getting into. Remember, this is a company that's never invented anything of substance. Microsoft is run by brilliant marketers who hire some of the best technical people coming out of the schools just to keep them away from companies like Sun. Then Microsoft gives them enough money to keep them hanging in while their potential as technical contributors is ground down by the bureaucracy. Microsoft's history is riddled with ideas like this, superficially attractive but technically stupid things that can be hyped and then dropped. Why do you think it takes 50 million lines of code to deliver the Windows interface?
  4. Microsoft XML isn't part of any deep strategy. It is just another quick fix, like DLLs, that has gotten so far out of hand that Microsoft has to call it a "strategy."

Those kinds of answers have their attractions, but I don't think either malice or incompetence need be invoked when a better-fitting answer is at hand. I think some people at Microsoft see an opportunity here to address what they see as a critical problem in licensing and copy protection.

Problems, problems

Microsoft has a number of serious structural problems. For example:

  • Microsoft's continual failure to expand outside its core markets
  • The open-source assault on its former government monopolies outside the U.S.
  • The fundamental incompatibilities between its home entertainment and enterprise markets
  • The most important daily management problem is revenue and expectations management.

Each time peripheral investment failures and low core-market growth force a price increase to meet revenue needs, that price increase has two unhappy side effects:

  1. It increases the rewards for theft.
  2. It lowers psychological barriers to theft.

Asking people to donate $20 if they like your shareware works in part because people don't want to rip off the little guy and in part because $20 is, for most people, less than the price at which greed overwhelms conscience.

Justifying a few shady copies of Office/XP on Windows/XP is a lot easier. Not only can Microsoft be portrayed as the giant corporate bad guy, but a simple "don't ask" policy can make taking home a PC with unaccounted-for software installed under your organization's general license seem like a perfectly reasonable employment benefit.

To you this may not seem like a big deal. In reality, the U.S. numbers are probably quite small, but it doesn't look that way to Microsoft. To Microsoft, even failure to upgrade on their schedule represents a form of revenue theft to be fought tooth-and-nail.

One of those teeth consists of making common cause with the recording industry. Beyond that, however, I think they see Passport and Palladium as among the first nails and Microsoft XML as the hammer that drives them home.

The big issue in licensing is ensuring that every working copy of the software can be matched to a revenue receipt. Extend the file format used to store Office documents to include mandatory active links that verify licensing and every document transferred from one PC to another becomes a license-enforcement tool for Microsoft.

The attraction

That's the central attraction. Many people buy their Microsoft Windows operating system just so they can run Microsoft Office. Take that franchise away and the company stands revealed as a conglomerator of poor investments; strengthen it, and the company can report further revenue increases.

So that's the vision: Office applications that embed source and destination-license verification into the document-loading process.

Implement that vision via a hypothetical MLVML (Microsoft License Verification Markup language) embedded as one of the default DTDs available to the XML parser and you get a miraculous bit of synergy.

< ?mlvml version="1.0"?>
< Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
< registration description="E2KXPSP2" progid="XP802AC54C" version="1.09" classid="{71f81b28-4695-4220-bd77-c21abaca02cb}" rightsid="{62228809749892649614109586926888399956309606359249805529046}">
< /registration>
< DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
< LocationOfComponents HREF="file:///H:\2KApps\MSOffice%20XP'/>

With this, Microsoft gets an unbeatable framework for Microsoft-license enforcement — and gets customers to pay for it by selling the controls needed to make it work as a fine-grained method for controlling document-distribution.

With it, user organizations can issue multiple levels of document-reading and reproduction rights to their users and enforce them through a pair of closely related XML/DTDs — one of them on a Microsoft document-verification server and one under the user organization's own control. Users whose access is limited to the same Windows technology used to impose the limits can't escape from them. Provided only that the organization has perfect platform consistency, this approach delivers the otherwise mythical non-forwardable, non-printable, read-once document.

The opportunities for synergy here are dazzling. The sales force can honestly trumpet the technology as a security first for Microsoft and probably make money selling Microsoft document verification servers into "Enterprise 6.0" organizations — while using the documents people work with and exchange to force upgrades and license concurrency down to the last PC in the network.

If Microsoft genuinely had no competition in office software, the security failures, complexities and costs associated with trying to make this scheme work wouldn't matter. The bottom line for people who have no choice is that they have no choice.

Microsoft's road ahead

In reality, Microsoft does have competition, and problems "getting there" therefore have significance.

If you were advising Microsoft on this, what would you suggest? I think that the obvious thing for them to do is exactly what they are doing — use a little bit of more-or-less-standard XML in some products while trying to leverage market perception of their inevitable dominance in Office applications to force the open-source community into following them down this road. If the OpenOffice.org people follow Microsoft's lead, they'll consign themselves to a poor second place, and if they don't, Microsoft can just reshuffle the deck and pretend the whole thing never happened.

On the other hand, if you were advising the open-source movement, what would you suggest? In my case, I'd suggest that trying to be a cheaper substitute for Microsoft Office is a losing strategy. In the end, people don't care about cheap. People want better options built around better ideas. That's why Linux won the battle for hearts and minds in the Intel-server arena — not because it's cheaper, but because it's better.

The necessary pieces are in place. The Liberty Alliance defined a framework for document authentication, user identification and access authorization that can be used with SAML now. The Unix smart-display architecture offers highly secure document routing and control to the organizations that need it. Product sets such as Cocoon point the way to secure, on-the-fly document generation and delivery without imposing proprietary restrictions on the user's hardware or software.

One thing that's missing is an easily deployable format gateway. In firewall mode, these gateways would translate all incoming Microsoft Office documents to OpenOffice.org formats on the way in and optionally reverse that transformation on the way out, automatically handling any needed cryptographic functions and third-party server calls in both directions.

In Web-server mode — like the TOM document conversion server at CMU — these would accept documents in one format and forward or return them in the alternative format, except that Microsoft Office formats would not be forwarded to addresses inside the organization.

Format gateways would let organizations with thousands of users participate in Microsoft document exchanges while using open-source products internally and limiting their Microsoft exposure to a couple of racks used to offload conversion processing at the gateway.

Put these pieces together and what open source offers is a better option, thus making the real bottom line on Microsoft's use of XML in Office 1X a simple one: Follow and lose, or continue to diverge and win by offering a smarter alternative that also happens to be cheaper.

More Stories By Paul Murphy

Paul Murphy wrote and published 'The Unix Guide to Defenestration'. Murphy is a 20-year veteran of the IT consulting industry.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.