On Google Wave – Part 1: Architecture

by Steven Devijver on June 6, 2009 · 8 comments

If you haven’t done so already, watch the 1:20 min announcement and demo of Google Wave. Wave is Google’s attempt at world domination and I welcome it.

I’ll write down my thoughts on the unique features of Google Wave in a few blog posts. I haven’t used Google Wave, in fact I’m basing myself on what’s available online today.

In this post I’ll talk about Wave’s underlying architecture based on what is known today. Architecture is probably the least accessible topic I intent to discuss yet Wave’s architecture is probably its most compelling feature. For me Wave’s architecture definitely deserves some closer inspection.

Wave has an associative memory architecture which is characterized by two important features:

  1. Content residing in the memory is seen by all nodes in the architecture that have to see it as if that content is local.
  2. Any modifications to memory content by any node is guaranteed to be seen correctly by all other nodes, also in the case of concurrent updates.

Associative memory is actually distributed memory, meaning that nodes connected to each other by a network can access the distributed memory as if it was local memory. In other words: associative memory is memory in the cloud.

The best illustration of this associative memory can be seen in the demo: as soon as an e-mail is being written it can already be seen by its addressees, before the author has decided that e-mail is ready to be sent. In fact, in Wave there is no notion of sending a message. The message exists and is seen by all addressees as soon as it is created.

Addressees can see live updates as the authors are typing. This is another illustration of associative memory: what addressees are seeing is their local memory being updated as the incremental update messages arrive. Messages do not have to be sent because they already reside where they have to be: in the local memories of the addressees’ browsers.

Associative memories have been around for many years. The best known implementation is probably the open-source Terracotta (Java). There have been Javascript implementations around as well, in particular Google Web Toolkit implementations.

However, what the Google Wave team has done is remarkable. They have taken the XMPP protocol and have extended it with their own associative memory protocol. You may not have heard of XMPP before, but you’ve probably heard of Jabber or Google Talk. That’s right, XMPP is an instant messaging protocol!

Google Wave’s associative memory implementation actually consists of two integrated architectures: a client-server architecture and a server-server architecture. The difference between the two architectures is that in the server-server architecture updates are sent between servers on behalf of users. The official name for a Wave server is a Wave provider.

It’s not entirely clear from the Wave protocol documentation whether XMPP is used between client and server although it’s probably the case. Regardless, each Wave provider holds one copy of each wave which includes the wave’s history. Each addressee also holds a local copy of the wave.

This is a walk-through of what happens when a client updates a wave in the case where all addressees are hosted on the same Wave provider:

  1. The client sends an update message to its Wave provider which holds the operation that has happened on the client. For example, if one letter was removed at position 10 the client would send this message: “Remove the character at position 10 in wave id X”.
  2. The Wave provider queues this message for later processing. The client doesn’t send any further updates until it receives an acknowledgement from the Wave provider for this message.
  3. The Wave provider processes the message as it pops out of the queue. The provider’s copy of the wave is updated. The Wave provider also broadcasts the same operation to all other clients and Wave providers which have to see updates to the affected wave (the clients of the addressees of the wave).
  4. The Wave provider sends an acknowledgement to the client where the change originated. That client can now continue to send updates.

When the Wave provider processes update messages the affected wave is locked. This means that on the Wave provider updates do not happen concurrently. On top of that, broadcast messages are immediately processed by clients. This means that clients may process other updates while its own updates are pending acknowledgment by the Wave provider.

Critically, each Wave provider holds its own copy of a wave. That means that Wave provider have to process broadcast messages originating from other Wave providers. Broadcast messages from other Wave providers are handled in almost the same way as updates originating from clients:

  1. A broadcast message arrives from another Wave provider and is stored in a queue for later processing.
  2. The Wave provider processes the broadcast message as it pops out of the queue. The provider’s copy of the wave is updated. The provider also broadcasts the same operation to clients it hosts which have to see updates to the affected wave.

Through this architecture, updates between addressees on the same Wave provider never leave that server. This is critical for organizations that want to prevent sensitive content from leaving their private networks. There organizations can host their own Wave provider for their own users so that sensitive content never leaves their private networks. On top of that, messages between servers are encrypted to ensure the authenticity of update messages.

Each user is hosted on one Wave provider. This means that the user connects to the user interface of that server and depends on that server to propagate updates across the Wave space and receive updates from the Wave space. User names are like e-mail addresses: <username>@<domain>.

Google has announced that the Wave provider software as well as client software will be open-sourced. This means that organizations can run their own Wave providers as they now run their own mail servers.

The Google Wave architecture providers a powerful federation protocol that together with the other Google Wave features will prove to be highly valuable for its users. Interestingly, Google has not abandoned the old e-mail architecture where anybody can choose to run their own e-mail server. This makes the federation principle even more powerful since organizations and users can now choose to protect their sensitive content in the security of their own networks.

Google Wave solved an old Google Apps headache where users and organizations didn’t want to store their own content on Google servers. Since Google Wave will eventually also offer spreadsheet and probably also database functionality Google now has a very powerful, federated productivity suite offering.

In the next post on Google Wave I’ll look at Wave’s unified messaging features.

{ 8 trackbacks }

On Google Wave – Part 2: Unified Messaging - Endesha - Where Leaders Meet
06.06.09 at 2:42 pm
On Google Wave – Part 3: Extensibility - Endesha - Where Leaders Meet
06.07.09 at 11:37 am
On Google Wave – Part 5: Version Control - Endesha - Where Leaders Meet
06.07.09 at 8:16 pm
Helical Wave » On Google Wave – Part 1: Architecture - Endesha - Where Leaders Meet
06.08.09 at 1:19 am
Michael Nielsen » Biweekly links for 06/08/2009
06.08.09 at 11:53 am
Google Wave « Blog Pra falar de coisas
06.08.09 at 4:45 pm
On Google Wave – Part 6: Social Network Platform - Endesha - Where Leaders Meet
06.09.09 at 9:26 am
BotchagalupeMarks for June 14th - 08:42 | IT Management and Cloud Blog
06.15.09 at 6:56 am

{ 0 comments… add one now }

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>