How Firefox Contacts Auto-Discovery Works


In version 0.3 of Firefox Contacts, I wanted to explore how the web browser could gather more information about a person to help create a personalized contact page.

What the software now does is this: When a contact page is loaded, either through the "Contacts" management screen, or by loading a person: URL, Firefox invokes a set of discovery modules on it. Each of these discovery modules inspects the person and tries to decide whether it can add anything to the person; in many cases, this fires off secondary network requests, which, when they complete, can start the whole ball rolling again. So what you see is little bits of information coming into the browser as more links are found.

To get specific, here's the list of discovery modules that are currently supported:

Webfinger works by converting an email address to an HTTP URL, loading that URL, and looking for Links to other resources. It can be supported on any website by the simple addition of a metadata file at the root of the website. Contacts runs through the list of links it finds through Webfinger and adds them to the "urls" property of the person.
Google Social Graph
Google has a little-known project called the Social Graph API. It is a specialized index of the web that identifies links between "people pages". Their page does a more thorough job of explaining it than I can, but the basic idea is that it identifies links between profile pages and provides a way to search them. It works on public pages only.
The HCard importer works by loading any page that Contacts thinks might contain personal data marked up in the HCard microformat. This could be because it was a Link with a rel of, discovered through webfinger, or because it is in the "urls" list of the record and is known to be an HCard provider - right now, I've got digg, twitter,, blogger, and linkedin tagged, but that list could certainly grow. The HCard importer is built in to Firefox and should be able to parse all of the metadata defined in the spec, though not all of it is visualized on the contact page yet.
Proprietary interfaces: Gravatar, Flickr, Yelp, and Amazon
With these importers, I wanted to demonstrate how a proprietary interface could work. In most cases, they simply traverse the list of email addresses and ask the site whether there is an account with that address. In the case of Yelp, the search only works if you are logged in, because Yelp requires an authenticated session to perform email-based discovery. That is an important point: In every case, the agent performing the discovery is you, not some search engine or social networking site. The browser is the ultimate point of trust for any user-to-user display of data, and by connecting directly from the browser, we can avoid a bunch of privacy-eroding disclosures.

Want to hack on it?

If you would like to experiment with automatic discovery of data, you can start right away. Here's a couple ideas:

Create an HCard-capable page and Google Profiles are HCard-enabled today. If you create a account and associate it with your record (e.g. by adding a URL field to your contact), the information in that field will automatically be added to your record.
Webfinger-enable your site
If you operate a website and host an email address at that site, you can webfinger-enable yourself by simply creating a host-meta file that points to your user directory and placing an XRD file there. You can put anything you want into the XRD file. (This is an interesting enough idea that I'll have to write a HOWTO for it - unless somebody else out there has already done it?) You can password-protect the XRD file if you only want to disclose it to your friends.
Create links between your pages
If you have, say, a Facebook page and a public page somewhere else, you can add a link on each of them that points to the other. That will allow people that can view your page to reach the other page automatically -- so, for example, your Facebook friends can find your Yelp or Netflix reviews.

Our goal is to create rich representations of people on the web that protect privacy and enhance our ability to connect and share with each other. If you have other ideas for how we can do that, please share them here, or on our discussion list..

blog comments powered by Disqus