Contacts in the Browser video
We put together a video where I explain how Contacts works with your browser. Enjoy!
We put together a video where I explain how Contacts works with your browser. Enjoy!
In version 0.3 of Firefox Contacts, I wanted to explore how the web browser could gather more information about a person to help create a personalized contact page.
What the software now does is this: When a contact page is loaded, either through the "Contacts" management screen, or by loading a person: URL, Firefox invokes a set of discovery modules on it. Each of these discovery modules inspects the person and tries to decide whether it can add anything to the person; in many cases, this fires off secondary network requests, which, when they complete, can start the whole ball rolling again. So what you see is little bits of information coming into the browser as more links are found.
To get specific, here's the list of discovery modules that are currently supported:
http://microformats.org/profile/hcard, discovered through webfinger, or because it is in the "urls" list of the record and is known to be an HCard provider - right now, I've got digg, twitter, status.net, blogger, and linkedin tagged, but that list could certainly grow. The HCard importer is built in to Firefox and should be able to parse all of the metadata defined in the spec, though not all of it is visualized on the contact page yet.Want to hack on it?
If you would like to experiment with automatic discovery of data, you can start right away. Here's a couple ideas:
Our goal is to create rich representations of people on the web that protect privacy and enhance our ability to connect and share with each other. If you have other ideas for how we can do that, please share them here, or on our discussion list..
I just finished a first cut at integrating the new Facebook Graph API with Firefox Contacts!
The integration had these steps:
An aside: The OAuth 2.0 flow involves an "application secret", which makes sense if I'm proving that I am some web service. When I am acting as a user-agent, it doesn't make sense at all, and involves putting something called a "secret" into client-side code, which means that it's not a secret in any sense. Unifying the user's authentication context with the user-agent's authentication context (that is, letting the browser do what the user can do) would fix this.
All in all, it took about seven hours of steady hacking to get it all working. That's much faster than I've been able to do anything with the old Facebook Platform API. Congrats to Bret and the team for that!
Released yet?
I'll be releasing 0.3 tomorrow morning, with Facebook and Yahoo! support, as well as people-in-the-awesomebar and the much improved discovery experience.
Until then, sources are available at hg.mozilla.org/labs/people, as always!
Firefox Contacts 0.2 has been released! This is mostly an infrastructure upgrade release; the database schema was changed to support a more dynamic way of interacting with contact services. I also added support for Portable Contacts, Webfinger and HCard import, which bring the "web of contacts" much closer to reality.
We made a big change to the database schema for this release. In 0.1, we combined data from multiple services into a single document and saved that into the database. While that made it simple to read, we had no ability to remove data from just one service, or to see where data had come from.
In 0.2, we save all the data from each service, with a service name label. These documents are flattened into a single record when the object is read, instead of when it is written. That means we can remove and refresh data on a per-service basis.
Old system:
{
documents: {
default: {
displayName: "John Doe",
name: {
given: "John",
family: "Doe"
},
emails: [
{type: "work", value: "john@work.com"},
{type: "home", value: "john@home.com"}
]
}
}
}
|
New system:
{
documents: {
gmail: {
name: {
given: "J",
family: "Doe"
},
emails: [
{type: "work", value: "john@work.com"},
]
},
native: {
displayName: "John Doe",
emails: [
{type: "home", value: "john@home.com"},
]
}
}
|
With this new capability in the database, we were able to add these new features:
Two new importers were finished for this release: LinkedIn and Plaxo.
The LinkedIn importer is a bit complicated since LinkedIn has a CAPTCHA for contact downloads. We spent some time trying to get the CAPTCHA flow into the Contacts user experience but didn't get it working -- so we are taking advantage of the fact that LinkedIn uses a server-side validity timer for the CAPTCHA. So the user has to answer the question, then import the link again. Messy and not very satisfying but it does work.
Plaxo is a technically interesting case because Plaxo is a pure Portable Contacts provider. We implemented a Generic Portable Contacts import module, and the Plaxo module is a trivial subclass of that.
We fixed some international character issues in the Mac Native address book importer by going to UTF-16 everywhere.
The Discoverer system was completed for this release. A discoverer has a simple API that takes a person and returns a record of new data about that person. We wrote a number of discoverers to demo the system:
http://microformats.org/profile/hcard and loads those.Relatively simple changes to the contacts view in this release. The most interesting new bit is the "Where did this data come from?" view, which just exposes the data structure we create in the database schema change. Also, lots of fiddly CSS changes to get the vertical heights right everywhere.
Other than changes to support the new database, the only change in the auto-completion module was support for more than one email address per contact, which was a much-requested feature.
I'm happy to announce the launch of the Contacts in the Browser project! You can read about it at the Mozilla Labs blog. I've been working on the extension for the last couple months and am glad to get it out in public where people can play with it.
I'll use this space to talk a bit more about the technical underpinnings of the extension.
In essence, Contacts is a local database with some specialized logic to handle duplicate detection, an API for importing records, a browser overlay for form autocompletion, and a security-limited API to query records from web content. I'll take each piece in turn.
The database
The internal database is a Javascript wrapper on the Firefox Storage Service. The Firefox service, in turn, wraps a SQLite embedded database library. We program the database using standard SQL, so we have your standard model of tables, indexes, and queries in there.
Our data model is effectively schema-less. Each person is represented as a GUID and a JSON blob. The people table is declared like this:
CREATE TABLE id INTEGER PRIMARY KEY, guid TEXT UNIQUE NOT NULL, json TEXT NOT NULL
We then have a data-driven scheme to create index tables from the JSON representation, so we have some additional tables and indexes that look like this:
CREATE TABLE displayName (id INTEGER PRIMARY KEY, person_id INTEGER NOT NULL, val TEXT NOT NULL COLLATE NOCASE); CREATE INDEX displayName_person_id ON displayName (person_id); CREATE INDEX displayName_val ON displayName (val);
It's the responsibility of the application logic to update the index tables, but we have some helper methods for that. Right now we're indexing on displayName, givenName, familyName, and emails. Callers (who must have chrome-level privileges; i.e. be an extension, not web content) can insert records with the add() method, or update records with update().
Working with sqlite is sometimes great and sometimes a bit of extra work. We learned early on that we had to be careful with our transactions, because committing a transaction on a laptop hard drive can take 10 milliseconds or more. In an early version, we were opening a transaction for each new person record, so importing an address book with 1000 contacts was taking upwards of 20 seconds. We eventually combined the import into a single transaction, which cut the runtime by about 100x.
The most interesting feature of a contact database isn't bulk import, though -- it's de-duplication. It is very common for users to have many repeated copies of contact data scattered all over their computers and the web. We would like Contacts to ultimately help with this problem, which means we need to be able to merge and combine data from multiple sources.
Our current implementation has a pretty trivial union algorithm. It simply compares the email addresses and full displayName of each person against the whole list, and merges the records if it finds a match. This has several problems:
About the JSON blob
We needed to pick a representation format for the user data that we put into our JSON blob. We settled on Portable Contacts. We have a bunch of thoughts about how to construct a representation system that allows multiple schemas to co-exist, but we needed to pick one to start with, and PoCo (as it is known by it's fans) hits all the right points of openness, adoption, and extensibility.
That said, our current use of PoCo is probably not quite right. We have a notion of multiple documents inside the JSON blob, but we don't have a clean mapping for which service provided what. The next release of Contacts will contain a revision to the schema to handle service attribution better. (Current work on the Mozilla wiki, here)
The Importer system
The second major piece of Contacts is the Importer system. A generic ImporterBackend object is provided as the parent class for implementations of an Importer, which is registered with the PeopleImporterSvc service.
The contract of Importer.import() is pretty simple: it takes a progress function and a completion callback, and does whatever is necessary to get some contacts into the database. Callers are encouraged to call People.add() only once, since it runs much faster that way, and should provide feedback to the user through the callback functions. In our implementation, the messages passed through those functions are rendered into the Contact Manager user interface.
We did a couple importer implementations to get a feel for how they would work.
We also did a Gravatar importer. Since we wrote this one, I've come to think that this is actually our first instance of a new object, which I'm calling a Discoverer. This is an object that, given a person or a piece of personal data, retrieves some other chunk of personal data. In the case of Gravatar, we examine the list of email addresses to determine if any of them are associated with a Gravatar, and, if so, we return the image URL of the Gravatar to the People service.
Discoverers will perform their discover() method on a person and return one or more records containing new data about that person, which can then be merged into the person record. We can do this automatically on all contacts, or interactively when we view a single contact.
There are a lot of interesting possibilities for discovery. We can do service-specific discovery, like Gravatar and Flickr, or generic discovery protocols like WebFinger and the Google Social Graph API. Because the discovery object can run in the user's web context, it can be used for search into restricted social networks, such as to discover a Facebook page for a contact. Coming soon.
The form autocompletion overlay
The extension includes the PeopleAutoCompleteSearch object, which implements the Firefox Autocomplete interface to provide form autocompletion.
One limitation of that interface is that we only get access to the form name. We don't get the type or the rel attribute, so we can't detect email fields in every case. But, since those fields are frequently named "email", "e-mail", "recipient", or "recipients", we look for those names, and pop up the autocompletion if we find a match.
The Content API
The last piece of the extension is a Javascript API. We dynamically extend the navigator Javascript object by using a XUL injector technique, which is an advanced bit of Firefox hackery. What we do is watch for a state that indicates that the page is loaded, and inject a new function into the page before the rest of the page runs. The function is created dynamically and returned as a closure, so we can restrict the scope of access to internal data.
What this means is that when the page calls navigator.people.find(), we can check with the Permission Manager to see whether the user has granted privileges to the Contacts system, and then check the internal Contacts database to see if field-level permissions have been saved. If the permissions are all there, we run the query and return. If they're not, we can pop up a XUL-based modal dialog that puts the user through a permission flow to grant (or deny) access to the contact database.
What's Next?
I think there are a huge range of exciting applications that are enabled by getting people into the browser.
Address book functionality is only the beginning, though it is an exciting step. I imagine email, phone, and physical address auto-completion and hotlinking everywhere. I want websites to stop asking for my credentials, or even an OAuth ticket, at other sites, simply to get access to my friend list.
There are lots basic address book capabilities that need doing: groups, multiple values for image, disambiguation based on common nicknames, per-service refresh (with timestamps to keep track of how stale data is), bulk edits.
In a future release, I will be adding support for hashed email access. In nearly every social networking setup task, there is no need for me to disclose my peers' real email addresses -- instead, I just need to disclose a stable, unique token for them. If a site could simply retrieve the hash of every email address in my friends group, they could discover all the people that have an account on the website, without actually knowing their names or addresses.
The discovery system has some exciting potential applications: by creating a user interface for web-based people discovery systems, we can help decrease the isolation of personal data into islands scattered across the web.
Much work remains to do! Thanks to everybody that's provided positive feedback on the initial release.