Disabling (or enabling) geotagging of photos in iPhone camera

Some people may not be aware that photos taken with the iPhone camera have the GPS coordinates of your location embedded in the EXIF metadata of the file. This means that when you post the picture online or send it to someone, you’re letting them know where you were at a certain place and time. If you’re ok with that then there’s no problem. But if you want to disable it, it’s pretty simple.

The procedure is basically the same as the one I described in this post about enabling Facebook Places. The difference is that once you’re in Location Services you’ll want to make sure “Camera” is disabled. Photos taken from that point on won’t have GPS embedded in their metadata. If you want to turn it back on, just set Camera to “enabled” in Location Services.

Using SSH tunnel & Squid to create a private encrypted proxy for true private browsing (mostly)

I once worked at this place where I got a stern talking-to for viewing non-work-related pages. It was around Christmas and I was doing my shopping online (since I left the house at 7 AM and got home at 8 PM). It’s not like I was farting around all the time. Anyway, the idea that I was being proactively watched by someone with an axe to grind pissed me off, so I decided I wouldn’t give him anything to read.

I don’t have that problem anymore, but I do frequently connect to open wifi points where my traffic can be viewed. I use SSL for things like email, but why even let them see that I’ve gone to nytimes.com?

My solution to both problems was the same: on my Linux box at home, run a proxy server, and pipe all my traffic to it via an SSH tunnel.

Step 1: Install Squid

Since I use CentOS, to do this I just did a yum install squid

Step 2: Configure Squid

Well, the default squid config (/etc/squid/squid.conf) was pretty much fine, although I needed to add an ACL clause so I could actually use the proxy. The LAN in my house is 192.168.1.0/24, so I put these lines in my squid.conf:

acl subnet_192 src 192.168.1.0/255.255.255.0
http_access allow subnet_192

Then start Squid.

Step 3:Create the SSH tunnel

I run Linux, so that’s the syntax I can provide (You can use putty to do this from a Windows machine):
ssh -f evan@public-hostname-of-proxy-server -L 3128:private-ip-of-proxy-server.com:3128 -N
This opens an SSH connection from your local machine (port 3128) to the remote server’s private IP on port 3128 (3128 being the default port on which squid listens). So connections to localhost:3128 will be forwarded over the SSH tunnel to port 3128 on the other machine’s private IP.

Step 4: Set your browser to point to localhost:3128 as proxy server

Well, that’s pretty self-explanatory. In the browser’s options (lots of other apps support HTTP proxies as well – AIM, etc), find the section about proxy settings and set the HTTP and HTTPS proxies to “localhost” and port 3128.

That’s it. To test if it’s working, try going to geoiptool.com and confirm that it shows you as coming from the home machine’s IP.

If you have a strict network admin who’s locked down outbound SSH, you can just have sshd listen on port 80 or 443, which almost everyone allows. A really nosy admin may notice encrypted traffic going to the server and kill it, but… well, I never said it was foolproof. 🙂

Facebook for iPhone – "Places" hangs on "Locating you…"

I decided to see how “Places” stacked up with Foursquare. I reactivated my Facebook account and reinstalled the iPhone app. Went to “Places” and clicked “Check In,” and… nothing. It mentioned something about turning on Location Services. I know I already have that enabled because other apps are using it without problem. Turns out you need to enable Location Services explicitly:

First, go to the Settings app and select General:

Settings -> General
Settings -> General

Select Location Services:

Location Services
Location Services

Make sure Facebook is enabled (On) if you want to use Places. If you want to disable Places, make sure this is set to Off.

Turn Facebook "On"
Make sure Facebook is 'On' to use Facebook Places.

Facebook for iPhone – “Places” hangs on “Locating you…”

I decided to see how “Places” stacked up with Foursquare. I reactivated my Facebook account and reinstalled the iPhone app. Went to “Places” and clicked “Check In,” and… nothing. It mentioned something about turning on Location Services. I know I already have that enabled because other apps are using it without problem. Turns out you need to enable Location Services explicitly:

First, go to the Settings app and select General:

Settings -> General
Settings -> General

Select Location Services:

Location Services
Location Services

Make sure Facebook is enabled (On) if you want to use Places. If you want to disable Places, make sure this is set to Off.

Turn Facebook "On"
Make sure Facebook is 'On' to use Facebook Places.

A less insidious way to use Facebook?

I deactivated my Facebook account a couple of months ago. I just kind of got tired of seeing silly updates from friends and “friends” – people I’d friended but wasn’t really friends with. I was also frustrated by the privacy implications of using such a service: you tell it about yourself, you tell it about who you know and how you know them, you keep adding more information about you and your friends to its huge brain that it’s free to use or abuse however it wants.

I don’t know if I’m anti-“Social” or just antisocial but most of the info streaming into my Facebook feed was just not interesting to me. I could have hidden those people, but then it seemed like it would make more sense simply to remove the connection to them, if I didn’t want to see their updates. I actually went through my list of connections and started removing people – people I knew from high school and hadn’t spoke to since then until they added me on Facebook, and then continued not talking to them, and other people who I knew but didn’t really interact with, online or offline. I didn’t really care about what they had to say and it occurred to me that they didn’t care what I had to say. Why did we friend even each other in the first place? Well, the friend suggester (suggestor?) makes it easy to friend people who are only tangentially related, since its whole purpose is to find new people for you to add.

I remember there was one person from school whom I hadn’t spoken to since probably 4th grade. This person attempted to friend me 5 times on FB (Soandso wants to be your friend…) and each time I clicked “Ignore,” but on the 6th time I finally relented. After 2 weeks of inane updates I unfriended the person. Within a month I was getting requests to refriend. Why? I don’t know you, you don’t know me, what’s to be gained by us pretending to be e-friends?

So I had some fundamental problems with Facebook. In addition to the friending of barely-friends, the feeding of so much information into the Facebook brain was starting to bother me. This is pretty similar to my worries about Google’s reach; basically every bit of information you post to Facebook to share with “friends” is also being added to Facebook’s marketing profile about you and your friends. The more you use the service, the more they know about you. And all those “Like” buttons all over the internet – a way for you to inform your Facebook friends that you like a blog post or news story – those are just a way for Facebook to know what sites you’re visiting. Whether you click the “like” button or not, your browser is loading the button of their servers, which means Facebook is reading your cookie and knows that YOU visited the page. This annoyed me so much that I edited my /etc/hosts file to redirect http://www.facebook.com to 127.0.0.1 (my own computer) where I’m running Apache, so the Like buttons just render as 404 errors now:

But I’m fine with that. I’ve also set my browser to reject all cookies from *.facebook.com. I realize this is just a drop in the ocean of data for Facebook, but screw them. Even with my account disabled they were collecting data about me, and that just pissed me off. But much like Google, Facebook’s tracking ability transcends browsers and computers, since in order to use their service you need to log in, and thus your movements around the internet can be tracked regardless of which computer or device you’re using.

Facebook wasn’t a completely worthless service for me. I found the photo album feature very useful. It was a great way to upload pictures and share them instantly with whomever wanted to see them. In my case this was usually my family plus a few friends. I doubt anything will top Facebook for this because these people are already on Facebook, and for something to come along that’s better at this than Facebook, these people would need to move to the new platform, which as of today doesn’t seem likely.

Photo sharing is the one thing I miss. I haven’t stopped taking pictures but it’s a much clumsier process now to share them with people. I put them in an album in Picasa, upload it to PicasaWeb, set the permissions on the album, send out the invitations. The recipients then have to click on a private link to get to the pictures, and if they want to see them again in the future, they need to dig through their inbox to find the link and click on it again. Not everybody uses Gmail, and even for those who do, this is just a clunky process. With Facebook albums, if the album is shared with someone, all they have to do is click on me and then click on my list of albums to see the pictures. Easy. I’m considering returning to Facebook just to get the photo album back.

So I was thinking that if I could restrict myself to using only the Facebook iPhone app, I’d still be able to take the occasional picture with the phone, upload it for people to see, and not fall prey to the tracking cookie problems I described above, since (I’m assuming) the Facebook app and Safari don’t share data. At least, not yet.

That idea prompted me to write this post in the first place, but as I’ve been writing it it occurred to me that it’s not really a workable plan. If I’m using it I’ll eventually feel the need to login via browser, meaning I’ll have to tear down all the walls I’ve erected – the hosts file entry, the cookie blocking – and I’ll be right back where I was, feeding them all my info and letting them track me everywhere I go. So I guess it’s going to come down to a question of whether or not the costs outweigh the benefits, as it always does.

Unless I can just write a browser plugin to strip the “Like” button from non-Facebook websites. Maybe AdBlock can do this. Hmm… The dog woke me up early today and everyone else is asleep still, and this all sounded a lot better in my head before I started writing it down.

More thoughts on Google's tracking abilities

It all comes down to the cookie.

The Wall Street Journal recently began a series of articles called What They Know, detailing the different pieces of data that online marketing companies have about people as they traverse the web. None of this is really new, especially not to me, since I work in that industry. But I was surprised at some of the data that was present in the cookies right in plaintext:

Now, I don’t know if the above image of a cookie was presented as it was because the reporters didn’t realize that all that was needed to “decode” that cookie was a couple of runs through PHP’s urldecode() and those %25255Es would be converted from their hexcodes to plain old ASCII – %25255E0 -> %255E0 -> %5E0 -> ^0 (caret). Maybe they didn’t know, or maybe they knew but they left it all computery so it looked “scarier” to readers… that green text on black background is usually reserved for movies like The Matrix. Anyway, like I said, what was surprising to me wasn’t that there was that much data being collected, but rather that the data was right there in the body of the cookie, readable by anyone. Even a simple base64_encode would have hidden the contents of the cookie from the casual snooper.

For a while I’ve been thinking about Google’s vast troves of data that go far, far beyond what the average marketer knows about the average web user. Let’s assume you’re… me. You use Gmail, Google, and YouTube on a pretty frequent basis. Google has single sign-on — as it should — so to use any of these services you can (and in many cases, have to) be logged in with your Google Account. This is logical and convenient for the user, but it unlocks huge amounts of information about you to Google. By having you sign in to any of their services, Google’s ability to track you online transcends cookies.

Cookies are small bits of data set by the server on your browser to allow information to persist between sessions. Since it’s set in the browser, it’s implicitly impossible for cookies set in one browser to be used in another browser. This means that if you start Firefox and click around the internet for a while, you’ll accumulate some cookies. If you then exit Firefox and start Safari, and click around to those same sites, you’ll get completely different cookies than those you got in Firefox — from a “tracking” perspective, the person using Firefox and the person using Safari are different people (even though they both happen to be you)1. Also, because cookies are tied to browsers, this implies that cookies set on one computer are bound to that browser on that computer — i.e., cookies in Firefox on computer A have no bearing on what happens in Firefox (or any other browser) on computer B.

Single sign-on knocks down these implicit privacy walls. Assume, again, that you’re me, and you have a Linux laptop at work. At home you have a Linux desktop, a Mac mini hooked up to the TV in the living room, and a Windows laptop. You also have an iPhone. Single sign-on enables Google to track what you’re doing across all of these devices. It’s really quite simple: on each machine you use, if you want to read your email (Gmail) you log in with your Google Account. At that point, Google knows that it’s you using the browser. The value inside the cookie they set in your particular browser may differ, but they know that you’re you. They know what you’re searching for in Google; where you go (by IP address; or, if you allow it, by GPS on most modern smart phones — Google’s Latitude service lets you relay your GPS coordinates to your friends), what kind of email you receive, who you correspond with. And let’s not forget that Google has plastered the internet with ads – over 90% of their revenue comes from advertising, and they bought DoubleClick a few years ago, so any time you go to a site with Google ads on it (which is pretty much all of them), they know it. They own YouTube, so they know every video you’ve watched on YouTube, which ones you’ve “Liked” and which ones you’ve “Favorited.” And, as I mentioned in my previous crazy-guy post, Google is amassing a huge facial-recognition database, so they’ll know everything about you – interests, income, travel habits, friends, what you look like, likes & dislikes. They can probably give a pretty good guess as to where you home is and where your office is just by seeing that between 9:00 AM and 6:00 PM you commonly access the internet from IP 1.2.3.4 and the rest of the time you usually come from IP 2.3.4.5, and simple IP-geo databases can tell them where those IPs are (admittedly, with widely varying accuracy).

The trove of information they have on the average person is actually frightening. The only thing keeping them from completely exploiting this data (assuming they aren’t, for argument’s sake) is their “Don’t Be Evil” philosophy and the shitstorm of bad press (and, one would assume, legal action) that would ensue if they were to do so. I’m not really convinced they aren’t already using all of this data, probably to make ultra-targeted advertising decisions, which seems relatively benign on the face. But the real risk comes when this all falls into someone else’s hands. Google could get hax0red — it’s already happened. Google could get subpoenaed — I’m sure it’s happened hundreds of times already. A new batch of idiots in the Senate could just redefine terrorism and require all Google’s data be handed over daily.

This isn’t strictly a problem with Google, but there aren’t many companies I can think of that have massive ad platforms that also provide services you’re willing to log in to, and the logging in is what allows them to track you across browsers, across computers, across devices, and ultimately in real life.

Oh well. Whatever. I’m a big hypocrite because I can’t imagine not using Gmail or any of Google’s services that I use daily. Sucks to be me, I guess. Even if you “trust” Google, you may not trust what Google becomes 10 years from now, but by then they already know all about you.

1This isn’t completely accurate, because even without cookies there are pieces of data that will be the same regardless of your browser, for example your IP address, which in general is a pretty good proxy for uniqueness, but I’m just thinking about cookies for now.

More thoughts on Google’s tracking abilities

It all comes down to the cookie.

The Wall Street Journal recently began a series of articles called What They Know, detailing the different pieces of data that online marketing companies have about people as they traverse the web. None of this is really new, especially not to me, since I work in that industry. But I was surprised at some of the data that was present in the cookies right in plaintext:

Now, I don’t know if the above image of a cookie was presented as it was because the reporters didn’t realize that all that was needed to “decode” that cookie was a couple of runs through PHP’s urldecode() and those %25255Es would be converted from their hexcodes to plain old ASCII – %25255E0 -> %255E0 -> %5E0 -> ^0 (caret). Maybe they didn’t know, or maybe they knew but they left it all computery so it looked “scarier” to readers… that green text on black background is usually reserved for movies like The Matrix. Anyway, like I said, what was surprising to me wasn’t that there was that much data being collected, but rather that the data was right there in the body of the cookie, readable by anyone. Even a simple base64_encode would have hidden the contents of the cookie from the casual snooper.

For a while I’ve been thinking about Google’s vast troves of data that go far, far beyond what the average marketer knows about the average web user. Let’s assume you’re… me. You use Gmail, Google, and YouTube on a pretty frequent basis. Google has single sign-on — as it should — so to use any of these services you can (and in many cases, have to) be logged in with your Google Account. This is logical and convenient for the user, but it unlocks huge amounts of information about you to Google. By having you sign in to any of their services, Google’s ability to track you online transcends cookies.

Cookies are small bits of data set by the server on your browser to allow information to persist between sessions. Since it’s set in the browser, it’s implicitly impossible for cookies set in one browser to be used in another browser. This means that if you start Firefox and click around the internet for a while, you’ll accumulate some cookies. If you then exit Firefox and start Safari, and click around to those same sites, you’ll get completely different cookies than those you got in Firefox — from a “tracking” perspective, the person using Firefox and the person using Safari are different people (even though they both happen to be you)1. Also, because cookies are tied to browsers, this implies that cookies set on one computer are bound to that browser on that computer — i.e., cookies in Firefox on computer A have no bearing on what happens in Firefox (or any other browser) on computer B.

Single sign-on knocks down these implicit privacy walls. Assume, again, that you’re me, and you have a Linux laptop at work. At home you have a Linux desktop, a Mac mini hooked up to the TV in the living room, and a Windows laptop. You also have an iPhone. Single sign-on enables Google to track what you’re doing across all of these devices. It’s really quite simple: on each machine you use, if you want to read your email (Gmail) you log in with your Google Account. At that point, Google knows that it’s you using the browser. The value inside the cookie they set in your particular browser may differ, but they know that you’re you. They know what you’re searching for in Google; where you go (by IP address; or, if you allow it, by GPS on most modern smart phones — Google’s Latitude service lets you relay your GPS coordinates to your friends), what kind of email you receive, who you correspond with. And let’s not forget that Google has plastered the internet with ads – over 90% of their revenue comes from advertising, and they bought DoubleClick a few years ago, so any time you go to a site with Google ads on it (which is pretty much all of them), they know it. They own YouTube, so they know every video you’ve watched on YouTube, which ones you’ve “Liked” and which ones you’ve “Favorited.” And, as I mentioned in my previous crazy-guy post, Google is amassing a huge facial-recognition database, so they’ll know everything about you – interests, income, travel habits, friends, what you look like, likes & dislikes. They can probably give a pretty good guess as to where you home is and where your office is just by seeing that between 9:00 AM and 6:00 PM you commonly access the internet from IP 1.2.3.4 and the rest of the time you usually come from IP 2.3.4.5, and simple IP-geo databases can tell them where those IPs are (admittedly, with widely varying accuracy).

The trove of information they have on the average person is actually frightening. The only thing keeping them from completely exploiting this data (assuming they aren’t, for argument’s sake) is their “Don’t Be Evil” philosophy and the shitstorm of bad press (and, one would assume, legal action) that would ensue if they were to do so. I’m not really convinced they aren’t already using all of this data, probably to make ultra-targeted advertising decisions, which seems relatively benign on the face. But the real risk comes when this all falls into someone else’s hands. Google could get hax0red — it’s already happened. Google could get subpoenaed — I’m sure it’s happened hundreds of times already. A new batch of idiots in the Senate could just redefine terrorism and require all Google’s data be handed over daily.

This isn’t strictly a problem with Google, but there aren’t many companies I can think of that have massive ad platforms that also provide services you’re willing to log in to, and the logging in is what allows them to track you across browsers, across computers, across devices, and ultimately in real life.

Oh well. Whatever. I’m a big hypocrite because I can’t imagine not using Gmail or any of Google’s services that I use daily. Sucks to be me, I guess. Even if you “trust” Google, you may not trust what Google becomes 10 years from now, but by then they already know all about you.

1This isn’t completely accurate, because even without cookies there are pieces of data that will be the same regardless of your browser, for example your IP address, which in general is a pretty good proxy for uniqueness, but I’m just thinking about cookies for now.

The sinister side of Google's Picasa face tagging

So, let me start by saying that I love Picasa, Google’s photo organization tool. It automatically finds new photos as you add them to your hard drive. It lets you crop pictures, remove red-eye, adjust colors and make a few other basic edits that cover probably 95% of what most people need to do when editing photos. It lets you select a few photos from your library and email them to anyone with just a couple of clicks. It also integrates with Google Earth and Google Maps to show you on a map where a particular photo was taken (for those unaware, GPS-enabled cameras, including many mobile phone cameras [e.g., iPhone] embed your GPS coordinates within the EXIF metadata of the photo, so any person, program or website with access to the image will know the location at which it was taken).

It also has a nifty feature called face tagging. How this works, basically, is Picasa analyzes all of the photos in your library and looks for faces. There’s some algorithm in the program that can recognize that two eyes, a nose, a mouth and maybe some hair is a face. So if you use the face-tagging feature, Picasa shows you a page of faces extracted from your photo library. Initially these photos have no names, but Picasa does some basic grouping of them. For example, it doesn’t know who your Uncle Bob is, but it does know that these 14 photos are all of the same person. The grouping feature isn’t perfect, but it is very helpful when you decide to apply a name to the group of photos – tagging 14 photos instead of one is a great time-saver.

This feature only really becomes useful if you start tagging faces with real names — i.e. if you tag the photos of Uncle Bob by telling Picasa “these are photos of Uncle Bob.” If you facetag enough photos, Google will start “guessing” the name for a particular face, and tagging it automatically. This feature is also not perfect, but I imagine they’re working on improving it all the time.

So, this all happens on your computer, within Picasa. I’m not so much of a tinfoil hat type as to suggest Google’s doing anything in particular with the data on your computer itself. The “problem” as I see it is that when you tag a photo of Uncle Bob, Picasa pulls Uncle Bob’s contact info out of your Gmail contacts. So essentially, you’re tying a face to an email address. As I said, I don’t think Google’s surreptitiously going to use the info that resides on your computer.

But in addition to Picasa, the photo organization tool you run on your computer, Google offers an online photo album service called Picasa Web Albums. This is similar to other services, Flickr being the largest, that offer a simple way to upload photos and share them with others. All users get 1 GB of free storage, and you can buy more pretty cheaply (as of today you can get 20 GB for $5/year). As you might expect from the names, Picasa and Picasa Web Albums integrate very well. If you create an album within Picasa, all you have to do to upload it to Picasa Web Albums is click the “Sync This Album” button. It will then upload all the photos in the album to Picasaweb.

Here’s where the potential creepy part starts. Let’s say you have a photo in Picasa, that you took on August 4th, 2010, at 10:00 AM, and you’ve tagged 2 faces in it: Aunt Alice (alice@gmail.com) and Uncle Bob (bob@gmail.com). Let’s further say that you took this photo with your iPhone, so the GPS coordinates are embedded in the photo metadata. You upload the photo to Picasa Web Albums. Well, now you’ve just told Google the following:

  • What alice@gmail.com looks like.
  • What bob@gmail.com looks like.
  • Where alice@gmail.com and bob@gmail.com physically were (via GPS coordinates) on 8/4/2010 at 10:00 AM

There’s lots of other information you’ve probably also told them, but these are the data that are creeping me out lately. If your album has 20 or 30 photos of Alice and Bob that you’ve tagged with their contact info then Google’s got a pretty good idea what they look like – if the Picasa desktop app is able to guess who people in your photos are based on some algorithm inside it, imagine what Google’s billion-dollar datacenters can do?

In all likelihood, you aren’t the only one uploading photos of Alice and Bob. Other people at other events tag photos of Alice and Bob and upload them to Google, further “teaching” this massive computer brain what Alice and Bob look like (since email addresses are basically internet-wide unique IDs, two photos tagged with the same email address can generally be assumed to be the same person). Alice and Bob may never use Picasa, may not even own a camera themselves, and may not even use Google at all. But at this point Google knows what they look like and where they’ve gone – completely apart from their computer-based activities.

I think facial recognition is going to become huge for marketers over the next decade or so. Picasa offers users a useful feature that seems like it has this sinister other side to it – basically building an enormous crowdsourced facial recognition database, so they’ll be able to identify millions of people right out of the gate. If New York City ever gives Google access to its street cams, Google will be able to track the activities of millions more people without their knowledge or consent. Combine that with the existing knowledge Google has – if your iPhone checks your Gmail account, they know your general location at any given time anyway, just based on IP address – and they can create a pretty accurate (in advertising terms) picture of you. And with facial recognition, it will actually BE a picture of you.

Much is made of Google’s “Don’t Be Evil” motto (and I couldn’t write this without throwing those 3 words in), and I tend to be somewhat of a Google fanboy myself. However, much like government, what you have to worry about isn’t always what the current regime is doing with its power, but what the regime 10 or 20 or 100 years from now will do with it. I’m sure Google has rules about how these data are used, but rules change; rules are broken. If there’s one rule that seems inviolate throughout human history it’s that power corrupts. Knowledge is power. Or something.

Well, whatever. I still love Picasa, it just gives me this creepy feeling sometimes. This stuff is all completely voluntary, nobody is being forced to use any of these features, but like I said, Uncle Bob and Aunt Alice were tagged in a photo by someone else – you don’t need to do anything to have your face added to the Great Google Face Database In The Sky. This is something I’ve been thinking about for a while, but I was prompted to write it down based on Eric Schmidt’s recent comment, “Show us 14 photos of yourself and we can identify who you are.”

The sinister side of Google’s Picasa face tagging

So, let me start by saying that I love Picasa, Google’s photo organization tool. It automatically finds new photos as you add them to your hard drive. It lets you crop pictures, remove red-eye, adjust colors and make a few other basic edits that cover probably 95% of what most people need to do when editing photos. It lets you select a few photos from your library and email them to anyone with just a couple of clicks. It also integrates with Google Earth and Google Maps to show you on a map where a particular photo was taken (for those unaware, GPS-enabled cameras, including many mobile phone cameras [e.g., iPhone] embed your GPS coordinates within the EXIF metadata of the photo, so any person, program or website with access to the image will know the location at which it was taken).

It also has a nifty feature called face tagging. How this works, basically, is Picasa analyzes all of the photos in your library and looks for faces. There’s some algorithm in the program that can recognize that two eyes, a nose, a mouth and maybe some hair is a face. So if you use the face-tagging feature, Picasa shows you a page of faces extracted from your photo library. Initially these photos have no names, but Picasa does some basic grouping of them. For example, it doesn’t know who your Uncle Bob is, but it does know that these 14 photos are all of the same person. The grouping feature isn’t perfect, but it is very helpful when you decide to apply a name to the group of photos – tagging 14 photos instead of one is a great time-saver.

This feature only really becomes useful if you start tagging faces with real names — i.e. if you tag the photos of Uncle Bob by telling Picasa “these are photos of Uncle Bob.” If you facetag enough photos, Google will start “guessing” the name for a particular face, and tagging it automatically. This feature is also not perfect, but I imagine they’re working on improving it all the time.

So, this all happens on your computer, within Picasa. I’m not so much of a tinfoil hat type as to suggest Google’s doing anything in particular with the data on your computer itself. The “problem” as I see it is that when you tag a photo of Uncle Bob, Picasa pulls Uncle Bob’s contact info out of your Gmail contacts. So essentially, you’re tying a face to an email address. As I said, I don’t think Google’s surreptitiously going to use the info that resides on your computer.

But in addition to Picasa, the photo organization tool you run on your computer, Google offers an online photo album service called Picasa Web Albums. This is similar to other services, Flickr being the largest, that offer a simple way to upload photos and share them with others. All users get 1 GB of free storage, and you can buy more pretty cheaply (as of today you can get 20 GB for $5/year). As you might expect from the names, Picasa and Picasa Web Albums integrate very well. If you create an album within Picasa, all you have to do to upload it to Picasa Web Albums is click the “Sync This Album” button. It will then upload all the photos in the album to Picasaweb.

Here’s where the potential creepy part starts. Let’s say you have a photo in Picasa, that you took on August 4th, 2010, at 10:00 AM, and you’ve tagged 2 faces in it: Aunt Alice (alice@gmail.com) and Uncle Bob (bob@gmail.com). Let’s further say that you took this photo with your iPhone, so the GPS coordinates are embedded in the photo metadata. You upload the photo to Picasa Web Albums. Well, now you’ve just told Google the following:

  • What alice@gmail.com looks like.
  • What bob@gmail.com looks like.
  • Where alice@gmail.com and bob@gmail.com physically were (via GPS coordinates) on 8/4/2010 at 10:00 AM

There’s lots of other information you’ve probably also told them, but these are the data that are creeping me out lately. If your album has 20 or 30 photos of Alice and Bob that you’ve tagged with their contact info then Google’s got a pretty good idea what they look like – if the Picasa desktop app is able to guess who people in your photos are based on some algorithm inside it, imagine what Google’s billion-dollar datacenters can do?

In all likelihood, you aren’t the only one uploading photos of Alice and Bob. Other people at other events tag photos of Alice and Bob and upload them to Google, further “teaching” this massive computer brain what Alice and Bob look like (since email addresses are basically internet-wide unique IDs, two photos tagged with the same email address can generally be assumed to be the same person). Alice and Bob may never use Picasa, may not even own a camera themselves, and may not even use Google at all. But at this point Google knows what they look like and where they’ve gone – completely apart from their computer-based activities.

I think facial recognition is going to become huge for marketers over the next decade or so. Picasa offers users a useful feature that seems like it has this sinister other side to it – basically building an enormous crowdsourced facial recognition database, so they’ll be able to identify millions of people right out of the gate. If New York City ever gives Google access to its street cams, Google will be able to track the activities of millions more people without their knowledge or consent. Combine that with the existing knowledge Google has – if your iPhone checks your Gmail account, they know your general location at any given time anyway, just based on IP address – and they can create a pretty accurate (in advertising terms) picture of you. And with facial recognition, it will actually BE a picture of you.

Much is made of Google’s “Don’t Be Evil” motto (and I couldn’t write this without throwing those 3 words in), and I tend to be somewhat of a Google fanboy myself. However, much like government, what you have to worry about isn’t always what the current regime is doing with its power, but what the regime 10 or 20 or 100 years from now will do with it. I’m sure Google has rules about how these data are used, but rules change; rules are broken. If there’s one rule that seems inviolate throughout human history it’s that power corrupts. Knowledge is power. Or something.

Well, whatever. I still love Picasa, it just gives me this creepy feeling sometimes. This stuff is all completely voluntary, nobody is being forced to use any of these features, but like I said, Uncle Bob and Aunt Alice were tagged in a photo by someone else – you don’t need to do anything to have your face added to the Great Google Face Database In The Sky. This is something I’ve been thinking about for a while, but I was prompted to write it down based on Eric Schmidt’s recent comment, “Show us 14 photos of yourself and we can identify who you are.”