Paul Kedrosky points to a new feature on Flickr which exposes statistics for meta data associated with members of the service. The graphs below, for example, shows popular point and shoot, and camera phones - the graphs is actually of percentage of members who have uploaded at least one photo to the service using that make of camera.
These kind of stats, however, always need to be taken with the right statistical caution. They have to be interpreted literally - as statistics describing the percentage of Flickr members who have uploaded photos which the Flickr system has identified as being taken by a particular camera or other image capture device. With that description, one needs to ask:
- How accurate is the classification?
- How does this relate to the total population of camera consumers, of which Flickr is a biased sample?
The first question is a potential problem given that Flickr states:
The graphs are only accurate to the extent that we can automatically detect the camera used to take the photo (about 2/3rds of the time). That is not usually possible with cameraphone photos and cameraphones are therefore under-represented.
Which seems to indicate that recall is about 66% (I'm assuming that a structured field is available and that precision is 100%).



Thank you for providing a voice of reason on this one. Looks like the Digg and Reddit user groups lapped this up as if true.
That second source of error you mentioned sounds like the biggie to me - that Flickr users are a far from representative sample of camera owners.
Not to mention the fact that those users who take (or choose to upload) more pictures will be overrepresented.
Posted by: Lee McEwan | December 02, 2006 at 05:08 PM