8 Responses to “Capture v. Derive”

  1. Long ago Y! Says:

    First of all:

    “impossible to derive them ipso facto.”

    post facto, not ipso facto…  [corrected, thanks.]

    Also:

    There’s a lot to be learned in observing how information is manipulated by both it’s creator and subsequent consumers. Who saves it, who forwards it to who, who deletes it, how long they view it, how frequently they view it and so on.

  2. joe lazarus Says:

    cool concept. can’t wait to give zonetag a spin.

    nice job on the new blog. the posts are great so far. not sure if you realized, but there seems to be some problem with your rss feed in that old posts keep appearing as new content in my blog reader, bloglines.

    thanks again for sharing your thoughts!

  3. LULOP.org [opensource] » Capture vs Derive Says:

    […] Metadata should be added to the video by capture device rather than derived from the video itself with computer vision analysis. Because it s enourmously easier for the camera to capture things relating to the video with approriate sensors than for any algorithm to derive them from the video, if not impossible. read this beautiful post on Capture versus Derive. Oh, and the weblog comes from one Bradley Horowitz of Yahoo, formerly of Virage… […]

  4. LULOP.org [opensource] » Capture vs Derive Says:

    […] Metadata should be added to the video by capture device rather than derived from the video itself with computer vision analysis. Because it s enormously easier for the camera to capture things relating to the video with approriate sensors than for any algorithm to derive them from the video, if not impossible. read this beautiful post on Capture versus Derive. Oh, and the weblog comes from one Bradley Horowitz of Yahoo, formerly of Virage… […]

  5. Abu Hurayrah Says:

    How much are we willing to share about ourselves, though, when it comes to the data we’re contributing? Granted, I realize that I am leaving my browser type, version, OS, time of access, IP address, and so on where I go on the Internet (though I don’t have with certain Firefox extensions able to change the User Agent string and whatnot), the idea that uploading an image to, say, Flickr, is going to record more than I may realize I’m sharing is somewhat disconcerting.

    However, I also realize that I am giving up more information now than I was 5 years ago, and even more than before that. Still, it would seem this rate of private information exchange is accelerating (that is, a 2nd derivative of greater-than one), and we need to simultaneously develop ways to guard what we want guarded.

    I find the concept of metadata very interesting, because it helps us put more data into a form easier-to-manage, search, and index for our computers, but is hidden automation of all of these components really the best and/or only way? Can we do it as sort of an opt-in method, where those with more technical skills can manually tag and index their own content?

    I am referring to the near future, and not so much to the now.

  6. Pete Cashmore Says:

    This is an excellent post – I’m really enjoying your blog.

    In response to Abu’s comment, did anyone see the Slashdot post where the commenters figured out the location of a botnet creator based on the metadata of an image in the Washington Post? More here:

    http://www.techdirt.com/articles/20060221/0318222_F.shtml

  7. Neela Jacques Says:

    Your blog started me thinking about Meta data. It does seem clear that as we capture and attach more meta data the underlying content can become exponentially more useful (the REAL long tail is generated).

    It may be useful however to think about classifying types of meta data. It appears to me that Meta data (for media) might be broken up into 3 separate categories:
    1. Fact (It’s George Bush, taken at 2pm)
    2. Preponderance (most people, but not all would agree on tag….ie. Its a joke,
    3. Subjective/contect dependent (VERY funny, well written article, conservative)

    Comptuers (using just a few data points or regressions etc.) can do a very good job at figuring the first if given a few data points. Second requires more data points (and requires statistics) but again is solvable.
    This last category is IMHO the most interesting. It is a significant challenge. Amazon first then so many others added a ton of value by providing places to find others’ opinions of things we might be interested in (using Metadata is broadest sense of the word here). The problem is that (using the amazon example) I don’t read every book people reccomend to me, but there are some people I know think similarly to me who’s book suggestion have a much stornger impact on me (e.g. buying Crossing the Chasm on Bradley’s suggestion in 1998). The problem is that we care not about does someone like it, but does someone LIKE ME like it. Expanding that to Metadata, some of the most useful meta data would be data that tells me what people most similar to ME think it is…This is much harder for a computer to do…. but again with enough data, for places where it really matters, definitely worth it. The obvious example would be places like NetFlix, Tivo and Amazon, but also blog engines, joke pages etc. Those that are able to add this third layer will have a HUGE advantage on their competitors as they will be able to match people with the content they are looking for, interested in, and most likely to be surprised by.

    (In Heinlein’s The Moon is a Harsh Mistress a computer seeking to understand humor is able to grasp what is funny has the main character rate joke after joke…but does he really understand what is funny? Most humans don’t agree on this themselves….)

  8. thaddaeus brophy Says:

    Most metadata really only passes for meta-blahblah in my opinion. I think Sir Arthur Conan Doyle said something about the probability of meaningfully engineering metadata from primary sources without human intervention: “When you have eliminated the impossible, whatever remains, however improbable, must be the truth.” How virage worked is a great clue of how it can be done however–better hire some talented semioticians–complex grammars aren’t really very easily comprehensible within symbolic languages or the people who have tortured their minds ( ;7 ), to work in them ‘prima facie’, IMHO.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: