<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Directions in Search over Social Media</title>
	<atom:link href="http://probablyirrelevant.org/2008/11/directions-in-search-over-social-media/feed/" rel="self" type="application/rss+xml" />
	<link>http://probablyirrelevant.org/2008/11/directions-in-search-over-social-media/</link>
	<description>Information Retrieval Research and Development</description>
	<lastBuildDate>Tue, 22 Jun 2010 14:21:52 -0400</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Jon Elsas</title>
		<link>http://probablyirrelevant.org/2008/11/directions-in-search-over-social-media/comment-page-1/#comment-520</link>
		<dc:creator>Jon Elsas</dc:creator>
		<pubDate>Mon, 10 Nov 2008 14:39:15 +0000</pubDate>
		<guid isPermaLink="false">http://probablyirrelevant.org/?p=40#comment-520</guid>
		<description>0: no apologies needed!  I think your interpretation is accurate, modulo varying definitions of &quot;anyone&quot; and &quot;everyone&quot; :)

2: I only partially agree on this point.  As you seem to be implying, &quot;social media&quot;, especially the definition I cited, is overly-broad.  Maybe intentionally so.  I like this high-recall (low-precision?) definition, as it prompts us to re-evaluate how we think about existing collections that may or may not generally be considered &quot;social&quot;.  

I do think current collections that are generally considered social media share common organizational patterns:  Individuals typically have a persistent identity that is tied to their contributions to the collection.  This allows us to treat authors as a unit of retrieval, for example.  Objects in these collections often support the attachment of commentary by the community -- tags, discussions threads, blog comments.  

These organizational patterns may not be either necessary or sufficient to define a media as social.  But, I think collections that share these types of organizational idioms should benefit from the same IR techniques that leverage of them.</description>
		<content:encoded><![CDATA[<p>0: no apologies needed!  I think your interpretation is accurate, modulo varying definitions of &#8220;anyone&#8221; and &#8220;everyone&#8221; :)</p>
<p>2: I only partially agree on this point.  As you seem to be implying, &#8220;social media&#8221;, especially the definition I cited, is overly-broad.  Maybe intentionally so.  I like this high-recall (low-precision?) definition, as it prompts us to re-evaluate how we think about existing collections that may or may not generally be considered &#8220;social&#8221;.  </p>
<p>I do think current collections that are generally considered social media share common organizational patterns:  Individuals typically have a persistent identity that is tied to their contributions to the collection.  This allows us to treat authors as a unit of retrieval, for example.  Objects in these collections often support the attachment of commentary by the community &#8212; tags, discussions threads, blog comments.  </p>
<p>These organizational patterns may not be either necessary or sufficient to define a media as social.  But, I think collections that share these types of organizational idioms should benefit from the same IR techniques that leverage of them.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fernando</title>
		<link>http://probablyirrelevant.org/2008/11/directions-in-search-over-social-media/comment-page-1/#comment-492</link>
		<dc:creator>Fernando</dc:creator>
		<pubDate>Sun, 09 Nov 2008 01:48:43 +0000</pubDate>
		<guid isPermaLink="false">http://probablyirrelevant.org/?p=40#comment-492</guid>
		<description>0: I apologize for dwelling on definitions but I&#039;d rather avoid a &quot;you&#039;ll know it when you see it&quot; approach.

So when I hear,

“collective good produced through computer-mediated collective action”

This is how I interpret (text) collective goods and actions,

collective good: a corpus accessible to and of value to everyone
collective action: a corpus modifiable by anyone

Is this accurate?

2: Okay.  I&#039;m really hoping that if two corpora are referred to as &quot;social media&quot;, then they will share interesting retrieval techniques more so than a &quot;social media&quot; corpus and a news corpus.</description>
		<content:encoded><![CDATA[<p>0: I apologize for dwelling on definitions but I&#8217;d rather avoid a &#8220;you&#8217;ll know it when you see it&#8221; approach.</p>
<p>So when I hear,</p>
<p>“collective good produced through computer-mediated collective action”</p>
<p>This is how I interpret (text) collective goods and actions,</p>
<p>collective good: a corpus accessible to and of value to everyone<br />
collective action: a corpus modifiable by anyone</p>
<p>Is this accurate?</p>
<p>2: Okay.  I&#8217;m really hoping that if two corpora are referred to as &#8220;social media&#8221;, then they will share interesting retrieval techniques more so than a &#8220;social media&#8221; corpus and a news corpus.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Elsas</title>
		<link>http://probablyirrelevant.org/2008/11/directions-in-search-over-social-media/comment-page-1/#comment-485</link>
		<dc:creator>Jon Elsas</dc:creator>
		<pubDate>Sat, 08 Nov 2008 18:50:07 +0000</pubDate>
		<guid isPermaLink="false">http://probablyirrelevant.org/?p=40#comment-485</guid>
		<description>Fernando -- 

0: I&#039;m partial to &lt;a href=&quot;http://ir.mathcs.emory.edu/SSM2008/papers/ssm22p-smith.pdf&quot; rel=&quot;nofollow&quot;&gt;Marc Smith&#039;s definition of social media&lt;/a&gt;: &quot;collective good produced through computer-mediated collective action&quot;.  The volume of producers or consumers does not define what is social media, but certainly are useful dimensions to characterize social media.  The &quot;collective good&quot; that I think is the most interesting from the IR perspective are long-lived text artifacts.  Friend networks, &quot;tweets&quot;, link collections, tags all certainly have value, but are (IMO) less interesting from the IR research perspective.

1: I don&#039;t know, but am working on it.  I&#039;ve been in contact with the people at &lt;a href=&quot;http://boardtracker.com&quot; rel=&quot;nofollow&quot;&gt;BoardTracker&lt;/a&gt;, and have been promised some interaction data.  I would love to get query logs form a service like &lt;a href=&quot;http://markmail.org&quot; rel=&quot;nofollow&quot;&gt;MarkMail&lt;/a&gt;, but haven&#039;t yet pursued it.  

I have looked at the AOL query log data a bit, and there are a few message boards that receive 1-2k clicks over the three months of that collection.  This gives a rough idea of kinds of queries that may be served by data in online message boards, but not a very rich picture of the interaction.  

2: See (0) above.  The archived Q/A dialog is what I&#039;ve spent the most time looking at, but certainly not the only worthwhile social media to study from the IR perspective.  Wikipedia comes to mind, where edit history provides another dimension of the collection structure to work with.

I&#039;m attracted to online forums and mailing lists because they have a history of hosting an exchange with experts.  I also think existing tools to search them are lacking.</description>
		<content:encoded><![CDATA[<p>Fernando &#8212; </p>
<p>0: I&#8217;m partial to <a href="http://ir.mathcs.emory.edu/SSM2008/papers/ssm22p-smith.pdf" rel="nofollow">Marc Smith&#8217;s definition of social media</a>: &#8220;collective good produced through computer-mediated collective action&#8221;.  The volume of producers or consumers does not define what is social media, but certainly are useful dimensions to characterize social media.  The &#8220;collective good&#8221; that I think is the most interesting from the IR perspective are long-lived text artifacts.  Friend networks, &#8220;tweets&#8221;, link collections, tags all certainly have value, but are (IMO) less interesting from the IR research perspective.</p>
<p>1: I don&#8217;t know, but am working on it.  I&#8217;ve been in contact with the people at <a href="http://boardtracker.com" rel="nofollow">BoardTracker</a>, and have been promised some interaction data.  I would love to get query logs form a service like <a href="http://markmail.org" rel="nofollow">MarkMail</a>, but haven&#8217;t yet pursued it.  </p>
<p>I have looked at the AOL query log data a bit, and there are a few message boards that receive 1-2k clicks over the three months of that collection.  This gives a rough idea of kinds of queries that may be served by data in online message boards, but not a very rich picture of the interaction.  </p>
<p>2: See (0) above.  The archived Q/A dialog is what I&#8217;ve spent the most time looking at, but certainly not the only worthwhile social media to study from the IR perspective.  Wikipedia comes to mind, where edit history provides another dimension of the collection structure to work with.</p>
<p>I&#8217;m attracted to online forums and mailing lists because they have a history of hosting an exchange with experts.  I also think existing tools to search them are lacking.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jon Elsas</title>
		<link>http://probablyirrelevant.org/2008/11/directions-in-search-over-social-media/comment-page-1/#comment-482</link>
		<dc:creator>Jon Elsas</dc:creator>
		<pubDate>Sat, 08 Nov 2008 17:54:45 +0000</pubDate>
		<guid isPermaLink="false">http://probablyirrelevant.org/?p=40#comment-482</guid>
		<description>Michael -- thanks for the reference.  Its certainly pertinent to this discussion, and an interesting extension to the types of research I associate with with INEX &amp; XML element retrieval.

I agree, many of these collections to which I&#039;m referring could lend themselves to faceted browsing interfaces, but I don&#039;t think those types of interfaces &quot;solve&quot; the problems of information access.  They provide users with control over how to display the data along different (orthogonal?) dimensions, but when there&#039;s 200k different authors, does providing filtering or sorting on that attribute really help?  I would argue that for faceted interfaces to be successful on large data sets such as these, we still need automatic methods for ranking authors, topics, threads, etc.</description>
		<content:encoded><![CDATA[<p>Michael &#8212; thanks for the reference.  Its certainly pertinent to this discussion, and an interesting extension to the types of research I associate with with INEX &#038; XML element retrieval.</p>
<p>I agree, many of these collections to which I&#8217;m referring could lend themselves to faceted browsing interfaces, but I don&#8217;t think those types of interfaces &#8220;solve&#8221; the problems of information access.  They provide users with control over how to display the data along different (orthogonal?) dimensions, but when there&#8217;s 200k different authors, does providing filtering or sorting on that attribute really help?  I would argue that for faceted interfaces to be successful on large data sets such as these, we still need automatic methods for ranking authors, topics, threads, etc.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fernando</title>
		<link>http://probablyirrelevant.org/2008/11/directions-in-search-over-social-media/comment-page-1/#comment-481</link>
		<dc:creator>Fernando</dc:creator>
		<pubDate>Sat, 08 Nov 2008 16:37:00 +0000</pubDate>
		<guid isPermaLink="false">http://probablyirrelevant.org/?p=40#comment-481</guid>
		<description>A few things:

0. Please define social media.  Does it have to do with the ratio of users generating content to those querying content?  Or something else?

1. There are existing forum/mailing list/newgroup search engines.  Do we know anything about how users interact with them?

2. You seem to be focusing on forum/mailing list/newgroup which all have the same discussion/qa/threaded format.  Now what&#039;s the relationship to other social media?  What about these other media make them &quot;uninteresting&quot;?  Is there a formal way to predict this?

Nice post.</description>
		<content:encoded><![CDATA[<p>A few things:</p>
<p>0. Please define social media.  Does it have to do with the ratio of users generating content to those querying content?  Or something else?</p>
<p>1. There are existing forum/mailing list/newgroup search engines.  Do we know anything about how users interact with them?</p>
<p>2. You seem to be focusing on forum/mailing list/newgroup which all have the same discussion/qa/threaded format.  Now what&#8217;s the relationship to other social media?  What about these other media make them &#8220;uninteresting&#8221;?  Is there a formal way to predict this?</p>
<p>Nice post.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael B.</title>
		<link>http://probablyirrelevant.org/2008/11/directions-in-search-over-social-media/comment-page-1/#comment-456</link>
		<dc:creator>Michael B.</dc:creator>
		<pubDate>Fri, 07 Nov 2008 20:56:00 +0000</pubDate>
		<guid isPermaLink="false">http://probablyirrelevant.org/?p=40#comment-456</guid>
		<description>Nice post!

The &quot;orthogonal axes of retrieval&quot; discussed here bring to mind faceted retrieval, which has been discussed for quite a while, but suffers from lack of collections to allow rigorous evaluation --- but this issue probably deserves its own post  :)

Interestingly, the research on book retrieval tries to address similar problems to the ones addressed here: multiple levels of granularity and organization of results by authorship, topic and date.  For example, this poster: http://www2007.org/posters/poster901.pdf</description>
		<content:encoded><![CDATA[<p>Nice post!</p>
<p>The &#8220;orthogonal axes of retrieval&#8221; discussed here bring to mind faceted retrieval, which has been discussed for quite a while, but suffers from lack of collections to allow rigorous evaluation &#8212; but this issue probably deserves its own post  :)</p>
<p>Interestingly, the research on book retrieval tries to address similar problems to the ones addressed here: multiple levels of granularity and organization of results by authorship, topic and date.  For example, this poster: <a href="http://www2007.org/posters/poster901.pdf" rel="nofollow">http://www2007.org/posters/poster901.pdf</a></p>
]]></content:encoded>
	</item>
</channel>
</rss>
