<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.3.3" -->
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	>
<channel>
	<title>Comments on: Wisdom of small crowds, part 3: another worker visualization</title>
	<link>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/</link>
	<description></description>
	<pubDate>Tue, 06 Jan 2009 23:50:27 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.3.3</generator>
		<item>
		<title>By: brendano</title>
		<link>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-437</link>
		<dc:creator>brendano</dc:creator>
		<pubDate>Thu, 14 Aug 2008 21:54:00 +0000</pubDate>
		<guid>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-437</guid>
		<description>Oh, as for HITs that take longer -- haven't looked at that too much.  I've only done this timing analysis for a task that's really easy for all HIT's.</description>
		<content:encoded><![CDATA[<p>Oh, as for HITs that take longer &#8212; haven&#8217;t looked at that too much.  I&#8217;ve only done this timing analysis for a task that&#8217;s really easy for all HIT&#8217;s.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: brendano</title>
		<link>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-436</link>
		<dc:creator>brendano</dc:creator>
		<pubDate>Thu, 14 Aug 2008 19:24:51 +0000</pubDate>
		<guid>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-436</guid>
		<description>oops, that code uses a function ("dfagg") from the utility file http://github.com/brendano/dlanalysis/tree/master/util.R</description>
		<content:encoded><![CDATA[<p>oops, that code uses a function (&#8221;dfagg&#8221;) from the utility file <a href="http://github.com/brendano/dlanalysis/tree/master/util.R" rel="nofollow">http://github.com/brendano/dlanalysis/tree/master/util.R</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: brendano</title>
		<link>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-435</link>
		<dc:creator>brendano</dc:creator>
		<pubDate>Thu, 14 Aug 2008 19:21:59 +0000</pubDate>
		<guid>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-435</guid>
		<description>Thanks!  The code is all in R and pretty minimal, so I'll just put it inline here.

This is working off of the new CSV format from the new AMT interface, which has one row per assignment:
&lt;pre&gt;
a = read.csv("amt_assignments_file.csv")
# some lame datetime cleanups - amazon uses strftime("%c"), totally dumb...
lame_convert &lt;- function(x)  strptime(x, "%A %b %d %T")
for (c in c('AcceptTime','AutoApprovalTime','CreationTime','Expiration','SubmitTime'))
 a[,c] = as.POSIXct( lame_convert(a[,c]) )
&lt;/pre&gt;


Here's the parallelism plot:

&lt;pre&gt;
worker_parallelism_plot &lt;- function(a, w_pos=NULL, ...) {
  if (is.null(w_pos)) {
    w_starts = (dfagg(a,a$WorkerId, function(x) min(x$SubmitTime - x$WorkTimeInSeconds)))
    w_pos = rank(w_starts, ties='first')
  }
  
  plot(a$SubmitTime - a$WorkTimeInSeconds, w_pos[a$WorkerId],  type='p', ...)
  segments(a$SubmitTime - a$WorkTimeInSeconds, w_pos[a$WorkerId],   a$SubmitTime, w_pos[a$WorkerId])
  # text(sort(w_starts), 1:length(w_pos), sprintf("%s", 1:length(w_pos)), pos=2)
}


&lt;/pre&gt;


The one-box-per-worker plot in the other post is just

&lt;pre&gt;
library(lattice)
xyplot(WorkTimeInSeconds ~ SubmitTime &#124; WorkerId, data=a)
&lt;/pre&gt;</description>
		<content:encoded><![CDATA[<p>Thanks!  The code is all in R and pretty minimal, so I&#8217;ll just put it inline here.</p>
<p>This is working off of the new CSV format from the new AMT interface, which has one row per assignment:</p>
<pre>
a = read.csv("amt_assignments_file.csv")
# some lame datetime cleanups - amazon uses strftime("%c"), totally dumb...
lame_convert < - function(x)  strptime(x, "%A %b %d %T")
for (c in c('AcceptTime','AutoApprovalTime','CreationTime','Expiration','SubmitTime'))
 a[,c] = as.POSIXct( lame_convert(a[,c]) )
</pre>
<p>Here&#8217;s the parallelism plot:</p>
</pre>
<pre>
worker_parallelism_plot < - function(a, w_pos=NULL, ...) {
  if (is.null(w_pos)) {
    w_starts = (dfagg(a,a$WorkerId, function(x) min(x$SubmitTime - x$WorkTimeInSeconds)))
    w_pos = rank(w_starts, ties='first')
  }

  plot(a$SubmitTime - a$WorkTimeInSeconds, w_pos[a$WorkerId],  type='p', ...)
  segments(a$SubmitTime - a$WorkTimeInSeconds, w_pos[a$WorkerId],   a$SubmitTime, w_pos[a$WorkerId])
  # text(sort(w_starts), 1:length(w_pos), sprintf("%s", 1:length(w_pos)), pos=2)
}
</pre>
<p>The one-box-per-worker plot in the other post is just</p>
</pre>
<pre>
library(lattice)
xyplot(WorkTimeInSeconds ~ SubmitTime | WorkerId, data=a)
</pre>
]]></content:encoded>
	</item>
	<item>
		<title>By: Panos Ipeirotis</title>
		<link>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-433</link>
		<dc:creator>Panos Ipeirotis</dc:creator>
		<pubDate>Thu, 14 Aug 2008 14:44:39 +0000</pubDate>
		<guid>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-433</guid>
		<description>Also, can you post the code for these visualizations? They are pretty cool and very revealing at the same time.</description>
		<content:encoded><![CDATA[<p>Also, can you post the code for these visualizations? They are pretty cool and very revealing at the same time.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Panos Ipeirotis</title>
		<link>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-432</link>
		<dc:creator>Panos Ipeirotis</dc:creator>
		<pubDate>Thu, 14 Aug 2008 14:20:49 +0000</pubDate>
		<guid>http://blog.doloreslabs.com/2008/08/wisdom-of-small-crowds-part-3-another-worker-visualization/#comment-432</guid>
		<description>Excellent demonstration of worker times (both this one and the previous post).

Have you thought of examining more closely the HITs that tend to take longer to complete than the rest? I am wondering if they are "more difficult" than the rest, or if they fall into some specific category.</description>
		<content:encoded><![CDATA[<p>Excellent demonstration of worker times (both this one and the previous post).</p>
<p>Have you thought of examining more closely the HITs that tend to take longer to complete than the rest? I am wondering if they are &#8220;more difficult&#8221; than the rest, or if they fall into some specific category.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
