complex grab bag
A week or so ago, Ethan Zuckerman posted an entry discussing aggregator blogs that use whole entries from Creative-Commons-licensed sources. He raises the issue of commercial blogs that take content that’s licensed under share-alike licenses. To use the content, these commercial blogs should also use a Creative Commons license, but, in the case he described, there was no license declaration. He wondered how enforceable these licenses are.
He was talking about a site called usmediaweb.net, which reblogs entries dealing with media issues. This site reposted an entry of mine on YouTube a few days ago, and, although I’d read Ethan’s post, I was too befuddled by the site to worry too much about it, and I was a little flattered to see someone actually reads this thing. I’m using a non-commercial-attribution-license, so if usmediaweb.net is a commercial site, unlike reblog or unmediated, I should have popped them an email and told them to take it down. It looks like they’re in the process of rebuilding the whole site, so it’s a moot issue at this point.
I’m running into a related, but more pragmatic issue. In the past few days, I’ve had a few mysterious sites republish posts whole-hog, then tracking back to my blog. Unlike usmediaweb.net, which republished a thoughtful post people might genuinely find interesting, these posts were lame linkdump entries. When I go to the linking site, I don’t see much other content, and, when I look, there doesn’t appear to be a Creative Commons license either. I haven’t bothered to ask them to take down the post because I suspect this is a form a trackback spam. While nothing on the sites reek of spam today, I’m wondering if they’re using trackbacks to get links to blank pages, and after getting enough inbound links, turning the pages into link farms. I haven’t seen anyone else discuss this phenomenon, but maybe I should be savvy enough to know trackback spam when I see it.


Hi there,
Yes, the first attempt at usmediaweb was an ill attempt to recreate a sort of Google News format. I wasn’t using anyones feed directly, instead I was only using a few generic tags from Icerocket (politics & tech). I was using the feeds mostly as a testing ground for something I had planned later. As far as the Creative Commons problem, that was an issue involved in some legal jargon. If you read the actual language in the creed, you will see it plainly says what you can do with the content. No where does it say anything about the use of feeds. We live in a time where traditional legal jargon is a thing of the past, and organizations such as creative commons need to start addressing those new needs for indepth legal language. I had heard nothing but bad things about the use of the creative commons licensing terms, so I decided to tear the whole thing down (again) and start fresh.
I feel better now that I have done that, I feel like I have control of the site again, and can put what I want on there again. I will be doing a write up on the site soon about the whole thing (the explanations were on the old, now gone, site).
In short, I had never meant any harm to anyones material, and never thought about it as stealing, but sometimes aggregation is just that, stealing.
Hey Bryce, thanks for the response. This is frankly an issue that I’m not particularly worried about, and, since I’m not a lawyer, I can only speculate about the legal issues related to aggregation and Creative Commons. However, one thing I do not want people to do with my content is re-use it on an ad-supported site. I choose not to run ads on my blog, and I don’t want others to make ad money by re-using my content.
Anyway, you say “if you read the actual language in the creed, you will see it plainly says what you can do with the content. No where does it say anything about the use of feeds.” I’m not sure what you mean about the creed, but I’m thinking specifically about the license. If the author uses a non-commercial license, publishers with commercial blogs do not have the license to re-use the content. Secondly, I don’t understand the distinction you draw between feeds and content. From my perspective, feeds are content. Maybe you were only repuplishing the content from a feed on your site, but it looked to me that you had cut-and-pasted the entire entry.
Yes, Bloglines is a commericial service that displays the content of my feeds on its site, but they’re in the business of diplaying the feeds users want to see, rather than aggregating other people’s content. They’re more about distribution and discover than content itself. A second facet to the feed-license issue is that feeds can include a Creative Commons license, and I thought that I was running a Wordpress plugin that included the license in the feed, but I apparently forgot to include it when I migrated my blog to a different server.
Anyway, I’m in no way angry about your reuse of my content, but I think this is a discussion free culture advocates and non-commercial bloggers need to have.
Hello again,
I would just like to start that when I was aggregating pure CC licensed only feeds, none of them were of the non-commercial use type.
I then realized I was wasting my time aggregating only 6-7 or so blogs (since it is hard to find a bunch of cc licensed blogs with good content. After that, I decided that maybe I could aggregate not feeds, but tags. This is where you come in at.
This was my setup:
I pulled only 2 feeds from Icerocket (the tags were ‘tech’ and ‘politics’, pretty broad selection right?).
From there, I then used my factory settings on my xml-rpc (pinging). Of course, I wouldn’t want to ping icerocket would I? Of course not, because the XML schema is exactly the same, and would recognize it as already submitted. So I pinged everyone else (technorati, weblogs, including all the Asian and European ones I could find). Now at that point, I had no control on what was being ‘reblogged’, so a lot of websites, such as yours, got caught in the mix of the CC licensed versus copyrighted versus non-commercial use licenses. The whole thing was a mess. After receiving an email from Dick Costello at Feedburner requested I remove the feed circulation for a certain post, after a number of complaints, was when I realized not only what I was doing wrong, but a lot of it is copyright infringement.
So what do people do when they are on their last leg? Take a day to reflect other options, then act on those decisions. I decided, the entire website is full of copyright abuse, so the entire website has to be killed. Rather than picking through 2,500 posts, it would be easier to slowly disassemble, and then gut the database. So that even if I wanted to use an old post, I couldn’t.
I was using a worpress plugin, which you may know of, that autoblogged anything you wanted from a feed. One of the technical problems with it, is that it always attributed the original post using a livejournal format, the wrong way. 99% of the time, the wordpress authors were attributed properly, but only if they used one tag in the post. There was nothing but problems from the moment of conception. I had well over 2,000 categories, many of which were the same one, my bandwidth skyrocketed (which wasn’t a big deal), and at over 300 posts per day on average, I was killing my database (making the website slower everyday).
I will dicuss the ethical side next, but this comment is turning into a novel right now, so I will leave you with that for now.