@hugh ...which I realised I didn't actually boost at any point. 🤦♀️
@virtualwolf I must admit I'm having a hard time cranking up the rage machine on this. Mastodon has multiple user-level tools for people who don't want their toots scraped.
I mean, yes, public posts should be anticipated to be accessed by the public in ways you don't expect, but that's no reason not to call out anyone who's being a dickhead with your stuff.
@hugh @fortescue @virtualwolf I only skimmed the paper this morning (and it's now (shock!) been pulled due to privacy issues) but it seemed at first glance that they were using API queries to pull the instance timelines directly, where a reputable search engine would view the pages as a real user would, which include things like, oh for instance, a user's "do not index this content" directives, if present.
They seemed to think this was fine because they did a tiny bit of scrubbing of usernames.
While academic institutions have pretty broad fair use rules in their favour, it's not absolute and there was no indication this group even considered this possibility.
Just because you CAN download a given thing quite legitimately via a web a site, it's not automatically public domain. There's still ownership.
Mastodon.ART — Your friendly creative home on the Fediverse! Interact with friends and discover new ones, all on a platform that is community-owned and ad-free. Admin: @Curator. Moderators: @EmergencyBattle, @ScribbleAddict, @TapiocaPearl, @Otherbuttons, @katwylder