Tumgik
#floss
Text
Palantir’s NHS-stealing Big Lie
Tumblr media
I'm on tour with my new, nationally bestselling novel The Bezzle! Catch me in TUCSON (Mar 9-10), then SAN FRANCISCO (Mar 13), Anaheim, and more!
Tumblr media
Capitalism's Big Lie in four words: "There is no alternative." Looters use this lie for cover, insisting that they're hard-nosed grownups living in the reality of human nature, incentives, and facts (which don't care about your feelings).
The point of "there is no alternative" is to extinguish the innovative imagination. "There is no alternative" is really "stop trying to think of alternatives, dammit." But there are always alternatives, and the only reason to demand that they be excluded from consideration is that these alternatives are manifestly superior to the looter's supposed inevitability.
Right now, there's an attempt underway to loot the NHS, the UK's single most beloved institution. The NHS has been under sustained assault for decades – budget cuts, overt and stealth privatisation, etc. But one of its crown jewels has been stubbournly resistant to being auctioned off: patient data. Not that HMG hasn't repeatedly tried to flog patient data – it's just that the public won't stand for it:
https://www.theguardian.com/society/2023/nov/21/nhs-data-platform-may-be-undermined-by-lack-of-public-trust-warn-campaigners
Patients – quite reasonably – do not trust the private sector to handle their sensitive medical records.
Now, this presents a real conundrum, because NHS patient data, taken as a whole, holds untold medical insights. The UK is a large and diverse country and those records in aggregate can help researchers understand the efficacy of various medicines and other interventions. Leaving that data inert and unanalysed will cost lives: in the UK, and all over the world.
For years, the stock answer to "how do we do science on NHS records without violating patient privacy?" has been "just anonymise the data." The claim is that if you replace patient names with random numbers, you can release the data to research partners without compromising patient privacy, because no one will be able to turn those numbers back into names.
It would be great if this were true, but it isn't. In theory and in practice, it is surprisingly easy to "re-identify" individuals in anonymous data-sets. To take an obvious example: we know which two dates former PM Tony Blair was given a specific treatment for a cardiac emergency, because this happened while he was in office. We also know Blair's date of birth. Check any trove of NHS data that records a person who matches those three facts and you've found Tony Blair – and all the private data contained alongside those public facts is now in the public domain, forever.
Not everyone has Tony Blair's reidentification hooks, but everyone has data in some kind of database, and those databases are continually being breached, leaked or intentionally released. A breach from a taxi service like Addison-Lee or Uber, or from Transport for London, will reveal the journeys that immediately preceded each prescription at each clinic or hospital in an "anonymous" NHS dataset, which can then be cross-referenced to databases of home addresses and workplaces. In an eyeblink, millions of Britons' records of receiving treatment for STIs or cancer can be connected with named individuals – again, forever.
Re-identification attacks are now considered inevitable; security researchers have made a sport out of seeing how little additional information they need to re-identify individuals in anonymised data-sets. A surprising number of people in any large data-set can be re-identified based on a single characteristic in the data-set.
Given all this, anonymous NHS data releases should have been ruled out years ago. Instead, NHS records are to be handed over to the US military surveillance company Palantir, a notorious human-rights abuser and supplier to the world's most disgusting authoritarian regimes. Palantir – founded by the far-right Trump bagman Peter Thiel – takes its name from the evil wizard Sauron's all-seeing orb in Lord of the Rings ("Sauron, are we the baddies?"):
https://pluralistic.net/2022/10/01/the-palantir-will-see-you-now/#public-private-partnership
The argument for turning over Britons' most sensitive personal data to an offshore war-crimes company is "there is no alternative." The UK needs the medical insights in those NHS records, and this is the only way to get at them.
As with every instance of "there is no alternative," this turns out to be a lie. What's more, the alternative is vastly superior to this chumocratic sell-out, was Made in Britain, and is the envy of medical researchers the world 'round. That alternative is "trusted research environments." In a new article for the Good Law Project, I describe these nigh-miraculous tools for privacy-preserving, best-of-breed medical research:
https://goodlawproject.org/cory-doctorow-health-data-it-isnt-just-palantir-or-bust/
At the outset of the covid pandemic Oxford's Ben Goldacre and his colleagues set out to perform realtime analysis of the data flooding into NHS trusts up and down the country, in order to learn more about this new disease. To do so, they created Opensafely, an open-source database that was tied into each NHS trust's own patient record systems:
https://timharford.com/2022/07/how-to-save-more-lives-and-avoid-a-privacy-apocalypse/
Opensafely has its own database query language, built on SQL, but tailored to medical research. Researchers write programs in this language to extract aggregate data from each NHS trust's servers, posing medical questions of the data without ever directly touching it. These programs are published in advance on a git server, and are preflighted on synthetic NHS data on a test server. Once the program is approved, it is sent to the main Opensafely server, which then farms out parts of the query to each NHS trust, packages up the results, and publishes them to a public repository.
This is better than "the best of both worlds." This public scientific process, with peer review and disclosure built in, allows for frequent, complex analysis of NHS data without giving a single third party access to a a single patient record, ever. Opensafely was wildly successful: in just months, Opensafely collaborators published sixty blockbuster papers in Nature – science that shaped the world's response to the pandemic.
Opensafely was so successful that the Secretary of State for Health and Social Care commissioned a review of the programme with an eye to expanding it to serve as the nation's default way of conducting research on medical data:
https://www.gov.uk/government/publications/better-broader-safer-using-health-data-for-research-and-analysis/better-broader-safer-using-health-data-for-research-and-analysis
This approach is cheaper, safer, and more effective than handing hundreds of millions of pounds to Palantir and hoping they will manage the impossible: anonymising data well enough that it is never re-identified. Trusted Research Environments have been endorsed by national associations of doctors and researchers as the superior alternative to giving the NHS's data to Peter Thiel or any other sharp operator seeking a public contract.
As a lifelong privacy campaigner, I find this approach nothing short of inspiring. I would love for there to be a way for publishers and researchers to glean privacy-preserving insights from public library checkouts (such a system would prove an important counter to Amazon's proprietary god's-eye view of reading habits); or BBC podcasts or streaming video viewership.
You see, there is an alternative. We don't have to choose between science and privacy, or the public interest and private gain. There's always an alternative – if there wasn't, the other side wouldn't have to continuously repeat the lie that no alternative is possible.
Tumblr media
Name your price for 18 of my DRM-free ebooks and support the Electronic Frontier Foundation with the Humble Cory Doctorow Bundle.
Tumblr media
If you'd like an essay-formatted version of this post to read or share, here's a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:
https://pluralistic.net/2024/03/08/the-fire-of-orodruin/#are-we-the-baddies
Tumblr media
Image: Gage Skidmore (modified) https://commons.m.wikimedia.org/wiki/File:Peter_Thiel_(51876933345).jpg
CC BY-SA 2.0 https://creativecommons.org/licenses/by-sa/2.0/deed.en
516 notes · View notes
badassbishova · 2 months
Text
Tumblr media Tumblr media
Who else is in a chokehold because of this woman? 🙋🏼‍♀️🥵
345 notes · View notes
pincushionsam · 6 months
Text
Tumblr media Tumblr media Tumblr media
Venus Fly Trap 🪴
I made this one for a friend the other week and absolutely love how it turned out! I also hand sewed a little patch with a happy birthday so it would always be there! 💕
Pattern: @sirithre (https://www.etsy.com/listing/1291762632/)
187 notes · View notes
nixcraft · 6 months
Text
FLOSS dev rocks :)
Tumblr media
169 notes · View notes
wojakgallery · 15 days
Text
Tumblr media
Title/Name: Zoomer Quick Floss Gif Wojak Series: Zoomer (Variant) Image by: Unknown Main Tag: Zoomer Gif Wojak
26 notes · View notes
rat-at-heart · 3 months
Text
Tumblr media
Was wondering if you possibly had some extra floss on you (he just had leftover hyena)
29 notes · View notes
Text
Tumblr media
22 notes · View notes
j4ckme · 1 month
Text
Tumblr media
22 notes · View notes
very-gay-alkyrion · 8 months
Text
:wq and Rest in Peace to Bram Moolenaar, the creator of vim, and Open-Source heavyweight.
55 notes · View notes
Photo
Tumblr media
(via GIPHY)
154 notes · View notes
thecomicoalman · 9 months
Text
Tumblr media
Floss!!
63 notes · View notes
Text
"Open" "AI" isn’t
Tumblr media
Tomorrow (19 Aug), I'm appearing at the San Diego Union-Tribune Festival of Books. I'm on a 2:30PM panel called "Return From Retirement," followed by a signing:
https://www.sandiegouniontribune.com/festivalofbooks
Tumblr media
The crybabies who freak out about The Communist Manifesto appearing on university curriculum clearly never read it – chapter one is basically a long hymn to capitalism's flexibility and inventiveness, its ability to change form and adapt itself to everything the world throws at it and come out on top:
https://www.marxists.org/archive/marx/works/1848/communist-manifesto/ch01.htm#007
Today, leftists signal this protean capacity of capital with the -washing suffix: greenwashing, genderwashing, queerwashing, wokewashing – all the ways capital cloaks itself in liberatory, progressive values, while still serving as a force for extraction, exploitation, and political corruption.
A smart capitalist is someone who, sensing the outrage at a world run by 150 old white guys in boardrooms, proposes replacing half of them with women, queers, and people of color. This is a superficial maneuver, sure, but it's an incredibly effective one.
In "Open (For Business): Big Tech, Concentrated Power, and the Political Economy of Open AI," a new working paper, Meredith Whittaker, David Gray Widder and Sarah B Myers document a new kind of -washing: openwashing:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4543807
Openwashing is the trick that large "AI" companies use to evade regulation and neutralizing critics, by casting themselves as forces of ethical capitalism, committed to the virtue of openness. No one should be surprised to learn that the products of the "open" wing of an industry whose products are neither "artificial," nor "intelligent," are also not "open." Every word AI huxters say is a lie; including "and," and "the."
So what work does the "open" in "open AI" do? "Open" here is supposed to invoke the "open" in "open source," a movement that emphasizes a software development methodology that promotes code transparency, reusability and extensibility, which are three important virtues.
But "open source" itself is an offshoot of a more foundational movement, the Free Software movement, whose goal is to promote freedom, and whose method is openness. The point of software freedom was technological self-determination, the right of technology users to decide not just what their technology does, but who it does it to and who it does it for:
https://locusmag.com/2022/01/cory-doctorow-science-fiction-is-a-luddite-literature/
The open source split from free software was ostensibly driven by the need to reassure investors and businesspeople so they would join the movement. The "free" in free software is (deliberately) ambiguous, a bit of wordplay that sometimes misleads people into thinking it means "Free as in Beer" when really it means "Free as in Speech" (in Romance languages, these distinctions are captured by translating "free" as "libre" rather than "gratis").
The idea behind open source was to rebrand free software in a less ambiguous – and more instrumental – package that stressed cost-savings and software quality, as well as "ecosystem benefits" from a co-operative form of development that recruited tinkerers, independents, and rivals to contribute to a robust infrastructural commons.
But "open" doesn't merely resolve the linguistic ambiguity of libre vs gratis – it does so by removing the "liberty" from "libre," the "freedom" from "free." "Open" changes the pole-star that movement participants follow as they set their course. Rather than asking "Which course of action makes us more free?" they ask, "Which course of action makes our software better?"
Thus, by dribs and drabs, the freedom leeches out of openness. Today's tech giants have mobilized "open" to create a two-tier system: the largest tech firms enjoy broad freedom themselves – they alone get to decide how their software stack is configured. But for all of us who rely on that (increasingly unavoidable) software stack, all we have is "open": the ability to peer inside that software and see how it works, and perhaps suggest improvements to it:
https://www.youtube.com/watch?v=vBknF2yUZZ8
In the Big Tech internet, it's freedom for them, openness for us. "Openness" – transparency, reusability and extensibility – is valuable, but it shouldn't be mistaken for technological self-determination. As the tech sector becomes ever-more concentrated, the limits of openness become more apparent.
But even by those standards, the openness of "open AI" is thin gruel indeed (that goes triple for the company that calls itself "OpenAI," which is a particularly egregious openwasher).
The paper's authors start by suggesting that the "open" in "open AI" is meant to imply that an "open AI" can be scratch-built by competitors (or even hobbyists), but that this isn't true. Not only is the material that "open AI" companies publish insufficient for reproducing their products, even if those gaps were plugged, the resource burden required to do so is so intense that only the largest companies could do so.
Beyond this, the "open" parts of "open AI" are insufficient for achieving the other claimed benefits of "open AI": they don't promote auditing, or safety, or competition. Indeed, they often cut against these goals.
"Open AI" is a wordgame that exploits the malleability of "open," but also the ambiguity of the term "AI": "a grab bag of approaches, not… a technical term of art, but more … marketing and a signifier of aspirations." Hitching this vague term to "open" creates all kinds of bait-and-switch opportunities.
That's how you get Meta claiming that LLaMa2 is "open source," despite being licensed in a way that is absolutely incompatible with any widely accepted definition of the term:
https://blog.opensource.org/metas-llama-2-license-is-not-open-source/
LLaMa-2 is a particularly egregious openwashing example, but there are plenty of other ways that "open" is misleadingly applied to AI: sometimes it means you can see the source code, sometimes that you can see the training data, and sometimes that you can tune a model, all to different degrees, alone and in combination.
But even the most "open" systems can't be independently replicated, due to raw computing requirements. This isn't the fault of the AI industry – the computational intensity is a fact, not a choice – but when the AI industry claims that "open" will "democratize" AI, they are hiding the ball. People who hear these "democratization" claims (especially policymakers) are thinking about entrepreneurial kids in garages, but unless these kids have access to multi-billion-dollar data centers, they can't be "disruptors" who topple tech giants with cool new ideas. At best, they can hope to pay rent to those giants for access to their compute grids, in order to create products and services at the margin that rely on existing products, rather than displacing them.
The "open" story, with its claims of democratization, is an especially important one in the context of regulation. In Europe, where a variety of AI regulations have been proposed, the AI industry has co-opted the open source movement's hard-won narrative battles about the harms of ill-considered regulation.
For open source (and free software) advocates, many tech regulations aimed at taming large, abusive companies – such as requirements to surveil and control users to extinguish toxic behavior – wreak collateral damage on the free, open, user-centric systems that we see as superior alternatives to Big Tech. This leads to the paradoxical effect of passing regulation to "punish" Big Tech that end up simply shaving an infinitesimal percentage off the giants' profits, while destroying the small co-ops, nonprofits and startups before they can grow to be a viable alternative.
The years-long fight to get regulators to understand this risk has been waged by principled actors working for subsistence nonprofit wages or for free, and now the AI industry is capitalizing on lawmakers' hard-won consideration for collateral damage by claiming to be "open AI" and thus vulnerable to overbroad regulation.
But the "open" projects that lawmakers have been coached to value are precious because they deliver a level playing field, competition, innovation and democratization – all things that "open AI" fails to deliver. The regulations the AI industry is fighting also don't necessarily implicate the speech implications that are core to protecting free software:
https://www.eff.org/deeplinks/2015/04/remembering-case-established-code-speech
Just think about LLaMa-2. You can download it for free, along with the model weights it relies on – but not detailed specs for the data that was used in its training. And the source-code is licensed under a homebrewed license cooked up by Meta's lawyers, a license that only glancingly resembles anything from the Open Source Definition:
https://opensource.org/osd/
Core to Big Tech companies' "open AI" offerings are tools, like Meta's PyTorch and Google's TensorFlow. These tools are indeed "open source," licensed under real OSS terms. But they are designed and maintained by the companies that sponsor them, and optimize for the proprietary back-ends each company offers in its own cloud. When programmers train themselves to develop in these environments, they are gaining expertise in adding value to a monopolist's ecosystem, locking themselves in with their own expertise. This a classic example of software freedom for tech giants and open source for the rest of us.
One way to understand how "open" can produce a lock-in that "free" might prevent is to think of Android: Android is an open platform in the sense that its sourcecode is freely licensed, but the existence of Android doesn't make it any easier to challenge the mobile OS duopoly with a new mobile OS; nor does it make it easier to switch from Android to iOS and vice versa.
Another example: MongoDB, a free/open database tool that was adopted by Amazon, which subsequently forked the codebase and tuning it to work on their proprietary cloud infrastructure.
The value of open tooling as a stickytrap for creating a pool of developers who end up as sharecroppers who are glued to a specific company's closed infrastructure is well-understood and openly acknowledged by "open AI" companies. Zuckerberg boasts about how PyTorch ropes developers into Meta's stack, "when there are opportunities to make integrations with products, [so] it’s much easier to make sure that developers and other folks are compatible with the things that we need in the way that our systems work."
Tooling is a relatively obscure issue, primarily debated by developers. A much broader debate has raged over training data – how it is acquired, labeled, sorted and used. Many of the biggest "open AI" companies are totally opaque when it comes to training data. Google and OpenAI won't even say how many pieces of data went into their models' training – let alone which data they used.
Other "open AI" companies use publicly available datasets like the Pile and CommonCrawl. But you can't replicate their models by shoveling these datasets into an algorithm. Each one has to be groomed – labeled, sorted, de-duplicated, and otherwise filtered. Many "open" models merge these datasets with other, proprietary sets, in varying (and secret) proportions.
Quality filtering and labeling for training data is incredibly expensive and labor-intensive, and involves some of the most exploitative and traumatizing clickwork in the world, as poorly paid workers in the Global South make pennies for reviewing data that includes graphic violence, rape, and gore.
Not only is the product of this "data pipeline" kept a secret by "open" companies, the very nature of the pipeline is likewise cloaked in mystery, in order to obscure the exploitative labor relations it embodies (the joke that "AI" stands for "absent Indians" comes out of the South Asian clickwork industry).
The most common "open" in "open AI" is a model that arrives built and trained, which is "open" in the sense that end-users can "fine-tune" it – usually while running it on the manufacturer's own proprietary cloud hardware, under that company's supervision and surveillance. These tunable models are undocumented blobs, not the rigorously peer-reviewed transparent tools celebrated by the open source movement.
If "open" was a way to transform "free software" from an ethical proposition to an efficient methodology for developing high-quality software; then "open AI" is a way to transform "open source" into a rent-extracting black box.
Some "open AI" has slipped out of the corporate silo. Meta's LLaMa was leaked by early testers, republished on 4chan, and is now in the wild. Some exciting stuff has emerged from this, but despite this work happening outside of Meta's control, it is not without benefits to Meta. As an infamous leaked Google memo explains:
Paradoxically, the one clear winner in all of this is Meta. Because the leaked model was theirs, they have effectively garnered an entire planet's worth of free labor. Since most open source innovation is happening on top of their architecture, there is nothing stopping them from directly incorporating it into their products.
https://www.searchenginejournal.com/leaked-google-memo-admits-defeat-by-open-source-ai/486290/
Thus, "open AI" is best understood as "as free product development" for large, well-capitalized AI companies, conducted by tinkerers who will not be able to escape these giants' proprietary compute silos and opaque training corpuses, and whose work product is guaranteed to be compatible with the giants' own systems.
The instrumental story about the virtues of "open" often invoke auditability: the fact that anyone can look at the source code makes it easier for bugs to be identified. But as open source projects have learned the hard way, the fact that anyone can audit your widely used, high-stakes code doesn't mean that anyone will.
The Heartbleed vulnerability in OpenSSL was a wake-up call for the open source movement – a bug that endangered every secure webserver connection in the world, which had hidden in plain sight for years. The result was an admirable and successful effort to build institutions whose job it is to actually make use of open source transparency to conduct regular, deep, systemic audits.
In other words, "open" is a necessary, but insufficient, precondition for auditing. But when the "open AI" movement touts its "safety" thanks to its "auditability," it fails to describe any steps it is taking to replicate these auditing institutions – how they'll be constituted, funded and directed. The story starts and ends with "transparency" and then makes the unjustifiable leap to "safety," without any intermediate steps about how the one will turn into the other.
It's a Magic Underpants Gnome story, in other words:
Step One: Transparency
Step Two: ??
Step Three: Safety
https://www.youtube.com/watch?v=a5ih_TQWqCA
Meanwhile, OpenAI itself has gone on record as objecting to "burdensome mechanisms like licenses or audits" as an impediment to "innovation" – all the while arguing that these "burdensome mechanisms" should be mandatory for rival offerings that are more advanced than its own. To call this a "transparent ruse" is to do violence to good, hardworking transparent ruses all the world over:
https://openai.com/blog/governance-of-superintelligence
Some "open AI" is much more open than the industry dominating offerings. There's EleutherAI, a donor-supported nonprofit whose model comes with documentation and code, licensed Apache 2.0. There are also some smaller academic offerings: Vicuna (UCSD/CMU/Berkeley); Koala (Berkeley) and Alpaca (Stanford).
These are indeed more open (though Alpaca – which ran on a laptop – had to be withdrawn because it "hallucinated" so profusely). But to the extent that the "open AI" movement invokes (or cares about) these projects, it is in order to brandish them before hostile policymakers and say, "Won't someone please think of the academics?" These are the poster children for proposals like exempting AI from antitrust enforcement, but they're not significant players in the "open AI" industry, nor are they likely to be for so long as the largest companies are running the show:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4493900
Tumblr media Tumblr media
I'm kickstarting the audiobook for "The Internet Con: How To Seize the Means of Computation," a Big Tech disassembly manual to disenshittify the web and make a new, good internet to succeed the old, good internet. It's a DRM-free book, which means Audible won't carry it, so this crowdfunder is essential. Back now to get the audio, Verso hardcover and ebook:
http://seizethemeansofcomputation.org
Tumblr media
If you'd like an essay-formatted version of this post to read or share, here's a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:
https://pluralistic.net/2023/08/18/openwashing/#you-keep-using-that-word-i-do-not-think-it-means-what-you-think-it-means
Tumblr media
Image: Cryteria (modified) https://commons.wikimedia.org/wiki/File:HAL9000.svg
CC BY 3.0 https://creativecommons.org/licenses/by/3.0/deed.en
246 notes · View notes
Text
20 notes · View notes
thackeroy · 20 days
Text
Tumblr media
Finally got around to putting the last stitches into this cute little cake I stitched up as part of celebrating @chelystitch's birthday last year!
15 notes · View notes
captain-acab · 4 months
Note
Do you have a more up to date link for a hacked spotify for android? I found your post from 2022 but the link doesn't have a version that's up to date with the version of spotify I have.
Thanks 💚
Not exactly. I actually don't use the hacked Spotify app anymore. I prefer Spotube, a free open-source app that interfaces directly with the Spotify web API. It also let's you download songs and albums directly, which the hacked app didn't do!
Overall Spotube is more stable, trustworthy (because the code source is open and community-audited), and will never be at risk Spotify patching the hack or disabling your account for violating terms of service, because it uses 100% legal software.
(P.S. the first link leads to F-Droid, an alternative app store for free, community-driven, open-source apps. That's where you can also download NewPipe, my favorite ad-free youtube app.)
21 notes · View notes
david-boomstick · 1 year
Video
"Ash Williams is not the same person since he returned from Fortnite"
160 notes · View notes