Category: Uncategorized

What on Earth is Open Source AI?

I want to talk about a recent conversation on the Open Source AI definition, but before that I want to do an acknowledgement. My position on the uptake of “AI” is that it is morally unconscionable, short-sighted, and frankly, just stupid. In a time of snowballing climate crisis and an impending environmental doom, not only are we diverting limited resources away from climate justice, we’re routing them to contribute to the crisis.

Not only that, the utility and societal relevance of LLMs and neural networks has been vastly overstated. They perform consistently worse than traditional computing and people doing the same jobs and are advertised to replace jobs and professions that don’t need replacing. Furthermore, we’ve been assaulted with a PR campaign of highly polished plagiarizing mechanical turks that hide the human labor involved, and shifts the costs in a way that furthers wealth inequality, and have been promised that they will only get better (are they? And better for whom?)

However since the world seems to have lost the plot, and until all the data centers are under sea water, some of us have to engage with “AI” seriously, whether to do some unintentional whitewashing under the illusion of driving the conversation, or for much needed harm reduction work, or simply for good old fashioned opportunism.

The modern tale of machine learning is intertwined with openwashing, where companies try to mislead consumers by associating their products with open source without actually being open or transparent. Within that context, and as legislation comes for “AI”, it makes sense that an organization like the Open Source Initiative (OSI) would try and establish a definition of what constitutes Open Source “AI”. It’s certainly not an easy task to take on.

The conversation that I would like to bring to your attention was started by Julia Ferraioli in this thread (noting that the thread got a bit large, so the weekly summaries posted by Mia Lykou Lund might be easier to follow). Julia argues that a definition of Open Source “AI” that doesn’t include the data used for training the model cannot be considered open source. The current draft lists those data as optional.

Steffano Maffulli published an opinion to explain the side of the proponents of keeping training data optional. I’ve tried to stay abreast of the conversations, but they’re has been a lot of takes and a lot of platforms where these conversations are happening, so I will limit my take to that recently published piece.

Reading through it, I’m personally not convinced and fully support the position that Julia outlined in the original thread. I don’t dismiss the concerns that Steffano raised wholesale, but ultimately they are not compelling. Fragmented global data regulations and compliance aren’t a unique challenge to Open Source “AI” alone, and should be addressed on that level to enable openness on a global scale.

Fundamentally, it comes down to this: Steffano argues that this open data requirement would put “Open Source at a disadvantage compared to opaque and proprietary AI systems.” Well, if the price of making Open Source “AI” competitive with proprietary “AI” is to break the openness that is fundamental to the definition, then why are we doing it? Is this about protecting Open Source from openwashing or accidentally enabling it because the right thing is hard to do? And when has Open Source not been at a disadvantage to proprietary systems?

I understand that OSI is navigating a complicated topic and trying to come up with an alternative that pleases everyone, but the longer this conversation goes on, it’s clear that at some point a line needs to be drawn, and OSI has to decide which side of the line it wants to be on.

EDIT (June 15th, 17:20 CET): I may be a bit behind on this, I just read a post by Tom Callaway from two weeks ago that makes lots of the same points much more eloquently and goes deeper into it, I highly recommend reading that.

Hello W- nah just messing with you 🤣

It’s been a long time since my last blog post, and it feels so fucking good. While it does feel so incredibly good to be writing again, there is something so unfamiliar about my relationship to this space, my blog, and the internet in general. Which leads us to answer the first question I will answer today:-

Where did all the old blog posts go?

They’re all happy and alive, frolicking in a server farm far far away. In reality, the internet has changed, and so have I. In fact, the internet I used to write about never existed in the first place. It was fiction, almost naive fiction, presented as reality, and as we know, reality shows never age well.

I had to take the archive down because I couldn’t draw a line between the person I was in the 2010s and the story I want to tell now. They’re not purged, I want to curate a few of them and present them within context when I have the time, but until then, the only way to access them would be web archive or something.

Story you want to tell?

Yes, that’s what blogging is you silly pants! I’m just in a very interesting period of my life, in a very interesting period of time, and both I and time are in a very interesting position. I’ve just left OTF after a very interesting five years of supporting people who build great tools to save those most vulnerable online, and now I’ve joined Techcultivation and looking to do more of that and beyond. Not to mention great projects being set up like the SVT which I really want to tell you about. Those are all stories, from the past, the present and the future that I want to tell.

That I need to tell really.

Surviving a World in Crisis

Ron Burgundy saying "Well, that escalated Quickly"

Not gonna sugar coat it folks, since the last time I wrote a blog post, things have been rapidly becoming shittier. It was partially why I stopped. I called my older posts “almost naive” earlier, and they totally were. I’ve been disillusioned for as long as I can remember, and angry for even longer than that. I’ve also been tired. But the disillusionment, one side effect was it made me feel embarrassed by the naive fiction I used to peddle pre-2016.

I will not belabor the point today, I’ll keep that for later blog posts, but here is why I’m writing again. Was I wrong about things in the past? Yeah I was. Was I naive? Almost adorably so. Did my politics evolve since then? I hope so. Is there a danger of me spewing more naive fiction that I might be embarrassed about in the future? Well, that’s actually my plan, and it’s almost crazy enough it might work.

When times are hard, do something. If it works, do it some more. If it does not work, do something else. But keep going.

Audre Lorde

Not writing has not been working for me. Writing things that turned out to be naive worked for me at the time. Crises robbed us from our imagination. But we don’t all have the luxury or privilege of being doom preppers or nihilists. Just as the climate crisis will hit the poor, the queer and those in the larger world first, it will come for their imaginations first.

I want to write again and maybe encourage you all to start blogging again because we need to save our imagination, it’s the only way we can keep going. So expect more wonderful stories on this website, both the ones I promised above, and more, about how we’re gotta get through this and make things better.