Tag: AI

The Future is Meaningless and I Hate It

I graduated as a Computer Engineer in the late 2000s, and at that time I was convinced that the future would be so full of meaning, almost literally. Yup, I’m talking about the “Semantic Web,” for those who remember. It was the big thing on everyone’s minds while machine learning was but a murmur. The Semantic Web was the original promise of digital utopia where everything would interconnect, where information would actually understand us, and where asking a question didn’t just get you a vague answer but actual insight.

The Semantic Web knew that “apple” could mean both a fruit and an overbearing tech company, and it would parse out which one you meant based on **technology**. I was so excited for that, even my university graduation project was a semantic web engine. I remember the thrill when I indexed 1/8 of Wikipedia, and my mind was blown when a search for Knafeh gave Nablus in the results (Sorry Damascenes).

And now here we are in 2024, and all of that feels like a hazy dream. What we got instead was a sea of copyright-stealing forest-burning AI models playing guessing games with us and using math to cheat. And we satisfied enough by that to call it intelligence.

When Tim Berners-Lee and other boffins imagined the Semantic Web, they weren’t just imagining smarter search engines. They were talking about a leap in internet intelligence. Metadata, relationships, ontologies—the whole idea was that data would be tagged, organized, and woven together in a way that was actually meaningful. The Semantic Web wouldn’t just return information; it would actually deliver understanding, relevance, context.

What did we end up with instead? A patchwork of services where context doesn’t matter and connections are shallow. Our web today is just brute-force AI models parsing keywords, throwing probability-based answers at us, or trying to convince us that paraphrasing a Wikipedia entry qualifies as “knowing” something. Everything about this feels cheap and brutish and offensive to my information science sensibilities. And what’s worse— our overlords have deigned that this is our future.

Nothing illustrates this madness more than Google Jarvis and Microsoft Co-pilot. These multi-billion dollar companies that can build whatever the hell they want, decide to take OCR technology— aka converting screenshots into text, pipe that text into a large language model, it produces a plausible-sounding response by stitching together bits and pieces of language patterns it’s seen before. Wow.

It’s the stupid leading the stupid. OCR sees shapes, patterns, guesses at letters, and spits out words. It has no idea what any of those words mean. It doesn’t know what the text is about, only that it can recognize it. Throws it to an LLM which doesn’t see words either, it only knows tokens. Takes a couple of plausible guesses and throws something out. The whole system is built on probability, not meaning.

It’s a cheap workaround that gets us “answers” without comprehension, without accuracy, without depth. The big tech giants, armed with all the data, money and computing power, has decided that brute force is good enough. So, instead of meaningful insights, we’re getting quick-fix solutions that barely scrape the surface of what we need. And to afford it we’ll need to bring defunct nuclear plants back online.

But how did we get here? Because let’s be real—brute force is easy, relatively fast, and profitable for someone I’m sure. AI does have some good applications. Let’s say you don’t want to let people into your country but don’t want to be overtly racist about it. Obfuscate that racism behind statistics!

Deep learning models don’t need carefully tagged, structured data because they don’t need to really be accurate, just enough to convince us that they are accurate sometimes. And for that measly goal, all they need is a lot of data and enough computing power to grind through. Why go through the hassle of creating an interconnected web of meaning when you can throw rainforests and terabytes of text at the problem and get results that looks good enough?

I know this isn’t fair for the folks currently working on Semantic Web stuff, but it’s fair to say that as a society, we essentially have given up on the arduous, meticulous work of building a true Semantic Web because we got something else now. But we didn’t get meaning, we got approximation. We got endless regurgitation, shallow summarization, probability over purpose. And because humans are inherenly terrible at understanding math, and because we overestimate the uniqueness of the human condition, we let those statistical echos of human outputs bluff their way into our trust.

It’s hard not to feel like I’ve been conned. I used to be excited about technology. The internet could have become a universe of intelligence, but what I have to look forward to now is just an endless AI centipede of meaningless content and recycled text. We’re settling for that because, I dunno, it kinda works and there’s lots of money in it? Don’t these fools see that we’re giving up something truly profound? An internet that truly connects, informs, and understands us, a meaningful internet, is just drifting out of reach.

But it’s gonna be fine, because instead of protecting Open Source from AI, some people decided it’s wiser to open-wash it instead. Thanks, I hate it. I hate all of it.

I Was Wrong About the Open Source Bubble

This is a follow up to my previous post where I discussed some factors indicating an imbalance in the open source ecosystem titled, Is the Open Source Bubble about to Burst? I was very happy to see some of the engagement with the blog post, even if some people seemed like they didn’t read past the title and were offended by characterizing open source as a bubble, or assuming simply because I’m talking about the current state of FOSS, or how some companies use it, that this somehow reflects my position on free software vs. open source.

Now, I wasn’t even the first or only person to suggest an Open Source bubble might exist. The first mention of the concept that I could find was by Simon Phipps, similarly asking “Is the Open Source bubble over?” all the way back in 2010, and I believe it’s an insightful framing for the time that we see culminate in all the pressures I alluded to in my post.

The second mention I could find is from Baldur Bjarnason, who wrote about Open Source Software and compared it to the blogging bubble. It’s a great blog post, and Baldur even wrote a newer article in response to mine talking about “Open Source surplus”, which is a framing I like a lot. I would recommend reading both. I’m very thankful for the thoughtful article.

Last week as well, Elastic announced it’s returning to open source, reversing one of the trends I talked about. Obviously, they didn’t want to admit they were wrong, saying it was the right move at the time. I have some thoughts about that, but I’ll keep them to myself, if that’s the excuse they need to tell themselves to end up open source again, then I won’t look a gift horse in the mouth. Hope more “source-open” projects follow.

Finally, the article was mentioned in my least favorite tech tabloid, The Register. Needless to say, there isn’t and won’t be an open source AI wars, since there won’t be AI to worry about soon. An industry that is losing billions of dollars a year and is heavily energy intensive that it would accelerate our climate doom won’t last. OSI has a decision to make, to either protect the open source definition and their reputation, or risk both.

P.S. I will continue to ignore any AI copium so save us both some time.

Is the Open Source Bubble about to Burst?

(EDIT: I wrote an update here.)

I want to start by making one thing clear: I’m not comparing open source software to typical Gartneresque tech hype bubbles like the metaverse or blockchain. FOSS as both a movement and as an industry has long standing roots and has established itself as a critical part of our digital world and is part of a wider movement based on values of collaboration and openness.

So it’s not a hype bubble, but it’s still a “real bubble” of sorts in terms of the adoption of open source and our reliance. Github, which hosts many open source projects, has been consistently reporting around 2 million first time contributors to OSS each year since 2021 and the number is trending upwards. Harvard Business School has estimated in a recent working paper that the value of OSS to the economy is 4.15 Billion USD.

There are far more examples out there but you see the point. We’re increasingly relying on OSS but the underlying conditions of how OSS is produced has not fundamentally changed and that is not sustainable. Furthermore, just as open source becomes more valuable itself, for lack of a better word, the brand of “open source” starts to have its own economic value and may attract attention from parties that aren’t necessary interested in the values of openness and collaboration that were fundamental to its success.

I want to talk about three examples I see of cracks that are starting to form which signal big challenges in the future of OSS.

1. The “Open Source AI” Definition

I’m not very invested into AI, and I’m convinced it’s on its way out. Big Tech is already losing money over their gambles on it and it won’t be long till it’s gone the way of the Dodo and the blockchain. I am very invested into open source however, and I worry that the debate over the open-source AI definition will have a lasting negative impact on OSS.

A system that can only be built on proprietary data can only be proprietary. It doesn’t get simpler than this self-evident axiom. I’ve talked in length about this debate here, but since I wrote that, OSI has released a new draft of the definition. Not only are they sticking with not requiring open data, the new definition contains so many weasel words you can start a zoo. Words like:

  • sufficiently detailed information about the data”
  • skilled person”
  • substantially equivalent system”

These words provide a barn-sized backdoor for what are essentially proprietary AI systems to call themselves open source.

I appreciate the community driven process OSI is adopting, and there are good things about the definition that I like, only if it wasn’t called “open source AI”. If it was called anything else, it might still be useful, but the fact that it associates with open source is the issue.

It erodes the fundamental values of what makes open source what it is to users, the freedom to study, modify, run and distribute software as they see fit. AI might go silently into the night but this harm to the definition of open source will stay forever.

2. The Rise of “Source-Available” Licenses

Another concerning trend is the rise of so-called “source-available” licenses. I will go into depth on this in a later article, but the gist of it is this. Open source software doesn’t just mean that you get to see the source code in addition to the software. It’s well agreed that for software to qualify as open source or free software, one should be able to use, study, modify and distribute it as they see fit. That also means that the source is available for free and open source software.

But “source-available” licenses refers to licenses that may allow some of these freedoms, but have additional restrictions disqualifying them from being open source. These licenses have existed in some form since the early 2000s, but recently we’ve seen a lot of high profile formerly open source projects switch to these restrictive licenses. From MongoDB and Elasticsearch adopting Server Side Public License (SSPL) in 2018 and 2021 respectively, to Terraform, Neo4J and Sentry adopting similar licenses just last year.

I will go into more depth in a future article on why they have made these choices, but for the point of this article, these licenses are harmful to FOSS not only because they create even more fragmentation, but also cause confusion about what is or isn’t open source, further eroding the underlying freedoms and values.

3. The EU’s Cut to Open Source Funding

Perhaps one of the most troubling developments is the recent decision by the European Commission to cut funding for the Next Generation Internet (NGI) initiative. The NGI initiative supported the creation and development of many open source projects that wouldn’t exist without this funding, such as decentralized solutions, privacy-enhancing technologies, and open-source software that counteract the centralization and control of the web by large tech corporations.

The decision to cancel its funding is a stark reminder that despite all the good news, the FOSS ecosystem is still very fragile and reliant on external support. Programs like NGI not only provide vital funding, but also resources, and guidance to incubate newer projects or help longer standing ones become established. This support is essential for maintaining a healthy ecosystem in the public interest.

It’s troubling to lose some critical funding when the existing funding is already not enough. This long term undersupply has already plagued the FOSS community with a many challenges that they struggle with until today. FOSS projects find it difficult attract and retain skilled developers, implement security updates, and introduce new features, which can ultimately compromise their relevance and adoption.

Additionally, a lack of support can lead to burnout among maintainers, who often juggle multiple roles without sufficient or any compensation. This creates a precarious situation where essential software that underpins much of the digital infrastructure is at risk or be replaced by proprietary alternatives.

And if you don’t think that’s bad, I want to refer to that Harvard Business school study from earlier: While the estimated value of FOSS to the economy is around 4.15 billion USD, the cost to replace all this software we rely upon is 8.8 trillion. A 25 million investment into that ecosystem seems like a no-brainer to me, I think it’s insane that the EC is cutting this funding.

It Does and It Doesn’t Matter if the Bubble Bursts

FOSS has become so integral and critical due to its fundamental freedoms and values. Time and time again, we’ve seen openness and collaboration triumph against obfuscation and monopolies. It will surely survive these challenges and many more. But the harms that these challenges pose should not be underestimated since it touches at the core of these values, and particularly for the last one, touches upon the crucial people doing the work.

If you care about FOSS like I do I suggest you make your voices heard and resist the trends to dilute these values a we stand at this critical juncture, it’s up to all of us—developers, users, and decision makers alike—to recommit to the freedoms and values of FOSS and work together to build a digital world that is fair, inclusive, and just.