I graduated as a Computer Engineer in the late 2000s, and at that time I was convinced that the future would be so full of meaning, almost literally. Yup, I’m talking about the “Semantic Web,” for those who remember. It was the big thing on everyone’s minds while machine learning was but a murmur. The Semantic Web was the original promise of digital utopia where everything would interconnect, where information would actually understand us, and where asking a question didn’t just get you a vague answer but actual insight.
The Semantic Web knew that “apple” could mean both a fruit and an overbearing tech company, and it would parse out which one you meant based on **technology**. I was so excited for that, even my university graduation project was a semantic web engine. I remember the thrill when I indexed 1/8 of Wikipedia, and my mind was blown when a search for Knafeh gave Nablus in the results (Sorry Damascenes).
And now here we are in 2024, and all of that feels like a hazy dream. What we got instead was a sea of copyright-stealing forest-burning AI models playing guessing games with us and using math to cheat. And we satisfied enough by that to call it intelligence.
When Tim Berners-Lee and other boffins imagined the Semantic Web, they weren’t just imagining smarter search engines. They were talking about a leap in internet intelligence. Metadata, relationships, ontologies—the whole idea was that data would be tagged, organized, and woven together in a way that was actually meaningful. The Semantic Web wouldn’t just return information; it would actually deliver understanding, relevance, context.
What did we end up with instead? A patchwork of services where context doesn’t matter and connections are shallow. Our web today is just brute-force AI models parsing keywords, throwing probability-based answers at us, or trying to convince us that paraphrasing a Wikipedia entry qualifies as “knowing” something. Everything about this feels cheap and brutish and offensive to my information science sensibilities. And what’s worse— our overlords have deigned that this is our future.
Nothing illustrates this madness more than Google Jarvis and Microsoft Co-pilot. These multi-billion dollar companies that can build whatever the hell they want, decide to take OCR technology— aka converting screenshots into text, pipe that text into a large language model, it produces a plausible-sounding response by stitching together bits and pieces of language patterns it’s seen before. Wow.
It’s the stupid leading the stupid. OCR sees shapes, patterns, guesses at letters, and spits out words. It has no idea what any of those words mean. It doesn’t know what the text is about, only that it can recognize it. Throws it to an LLM which doesn’t see words either, it only knows tokens. Takes a couple of plausible guesses and throws something out. The whole system is built on probability, not meaning.
It’s a cheap workaround that gets us “answers” without comprehension, without accuracy, without depth. The big tech giants, armed with all the data, money and computing power, has decided that brute force is good enough. So, instead of meaningful insights, we’re getting quick-fix solutions that barely scrape the surface of what we need. And to afford it we’ll need to bring defunct nuclear plants back online.
But how did we get here? Because let’s be real—brute force is easy, relatively fast, and profitable for someone I’m sure. AI does have some good applications. Let’s say you don’t want to let people into your country but don’t want to be overtly racist about it. Obfuscate that racism behind statistics!
Deep learning models don’t need carefully tagged, structured data because they don’t need to really be accurate, just enough to convince us that they are accurate sometimes. And for that measly goal, all they need is a lot of data and enough computing power to grind through. Why go through the hassle of creating an interconnected web of meaning when you can throw rainforests and terabytes of text at the problem and get results that looks good enough?
I know this isn’t fair for the folks currently working on Semantic Web stuff, but it’s fair to say that as a society, we essentially have given up on the arduous, meticulous work of building a true Semantic Web because we got something else now. But we didn’t get meaning, we got approximation. We got endless regurgitation, shallow summarization, probability over purpose. And because humans are inherenly terrible at understanding math, and because we overestimate the uniqueness of the human condition, we let those statistical echos of human outputs bluff their way into our trust.
It’s hard not to feel like I’ve been conned. I used to be excited about technology. The internet could have become a universe of intelligence, but what I have to look forward to now is just an endless AI centipede of meaningless content and recycled text. We’re settling for that because, I dunno, it kinda works and there’s lots of money in it? Don’t these fools see that we’re giving up something truly profound? An internet that truly connects, informs, and understands us, a meaningful internet, is just drifting out of reach.
But it’s gonna be fine, because instead of protecting Open Source from AI, some people decided it’s wiser to open-wash it instead. Thanks, I hate it. I hate all of it.