Is the Open Source Bubble about to Burst?

(EDIT: I wrote an update here.)

I want to start by making one thing clear: I’m not comparing open source software to typical Gartneresque tech hype bubbles like the metaverse or blockchain. FOSS as both a movement and as an industry has long standing roots and has established itself as a critical part of our digital world and is part of a wider movement based on values of collaboration and openness.

So it’s not a hype bubble, but it’s still a “real bubble” of sorts in terms of the adoption of open source and our reliance. Github, which hosts many open source projects, has been consistently reporting around 2 million first time contributors to OSS each year since 2021 and the number is trending upwards. Harvard Business School has estimated in a recent working paper that the value of OSS to the economy is 4.15 Billion USD.

There are far more examples out there but you see the point. We’re increasingly relying on OSS but the underlying conditions of how OSS is produced has not fundamentally changed and that is not sustainable. Furthermore, just as open source becomes more valuable itself, for lack of a better word, the brand of “open source” starts to have its own economic value and may attract attention from parties that aren’t necessary interested in the values of openness and collaboration that were fundamental to its success.

I want to talk about three examples I see of cracks that are starting to form which signal big challenges in the future of OSS.

1. The “Open Source AI” Definition

I’m not very invested into AI, and I’m convinced it’s on its way out. Big Tech is already losing money over their gambles on it and it won’t be long till it’s gone the way of the Dodo and the blockchain. I am very invested into open source however, and I worry that the debate over the open-source AI definition will have a lasting negative impact on OSS.

A system that can only be built on proprietary data can only be proprietary. It doesn’t get simpler than this self-evident axiom. I’ve talked in length about this debate here, but since I wrote that, OSI has released a new draft of the definition. Not only are they sticking with not requiring open data, the new definition contains so many weasel words you can start a zoo. Words like:

  • sufficiently detailed information about the data”
  • skilled person”
  • substantially equivalent system”

These words provide a barn-sized backdoor for what are essentially proprietary AI systems to call themselves open source.

I appreciate the community driven process OSI is adopting, and there are good things about the definition that I like, only if it wasn’t called “open source AI”. If it was called anything else, it might still be useful, but the fact that it associates with open source is the issue.

It erodes the fundamental values of what makes open source what it is to users, the freedom to study, modify, run and distribute software as they see fit. AI might go silently into the night but this harm to the definition of open source will stay forever.

2. The Rise of “Source-Available” Licenses

Another concerning trend is the rise of so-called “source-available” licenses. I will go into depth on this in a later article, but the gist of it is this. Open source software doesn’t just mean that you get to see the source code in addition to the software. It’s well agreed that for software to qualify as open source or free software, one should be able to use, study, modify and distribute it as they see fit. That also means that the source is available for free and open source software.

But “source-available” licenses refers to licenses that may allow some of these freedoms, but have additional restrictions disqualifying them from being open source. These licenses have existed in some form since the early 2000s, but recently we’ve seen a lot of high profile formerly open source projects switch to these restrictive licenses. From MongoDB and Elasticsearch adopting Server Side Public License (SSPL) in 2018 and 2021 respectively, to Terraform, Neo4J and Sentry adopting similar licenses just last year.

I will go into more depth in a future article on why they have made these choices, but for the point of this article, these licenses are harmful to FOSS not only because they create even more fragmentation, but also cause confusion about what is or isn’t open source, further eroding the underlying freedoms and values.

3. The EU’s Cut to Open Source Funding

Perhaps one of the most troubling developments is the recent decision by the European Commission to cut funding for the Next Generation Internet (NGI) initiative. The NGI initiative supported the creation and development of many open source projects that wouldn’t exist without this funding, such as decentralized solutions, privacy-enhancing technologies, and open-source software that counteract the centralization and control of the web by large tech corporations.

The decision to cancel its funding is a stark reminder that despite all the good news, the FOSS ecosystem is still very fragile and reliant on external support. Programs like NGI not only provide vital funding, but also resources, and guidance to incubate newer projects or help longer standing ones become established. This support is essential for maintaining a healthy ecosystem in the public interest.

It’s troubling to lose some critical funding when the existing funding is already not enough. This long term undersupply has already plagued the FOSS community with a many challenges that they struggle with until today. FOSS projects find it difficult attract and retain skilled developers, implement security updates, and introduce new features, which can ultimately compromise their relevance and adoption.

Additionally, a lack of support can lead to burnout among maintainers, who often juggle multiple roles without sufficient or any compensation. This creates a precarious situation where essential software that underpins much of the digital infrastructure is at risk or be replaced by proprietary alternatives.

And if you don’t think that’s bad, I want to refer to that Harvard Business school study from earlier: While the estimated value of FOSS to the economy is around 4.15 billion USD, the cost to replace all this software we rely upon is 8.8 trillion. A 25 million investment into that ecosystem seems like a no-brainer to me, I think it’s insane that the EC is cutting this funding.

It Does and It Doesn’t Matter if the Bubble Bursts

FOSS has become so integral and critical due to its fundamental freedoms and values. Time and time again, we’ve seen openness and collaboration triumph against obfuscation and monopolies. It will surely survive these challenges and many more. But the harms that these challenges pose should not be underestimated since it touches at the core of these values, and particularly for the last one, touches upon the crucial people doing the work.

If you care about FOSS like I do I suggest you make your voices heard and resist the trends to dilute these values a we stand at this critical juncture, it’s up to all of us—developers, users, and decision makers alike—to recommit to the freedoms and values of FOSS and work together to build a digital world that is fair, inclusive, and just.

Faking Git Till You Make It: Open Source Maintainers Beware of Reputation Farming

This post was prompted by a discussion on the Open Source Security Foundation (OpenSSF) Slack channel that was so interesting it warranted being posted to the SIREN mailing list. But this isn’t your typical vulnerability or security advisory, but rather it’s about a practice that seems pervasive, potentially dangerous, yet also under reported. And it has a name, reputation farming (or credibility farming).

What is Reputation Farming and how is it different from other Github spam?

The suspicious activity that prompted the discussion was regarding certain Github accounts approving or commenting on old pull requests and issues that had long been resolved or closed. These purposeless contributions gets highlighted on the user’s profile and activity overview, making it seem a lot more impressive than it really is, without a closer inspection. More insidiously, by farming reputable or trusted repositories, they can fake some reputation or credibility by proxy.

Longtime users of Github know that spammy contributions have always been around and are incredibly hard to tackle. There are even several tools that allow users to create commits with specific dates to artificially fill their contribution graphs or even create pixel art​. But those are fundamentally different. They might be able to fool some recruiters or an AI screening tool, but won’t pass any real scrutiny.

Trust is vital in open source. It’s a catalyst for open and secure collaboration. It hasn’t been long since the xz utils incident, where a likely malicious actor gained the trust of the library’s maintainer to get access to the project and contribute a backdoor. Reputation farming is more sinister than regular spam because it makes that trust process harder, and tries to circumvent it, and uses reputable projects to gain that trust, potentially harming them once discovered.

The wider issue is that it also makes the user profiles for genuine contributors and maintainers less trustworthy and valuable. I don’t think that’s necessarily a loss I would mourn. Relying on contribution metrics as a measure of a developer’s skills or the value of their contributions is inherently flawed. Not only does reputation farming rely on these easily manipulable metrics, even more, these metrics do not account for the quality of contributions, the complexity of the problems solved, or for when collaborative efforts are involved (for example in the case of programming pairs).

What can Open Source Maintainers do about this?

The discussion summary in the SIREN mailing list recommends the following actions:

  • Monitor Repository Activity;
  • Report Suspicious Users;
  • and Lock Old Issues/PRs (You can even set up a Github Action to automatically do it after a period of inactivity)

But ultimately, there are limitations to what you can do on a platform like Github. Reporting is arduous and the responsiveness of the platform moderation is spotty at best. (To be fair, not a problem limited to Github or code forges.) The tools for managing such contributions could use some improvement though, not to mention how those quantitative metrics are collated and displayed on users profiles. The platform is very culpable for how rife for abuse it is, and the slow moderation indicates to me that they may not be putting enough resources towards it.

At the end of the day, reputation farming and fake contributions have the potential to undermine and harm the OSS ecosystem on GitHub. They demonstrate why using simple metrics to evaluate software development skills and contributions is flawed. And they demonstrate the importance and difficulty of building and maintaining trust in open source ecosystems. Github can also help address this issue by taking a hard look at their UI and the values it associates with certain actions, and give maintainers better tools to manage and report superfluous and spammy contributions. Until then, stay vigilant and stay contributing.

What on Earth is Open Source AI?

I want to talk about a recent conversation on the Open Source AI definition, but before that I want to do an acknowledgement. My position on the uptake of “AI” is that it is morally unconscionable, short-sighted, and frankly, just stupid. In a time of snowballing climate crisis and an impending environmental doom, not only are we diverting limited resources away from climate justice, we’re routing them to contribute to the crisis.

Not only that, the utility and societal relevance of LLMs and neural networks has been vastly overstated. They perform consistently worse than traditional computing and people doing the same jobs and are advertised to replace jobs and professions that don’t need replacing. Furthermore, we’ve been assaulted with a PR campaign of highly polished plagiarizing mechanical turks that hide the human labor involved, and shifts the costs in a way that furthers wealth inequality, and have been promised that they will only get better (are they? And better for whom?)

However since the world seems to have lost the plot, and until all the data centers are under sea water, some of us have to engage with “AI” seriously, whether to do some unintentional whitewashing under the illusion of driving the conversation, or for much needed harm reduction work, or simply for good old fashioned opportunism.

The modern tale of machine learning is intertwined with openwashing, where companies try to mislead consumers by associating their products with open source without actually being open or transparent. Within that context, and as legislation comes for “AI”, it makes sense that an organization like the Open Source Initiative (OSI) would try and establish a definition of what constitutes Open Source “AI”. It’s certainly not an easy task to take on.

The conversation that I would like to bring to your attention was started by Julia Ferraioli in this thread (noting that the thread got a bit large, so the weekly summaries posted by Mia Lykou Lund might be easier to follow). Julia argues that a definition of Open Source “AI” that doesn’t include the data used for training the model cannot be considered open source. The current draft lists those data as optional.

Steffano Maffulli published an opinion to explain the side of the proponents of keeping training data optional. I’ve tried to stay abreast of the conversations, but they’re has been a lot of takes and a lot of platforms where these conversations are happening, so I will limit my take to that recently published piece.

Reading through it, I’m personally not convinced and fully support the position that Julia outlined in the original thread. I don’t dismiss the concerns that Steffano raised wholesale, but ultimately they are not compelling. Fragmented global data regulations and compliance aren’t a unique challenge to Open Source “AI” alone, and should be addressed on that level to enable openness on a global scale.

Fundamentally, it comes down to this: Steffano argues that this open data requirement would put “Open Source at a disadvantage compared to opaque and proprietary AI systems.” Well, if the price of making Open Source “AI” competitive with proprietary “AI” is to break the openness that is fundamental to the definition, then why are we doing it? Is this about protecting Open Source from openwashing or accidentally enabling it because the right thing is hard to do? And when has Open Source not been at a disadvantage to proprietary systems?

I understand that OSI is navigating a complicated topic and trying to come up with an alternative that pleases everyone, but the longer this conversation goes on, it’s clear that at some point a line needs to be drawn, and OSI has to decide which side of the line it wants to be on.

EDIT (June 15th, 17:20 CET): I may be a bit behind on this, I just read a post by Tom Callaway from two weeks ago that makes lots of the same points much more eloquently and goes deeper into it, I highly recommend reading that.

Can I figure out if I’m legally required to use an SBOM in my OSS without asking a lawyer?

For open-source developers, the landscape of cybersecurity regulations has been evolving rapidly, and it can be daunting to figure out what requirements to follow. One of these requirements that keep coming up is SBOMs, but what are they, and who’s required to implement them and how? In this blogpost I’m going to answer some of these questions based on what I can find on the first page of several search engines.

Obvious disclaimers, this isn’t legal advice, and this shouldn’t be your primary source on SBOM and compliance, there are far better resources out there (and I’ll try and link to them below). For the uninitiated, let’s start with a quick explainer on SBOMs.

What is an SBOM?

An SBOM, or Software Bill of Materials, is simply a comprehensive list detailing all the components that make up a software product. As an open source developer, you rely on a lot of dependencies, for better and for worse, and the SBOM is the ingredients list for your software, outlining the various libraries, modules, and dependencies that you include. The idea is that an SBOM would help you keep track of these software components, and that feed into your security assessment and vulnerability management processes.

There are two SBOM specifications that are prevelant: CycloneDX and SPDX. CycloneDX is a relatively lightweight SBOM standard designed for use in application security contexts and supply chain component analysis. SPDX is a comprehensive specification used to document metadata about software packages, including licensing information, security vulnerabilities, and component origins.

Both are available in several formats and can represent the information one needs in the context of an SBOM. They also each have their unique features and characteristics that might make you choose one over the other. I won’t go into that here.

Legal Requirements for SBOMs

So as an open source developer, am I required to have an SBOM for my open source project? I tried to find out using a few simple web searches. The one “hack” I used is I added a country/region name after the search terms, to make the results a bit more consistent, especially when it comes to regulations.

  • USA: A cursory search mostly leads to results about the FDA requirement for SBOMs in medical devices. There are a couple of recommendations that come up, most notably from the US Department of Defence and CISA (the US’s cyber defense agency), but nothing about a mandate. Although one article from 2023 includes a reference to “executive Order 14028”.

    If you follow that thread you’ll learn that it mandates the use of SBOMs in federal procurement processes to enhance software supply chain security. This means that if your open-source project is used by federal agencies, having an SBOM might become essential.
  • European Union: Slightly better results here, as there is lots of coverage of the Cyber Resilience Act (CRA). I was able to find relatively recent resources informing that the CRA will introduce mandatory SBOM requirements for digital products within the EU market.

    Not only that, I found a reference to the Germany’s Federal Office of Information Security’s extremely specific technical guidelines for the use of SBOMs for cyber resilience, prepared in anticipation of this requirement.
  • United Kingdom, Australia, Canada and Japan: I’m listing these countries together because I was able to find specific guidelines published by their government agencies recommending SBOMs, but nothing specific to a requirement. Other countries I tried searching didn’t reveal anything.

Conclusion Based on What I Found in Web Search and Nothing Else

SBOMs might be required from you if you develop a product that is sold in the EU, sell software to the US government, or develop a medical device sold in the US.

(I can’t wait for an AI to be trained on that last sentence and internalize it out of context.)

Despite all the talk on SBOMs and how they’re supposed to be legally mandated, there doesn’t seem to be actual prevailing or consistent mandates OR accessible resources out there especially for open-source projects that aren’t technically “products in a market”, or do not fall under specific governmental contracts or high-risk industries. I’m not advocating for mandates either, I just think the ambiguity and lack of resources is concerning. Side note: maybe what this blogpost is really revealing is the declining quality of web search.

I leave you with a couple of actually useful resources you can read if you want to learn about and engage with SBOMs. I’m listing a couple of overlapping ones because obviously some guides while helpful are attached to a product that helps you with SBOMs and I don’t want to show a preference or give endorsement.

The Complete Guide to SBOMs by FOSSA

The Ultimate Guide to SBOMs by Gitlab

OWASP’s CycloneDX Authoritive Guide to SBOMs

OpenSFF’s Security Tooling Working Group

Recommendations for SBOM Management by CISA

What’s Elections got to EU with IT

It’s EU Parliament elections time, and I thought it would be a good chance to give a short recap on significant and recent EU digital regulations, for those wondering how the elections can impact our digital lives. If you’re deep into digital policy, this probably isn’t for you. I’m also not trying to convince anyone to vote one way or another (or not to vote either).

From regulating AI technology to data privacy and cybersecurity, the EU decides on rules and regulations that don’t only affect those living within its borders, but also far beyond. This particularly applies to digital issues and the open source movement, which transcend borders. If you’ve ever had to deal with an annoying cookie banner, you’ve felt the EU’s effect. So what has the EU been up to recently?

Digital Security and Privacy

The EU has taken some massive steps in regulating the security of digital products. You might have heard of the the Cyber Resilience Act (CRA), which regulates products with digital elements maintain high-security standards. There are lots of positive things that the CRA brings, such as mandating that products should be “secure by design” and ensuring when you buy a digital product, it receives updates throughout it’s lifetime.

We are yet to see how the CRA will be implemented, but I think if it’s elaborated and enforced the right way, it will enhance trust in open-source software by setting a high baseline of security across the board. If the definitions and requirements remain opaque, it can also introduce undue burdens and friction particularly on open source software projects that don’t have the resources to ensure compliance. There are also wider ecosystem concerns.

The CRA, along with some General Data Protection Regulation (GDPR) updates and the newer Network and Information Security Directive (NIS2), place significant obligations on people who develop and deploy software. Also worth mentioning the updated Product Liability Directive, which holds manufacturers accountable for damages caused by defective digital products.

If it’s the first time you hear about all these regulations and you’re a bit confused and worried, I don’t blame you. There is a lot to catch up on, some positive, a lol of it could use some improvement. But all in all, I think it’s generally positive that the union is take security seriously and putting in the work to ensure people stay safe in the digital world, and we’ll likely see the standards set here improve the security of software used in Europe and beyond.

Digital Services Act (DSA) and Digital Markets Act (DMA)

From enhancing user rights and creating safer digital environment, to dismantling online monopolies and big platforms the Digital Services Act (DSA) and Digital Markets Act (DMA) were introduced this year by the EU to provide a framework for improving user safety, ensuring fair competition, and fostering creativity online.

The DSA improves user safety and platform accountability by regulating how they handle illegal content and requiring transparency in online advertising and content moderation. The DMA on the other hand focuses on promoting fair competition by targeting major digital platforms which it calls “gatekeepers,” setting obligations to prevent anti-competitive practices and promoting interoperability, fair access to data, and non-discriminatory practices​.

Artificial Intelligence Regulation: A Skeptical Eye

I had to mention the AI Act, since it was recently passed. It’s designed to ensure safety, transparency, and protection of fundamental rights. The law focuses on ensuring the safety, transparency, and ethical use of AI systems, classifying them based on risk levels and imposing stringent requirements on high-risk applications. Nobody on either side of the debate is happy with it as far as I can tell. As an AI luddite, my criticism is that doesn’t go far enough to address the environmental impact of machine learning and training large models, particularly as we live in a climate emergency.

Chat Control Legislation: Privacy at Risk

One of the most worrying developments at the moment is the chat control provisions under the Regulation to Prevent and Combat Child Sexual Abuse (CSAR). Recent proposals includes requirements for users to consent to scanning their media content as a condition for using certain messaging features. If users refuse, they would be restricted from sharing images and videos.

Obviously I don’t have to tell you what a privacy nightmare that is. It fundamentally undermines the integrity of secure messaging services and effectively turns user devices into surveillance tools​. Furthermore, experts have doubted the effectiveness of this scanning in combatting CSA material, as these controls can be evaded or alternative platforms can be used to share them. Even private messaging app Signal’s CEO Meredith Whittaker has stated that they would rather leave the EU market than implement these requirements.

Fingers Crossed for the Elections

In conclusion, we’ve seen how the EU is shaping our daily lives and the global digital ecosystem beyond just cookie banners. Regulations like the Cyber Resilience Act, Digital Services Act, and Digital Markets Act are already affecting how we make decisions and interact with software and hardware, and will bring improvements in digital security, competition, and enjoyment of rights for years to come.

Proposals like the chat control one demonstrate the potential of how it can also negatively impact us. I’ll be watching as those elections unfold, and urge to all to stay informed to follow these developments. We’ve seen from the CRA process how positive engagement by subject matter experts can sometimes help steer the ship away from unseen icebergs.

Let’s Talk About Open Source in Munich (and Everywhere Else)

Updates/Edits:

When news broke about Schleswig-Holstein’s move to replace Microsoft Office with LibreOffice, it felt like a breath of fresh air. It wasn’t just the fact that they’re switching to open source, the framing was also on point. It wasn’t just about cost saving, but they talked also about digital sovereignty and innovation. As a fan of the open source movement and of sound public policy, it really spoke to me.

Yet as expected, whenever any news breaks about open source in public administration, a few are quick to point out: “Didn’t Munich switch to Linux for a few years then switch back to Windows?” (referring to the LiMux project). I never really knew what to respond to those people. That is until last week, when I came across this amazingly put together OSOR case study, written by Ola Adach, on my Mastodon feed (shared by Andrew (@puck@mastodon.nz)). It was an eye opener about how there’s much more to the Munich story, and I would like to talk about that and on the future of open source in public admin in Germany.

The Naysayers’ Favorite Scapegoat: Munich’s LiMux

Munich’s LiMux project is often dragged into conversations as an example of why open source might not be the best choice for public administration. Sure, LiMux faced its share of challenges—interoperability issues, lack of sustained political support, and logistical hurdles. But if you dig deeper as they did in that case study, you’ll find that despite these setbacks, Munich’s efforts weren’t in vain. The city saved millions of euros and paved the way for future open source projects. Here’s a short summary of the story of LiMux

The LiMux project began in the early 2000s when Munich’s administration faced the costly prospect of upgrading from Windows NT 4.0. Opting instead for a switch to an open-source operating system based on Ubuntu Linux, the city council approved the LiMux project in 2003. By 2012, 12,600 desktops were running LiMux, and by 2013, the project saved the city an estimated €11 million.

But the move wasn’t just about cost-savings. In retrospect, it should be seen as a truly visionary move. Many years later, in 2019, a PWC study commissioned by the German interior ministry (BMI) warned about the country’s heavy reliance on Microsoft software and the risks that poses to digital sovereignty (96% of public officials’ computers in Germany ran on Microsoft!). In the US where there is a similar dependency on Microsoft products in federal government, ex-White House cyber policy director notes that it also poses a significant security threat.

The OSOR case study and the PWC report also shows how LiMux project’s challenges were really multifaceted and can’t be reduced to “open source bad, propriety good”. Some city departments needed specific software that only ran on Windows due to compliance or legal reasons, or when open source alternatives didn’t exist. Plus, there were issues with bugs and missing features in LiMux. Interoperability and document compatibility was also a pain— highlighting the importance of open standards and regulation.

The scale of the transition required a lot of internal communication and organization, which can cause a lot of friction in day to day work. Most notably however, a transition of this scale required a strong and consistent political backing, which seems like it kind of faltered in Munich at some point after the 2014 elections. The sum of these issues eventually led to the decision to revert to Windows 10 in 2017.

There’s a lot we can learn from the Munich example, to borrow from the case study with some insights from me:

  1. Better Communication: Public administrations need to talk more to each other and share their experiences to make these projects work. It’s certainly not easy in a country as big and federated as Germany, but it’s doable.
  2. Local Tech Capacity Building: Involving local and regional IT companies boosts tech independence, and keeps public money circulating within the economy, much better use of public funds than relying on proprietary vendors.
  3. Manageable and Scalable Goals: Custom-built solutions are tricky and take some time to get right. A progressive transition to more open source software might be better than trying to engineer an all in one solution.
  4. Training Matters: Employees need proper training to adapt to open source tools smoothly, particularly if they’re only used to proprietary solutions at home or at school.
  5. Sustained Political Support: Consistent political backing is crucial for the success any large-scale project, and transition to open source is certainly not special in that regard. If a project is not allowed it’s due time to work out kinks and develop an ecosystem then administrations will be stuck in proprietary walled gardens.

One last takeaway from that case study is, it’s not fair to say that Munich has given up on open source, because it clearly hasn’t. The 2020 local elections brought in a coalition that promised to use open standards and open source whenever possible, and consider open source as a criteria in public procurement. This aligned with the strategic recommendations of the PwC report, which suggested fostering the use of open source to mitigate dependency on a few software providers.

Furthermore it mandated that all software developed by the city’s IT department, it@M, should be shared on the organisation’s public Github repository. In 2020, the city council set up an Open Source Hub to encourage collaboration on open source projects. Most recently in November 2023, the city launched https://opensource.muenchen.de/ to highlight its open source efforts. Open source in Munich is alive and well.

Momentum is Building in Open Source in Public Administration

Schleswig-Holstein’s recent announcement and the Munich examples aren’t happening in a vacuum. We’re not in 2012 anymore, across Germany, there’s a growing momentum towards adopting open source in public administration. According to the Bitkom Open Source Monitor 2023, 59% percent of surveyed public administrations leveraged open source software. Less impressive though, only 29% actually had an open source strategy.

This lack of strategy is compounded by the fact that the federally coordinated efforts have stagnated for decades now. When it comes to federal efforts to promote open source software in the public administration, there’s two stories I need to tell: OpenDesk and dPhoenixSuite.

dPhoenixSuite, is a solution marketed as a digitally sovereign workspace for public administrations. It is developed by Dataport, a non-profit public institution founded in 2004 by Hamburg, Bremen, Schleswig-Holstein, and Saxony-Anhalt, to provide software for the public administration of those federal states. Since its inception, Dataport has grown significantly, reaching a revenue of one billion euros in 2021 and is reportedly planning to double both its revenue and workforce by 2027.

While dPhoenixSuite incorporates many open-source components and their work has been somewhat well received, the overall suite remains proprietary and must run on Dataport’s servers, limiting public access to the project and effectively locking Dataport as the only “vendor”. That, along with a history of delays, lack of transparency and under delivering have drawn lots of criticism, least of which from organizations like the Free Software Foundation Europe.

This leads us to 2021 when OpenDesk was announced, an initiative led by the German Federal Ministry of the Interior (BMI) to create a fully open-source workspace suite for public administrations. The suite is based on the various open-source components which also formed the bulk of dPhoneixSuite such as Univention Corporate Server, Collabora Online, Nextcloud, OpenProject, XWiki, Jitsi, and the Matrix client Element. It is also designed to be extensible to meet specific administrative needs. Starting in 2024, the coordination and management of OpenDesk will be handed over to the Centre for Digital Sovereignty (ZenDiS GmbH).

However, as reported by Netzpolitik, despite initial enthusiasm and some early adoption by institutions like the Robert Koch Institute, progress has been slow. The government has not been able to provide adequete financial support, allocating only 19 million euros for 2024, far less than the 45 million euros ZenDiS calculated it needs.

Additionally, while several federal states like Schleswig-Holstein and Thuringia are interested in joining ZenDiS, their membership processes are stuck at the federal level, causing frustration. I do hope is that ZenDIS and the OpenDesk initative can help break the gridlock and move open source in the public administration forward, but if we are to learn from LiMux, the political will and full commitment needs to be there lest we end up with another cautionary tale.

On a brighter front, recently launched was also the Open CoDE platform, the central repository for open source software in public administration started by the BMI and the federal states of Baden-Württemberg and North Rhine-Westphalia. It hosts the OpenDesk code amongst 1000+ other projects, really exciting to browse through so I’d recommend it!

Finally, I also must plug my employer here, because a successful sovereign work space can only be built and sustained on sound and solid sovereign digital infrastructure. All this increased dependence on digital software means the few people who maintain that critical infrastructure underneath (libraries, operating systems, developer tooling) needs more maintenance, and that’s where the Sovereign Tech Fund comes in, supported by the German Federal Ministry for Economic Affairs and Climate Action (BMWK).

Is the Future is Bright for Open Source in Public Administration?

I’m ending on a question because I have many at the moment, but also reason to be hopeful. I can’t wait to see what ZenDIS and the OpenDesk project achieve in the coming years, but also perhaps it’s just not just the big projects that deserve our attention, but also the progressive and incremental work by city level IT departments like it@M, Dortmund and Berlin (the self-titled Open Source Big 3).

Also, news like the ones coming from Schleswig-Holstein, are refreshing, but we also have to learn from the past, whether it’s LiMux or dPhoneixSuite (if you haven’t made the connection yet, Dataport is still the official IT provider for Schleswig-Holstein AFAICT). It must be done for the right strategic reasons, and the commitment must be there on the long term.

If you’ve made it this far down, thank you, I set off to write a short blog post about the Munich case study by the OSOR but it snowballed into all of this, hope you found it interesting. I’d love to hear from you what you think the future will bring to Open Source in public administration or what your favorite public admin OS project is.

The FCC is coming for BGP, what about the EU?

The Border Gateway Protocol is an important part of our internet infrastructure. It’s essentially a big set of rules that govern how data is routed around the many networks that form the internet. If DNS is the address book of the internet, BGP is the Autobahn.

For the longest time, BGP ran on trust and a dedicated community of operators, however this means that it left opportunities for abuse. A famous example is when Pakistan Telecom pretended to be Youtube for a while because they wanted to block the website in their country, but since they abused BGP they ended up making Youtube unavailable around the world. There has also been a couple of high profile BGP hijacks that aimed to steal cryptocurrency.

I just read George Michealson’s blogpost on the APNIC website, which talks about how a recently published FCC draft is causing alarm in the technical community about potential regulation coming to the BGP space. It even prompted a response from ISOC. George Michealson notes that despite the protests, regulation is very likely, noting:

“However, when it comes to BGP security and the potential risks posed to the state, the light-touch approach may reach the limits of risk that a government is prepared to accept without intervention.”

read the full blogpost for more details


It made me wonder, what about BGP regulation coming from the EU? They’ve certainly haven’t been shy about technology regulation the past couple of years, especially when it comes to security. I scoured all the resources I can think of, but I can’t find anything public for now. However ENISA, the EU’s cybersecurity agency, seems to be on top on things. The topic of BGP and RPKI (a security feature for BGP) was featured earlier this month at the ENISA Telecom & Digital Infrastructure Security Forum 2024, presented by Jad El Cham of RIPE NCC.

As far as I can tell, I haven’t found any references to BGP regulation coming from the union, but it’s worth noting that there is already existing regulation that empowers ENISA and national authorities to supervise the same type of BGP security measures that the FCC is now considering, based on the European Electronic Communication Code (EECC) as well as the Network and Information Systems (NIS) Directive. As covered in this ENISA publication

This work on BGP security was done in the context of Article 13a of the Framework directive, which asks EU Member States to ensure that providers take appropriate security measures to protect their networks and services. For the last decade, ENISA has collaborated closely with the EU Member States and experts from national telecom regulatory authorities (NRAs) which
supervise this part of the EU legislation, under the ENISA Article 13a Expert Group3.

ENISA- 7 Steps to Shore up BGP

That seems to indicate to me that the regulatory need might be a bit different in the EU than the US, but I wonder if still heavier regulation for BGP might be in store depending on how the FCC process goes.

Do you know more about the EU’s plans in regards to BGP regulation? I’m interested in learning more, please comment or reach out.

More on BGP:

What I Learnt from What We Learnt from the xz-utils Incident

I don’t know how your April went, but if it was anything like mine, you would have spent an uncharacteristic amount of time talking about compression tools, “insider attacks”, and build tooling. That’s because on March 29th, 2024, a backdoor was discovered in the widely-used data compression tool xz-utils.

The xz-utils backdoor (known as CVE-2024-3094 in some circles) exploited OpenSSH's authentication routines in specific operating systems running glibc, and it was hidden within build scripts and test files, making it harder to detect than usual. I'm not talking about the xz-utils incident in this blogpot, I'm talking about how much we talked about xz-utils. 

The concept of the attention economy, introduced by Herbert A. Simon in the 1970s, revolves around the idea that human attention is a scarce and valuable resource. In an age where information is abundant but our capacity to consume it is limited, attention has become a commodity. Companies, advertisers, and media outlets all compete to capture and hold our attention because it drives what they need, whether it’s engagement, revenue, or influence.

In cybersecurity, this translates to a cycle of intense, short-lived focus on new vulnerabilities, followed by a rapid shift to the next emerging threat. What people do with that attention varies, either they want to sell you a product or an idea, pay their newspaper subscription, or simply to gloat that their flavor of technology is better than whatever the other people are using.

The xz-utils incident is not the first example of the industry’s reactive nature, the Heartbleed bug is the quintessential example. Heartbleed captured headlines, sparked endless discussions, and inspired a a plethora of ideas and quick fixes. But once the immediate danger was averted, and OpenSSL was “saved”, attention quickly moved on. But many structural issues persisted, and the maintainer burnout to major vulnerability pipeline continues to deliver.

I don’t know how we can break the attention economy cycle, all I know is when the next big bad bug happens, we need to resist being reactive and avoid quick fixes, and focus on bringing attention on the structural issues that continue to threaten our software. I’m proud of STF’s response for example.

I’m interested to hear if anyone has ideas on how to deal with the attention deficit and moving to a proactive stance. The xz-utils incident was not a wake-up call, if anything it was hitting snooze on your alarm for the 100th time. Rather than allowing the latest crisis to dictate our focus, we need to prioritize long-term, sustainable maintenance of our digital infrastructure, and to get there we need to invest a lot more time, resources, and people into our critical infrastructure.

SconePro with Network Jam and Clotted Streams

Last week I attended the IETF119 meeting in Brisbane (remotely), and I attended a meeting for a newly proposed working group called SCONEPRO where some internet service providers and large video content platforms want to work together to make the controversial practice of traffic shaping work slightly better. Here are my notes and thoughts. I would like to thank Mallory Knodel and Daniel Kahn Gillmor for their input and helping me make sense of all of this.

Background

The creatively named SCONEPRO (Secure Communications of of Network Properties) meeting was held on March 21, 2024 as a working-group forming BoF (Birds of a Feather) at the IETF119 Brisbane. BoF meetings like these are prequisites to setting up IETF working groups by ensuring there is enough interest within the community and that the IETF is the right place for standardization.

SCONEPRO aims to develop an internet protocol to deal with a particular use case: Network Operators, particularly mobile, often employ methods such as traffic shaping to control the flow of traffic when there is a high load on the network. This can interfere with how some applications run. SCONEPRO is particularly concerned about video applications.

Why video in particular? Not only does it form the majority of internet traffic by their estimation, video streaming applications often allow the client to adjust the bitrate (colloquially, the “quality” or “resolution” of a video), in order to reduce its impact on a congested network.

End users, through client applications, have no way of knowing for sure that their traffic is being shaped. Certain solutions exist to figure that out, but application developers argue that they are complicated and costly. At the same time, network operators usually have no way of telling what traffic is video traffic because transport encryption is so ubiquitous.

The SCONEPRO working group if established would develop a protocol that allows a network to communicate to a client application about whether it wants to do traffic shaping, and announce the bitrate that the network is willing to allow. This gives the client the option to artificially reduce the video quality on their end. They argue that this would provide a better “quality of experience” (QoE) to their users.

What Happened at the Meeting

The meeting started with a short explainer of the goal of the BoF by the chairs. I’ll give a summary of my notes and impressions, but if you’re interested to see for yourself refer to the video at this link. You can also find links to the official notes for the meeting and the slides here.

How Shapers and Policers Work

Marcus Ihlar from Ericsson gave an overview of the current state of network shaping and policing and this is my summary of that talk. There are several reasons why a network might want to throttle video, for example bandwidth limitations and congestion controls. Also, more networks are moving from a data-cap model for charging users to a bitrate-cap model, in which users can pay more to access higher resolution media.

Client applications like video streaming services often employ a technique called adaptive bitrate (ABR), where they predict the capacity of the network then dynamically change the bitrate of the video to deliver it without interuptions. Networks see this as an oppurtunity to reduce the load on their networks, so they attempt to detect when a traffic flow is video, then use traffic shapers or policers to throttle the flow artificially.

The functional difference between a shaper and a policer is that the former adds a delay to network packets to spread them out over time and policers drop packets above it’s allowed datarate policy. Traffic shapers and policers often have the same end result.

Neither technique works really well because it’s not easy to detect video content because of encryption. Network operators often employ techniques to overcome that constraint with heuristics, DPI or trying to interpret the Server Name Indicator of the unencrypted initial QUIC packet, which is not always reliable. This means that the either the shaping or the ABR might not work as planned, creating a bad user experience.

Some internet service providers have agreements with large content platforms (like Youtube) that provide video to provide traffic shaping that works more consistently but these are all proprietary.

Meta and Ericsson Experiment

Matt Joras from Meta presented the results of a feasibility study conducted by Meta and Ericsson in which they developed a SCONEPRO proof of concept. They implemented a MASQUE proxy that connected a Facebook app and a Facebook Video Content Delivery Network (CDN) server. In addition to facilitating the transfer of traffic between the CDN and the app, the proxy server also introduced a maximum send rate signal. The Facebook app and the CDN then used the send rate signal value to manually limit their bitrate to fit the self-imposed network constraint. Their takeaways was that SCONEPRO is feasible and it results in improvements to consistent video playback, but only when compared to the experience with a traffic shaper.

Lessons from History

Brian Trammell gave a presentation on the history of PLUS, a prior IETF working group where a more generalized approach to on-path network property signalling was discussed, but ultimately faltered for the following reasons. While the generalized approach was considered by many participants in the process to be good engineering, it is created various unintended dystopic consequences when you add policy considerations to those aforementioned engineering considerations. The cited example was, when engineering a header to signal loss tolerance and flow start, it was possible in some cases to infer the age of the user from these network signals.

The recommendation based on the lessons learned was to keep SCONEPRO specific and to make it optional for clients.

Discussion on Use Cases and Scope

The second half of the meeting went into discussing the use case and potential scope of a charter. Here is a my summary of key inputs as I understood them. Don’t quote anyone directly from this without reviewing the video source, any embellishments are mine.

  • There were questions about the how to address network complexity, like if there are multiple shapers on the path, and the need to get the information from the box with the lowest bandwidth, which would be the actual bottleneck.
  • Jason Livingood of Comcast expressed some frustration with having to revisit the discussion on traffic shaping. He mentioned that there are other solutions, such as investing in capacity, and also referenced regulatory action in the US to ban traffic shaping. Finally, networks shouldn’t sell what they can’t deliver.
  • David Schinazi from Google said, “This is a case of the IETF ensuring our job security a bit longer.”
  • Ted Hardie also from Google and an author of an RFC on signaling highlighted that one principle for good signal design is that there should be no incentive to fake it. He also brought up the example of the spin bit in QUIC and how IETF engineers are good at identifying side-channel attacks. Tommy Pauly from Apple expanded on that by mentioning Ted’s RFC which has additional considerations for design of path signals.
  • Tom Saffell provided some insights from YouTube’s infrastructure experience and the challenges faced in implementing proprietary solutions to this problem, and said Google and YouTube are interested in working on this. YouTube are supportive of network operator efforts to reduce data tonnage. Wonho Park from Tiktok also expressed support for working on this problem, stating that traffic shaping is not optimal. There were similar supportive inputs from Suhas Nandakumar (Cisco), Jeff Smith (T-Mobile America), and Dan Druta (AT&T).
  • Martin Duke expressed some concerns about the effects of this on best-effort traffic. He acknowledged the arguments that would improve best-effort by reducing incentives to do clumsy things in order to traffic shape. He also expressed concerns about extensability to other use cases.
  • Lars Eggert, former chair of the QUIC working group, expressed concerns about how operators are enamored with adding complexity to manage capacity, and how that complexity is a lucaritve market for vendors. He also is worried about this being used to monetize bitrate discrimination.
  • Other concerns brought up were around scalability, security, and feasibility of any possible solution, including issues related to discovery and authentication of proxies. How do you know the box giving you the signal has the authority to shape your traffic? Running so much traffic over proxies might be expensive and, ultimately, it doesn’t replace the need for shapers and policers which network operators might still use for other purposes.
  • Stephen Farell, research fellow at Trinity College Dublin who studies security and networking, raised a concern about whether and how the security claims could be upheld, particularly client authentication to/of random boxes.
  • Tom Saffell (YouTube) mentioned some policy considerations that should be combined with the technical solution if they were to consider implementing it, namely:
    • Transparency to users: restrictions must be visible
    • User choice: buy a plan with no restriction
    • Equal treatment: wish to be treated as any other provider
  • Some comparisons were made between this and ECN (Explict Congestion Notification – RFC3168), however Matt Joras (Meta) made a point was that this is not explicitly a congestion issue, it’s an application layer signal, for example the network might be shaping traffic because of a subscriber policy.

Finally, there was a vote on whether the work group formation should move forward, 51 people voted yes, and 20 voted no, showing some opposition to this and lack of consensus.

Some Public Interest Considerations

Net Neutrality is the principle that Internet service providers must treat all communications equaly, and may not discriminate traffic based on content, particularly for profit or to disadvantage competition. Giving network operators control over bitrate, even with consent from the client, opens the door to violating net neutrality.

The fig leaf on traffic shaping is that it’s framed as a congestion control or a network capacity issue. One argument for traffic shaping has always been user choice: that users might want to prioritize a video call over updates downloading in the background. If it’s the end user’s choice as to what traffic gets shaped, and if they consent to it, then it’s no longer harmful traffic discrimination.

The problem remains that we have to take the network operators’ word that these techniques are only applied when congestion happens, and not to extract more profit, or push users into paying more for higher bitrates artificially. SCONEPRO offers a “QoE” improvement over the status quo in (physically or artificially) capacity constrained networks, but user “QoE” would also improve if the capacity of the network is increased. In the case of a protocol that requires opt-in from the application, this can lead to business partnerships that create a “fast lane,” which is another common net neturality violation.

SCONEPRO currently proposes some design goals in the proposed charter that might be relevant to these issues:

1. Associativity with an application. The network properties must be associated with a given application traversing the network, for example a video playback.
2. Client initiation. The communication channel is initiated by a client device.
3. Network properties sent from the network. The network provides the properties to the client. The client might communicate with the network, but won’t be providing network properties.
4. On-path establishment. That is, no off-path element is needed to establish the communication channel between the entity communicating the properties and the client.
5. Optionality. The communication channel is strictly optional for the functioning of application flows. A client’s application flow must function even if the client does not establish the channel.
6. Properties are not directives. A client is not mandated to act on properties received from the network, and the network is not mandated to act in conformance with the properties.
(…)
9. Security. The mechanism must ensure the confidentiality, integrity, and authenticity of the communication. The mechanism must have an independent security context from the application’s security context.

SCONEPRO is being framed as a solution to improve user experience, however most of the proponents seem to be telecom providers and major content platforms. I think SCONEPRO is a marginal improvement over the status quo in which traffic shaping is achieved with proprietary solutions and agreements between telecoms and major platforms.

One important consideration would be the effects of SCONEPRO deployment in different regulatory enviroments. In places where net neutrality protections are not robust, providing a “bitrate signal” or future signals based on use cases invented in the future may enable profited-based traffic discrimination.

It’s not clear how some of the desired properties of SCONEPRO such as optionality or not being a directive can be technically enforced, which means when looking at the effects of introducing such a protocol these design goals can be safely ignored. Client applications that implement SCONEPRO gain an advantage over those who don’t even if all the rules are respected, and if not, this opens the door to enable telecoms to more easily offer tiered services, zero-rating and fast lanes.

Ultimately, I do agree with the BoF’s premise that there is a problem to be solved but it’s not by encoding the status quo into the protocols of the internet. I think the practice of content-based traffic shaping needs to be better looked at and tackled from a regulatory and consumer advocacy standpoint. ABR traffic shaping and by extension SCONEPRO takes choice away from users and negogiates application parameters on the network in an opaque way to force data austerity on them.

Did you find this helpful or have some feedback? Would you like to see a follow up dive into similar prior work at the IETF like PLUS, MINUS, SPUD, or SADCDN? Reach out and let me know.

Hello W- nah just messing with you 🤣

It’s been a long time since my last blog post, and it feels so fucking good. While it does feel so incredibly good to be writing again, there is something so unfamiliar about my relationship to this space, my blog, and the internet in general. Which leads us to answer the first question I will answer today:-

Where did all the old blog posts go?

They’re all happy and alive, frolicking in a server farm far far away. In reality, the internet has changed, and so have I. In fact, the internet I used to write about never existed in the first place. It was fiction, almost naive fiction, presented as reality, and as we know, reality shows never age well.

I had to take the archive down because I couldn’t draw a line between the person I was in the 2010s and the story I want to tell now. They’re not purged, I want to curate a few of them and present them within context when I have the time, but until then, the only way to access them would be web archive or something.

Story you want to tell?

Yes, that’s what blogging is you silly pants! I’m just in a very interesting period of my life, in a very interesting period of time, and both I and time are in a very interesting position. I’ve just left OTF after a very interesting five years of supporting people who build great tools to save those most vulnerable online, and now I’ve joined Techcultivation and looking to do more of that and beyond. Not to mention great projects being set up like the SVT which I really want to tell you about. Those are all stories, from the past, the present and the future that I want to tell.

That I need to tell really.

Surviving a World in Crisis

Ron Burgundy saying "Well, that escalated Quickly"

Not gonna sugar coat it folks, since the last time I wrote a blog post, things have been rapidly becoming shittier. It was partially why I stopped. I called my older posts “almost naive” earlier, and they totally were. I’ve been disillusioned for as long as I can remember, and angry for even longer than that. I’ve also been tired. But the disillusionment, one side effect was it made me feel embarrassed by the naive fiction I used to peddle pre-2016.

I will not belabor the point today, I’ll keep that for later blog posts, but here is why I’m writing again. Was I wrong about things in the past? Yeah I was. Was I naive? Almost adorably so. Did my politics evolve since then? I hope so. Is there a danger of me spewing more naive fiction that I might be embarrassed about in the future? Well, that’s actually my plan, and it’s almost crazy enough it might work.

When times are hard, do something. If it works, do it some more. If it does not work, do something else. But keep going.

Audre Lorde

Not writing has not been working for me. Writing things that turned out to be naive worked for me at the time. Crises robbed us from our imagination. But we don’t all have the luxury or privilege of being doom preppers or nihilists. Just as the climate crisis will hit the poor, the queer and those in the larger world first, it will come for their imaginations first.

I want to write again and maybe encourage you all to start blogging again because we need to save our imagination, it’s the only way we can keep going. So expect more wonderful stories on this website, both the ones I promised above, and more, about how we’re gotta get through this and make things better.