In January 2021, Microsoft knowingly left customers exposed to an active cyber attack for at least two months while it fixed flaws in its Exchange software. A month earlier, SolarWinds admitted that an attacker was actively victimizing customers using malicious code that had be inserted into its software ten months prior. Despite the substantial harm caused by each event, the power the companies wield over customers will likely help them deflect significant accountability and reinforce a system of privilege that is steadily eroding global cyber defenses.
There are many other examples, but one 2018 disclosure serves as a foundational case study in cybersecurity inequity. It was then, six months after discovering the worst computer bugs in history, that a secret response effort coordinated across seven technology industry titans collapsed catastrophically a week sooner than planned. Its failure to effectively shield technology customers from harm clearly exposed a system of privilege that continues to stymie hardware and software supply chain security today.
Jann Horn, a 22-year-old security researcher on Google’s Project Zero vulnerability hunting team, reported potential flaws in computer processors to major manufacturers on June 1, 2017. Horn’s discovery that a hacker could potentially steal data from every computer, phone, and smart device on the planet was the technology industry’s worst nightmare come true.
The Central Processing Unit (CPU) is the brain that determines the result of any computer function. As a physical piece of hardware, it operates by breaking down the millions of instructions and calculations at the heart of each function into electrons that it channels through an extraordinarily complex maze of conductive pathways. Each time a pathway carries an electron, it generates a minuscule amount of heat that rapidly accumulates.
One way that manufacturers keep CPUs cool enough to remain stable while also satisfying the world’s insatiably growing appetite for computing power is to find ways for using software to trick a CPU into doing more in less time. For example, if the CPU can speculate multiple likely outcomes for a function in advance and begin working on subsequent functions based on those possible outcomes, it will be able jump ahead a bit after completing the first function.
Saving time is great, but speculation comes at some cost. For the CPU, acting on multiple possible outcomes creates surreptitious pathways for attackers to bypass security controls, execute privileged functions, and gain access to sensitive data. Academic researchers had theorized for years that speculative execution, a common CPU architectural design feature with roots dating back to 1995, could be vulnerable to attack. Since no one had previously succeeded in proving the theories in practice, manufacturers continued to incorporate the feature in new CPUs.
Hardware flaws are the most subversive supply chain vulnerabilities because they can only be completely fixed by replacing the damaged physical components. After validating Horn’s report, manufacturers faced the herculean challenge of redesigning, producing, and replacing hundreds of millions of CPUs. It would normally take several years and hundreds of millions of dollars to just begin manufacturing a new CPU. Rather than wait and hope the decades-old vulnerabilities stayed hidden, the manufacturers sought help from other key companies to hide the defective features.
Secrecy is a universal pillar for how the technology industry addresses adverse events. Shielded by use agreements that disallow customers from peering too deeply into how products work and licensing contracts that indemnify against damage caused by defects, technology companies have little incentive to be transparent when problems are found. True to the inherent protectionist desire to restrict information flow about adverse discoveries, the CPU manufacturers recruited a very small cabal of experts from the most prominent technology companies to help address the vulnerabilities.
Led by Intel, the largest manufacturer of vulnerable processors, the coordinated vulnerability disclosure (CVD) group included two other CPU manufacturers (AMD and Arm) and at least four other companies that develop operating system software used by devices powered by affected CPUs (Amazon, Apple, Google, and Microsoft). Together representing an estimated $1T in 2020 revenue, those organizations roughly account for 20% of the estimated $5T technology industry despite being barely more than a handful of the approximately 500K technology companies in the US alone. That elite group of experts collaborated under a responsible disclosure agreement to keep their activities quiet until reaching a deadline that lifted the embargo.
Horn and Google Project Zero would ultimately grant the manufacturers six months to address the vulnerabilities, agreeing to wait until January 9, 2018 before announcing them to the world.
Maintaining the secrecy of such a coordination effort requires a shared proprietary sense of self-preservation. That directly contradicts the development ethos of one other critical supply chain component: the Linux operating system.
Though rarely used on personal computers, Linux may be the most pervasive operating system for powering web sites and cloud services. Linux itself also powers many smart devices, providing the software code foundation for Google Android devices and Chromebooks, Amazon Kindle electronic readers and Fire devices, and even Tesla vehicles. Originally written in the early 1990s, Linux is an example of open source software developed by a public community of contributors that fiercely champion openness and transparency as intentional subversions of the more proprietary technology business culture. The CVD response could not succeed without also updating Linux, but direct engagement would put any secret repair effort at risk.
Dave Hansen, a Linux hacker and engineer at Intel’s Open Source Technology Center, attempted to leverage the community’s openness to quietly introduce a way to cover the CPU vulnerabilities in an October 31, 2017 code submission. Based on a proof-of-concept developed by researchers at the Graz University of Technology to prevent operating system memory attacks against hardware, Hansen’s work represented a major change to the main Linux hardware interface component, its kernel. The effort proved too conspicuous.
Though pleased at the sudden strong interest in research they published in February 2017, the Graz researchers grew suspicious when Hansen played down the significant performance impact caused by the new kernel component. Subsequent digging led the Graz team to repeat Horn’s vulnerability report on December 3, 2017.
Four independent researchers or research teams would eventually be recognized for separately discovering three related CPU vulnerabilities in the six months prior to their public disclosure. Having so many researchers reporting the same years-old vulnerabilities within just a few months of each other indicated broad access to information that pointed to the inherent flaws. Though the manufacturers still had no knowledge of active exploits against the vulnerabilities, time was running out before sophisticated agents would also be able to follow the breadcrumbs (assuming that they hadn’t already).
Linux community openness predictably proved to be the weak link in the carefully choreographed coordinated response. Two days following the December 18 Linux kernel update that included the new “Kernel Page Table Isolation (KPTI)” code that had evolved from Hansen’s initial submission, a developer posted that the KPTI insertion had “all the markings of a security patch being readied under pressure from a deadline.” Curiously, an AMD software engineer then elevated the issue, instructing users not to enable KPTI on computers using AMD processors. Arguing that that attacks targeting “speculative references” did not apply to the company’s processor, the message marked the first official utterance of the low level CPU feature in the context of the code that Hansen had introduced.
Since the associated research targeting speculative execution applied at a level of computer science beyond the expertise of most developers, it’s doubtful that many would have made the connection on their own. But, the apparently inadvertent AMD disclosure opened the flood gates and demonstrated the power of collaborative vulnerability detection. Two days later, a security researcher tweeted a briefing that described a potentially related design vulnerability in how processors use computer memory. Then, on January 2, 2018, technology news site The Register reported a severe Intel processor design flaw based on the subsequent Linux email list chatter. Shortly thereafter, a tweet from aforementioned security researcher demonstrated a live exploit of that vulnerability.
Cybersecurity is an unforgivingly parasitic companion for modern businesses and the technology industry has a long history of failing to properly address major weaknesses. Companies routinely take advantage of technology complexity to redirect blame for their own failures and redistribute accountability onto others. The inherent narcissism and arrogance at the heart of that systemic gaslighting drives a prolific exclusionary culture woven throughout the industry. Though the most prominently derivative biases relate to the lack of workforce diversity and algorithmic stereotyping, exclusionary corporate behavior also impedes the healthy information sharing relationships and multi-organizational collaborations crucial for addressing supply chain weaknesses. Instead, the technology industry operates under a system of privilege through class hierarchies forged from economic status, market position, and history.
Most commonly associated with race and gender, social psychologists have developed a more generalized understanding of privilege over the past 20 years. One review article approaches that broader examination as pertaining to how membership in a dominant group promotes elements of power and oppression. Market dominance elicits a groupthink perspective that what serves industry leader needs automatically serves the needs of its respective market segment. How that translates to the technology industry is that large companies routinely leverage their power to establish rules and constructs that favor their business contexts as most representative of, and therefore important to, their market spaces. That business class privilege extends to cybersecurity by empowering those better positioned organizations to lead the industry’s response to large-scale incidents and events.
Knowledge is currency in the cybersecurity battle space. Since the probability that an attacker can exploit a vulnerability increases the longer a system is exposed, knowing about that vulnerability before other affected organizations implicitly promotes inequity. The advanced notice afforded the four CVD group software companies reinforced their positions of privilege by enabling them to more rapidly defend their own interests while smaller competitors and other organizations scrambled to match months of preparation. Just being included in the group granted extraordinary benefits regardless of whether the effort succeeded or failed.
Coordination group membership beyond the CPU manufacturers prompted anti-competition issues in two primary market areas: operating systems and cloud hosting services.
Just about every computing device includes an operating system to manage how it interfaces with the environment around it. While the most familiar are those that consumers use every day, such as Microsoft Windows or Apple iOS, the market includes dozens of operating systems that support the vulnerable processors. The Linux kernel itself represents a key source of technology supply chain complexity since many other companies use it to develop custom “distributions,” or versions, of the operating system. For example, Google built its Android and ChromeOS systems from a modified Linux kernel, as did Amazon and countless other companies. Updating the kernel prior to the public disclosure was a logical first step, but that update would need to propagate across all of the active distributions to effectively mitigate the CPU vulnerabilities. Calling the result an example of “selective disclosure,” the leader of one major Linux distribution argued that the effort put smaller competitors at an extraordinary disadvantage to those coordination group members that were able to begin working on and deploying patches in advance.
Likely even broader in its reach and complexity, the cloud hosting services market includes hundreds of companies worldwide that most notably compete with Amazon Web Services (AWS), Google Cloud Platform, and Microsoft Azure Cloud. Cloud services center on making more efficient use of technology infrastructure by having customers share physical computing resources. Oftentimes, that includes hosting a “virtual” server by carving out a portion of computational power and configuring it for customer access from elsewhere on the Internet. In practice, those virtual installations are standard operating systems nested within other operating systems designed to protect one customer’s computing resources from another. A sophisticated attacker could potentially exploit the CPU vulnerabilities to “bridge the gap” after successfully attacking one customer to infiltrate other customer systems hosted on the same physical hardware. Customer trust in that market depends on the provider’s ability to carefully choreograph patch installation and system reboots with minimal impact to service availability.
Again, exclusion from the coordination group put cloud hosting services competitors at an severe disadvantage as they scrambled in response to the public disclosure while the market leaders executed against predefined response plans. In a blog post responding to the disclosure, the Chief Security Officer of DigitalOcean explained that “the strict embargo placed by Intel has significantly limited our ability to establish a comprehensive understanding of the potential impact.” Ars Technica quoted a blunt statement from the chief executive of another provider in its deep report on the initial disclosure aftermath. “The big guys — Google, Amazon, and Microsoft — have had 60 days at least of prep time, and we’ve had negative prep time.”
Public officials were blindsided by the disclosure because the CVD group also excluded relevant government authorities such as the US Computer and Emergency Response Team Coordination Center (CERT/CC). In lieu of vulnerability notification regulations or requirements, technology companies may engage the CERT/CC as a courtesy to mitigate damage from upcoming disclosures, especially for issues that potentially impact critical infrastructure.
Forced into action without any prior knowledge or CVD engagement, the US CERT recommended CPU replacement as the primary solution for affected systems. Its initial notice explained that the “underlying vulnerability is primarily caused by CPU architecture design choices.” Adding that “removing the vulnerability requires replacing vulnerable CPU hardware,” the notification fomented confusion and prompted early backlash against the coordination group efforts to manage panic. Despite deemphasizing replacement in a subsequent update, the US CERT maintained that customers should consider their options, eventually noting that though “replacing existing CPUs in already deployed systems is not practical, organizations acquiring new systems should evaluate their CPU selection” to address the likely long-term impact of the vulnerabilities on hardware.
After coordination group members appeared to bungle their own vulnerability responses, congressional leaders took notice. On January 24, 2018, the US House of Representatives Committee on Energy and Commerce sent letters to each of the CVD group members asserting that the “situation has shown the need for additional scrutiny regarding multi-party coordinated vulnerability disclosures.” Arguing that effective responses to cybersecurity incidents “require extensive collaboration not only between individual companies, but also across sectors traditionally siloed from one another,” the letters questioned the companies on the impact the disclosure embargoes have on building healthy collaborations.
Predictably, coordination group members responded by deflecting questions on embargo agreements, redirecting disclosure decisions to other members, and disavowing any individual need to notify the US CERT or the CERT/CC.
Microsoft’s response leveraged commonly understood responsible disclosure practices to best convey the group structure while effectively diminishing its own role in group decision-making. Noting that one company, Google, met the CVD protocol’s definition of the “finder,” each of the CPU companies were vulnerability “owners.” Since the owner holds responsibility for “determining how best to address the vulnerability in its product,” each manufacturer could have reasonably conducted its own individual CVD process. Instead, the group decided to form a most unusual joint coordination team. Though a collaborative mitigation process likely eliminated the potential for varying disclosure timelines that would prompt competition between them, it also likely distributed responsibility such that no one manufacturer would be accountable for leading the group.
Lost in the responses, however, was that while Project Zero could legitimately dictate the disclosure timeline as the “finder,” Google was not responsible for formalizing the disclosure embargo, an action that would be held by the vulnerability owner. Google explained that its standard 90 day disclosure period “was extended over time, in consultation with the affected developers and given the complex nature of the vulnerability and the mitigations.” With three owners, such an embargo would require a consensus agreement, one that the manufacturers either withheld from their responses to Congress or chose not to legally execute.
In fact, most of the responses conspicuously sidestepped the embargo discussion. Amazon provided the most clarity, briefly noting in its response, “all information Intel provided Amazon was subject to a Non-Disclosure Agreement. The companies that discovered and initially disclosed the vulnerabilities determined the application of the Non-Disclosure Agreement.” Apple passively addressed its adherence to what it called “standard industry practice,” explaining, “Apple (like other notified vendors) was required to agree not to disclose the vulnerability for ninety days.” Still left unanswered is whether the companies executed that agreement in response to the vulnerability reports or as part of some other extended formal relationship.
All of the responses were much more consistent about having no need nor assumed responsibility for coordinating disclosure with the US CERT. Microsoft best stated the position of it and the other supporting organizations by pointing to the manufacturers. “CVD assigns to the owners of a vulnerability the authority to decide whether to notify others, including whether to notify government agencies.” Though Intel and Arm demurred, AMD’s response directly rejected the implication that it held any responsibility to engage, tersely arguing, “While federal civilian agencies are required to report cybersecurity incidents to US-CERT, there is no similar requirement for any entity, including private companies such as AMD, to report vulnerabilities to either US CERT or CERT/CC.” AMD’s response also attempted to subversively delegitimize the US CERT as an authoritative government institution in such a coordinated disclosure, dismissing it as a “a private, non-governmental organization at Carnegie Mellon University.”
When Congress granted the US CERT an opportunity to address the CVD effort, its response directly pushed back against the premise that responsible disclosure empowered the exclusion of government stakeholders.“The focus on producing patches should not come at the expense of notifying organizations responsible for critical infrastructure protection and public safety — typically government organizations like DHS NCCIC [Department of Homeland Security National Cybersecurity and Communications Integration Center] (including US-CERT and [Industrial Control System]-CERT). We consider the vendors’ lack or inadequacy of such notification to have been in error.” Doubling down, the US CERT described the implication that it could leak an advanced vulnerability disclosure as “specious,” countering that the organization was “not aware of any embargoed vulnerabilities reported to DHS NCCIC having been leaked prematurely.”
The US CERT had good reason to interpret the corporate responses as hostile and asserting exclusive privilege over the coordinated disclosure effort. Amazon previously justified the group’s actions by explaining its understanding that “the vulnerabilities were disclosed to a limited number of companies best-positioned to develop broad, industry-wide remediation while maintaining confidentiality during the extended period required to develop countermeasures.” In its earlier response to Congress, Intel confirmed that understanding, declaring that the company had “disclosed information … only to companies who could assist Intel in enhancing the security of technology users.” The US CERT attacked the resulting ineffectiveness of that exclusionary posture. “When that advanced notification does not occur sufficiently early, … coordinators may be in a rush to understand the issue while preparing their advisories, leading to erroneous and inadequate advice to their constituencies.”
Though withholding disclosure to prepare a sufficient response is a reasonable strategy in most cases, the effectiveness of allowing vulnerability owners to determine for themselves what other organizations to include in a coordinated response may approach a limit as the vulnerability impact extends through many supply chain layers. Once potential competitive concerns and business liabilities become catastrophic, a company will naturally shift its response to defending its own interests over those of its customers. In this case, the CVD group executed a response that favored their needs as leaders of their respective ecosystems, leveraging their positions of power to reinforce an inappropriate advantage over others. Perhaps the resulting chaos and lingering impact across the technology supply chain were unavoidable, but the CPU manufacturers subverted the response and undermined the spirit of responsible disclosure when forming an extraordinarily exclusive coordination group that only granted the opportunity of advanced preparation to their top stakeholders.
Sadly, rather than hold the companies accountable for creating an anti-competitive vulnerability response environment or even seek to clarify what the public should expect from future CVD efforts, Congress chose to blame technology complexity and negative publicity for collaboration failures. In a white paper that detailed conclusions from its investigation, the Committee on Energy and Commerce instead recommended the creation of a new legal framework to “encourage more organizations and third-parties to leverage CVD and its attendant benefits.” Further suggesting that Congress “explore ways to encourage federal agencies and private sector stakeholders to address and minimize the negative public responses to CVDs,” the Committee ultimately disregarded the very real possibility that organizations are susceptible to varying levels of harm from cybersecurity incidents based more on their market positions than the inherent danger posed by operating on the Internet.
On January 3, 2018, six days prior to the planned coordinated disclosure, Horn broke the secrecy agreement and posted a detailed explanation of the Meltdown and Spectre vulnerabilities. The now familiar chaos ensued as companies, government agencies, and consumers scrambled to defend themselves without any significant assistance. Hundreds of millions of devices around the world will be susceptible to exploits for years to come.
Since then, large technology companies have continued to wield their positions of power to make cybersecurity decisions that favor their own interests while keeping customers and smaller competitors many steps behind. Without receiving some new incentive or motivation that mitigates the pervasive power imbalance making customers subservient to the companies that they license products and services from, BigTech can keep pushing the boundaries of how little responsibility it has for defending organizations against both product vulnerabilities and sophisticated threat actors.