emacs.d/clones/abseil.io/resources/swe-book/html/ch22.html

414 lines
69 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Software Engineering at Google</title>
<script type="text/javascript" src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"> </script>
<link rel="stylesheet" type="text/css" href="theme/html/html.css">
</head>
<body data-type="book">
<section xmlns="http://www.w3.org/1999/xhtml" data-type="chapter" id="large-scale_changes">
<h1>Large-Scale Changes</h1>
<p class="byline">Written by Hyrum Wright</p>
<p class="byline">Edited by Lisa Carey</p>
<p>Think for a moment about your own codebase.<a contenteditable="false" data-primary="large-scale changes" data-type="indexterm" id="ix_LSC">&nbsp;</a> How many files can you reliably update in a single, simultaneous commit? What are the factors that constrain that number? Have you ever tried committing a change that large? Would you be able to do it in a reasonable amount of time in an emergency? How does your largest commit size compare to the actual size of your codebase? How would you test such a change? How many people would need to review the change before it is committed? Would you be able to roll back that change if it did get committed? The answers to these questions might surprise you (both what you <em>think</em> the answers are and what they actually turn out to be for your organization).</p>
<p>At Google, weve long ago abandoned the idea of making sweeping changes across our codebase in these types of large atomic changes. Our observation has been that, as a codebase and the number of engineers working in it grows, the largest atomic change possible counterintuitively <em>decreases—</em>running all affected presubmit checks and tests becomes difficult, to say nothing of even ensuring that every file in the change is up to date before submission. As it has become more difficult to make sweeping changes to our codebase, given our general desire to be able to continually improve underlying infrastructure, weve had to develop new ways of reasoning about large-scale changes and how to implement them.</p>
<p>In this chapter, well talk about the techniques, both social and technical, that enable us to keep the large Google codebase flexible and responsive to changes in underlying infrastructure. Well also provide some real-life examples of how and where weve used these approaches. Although your codebase might not look like Googles, understanding these principles and adapting them locally will help your development organization scale while still being able to make broad changes across your codebase.</p>
<section data-type="sect1" id="what_is_a_large-scale_changequestion_ma">
<h1>What Is a Large-Scale Change?</h1>
<p>Before going much further, we should dig into what qualifies as a large-scale change (LSC). <a contenteditable="false" data-primary="large-scale changes" data-secondary="qualities of" data-type="indexterm" id="id-aguVHXskS6">&nbsp;</a>In our experience, an LSC is any set of changes that are logically related but cannot practically be submitted as a single atomic unit.<a contenteditable="false" data-primary="LSCs" data-see="large-scale changes" data-type="indexterm" id="id-J6u6sDs8SO">&nbsp;</a> This might be because it touches so many files that the underlying tooling cant commit them all at once, or it might be because the change is so large that it would always have merge conflicts. In many cases, an LSC is dictated by your repository topology: if your organization uses a collection of distributed or federated repositories,<sup><a data-type="noteref" id="ch01fn220-marker" href="ch22.html#ch01fn220">1</a></sup> making atomic changes across them might not even be technically possible.<sup><a data-type="noteref" id="ch01fn221-marker" href="ch22.html#ch01fn221">2</a></sup> Well look at potential barriers to atomic changes in more detail later in this chapter.<a contenteditable="false" data-primary="changes to code" data-secondary="large-scale" data-see="large-scale changes" data-type="indexterm" id="id-wgu4cMsxSn">&nbsp;</a></p>
<p>LSCs at Google are almost always generated using automated tooling. Reasons for making an LSC vary, but the changes themselves generally fall into a few basic <span class="keep-together">categories:</span></p>
<ul>
<li>
<p>Cleaning up common antipatterns using codebase-wide analysis tooling</p>
</li>
<li>
<p>Replacing uses of deprecated library features</p>
</li>
<li>
<p>Enabling low-level infrastructure improvements, such as compiler upgrades</p>
</li>
<li>
<p>Moving users from an old system to a newer one<sup><a data-type="noteref" id="ch01fn222-marker" href="ch22.html#ch01fn222">3</a></sup></p>
</li>
</ul>
<p>The number of engineers working on these specific tasks in a given organization might be low, but it is useful for their customers to have insight into the LSC tools and process. By their very nature, LSCs will affect a large number of customers, and the LSC tools easily scale down to teams making only a few dozen related changes.</p>
<p>There can be broader motivating causes behind specific LSCs. For example, a new language standard might introduce a more efficient idiom for accomplishing a given task, an internal library interface might change, or a new compiler release might require fixing existing problems that would be flagged as errors by the new release. The majority of LSCs across Google actually have near-zero functional impact: they tend to be widespread textual updates for clarity, optimization, or future compatibility. But LSCs are not theoretically limited to this behavior-preserving/refactoring class of change.</p>
<p>In all of these cases, on a codebase the size of Googles, infrastructure teams might routinely need to change hundreds of thousands of individual references to the old pattern or symbol. In the largest cases so far, weve touched millions of references, and we expect the process to continue to scale well. Generally, weve found it advantageous to invest early and often in tooling to enable LSCs for the many teams doing infrastructure work. Weve also found that efficient tooling also helps engineers performing smaller changes. The same tools that make changing thousands of files efficient also scale down to tens of files reasonably well.</p>
</section>
<section data-type="sect1" id="who_deals_with_lscsquestion_mark">
<h1>Who Deals with LSCs?</h1>
<p>As just indicated, the infrastructure teams that build and manage our systems are responsible for much of the work of performing LSCs, but the tools and resources are available across the company.<a contenteditable="false" data-primary="large-scale changes" data-secondary="responsibility for" data-type="indexterm" id="ix_LSCresp">&nbsp;</a> If you skipped <a data-type="xref" href="ch01.html#what_is_software_engineeringquestion_ma">What Is Software Engineering?</a>, you might wonder why infrastructure teams are the ones responsible for this work. Why cant we just introduce a new class, function, or system and dictate that everybody who uses the old one move to the updated analogue? Although this might seem easier in practice, it turns out not to scale very well for several reasons.</p>
<p>First, the infrastructure teams that build and manage the underlying systems are also the ones with the domain knowledge required to fix the hundreds of thousands of references to them. Teams that consume the infrastructure are unlikely to have the context for handling many of these migrations, and it is globally inefficient to expect them to each relearn expertise that infrastructure teams already have. Centralization also allows for faster recovery when faced with errors because errors generally fall into a small set of categories, and the team running the migration can have a playbook—formal or informal—for addressing them.</p>
<p>Consider the amount of time it takes to do the first of a series of semi-mechanical changes that you dont understand. You probably spend some time reading about the motivation and nature of the change, find an easy example, try to follow the provided suggestions, and then try to apply that to your local code. Repeating this for every team in an organization greatly increases the overall cost of execution. By making only a few centralized teams responsible for LSCs, Google both internalizes those costs and drives them down by making it possible for the change to happen more efficiently.</p>
<p>Second, nobody likes unfunded mandates.<sup><a data-type="noteref" id="ch01fn223-marker" href="ch22.html#ch01fn223">4</a></sup> Even though a new system might be categorically better than the one it replaces, those benefits are often diffused across an organization and thus unlikely to matter enough for individual teams to want to update on their own initiative. If the new system is important enough to migrate to, the costs of migration will be borne somewhere in the organization. Centralizing the migration and accounting for its costs is almost always faster and cheaper than depending on individual teams to organically migrate.</p>
<p>Additionally, having teams that own the systems requiring LSCs helps align incentives to ensure the change gets done. In our experience, organic migrations are unlikely to fully succeed, in part because engineers tend to use existing code as examples when writing new code. Having a team that has a vested interest in removing the old system responsible for the migration effort helps ensure that it actually gets done. Although funding and staffing a team to run these kinds of migrations can seem like an additional cost, it is actually just internalizing the externalities that an unfunded mandate creates, with the additional benefits of economies of scale.</p>
<aside data-type="sidebar" id="filling_potholes">
<h5>Case Study: Filling Potholes</h5>
<p>Although the LSC systems at Google are used for high-priority migrations, weve also discovered that just having them available opens up opportunities for various small fixes across our codebase, which just wouldnt have been possible without them.<a contenteditable="false" data-primary="small fixes across the codebase with LSCs" data-type="indexterm" id="id-Qbu4HEsRSYt7">&nbsp;</a> Much like transportation infrastructure tasks consist of building new roads as well as repairing old ones, infrastructure groups at Google spend a lot of time fixing existing code, in addition to developing new systems and moving users to them.</p>
<p>For example, early in our history, a template library emerged to supplement the C++ Standard Template Library. Aptly named the Google Template Library, this library consisted of several header files worth of implementation. For reasons lost in the mists of time, one of these header files was named <em>stl_util.h</em> and another was named <em>map-util.h</em> (note the different separators in the file names). In addition to driving the consistency purists nuts, this difference also led to reduced productivity, and engineers had to remember which file used which separator, and only discovered when they got it wrong after a potentially lengthy compile cycle.</p>
<p>Although fixing this single-character change might seem pointless, particularly across a codebase the size of Googles, the maturity of our LSC tooling and process enabled us to do it with just a couple weeks worth of background-task effort. Library authors could find and apply this change en masse without having to bother end users of these files, and we were able to quantitatively reduce the number of build failures caused by this specific issue. The resulting increases in productivity (and happiness) more than paid for the time to make the change.</p>
<p>As the ability to make changes across our entire codebase has improved, the diversity of changes has also expanded, and we can make some engineering decisions knowing that they arent immutable in the future. Sometimes, its worth the effort to fill a few potholes.<a contenteditable="false" data-primary="large-scale changes" data-secondary="responsibility for" data-startref="ix_LSCresp" data-type="indexterm" id="id-0OuJHNc5SDtO">&nbsp;</a></p>
</aside>
</section>
<section data-type="sect1" id="barriers_to_atomic_changes">
<h1>Barriers to Atomic Changes</h1>
<p>Before we<a contenteditable="false" data-primary="atomic changes, barriers to" data-type="indexterm" id="ix_atom">&nbsp;</a> discuss the process<a contenteditable="false" data-primary="large-scale changes" data-secondary="barriers to atomic changes" data-type="indexterm" id="ix_LSCbarr">&nbsp;</a> that Google uses to actually effect LSCs, we should talk about why many kinds of changes cant be committed atomically. In an ideal world, all logical changes could be packaged into a single atomic commit that could be tested, reviewed, and committed independent of other changes. Unfortunately, as a repository—and the number of engineers working in it—grows, that ideal becomes less feasible. It can be completely infeasible even at small scale when using a set of distributed or federated repositories.</p>
<section data-type="sect2" id="technical_limitations">
<h2>Technical Limitations</h2>
<p>To begin with, most Version Control Systems (VCSs) have operations that scale linearly with the size of a change. <a contenteditable="false" data-primary="large-scale changes" data-secondary="barriers to atomic changes" data-tertiary="technical limitations" data-type="indexterm" id="id-wguNHMsVfahG">&nbsp;</a><a contenteditable="false" data-primary="centralized version control systems (VCSs)" data-secondary="operations scaling linearly with size of a change" data-type="indexterm" id="id-dXuZs1sXf3ha">&nbsp;</a>Your system might be able to handle small commits (e.g., on the order of tens of files) just fine, but might not have sufficient memory or processing power to atomically commit thousands of files at once. In centralized VCSs, commits can block other writers (and in older systems, readers) from using the system as they process, meaning that large commits stall other users of the system.</p>
<p>In short, it might not be just “difficult” or “unwise” to make a large change atomically: it might simply be impossible with a given infrastructure. Splitting the large change into smaller, independent chunks gets around these limitations, although it makes the execution of the change more complex.<sup><a data-type="noteref" id="ch01fn224-marker" href="ch22.html#ch01fn224">5</a></sup></p>
</section>
<section data-type="sect2" id="merge_conflicts">
<h2>Merge Conflicts</h2>
<p>As the size of a change grows, the<a contenteditable="false" data-primary="large-scale changes" data-secondary="barriers to atomic changes" data-tertiary="merge conflicts" data-type="indexterm" id="id-dXuvH1s8C3ha">&nbsp;</a> potential for merge conflicts also increases.<a contenteditable="false" data-primary="merge conflicts, size of changes and" data-type="indexterm" id="id-Gku8sEs4Cnhq">&nbsp;</a> Every version control system we know of requires updating and merging, potentially with manual resolution, if a newer version of a file exists in the central repository. As the number of files in a change increases, the probability of encountering a merge conflict also grows and is compounded by the number of engineers working in the <span class="keep-together">repository.</span></p>
<p>If your company is small, you might be able to sneak in a change that touches every file in the repository on a weekend when nobody is doing development. Or you might have an informal system of grabbing the global repository lock by passing a virtual (or even physical!) token around your development team. At a large, global company like Google, these approaches are just not feasible: somebody is always making changes to the repository.</p>
<p>With few files in a change, the probability of merge conflicts shrinks, so they are more likely to be committed without problems. This property also holds for the following areas as well.</p>
</section>
<section data-type="sect2" id="no_haunted_graveyards">
<h2>No Haunted Graveyards</h2>
<p>The SREs who run Googles production services have a mantra: “No Haunted Graveyards.” A haunted<a contenteditable="false" data-primary="large-scale changes" data-secondary="barriers to atomic changes" data-tertiary="no haunted graveyards" data-type="indexterm" id="id-Gku5HEszcnhq">&nbsp;</a> graveyard in this<a contenteditable="false" data-primary="haunted graveyards" data-type="indexterm" id="id-QbubsEswczh7">&nbsp;</a> sense is a system that is so ancient, obtuse, or complex that no one dares enter it. Haunted graveyards are often business-critical systems that are frozen in time because any attempt to change them could cause the system to fail in incomprehensible ways, costing the business real money. They pose a real existential risk and can consume an inordinate amount of resources.</p>
<p>Haunted graveyards dont just exist in production systems, however; they can be found in codebases. Many organizations have bits of software that are old and unmaintained, written by someone long off the team, and on the critical path of some important revenue-generating functionality. These systems are also frozen in time, with layers of bureaucracy built up to prevent changes that might cause instability. Nobody wants to be the network support engineer II who flipped the wrong bit!</p>
<p>These parts of a codebase are anathema to the LSC process because they prevent the completion of large migrations, the decommissioning of other systems upon which they rely, or the upgrade of compilers or libraries that they use. From an LSC perspective, haunted graveyards prevent all kinds of meaningful progress.</p>
<p>At Google, weve found the counter to this to be good, ol-fashioned testing. When software is thoroughly tested, we can make arbitrary changes to it and know with confidence whether those changes are breaking, no matter the age or complexity of the system. Writing those tests takes a lot of effort, but it allows a codebase like Googles to evolve over long periods of time, consigning the notion of haunted software graveyards to a graveyard of its own.</p>
</section>
<section data-type="sect2" id="heterogeneity">
<h2>Heterogeneity</h2>
<p>LSCs really work only when the bulk of the effort for them can be done by computers, not humans.<a contenteditable="false" data-primary="large-scale changes" data-secondary="barriers to atomic changes" data-tertiary="heterogeneity" data-type="indexterm" id="id-Qbu4HEsjIzh7">&nbsp;</a> As good as humans can be with ambiguity, computers rely upon consistent environments to apply the proper code transformations to the correct places. If your organization has many different VCSs, Continuous Integration (CI) systems, project-specific tooling, or formatting guidelines, it is difficult to make sweeping changes across your entire codebase. Simplifying the environment to add more consistency will help both the humans who need to move around in it and the robots making automated transformations.</p>
<p class="pagebreak-before">For example, many projects at Google have presubmit tests configured to run before changes are made to their codebase. Those checks can be very complex, ranging from checking new dependencies against a whitelist, to running tests, to ensuring that the change has an associated bug. Many of these checks are relevant for teams writing new features, but for LSCs, they just add additional irrelevant complexity.</p>
<p>Weve decided to embrace some of this complexity, such as running presubmit tests, by making it standard across our codebase. For other inconsistencies, we advise teams to omit their special checks when parts of LSCs touch their project code. Most teams are happy to help given the benefit these kinds of changes are to their projects.</p>
<div data-type="note" id="id-znUZcZIMhM"><h6>Note</h6>
<p>Many of the benefits of consistency for humans mentioned in <a data-type="xref" href="ch08.html#style_guides_and_rules">Style Guides and Rules</a> also apply to automated tooling.</p>
</div>
</section>
<section data-type="sect2" id="testing-id00096">
<h2>Testing</h2>
<p>Every change should be tested (a process well talk about more in just a moment), but the<a contenteditable="false" data-primary="large-scale changes" data-secondary="barriers to atomic changes" data-tertiary="testing" data-type="indexterm" id="id-AAuZHKsNSyhn">&nbsp;</a> larger <a contenteditable="false" data-primary="testing" data-secondary="as barrier to atomic changes" data-type="indexterm" id="id-zgugsos9Sahy">&nbsp;</a>the change, the more difficult it is to actually test it appropriately. Googles CI system will run not only the tests immediately impacted by a change, but also any tests that transitively depend on the changed files.<sup><a data-type="noteref" id="ch01fn225-marker" href="ch22.html#ch01fn225">6</a></sup> This means a change gets broad coverage, but weve also observed that the farther away in the dependency graph a test is from the impacted files, the more unlikely a failure is to have been caused by the change itself.</p>
<p>Small, independent changes are easier to validate, because each of them affects a smaller set of tests, but also because test failures are easier to diagnose and fix. Finding the root cause of a test failure in a change of 25 files is pretty straightforward; finding 1 in a 10,000-file change is like the proverbial needle in a haystack.</p>
<p>The trade-off in this decision is that smaller changes will cause the same tests to be run multiple times, particularly tests that depend on large parts of the codebase. Because engineer time spent tracking down test failures is much more expensive than the compute time required to run these extra tests, weve made the conscious decision that this is a trade-off were willing to make. That same trade-off might not hold for all organizations, but it is worth examining what the proper balance is for yours.<a contenteditable="false" data-primary="atomic changes, barriers to" data-startref="ix_atom" data-type="indexterm" id="id-0OuJHbC5S9hO">&nbsp;</a><a contenteditable="false" data-primary="large-scale changes" data-secondary="barriers to atomic changes" data-startref="ix_LSCbarr" data-type="indexterm" id="id-9auAsqC4SxhQ">&nbsp;</a></p>
<aside data-type="sidebar" id="testing_lscs">
<h5>Case Study: Testing LSCs</h5>
<p class="byline">Adam Bender</p>
<p>Today it is common<a contenteditable="false" data-primary="testing" data-secondary="of large-scale changes" data-type="indexterm" id="ix_tstLSC">&nbsp;</a> for a double-digit percentage (10% to 20%) of <a contenteditable="false" data-primary="large-scale changes" data-secondary="testing" data-type="indexterm" id="ix_LSCtst">&nbsp;</a>the changes in a project to be the result of LSCs, meaning a substantial amount of code is changed in projects by people whose full-time job is unrelated to those projects. Without good tests, such work would be impossible, and Googles codebase would quickly atrophy under its own weight. LSCs enable us to systematically migrate our entire codebase to newer APIs, deprecate older APIs, change language versions, and remove popular but dangerous practices.</p>
<p>Even a simple one-line signature change becomes complicated when made in a thousand different places across hundreds of different products and services.<sup><a data-type="noteref" id="ch01fn226-marker" href="ch22.html#ch01fn226">7</a></sup> After the change is written, you need to coordinate code reviews across dozens of teams. Lastly, after reviews are approved, you need to run as many tests as you can to be sure the change is safe.<sup><a data-type="noteref" id="ch01fn227-marker" href="ch22.html#ch01fn227">8</a></sup> We say “as many as you can,” because a good-sized LSC could trigger a rerun of every single test at Google, and that can take a while. In fact, many LSCs have to plan time to catch downstream clients whose code backslides while the LSC makes its way through the process.</p>
<p>Testing an LSC can be a slow and frustrating process. When a change is sufficiently large, your local environment is almost guaranteed to be permanently out of sync with head as the codebase shifts like sand around your work. In such circumstances, it is easy to find yourself running and rerunning tests just to ensure your changes continue to be valid. When a project has flaky tests or is missing unit test coverage, it can require a lot of manual intervention and slow down the entire process. To help speed things up, we use a strategy called the TAP (Test Automation Platform) train.</p>
<h3>Riding the TAP Train</h3>
<p>The core <a contenteditable="false" data-primary="large-scale changes" data-secondary="testing" data-tertiary="riding the TAP train" data-type="indexterm" id="id-6WuYHKSqcMSkhV">&nbsp;</a>insight to LSCs is that they rarely interact with one another, and most affected tests are going to pass for most LSCs.<a contenteditable="false" data-primary="Test Automation Platform (TAP)" data-secondary="train model and testing of LSCs" data-type="indexterm" id="id-nguQsWSwc8S2hY">&nbsp;</a> As a result, we can test more than one change at a time and reduce the total number of tests executed. The train model has proven to be very effective for testing LSCs.</p>
<p>The TAP train takes advantage of two facts:</p>
<ul>
<li>
<p>LSCs tend to be pure refactorings and therefore very narrow in scope, preserving local semantics.</p>
</li>
<li>
<p>Individual changes are often simpler and highly scrutinized, so they are correct more often than not.</p>
</li>
</ul>
<p>The train model also has the advantage that it works for multiple changes at the same time and doesnt require that each individual change ride in isolation.<sup><a data-type="noteref" id="ch01fn228-marker" href="ch22.html#ch01fn228">9</a></sup></p>
<p>The train has five steps and is started fresh every three hours:</p>
<ol>
<li>
<p>For each change on the train, run a sample of 1,000 randomly-selected tests.</p>
</li>
<li>
<p>Gather up all the changes that passed their 1,000 tests and create one uber-change from all of them: “the train.”</p>
</li>
<li>
<p>Run the union of all tests directly affected by the group of changes. Given a large enough (or low-level enough) LSC, this can mean running every single test in Googles repository. This process can take more than six hours to complete.</p>
</li>
<li>
<p>For each nonflaky test that fails, rerun it individually against each change that made it into the train to determine which changes caused it to fail.</p>
</li>
<li>
<p>TAP generates a report for each change that boarded the train. The report describes all passing and failing targets and can be used as evidence that an LSC is safe to submit.</p>
</li>
</ol>
</aside>
</section>
<section data-type="sect2" id="code_review">
<h2>Code Review</h2>
<p>Finally, as we mentioned in <a data-type="xref" href="ch09.html#code_review-id00002">Code Review</a>, all changes need to be reviewed before submission, and this policy applies even for LSCs. <a contenteditable="false" data-primary="large-scale changes" data-secondary="testing" data-tertiary="code reviews" data-type="indexterm" id="id-0Ou2sMsdt9hO">&nbsp;</a><a contenteditable="false" data-primary="code reviews" data-secondary="for large-scale changes" data-type="indexterm" id="id-9auLfRsZtxhQ">&nbsp;</a>Reviewing large commits can be tedious, onerous, and even error prone, particularly if the changes are generated by hand (a process you want to avoid, as well discuss shortly). In just a moment, well look at how tooling can often help in this space, but for some classes of changes, we still want humans to explicitly verify they are correct. Breaking an LSC into separate shards makes this much easier.</p>
<aside data-type="sidebar" id="scoped_ptr_to_stdunique_ptr">
<h5>Case Study: scoped_ptr to std::unique_ptr</h5>
<p>Since its<a contenteditable="false" data-primary="C++" data-secondary="scoped_ptr to std::unique_ptr" data-type="indexterm" id="id-9aupHRs3f4tPhW">&nbsp;</a> earliest days, Googles C++ codebase has had a self-destructing smart pointer for wrapping <a contenteditable="false" data-primary="scoped_ptr in C++" data-type="indexterm" id="id-YDujsds9f8tZhm">&nbsp;</a>heap-allocated C++ objects and ensuring that they are destroyed when the smart pointer goes out of scope.<a contenteditable="false" data-primary="large-scale changes" data-secondary="testing" data-tertiary="scoped_ptr to std::unique_ptr" data-type="indexterm" id="id-Knuafas2fwtMh4">&nbsp;</a> This type was called <code>scoped_ptr</code> and was &nbsp;used extensively throughout Googles codebase to ensure that object lifetimes were appropriately managed. It wasnt perfect, but given the limitations of the then-current C++ standard (C++98) when the type was first introduced, it made for safer programs.</p>
<p>In C++11, the language<a contenteditable="false" data-primary="std::unique_ptr in C++" data-type="indexterm" id="id-YDu2Hzf9f8tZhm">&nbsp;</a> introduced a new type: <code>std::unique_ptr</code>. It fulfilled the same function as <code>scoped_ptr</code>, but also prevented other classes of bugs that the language now could detect. <code>std::unique_ptr</code> was strictly better than <code>scoped_ptr</code>, yet Googles codebase had more than 500,000 references to <code>scoped_ptr</code> scattered among millions of source files. Moving to the more modern type required the largest LSC attempted to that point within Google.</p>
<p>Over the course of several months, several engineers attacked the problem in parallel. Using Googles large-scale migration infrastructure, we were able to change references to <code>scoped_ptr</code> into references to <code>std::unique_ptr</code> as well as slowly adapt <code>scoped_ptr</code> to behave more closely to <code>std::unique_ptr</code>. At the height of the migration process, we were consistently generating, testing and committing more than 700 independent changes, touching more than 15,000 files <em>per day</em>. Today, we sometimes manage 10 times that throughput, having refined our practices and improved our tooling.</p>
<p>Like almost all LSCs, this one had a very long tail of tracking down various nuanced behavior dependencies (another manifestation of Hyrums Law), fighting race conditions with other engineers, and uses in generated code that werent detectable by our automated tooling. We continued to work on these manually as they were discovered by the testing infrastructure.</p>
<p><code>scoped_ptr</code> was also used as a parameter type in some widely used APIs, which made small independent changes difficult. We contemplated writing a call-graph analysis system that could change an API and its callers, transitively, in one commit, but were concerned that the resulting changes would themselves be too large to commit <span class="keep-together">atomically.</span></p>
<p>In the end, we were able to finally remove <code>scoped_ptr</code> by first making it a type alias of <code>std::unique_ptr</code> and then performing the textual substitution between the old alias and the new, before eventually just removing the old <code>scoped_ptr</code> alias. Today, Googles codebase benefits from using the same standard type as the rest of the C++ ecosystem, which was possible only because of our technology and tooling for LSCs.<a contenteditable="false" data-primary="testing" data-secondary="of large-scale changes" data-startref="ix_tstLSC" data-type="indexterm" id="id-43u4CwSmfjtgh7">&nbsp;</a><a contenteditable="false" data-primary="large-scale changes" data-secondary="testing" data-startref="ix_LSCtst" data-type="indexterm" id="id-Z3ugc0SRfWtDh6">&nbsp;</a></p>
</aside>
</section>
</section>
<section data-type="sect1" id="lsc_infrastructure">
<h1>LSC Infrastructure</h1>
<p>Google has invested in a <a contenteditable="false" data-primary="large-scale changes" data-secondary="infrastructure" data-type="indexterm" id="ix_LSCinfr">&nbsp;</a>significant amount of infrastructure to make LSCs possible. This infrastructure includes tooling for change creation, change management, change review, and testing. However, perhaps the most important support for LSCs has been the evolution of cultural norms around large-scale changes and the oversight given to them. Although the sets of technical and social tools might differ for your organization, the general principles should be the same.</p>
<section data-type="sect2" id="policies_and_culture">
<h2>Policies and Culture</h2>
<p>As weve described in <a data-type="xref" href="ch16.html#version_control_and_branch_management">Version Control and Branch Management</a>, Google stores the bulk<a contenteditable="false" data-primary="large-scale changes" data-secondary="infrastructure" data-tertiary="policies and culture" data-type="indexterm" id="id-Gku8sEsZfMUq">&nbsp;</a> of its source code in a single monolithic repository (monorepo), and every engineer has visibility into almost all of this code. This high degree of openness means that any engineer can edit any file and send those edits for review to those who can approve them. However, each of those edits has costs, both to generate as well as review.<a contenteditable="false" data-primary="policies for large-scale changes" data-type="indexterm" id="id-Qbu7fEs3fwU7">&nbsp;</a><sup><a data-type="noteref" id="ch01fn229-marker" href="ch22.html#ch01fn229">10</a></sup></p>
<p>Historically, these costs have been somewhat symmetric, which limited the scope of changes a single engineer or team could generate. As Googles LSC tooling improved, it became easier to generate a large number of changes very cheaply, and it became equally easy for a single engineer to impose a burden on a large number of reviewers across the company. Even though we want to encourage widespread improvements to our codebase, we want to make sure there is some oversight and thoughtfulness behind them, rather than indiscriminate tweaking.<sup><a data-type="noteref" id="ch01fn230-marker" href="ch22.html#ch01fn230">11</a></sup></p>
<p>The end result is a lightweight approval process for teams and individuals seeking to make LSCs across Google. This process is overseen by a group of experienced engineers who are familiar with the nuances of various languages, as well as invited domain experts for the particular change in question. The goal of this process is not to prohibit LSCs, but to help change authors produce the best possible changes, which make the best use of Googles technical and human capital. Occasionally, this group might suggest that a cleanup just isnt worth it: for example, cleaning up a common typo without any way of preventing recurrence.</p>
<p>Related to these policies was a shift in cultural norms surrounding LSCs.<a contenteditable="false" data-primary="culture" data-secondary="changes in norms surrounding LSCs" data-type="indexterm" id="id-AAuZHjcbfzUn">&nbsp;</a> Although it is important for code owners to have a sense of responsibility for their software, they also needed to learn that LSCs were an important part of Googles effort to scale our software engineering practices. Just as product teams are the most familiar with their own software, library infrastructure teams know the nuances of the infrastructure, and getting product teams to trust that domain expertise is an important step toward social acceptance of LSCs. As a result of this culture shift, local product teams have grown to trust LSC authors to make changes relevant to those authors domains.</p>
<p>Occasionally, local owners question the purpose of a specific commit being made as part of a broader LSC, and change authors respond to these comments just as they would other review comments. Socially, its important that code owners understand the changes happening to their software, but they also have come to realize that they dont hold a veto over the broader LSC. Over time, weve found that a good FAQ and a solid historic track record of improvements have generated widespread endorsement of LSCs throughout Google.</p>
</section>
<section data-type="sect2" id="codebase_insight">
<h2>Codebase Insight</h2>
<p>To do LSCs, weve found it invaluable<a contenteditable="false" data-primary="large-scale changes" data-secondary="infrastructure" data-tertiary="codebase insight" data-type="indexterm" id="id-Gku5HEs4CMUq">&nbsp;</a> to be able to do <a contenteditable="false" data-primary="codebase" data-secondary="analysis of, large-scale changes and" data-type="indexterm" id="id-QbubsEsMCwU7">&nbsp;</a>large-scale analysis of our codebase, both on a textual level using traditional tools, as well as on a semantic level. <a contenteditable="false" data-primary="Kythe" data-type="indexterm" id="id-AAumfKsECzUn">&nbsp;</a>For example, Googles use of the semantic indexing tool <a href="https://kythe.io">Kythe</a> provides a complete map of the links between parts of our codebase, allowing us to ask questions such as “Where are the callers of this function?” or “Which classes derive from this one?” Kythe and similar tools also provide programmatic access to their data so that they can be incorporated into refactoring tools. (For further examples, see Chapters <a data-type="xref" data-xrefstyle="select:labelnumber" href="ch17.html#code_search">Code Search</a> and <a data-type="xref" data-xrefstyle="select:labelnumber" href="ch20.html#static_analysis-id00082">Static Analysis</a>.)</p>
<p>We also use compiler-based indices to run abstract syntax tree-based analysis and transformations over our codebase. Tools such as <a href="https://oreil.ly/c6xvO">ClangMR</a>, JavacFlume, or <a href="https://oreil.ly/Er03J">Refaster</a>, which can perform transformations in a highly parallelizable way, depend on these insights as part of their function. For smaller changes, authors can use specialized, custom tools, <code>perl</code> or <code>sed</code>, regular expression matching, or even a simple shell script.</p>
<p>Whatever tool your organization uses for change creation, its important that its human effort scale sublinearly with the codebase; in other words, it should take roughly the same amount of human time to generate the collection of all required changes, no matter the size of the repository. The change creation tooling should also be comprehensive across the codebase, so that an author can be assured that their change covers all of the cases theyre trying to fix.</p>
<p>As with other areas in this book, an early investment in tooling usually pays off in the short to medium term. As a rule of thumb, weve long held that if a change requires more than 500 edits, its usually more efficient for an engineer to learn and execute our change-generation tools rather than manually execute that edit. For experienced “code janitors,” that number is often much smaller.</p>
</section>
<section data-type="sect2" id="change_management">
<h2>Change Management</h2>
<p>Arguably the most <a contenteditable="false" data-primary="large-scale changes" data-secondary="infrastructure" data-tertiary="change management" data-type="indexterm" id="id-Qbu4HEswcwU7">&nbsp;</a>important piece of large-scale change<a contenteditable="false" data-primary="change management for large-scale changes" data-type="indexterm" id="id-AAu4sKsGczUn">&nbsp;</a> infrastructure is the set of tooling that shards a master change into smaller pieces and manages the process of testing, mailing, reviewing, and committing them independently. <a contenteditable="false" data-primary="Rosie tool" data-type="indexterm" id="id-zguofoskcKUy">&nbsp;</a>At Google, this tool is called Rosie, and we discuss its use more completely in a few moments when we examine our LSC process. In many respects, Rosie is not just a tool, but an entire platform for making LSCs at Google scale. It provides the ability to split the large sets of comprehensive changes produced by tooling into smaller shards, which can be tested, reviewed, and submitted independently.</p>
</section>
<section data-type="sect2" id="testing">
<h2>Testing</h2>
<p>Testing is another important piece<a contenteditable="false" data-primary="large-scale changes" data-secondary="infrastructure" data-tertiary="testing" data-type="indexterm" id="id-AAuZHKs3IzUn">&nbsp;</a> of large-scale-changeenabling infrastructure.<a contenteditable="false" data-primary="testing" data-secondary="in large-scale change infrastructure" data-type="indexterm" id="id-zgugsos7IKUy">&nbsp;</a> As discussed in <a data-type="xref" href="ch11.html#testing_overview">Testing Overview</a>, tests are one of the important ways that we validate our software will behave as expected. This is particularly important when applying changes that are not authored by humans. A robust testing culture and infrastructure means that other tooling can be confident that these changes dont have unintended effects.</p>
<p>Googles testing strategy for LSCs differs slightly from that of normal changes while still using the same underlying CI infrastructure. Testing LSCs means not just ensuring the large master change doesnt cause failures, but that each shard can be submitted safely and independently. Because each shard can contain arbitrary files, we dont use the standard project-based presubmit tests. Instead, we run each shard over the transitive closure of every test it might affect, which we discussed earlier.</p>
</section>
<section data-type="sect2" id="language_support">
<h2>Language Support</h2>
<p>LSCs at Google are typically done on a per-language basis, and some languages support them much more easily than others.<a contenteditable="false" data-primary="programming languages" data-secondary="support for large-scale changes" data-type="indexterm" id="id-zguVHos9SKUy">&nbsp;</a><a contenteditable="false" data-primary="large-scale changes" data-secondary="infrastructure" data-tertiary="language support" data-type="indexterm" id="id-0Ou2sMs5S4UO">&nbsp;</a> Weve found that language features such as type aliasing and forwarding functions are invaluable for allowing existing users to continue to function while we introduce new systems and migrate users to them non-atomically. For languages that lack these features, it is often difficult to migrate systems incrementally.<sup><a data-type="noteref" id="ch01fn231-marker" href="ch22.html#ch01fn231">12</a></sup></p>
<p>Weve also found that statically typed languages are much easier to perform large automated changes in than dynamically typed languages. Compiler-based tools along with strong static analysis provide a significant amount of information that we can use to build tools to affect LSCs and reject invalid transformations before they even get to the testing phase. The unfortunate result of this is that languages like Python, Ruby, and JavaScript that are dynamically typed are extra difficult for maintainers. Language choice is, in many respects, intimately tied to the question of code lifespan: languages that tend to be viewed as more focused on developer productivity tend to be more difficult to maintain. Although this isnt an intrinsic design requirement, it is where the current state of the art happens to be.</p>
<p>Finally, its worth pointing out that automatic language formatters are a crucial part of the LSC infrastructure. Because we work toward optimizing our code for readability, we want to make sure that any changes produced by automated tooling are intelligible to both immediate reviewers and future readers of the code. All of the LSC-generation tools run the automated formatter appropriate to the language being changed as a separate pass so that the change-specific tooling does not need to <span class="keep-together">concern</span> itself with formatting specifics. Applying automated formatting, such as <a href="https://github.com/google/google-java-format">google-java-format</a> or <a href="https://clang.llvm.org/docs/ClangFormat.html">clang-format</a>, to our codebase means that automatically produced changes will “fit in” with code written by a human, reducing future development friction. Without automated formatting, large-scale automated changes would never have become the accepted status quo at Google.</p>
</section>
<aside data-type="sidebar" id="operation_rosehub">
<h5>Case Study: Operation RoseHub</h5>
<p>LSCs have become a large part of Googles internal<a contenteditable="false" data-primary="large-scale changes" data-secondary="infrastructure" data-tertiary="Operation RoseHub" data-type="indexterm" id="id-0OuJHMsdt4UO">&nbsp;</a> culture, but they are starting to have implications in the broader world. <a contenteditable="false" data-primary="Operation RoseHub" data-type="indexterm" id="id-9auAsRsZt7UQ">&nbsp;</a>Perhaps the best known case so far was “<a href="https://oreil.ly/txtDj">Operation RoseHub</a>.”</p>
<p>In early 2017, a vulnerability in the Apache Commons library allowed any Java application with a vulnerable version of the library in its transitive classpath to become susceptible to remote execution. This bug became known as the Mad Gadget. Among other things, it allowed an avaricious hacker to encrypt the San Francisco Municipal Transportation Agencys systems and shut down its operations. Because the only requirement for the vulnerability was having the wrong library somewhere in its classpath, anything that depended on even one of many open source projects on GitHub was vulnerable.</p>
<p>To solve this problem, some enterprising Googlers launched their own version of the LSC process. By using tools such as <a href="https://cloud.google.com/bigquery">BigQuery</a>, volunteers identified affected projects and sent more than 2,600 patches to upgrade their versions of the Commons library to one that addressed Mad Gadget. Instead of automated tools managing the process, more than 50 humans made this LSC work.<a contenteditable="false" data-primary="large-scale changes" data-secondary="infrastructure" data-startref="ix_LSCinfr" data-type="indexterm" id="id-KnuvsmCDtKUj">&nbsp;</a></p>
</aside>
</section>
<section data-type="sect1" id="the_lsc_process">
<h1>The LSC Process</h1>
<p>With these pieces of infrastructure in place, we can now talk about the process for actually making an LSC. <a contenteditable="false" data-primary="large-scale changes" data-secondary="process" data-type="indexterm" id="ix_LSCproc">&nbsp;</a>This roughly breaks down into four phases (with very nebulous boundaries between them):</p>
<ol>
<li>
<p>Authorization</p>
</li>
<li>
<p>Change creation</p>
</li>
<li>
<p>Shard management</p>
</li>
<li>
<p>Cleanup</p>
</li>
</ol>
<p>Typically, these steps happen after a new system, class, or function has been written, but its important to keep them in mind during the design of the new system. At Google, we aim to design successor systems with a migration path from older systems in mind, so that system maintainers can move their users to the new system <span class="keep-together">automatically.</span></p>
<section data-type="sect2" id="authorization">
<h2>Authorization</h2>
<p>We ask potential authors to fill out a brief document explaining the reason for a proposed change, its estimated impact across<a contenteditable="false" data-primary="large-scale changes" data-secondary="process" data-tertiary="authorization" data-type="indexterm" id="id-AAuZHKsGcNun">&nbsp;</a> the codebase (i.e., how many smaller shards the large change would generate), and answers to any questions potential reviewers might have.<a contenteditable="false" data-primary="authorization for large-scale changes" data-type="indexterm" id="id-zgugsoskcNuy">&nbsp;</a> This process also forces authors to think about how they will describe the change to an engineer unfamiliar with it in the form of an FAQ and proposed change description. Authors also get “domain review” from the owners of the API being refactored.</p>
<p>This proposal is then forwarded to an email list with about a dozen people who have oversight over the entire process. After discussion, the committee gives feedback on how to move forward. For example, one of the most common changes made by the committee is to direct all of the code reviews for an LSC to go to a single "global approver." Many first-time LSC authors tend to assume that local project owners should review everything, but for most mechanical LSCs, its cheaper to have a single expert understand the nature of the change and build automation around reviewing it properly.</p>
<p>After the change is approved, the author can move forward in getting their change submitted. Historically, the committee has been very liberal with their approval,<sup><a data-type="noteref" id="ch01fn232-marker" href="ch22.html#ch01fn232">13</a></sup> and often gives approval not just for a specific change, but also for a broad set of related changes. Committee members can, at their discretion, fast-track obvious changes without the need for full deliberation.</p>
<p>The intent of this process is to provide oversight and an escalation path, without being too onerous for the LSC authors. The committee is also empowered as the escalation body for concerns or conflicts about an LSC: local owners who disagree with the change can appeal to this group who can then arbitrate any conflicts. In practice, this has rarely been needed.</p>
</section>
<section data-type="sect2" id="change_creation">
<h2>Change Creation</h2>
<p>After getting the required approval, an LSC author will begin to produce the actual code edits. <a contenteditable="false" data-primary="changes to code" data-secondary="change creation in LSC process" data-type="indexterm" id="id-zguVHos7INuy">&nbsp;</a><a contenteditable="false" data-primary="large-scale changes" data-secondary="process" data-tertiary="change creation" data-type="indexterm" id="id-0Ou2sMsGIEuO">&nbsp;</a>Sometimes, these can be generated comprehensively into a single large global change that will be subsequently sharded into many smaller independent pieces. Usually, the size of the change is too large to fit in a single global change, due to technical limitations of the underlying version control system.</p>
<p>The change generation process should be as automated as possible so that the parent change can be updated as users backslide into old uses<sup><a data-type="noteref" id="ch01fn233-marker" href="ch22.html#ch01fn233">14</a></sup> or textual merge conflicts occur in the changed code. Occasionally, for the rare case in which technical tools arent able to generate the global change, we have sharded change generation across humans (see <a data-type="xref" href="ch22.html#operation_rosehub">Case Study: Operation RoseHub</a>). Although much more labor intensive than automatically generating changes, this allows global changes to happen much more quickly for time-sensitive applications.</p>
<p>Keep in mind that we optimize for human readability of our codebase, so whatever tool generates changes, we want the resulting changes to look as much like human-generated changes as possible. This requirement leads to the necessity of style guides and automatic formatting tools (see <a data-type="xref" href="ch08.html#style_guides_and_rules">Style Guides and Rules</a>).<sup><a data-type="noteref" id="ch01fn234-marker" href="ch22.html#ch01fn234">15</a></sup></p>
</section>
<section data-type="sect2" id="sharding_and_submitting">
<h2>Sharding and Submitting</h2>
<p>After a global change has <a contenteditable="false" data-primary="sharding and submitting in LSC process" data-type="indexterm" id="ix_shrd">&nbsp;</a>been generated, the author then starts <a contenteditable="false" data-primary="large-scale changes" data-secondary="process" data-tertiary="sharding and submitting" data-type="indexterm" id="ix_LSCprocsh">&nbsp;</a>running <a contenteditable="false" data-primary="Rosie tool" data-secondary="sharding and submitting in LSC process" data-type="indexterm" id="ix_Rosie">&nbsp;</a>Rosie. Rosie takes a large change and shards it based upon project boundaries and ownership rules into changes that <em>can</em> be submitted atomically. It then puts each individually sharded change through an independent test-mail-submit pipeline. Rosie can be a heavy user of other pieces of Googles developer infrastructure, so it caps the number of outstanding shards for any given LSC, runs at lower priority, and communicates with the rest of the infrastructure about how much load it is acceptable to generate on our shared testing infrastructure.</p>
<p>We talk more about the specific test-mail-submit process for each shard below.</p>
<aside data-type="sidebar" id="cattle_versus_pets">
<h5>Cattle Versus Pets</h5>
<p>We often use the “cattle and pets” analogy when referring to individual machines in a distributed computing environment, but the same principles can apply to changes within a codebase.<a contenteditable="false" data-primary="cattle versus pets analogy" data-secondary="applying to changes in a codebase" data-type="indexterm" id="id-KnuWHaspCbSGu4">&nbsp;</a></p>
<p>At Google, as at most organizations, typical changes to the codebase are handcrafted by individual engineers working on specific features or bug fixes. Engineers might spend days or weeks working through the creation, testing, and review of a single change. They come to know the change intimately, and are proud when it is finally committed to the main repository. The creation of such a change is akin to owning and raising a favorite pet.</p>
<p>In contrast, effective handling of LSCs requires a high degree of automation and produces an enormous number of individual changes. In this environment, weve found it useful to treat specific changes as cattle: nameless and faceless commits that might be rolled back or otherwise rejected at any given time with little cost unless the entire herd is affected. Often this happens because of an unforeseen problem not caught by tests, or even something as simple as a merge conflict.</p>
<p>With a “pet” commit, it can be difficult to not take rejection personally, but when working with many changes as part of a large-scale change, its just the nature of the job. Having automation means that tooling can be updated and new changes generated at very low cost, so losing a few cattle now and then isnt a problem.</p>
</aside>
<section data-type="sect3" id="testing-id00098">
<h3>Testing</h3>
<p>Each independent shard is tested by running it through TAP, Googles CI framework. <a contenteditable="false" data-primary="Test Automation Platform (TAP)" data-secondary="testing LSC shards" data-type="indexterm" id="id-kgu2HzsqcVS8uz">&nbsp;</a>We run every test that depends on the files in a given change transitively, which often creates high load on our CI system.</p>
<p>This might sound computationally expensive, but in practice, the vast majority of shards affect fewer than one thousand tests, out of the millions across our codebase. For those that affect more, we can group them together: first running the union of all affected tests for all shards, and then for each individual shard running just the intersection of its affected tests with those that failed the first run. Most of these unions cause almost every test in the codebase to be run, so adding additional changes to that batch of shards is nearly free.</p>
<p>One of the drawbacks of running such a large number of tests is that independent low-probability events are almost certainties at large enough scale. Flaky and brittle tests, such as those discussed in <a data-type="xref" href="ch11.html#testing_overview">Testing Overview</a>, which often dont harm the teams that write and maintain them, are particularly difficult for LSC authors. Although fairly low impact for individual teams, flaky tests can seriously affect the throughput of an LSC system. Automatic flake detection and elimination systems help with this issue, but it can be a constant effort to ensure that teams that write flaky tests are the ones that bear their costs.</p>
<p>In our experience with LSCs as semantic-preserving, machine-generated changes, we are now much more confident in the correctness of a single change than a test with any recent history of flakiness—so much so that recently flaky tests are now ignored when submitting via our automated tooling. In theory, this means that a single shard can cause a regression that is detected only by a flaky test going from flaky to failing. In practice, we see this so rarely that its easier to deal with it via human communication rather than automation.</p>
<p>For any LSC process, individual shards should be committable independently. This means that they dont have any interdependence or that the sharding mechanism can group dependent changes (such as to a header file and its implementation) together. Just like any other change, large-scale change shards must also pass project-specific checks before being reviewed and committed.</p>
</section>
<section data-type="sect3" id="mailing_reviewers">
<h3>Mailing reviewers</h3>
<p>After Rosie has validated that a change is safe through testing, it mails the change to an appropriate reviewer. In a company as large as Google, with thousands of engineers, reviewer discovery itself is a challenging problem. Recall from <a data-type="xref" href="ch09.html#code_review-id00002">Code Review</a> that code in the repository is organized with OWNERS files, which list users with approval privileges for a specific subtree in the repository. Rosie uses an owners detection service that understands these OWNERS files and weights each owner based upon their expected ability to review the specific shard in question. If a particular owner proves to be unresponsive, Rosie adds additional reviewers automatically in an effort to get a change reviewed in a timely manner.</p>
<p>As part of the mailing process, Rosie also runs the per-project precommit tools, which might perform additional checks. For LSCs, we selectively disable certain checks such as those for nonstandard change description formatting. Although useful for individual changes on specific projects, such checks are a source of heterogeneity across the codebase and can add significant friction to the LSC process. This heterogeneity is a barrier to scaling our processes and systems, and LSC tools and authors cant be expected to understand special policies for each team.</p>
<p>We also aggressively ignore presubmit check failures that preexist the change in question. When working on an individual project, its easy for an engineer to fix those and continue with their original work, but that technique doesnt scale when making LSCs across Googles codebase. Local code owners are responsible for having no preexisting failures in their codebase as part of the social contract between them and infrastructure teams.</p>
</section>
<section data-type="sect3" id="reviewing">
<h3>Reviewing</h3>
<p>As with other changes, changes generated by Rosie are expected to go through the standard code review process.<a contenteditable="false" data-primary="code reviews" data-secondary="for large-scale changes" data-type="indexterm" id="id-ygu1H6s6SDSAun">&nbsp;</a> In practice, weve found that local owners dont often treat LSCs with the same rigor as regular changes—they trust the engineers generating LSCs too much. Ideally these changes would be reviewed as any other, but in practice, local project owners have come to trust infrastructure teams to the point where these changes are often given only cursory review. Weve come to only send changes to local owners for which their review is required for context, not just approval permissions. All other changes can go to a “global approver”: someone who has ownership rights to approve <em>any</em> change throughout the repository.</p>
<p>When using a global approver, all of the individual shards are assigned to that person, rather than to individual owners of different projects. Global approvers generally have specific knowledge of the language and/or libraries they are reviewing and work with the large-scale change author to know what kinds of changes to expect. They know what the details of the change are and what potential failure modes for it might exist and can customize their workflow accordingly.</p>
<p>Instead of reviewing each change individually, global reviewers use a separate set of pattern-based tooling to review each of the changes and automatically approve ones that meet their expectations. Thus, they need to manually examine only a small subset that are anomalous because of merge conflicts or tooling malfunctions, which allows the process to scale very well.</p>
</section>
<section data-type="sect3" id="submitting">
<h3>Submitting</h3>
<p>Finally, individual changes are committed. As with the mailing step, we ensure that the change passes the various project precommit checks before actually finally being committed to the repository.</p>
<p>With Rosie, we are able to effectively create, test, review, and submit thousands of changes per day across all of Googles codebase and have given teams the ability to effectively migrate their users. Technical decisions that used to be final, such as the name of a widely used symbol or the location of a popular class within a codebase, no longer need to be final.<a contenteditable="false" data-primary="Rosie tool" data-secondary="sharding and submitting in LSC process" data-startref="ix_Rosie" data-type="indexterm" id="id-nguaHwfxt8SvuY">&nbsp;</a><a contenteditable="false" data-primary="sharding and submitting in LSC process" data-startref="ix_shrd" data-type="indexterm" id="id-43uWsQfktKSMu7">&nbsp;</a><a contenteditable="false" data-primary="large-scale changes" data-secondary="process" data-startref="ix_LSCprocsh" data-tertiary="sharding and submitting" data-type="indexterm" id="id-Z3uAf2f6tRSqu6">&nbsp;</a></p>
</section>
</section>
<section data-type="sect2" id="cleanup">
<h2>Cleanup</h2>
<p>Different LSCs have different definitions of “done,” which can<a contenteditable="false" data-primary="large-scale changes" data-secondary="process" data-tertiary="cleanup" data-type="indexterm" id="id-9aupHRsZtDuQ">&nbsp;</a> vary<a contenteditable="false" data-primary="cleanup in LSC process" data-type="indexterm" id="id-YDujsdspt2uP">&nbsp;</a> from completely removing an old system to migrating only high-value references and leaving old ones to organically disappear.<sup><a data-type="noteref" id="ch01fn235-marker" href="ch22.html#ch01fn235">16</a></sup> In almost all cases, its important to have a system that prevents additional introductions of the symbol or system that the large-scale change worked hard to remove. At Google, we use the Tricorder framework mentioned in Chapters <a data-type="xref" data-xrefstyle="select:labelnumber" href="ch19.html#critique_googleapostrophes_code_review">Critique: Googles Code Review Tool</a> and <a data-type="xref" data-xrefstyle="select:labelnumber" href="ch20.html#static_analysis-id00082">Static Analysis</a> to flag at review time when an engineer introduces a new use of a deprecated object, and this <a contenteditable="false" data-primary="deprecation" data-secondary="preventing new uses of deprecated object" data-type="indexterm" id="id-ygu6I6sWtauj">&nbsp;</a>has proven an effective method to prevent backsliding. We talk more about<a contenteditable="false" data-primary="large-scale changes" data-secondary="process" data-startref="ix_LSCproc" data-type="indexterm" id="id-6WuoS1sbtNuO">&nbsp;</a> the entire deprecation process in <a data-type="xref" href="ch15.html#deprecation">Deprecation</a>.</p>
</section>
</section>
<section data-type="sect1" id="conclusion-id00026">
<h1>Conclusion</h1>
<p>LSCs form an important part of Googles software engineering ecosystem. At design time, they open up more possibilities, knowing that some design decisions dont need to be as fixed as they once were. The LSC process also allows maintainers of core infrastructure the ability to migrate large swaths of Googles codebase from old systems, language versions, and library idioms to new ones, keeping the codebase consistent, spatially and temporally. And all of this happens with only a few dozen engineers supporting tens of thousands of others.</p>
<p>No matter the size of your organization, its reasonable to think about how you would make these kinds of sweeping changes across your collection of source code. Whether by choice or by necessity, having this ability will allow greater flexibility as your organization scales while keeping your source code malleable over time.</p>
</section>
<section data-type="sect1" id="tlsemicolondrs-id00128">
<h1>TL;DRs</h1>
<ul>
<li>
<p>An LSC process makes it possible to rethink the immutability of certain technical decisions.</p>
</li>
<li>
<p>Traditional models of refactoring break at large scales.</p>
</li>
<li>
<p>Making LSCs means making a<a contenteditable="false" data-primary="large-scale changes" data-startref="ix_LSC" data-type="indexterm" id="id-AAuZH0Hbf6sNFL">&nbsp;</a> habit of making LSCs.</p>
</li>
</ul>
</section>
<div data-type="footnotes"><p data-type="footnote" id="ch01fn220"><sup><a href="ch22.html#ch01fn220-marker">1</a></sup>For some ideas about why, see <a data-type="xref" href="ch16.html#version_control_and_branch_management">Version Control and Branch Management</a>.</p><p data-type="footnote" id="ch01fn221"><sup><a href="ch22.html#ch01fn221-marker">2</a></sup>Its possible in this federated world to say “well just commit to each repo as fast as possible to keep the duration of the build break small!" But that approach really doesnt scale as the number of federated repositories grows.</p><p data-type="footnote" id="ch01fn222"><sup><a href="ch22.html#ch01fn222-marker">3</a></sup>For a further discussion about this practice, see <a data-type="xref" href="ch15.html#deprecation">Deprecation</a>.</p><p data-type="footnote" id="ch01fn223"><sup><a href="ch22.html#ch01fn223-marker">4</a></sup>By “unfunded mandate,” we mean “additional requirements imposed by an external entity without balancing compensation.” Sort of like when the CEO says that everybody must wear an evening gown for “formal Fridays” but doesnt give you a corresponding raise to pay for your formal wear.</p><p data-type="footnote" id="ch01fn224"><sup><a href="ch22.html#ch01fn224-marker">5</a></sup>See <a href="https://ieeexplore.ieee.org/abstract/document/8443579"><em class="hyperlink">https://ieeexplore.ieee.org/abstract/document/8443579</em></a>.</p><p data-type="footnote" id="ch01fn225"><sup><a href="ch22.html#ch01fn225-marker">6</a></sup>This probably sounds like overkill, and it likely is. Were doing active research on the best way to determine the “right” set of tests for a given change, balancing the cost of compute time to run the tests, and the human cost of making the wrong choice.</p><p data-type="footnote" id="ch01fn226"><sup><a href="ch22.html#ch01fn226-marker">7</a></sup>The largest series of LSCs ever executed removed more than one billion lines of code from the repository over the course of three days. This was largely to remove an obsolete part of the repository that had been migrated to a new home; but still, how confident do you have to be to delete one billion lines of code?</p><p data-type="footnote" id="ch01fn227"><sup><a href="ch22.html#ch01fn227-marker">8</a></sup>LSCs are usually supported by tools that make finding, making, and reviewing changes relatively straight <span class="keep-together">forward.</span></p><p data-type="footnote" id="ch01fn228"><sup><a href="ch22.html#ch01fn228-marker">9</a></sup>It is possible to ask TAP for single change “isolated” run, but these are very expensive and are performed only during off-peak hours.</p><p data-type="footnote" id="ch01fn229"><sup><a href="ch22.html#ch01fn229-marker">10</a></sup>There are obvious technical costs here in terms of compute and storage, but the human costs in time to review a change far outweigh the technical ones.</p><p data-type="footnote" id="ch01fn230"><sup><a href="ch22.html#ch01fn230-marker">11</a></sup>For example, we do not want the resulting tools to be used as a mechanism to fight over the proper spelling of “gray” or “grey” in comments.</p><p data-type="footnote" id="ch01fn231"><sup><a href="ch22.html#ch01fn231-marker">12</a></sup>In fact, Go recently introduced these kinds of language features specifically to support large-scale refactorings (see <a href="https://talks.golang.org/2016/refactor.article"><em class="hyperlink">https://talks.golang.org/2016/refactor.article</em></a>).</p><p data-type="footnote" id="ch01fn232"><sup><a href="ch22.html#ch01fn232-marker">13</a></sup>The only kinds of changes that the committee has outright rejected have been those that are deemed dangerous, such as converting all <code>NULL</code> instances to <code>nullptr</code>, or extremely low-value, such as changing spelling from British English to American English, or vice versa. As our experience with such changes has increased and the cost of LSCs has dropped, the threshold for approval has as well.</p><p data-type="footnote" id="ch01fn233"><sup><a href="ch22.html#ch01fn233-marker">14</a></sup>This happens for many reasons: copy-and-paste from existing examples, committing changes that have been in development for some time, or simply reliance on old habits.</p><p data-type="footnote" id="ch01fn234"><sup><a href="ch22.html#ch01fn234-marker">15</a></sup>In actuality, this is the reasoning behind the original work on clang-format for C++.</p><p data-type="footnote" id="ch01fn235"><sup><a href="ch22.html#ch01fn235-marker">16</a></sup>Sadly, the systems we most want to organically decompose are those that are the most resilient to doing so. They are the plastic six-pack rings of the code ecosystem.</p></div></section>
</body>
</html>