diff options
author | Nathan Fritz <fritzy@github.com> | 2021-12-16 21:01:56 +0300 |
---|---|---|
committer | Nathan Fritz <fritzy@github.com> | 2021-12-16 21:05:19 +0300 |
commit | d7265045730555c03b3142c004c7438e9577028c (patch) | |
tree | 035d81b3124bdaa09c21854934bf2b2b50e52e44 /workspaces/arborist/docs | |
parent | d8aac8448e983692cacb427e03f4688cd1b62e30 (diff) |
Bring in all libnpm modules + arborist as workspaces (#4166)
Added libnpm workspaces and arborist
Diffstat (limited to 'workspaces/arborist/docs')
-rw-r--r-- | workspaces/arborist/docs/audit-bundled.md | 25 | ||||
-rw-r--r-- | workspaces/arborist/docs/diff.md | 52 | ||||
-rw-r--r-- | workspaces/arborist/docs/filtered-reify.md | 93 | ||||
-rw-r--r-- | workspaces/arborist/docs/global.md | 57 | ||||
-rw-r--r-- | workspaces/arborist/docs/ideal-tree.md | 132 | ||||
-rw-r--r-- | workspaces/arborist/docs/links.md | 137 | ||||
-rw-r--r-- | workspaces/arborist/docs/logo.svg | 44 | ||||
-rw-r--r-- | workspaces/arborist/docs/optional-deps.md | 49 | ||||
-rw-r--r-- | workspaces/arborist/docs/parent.md | 164 | ||||
-rw-r--r-- | workspaces/arborist/docs/reify.md | 275 | ||||
-rw-r--r-- | workspaces/arborist/docs/tree-types.md | 30 | ||||
-rw-r--r-- | workspaces/arborist/docs/workspace.md | 181 |
12 files changed, 1239 insertions, 0 deletions
diff --git a/workspaces/arborist/docs/audit-bundled.md b/workspaces/arborist/docs/audit-bundled.md new file mode 100644 index 000000000..c71b2c800 --- /dev/null +++ b/workspaces/arborist/docs/audit-bundled.md @@ -0,0 +1,25 @@ +# audit advisories on bundled dependencies + +``` +root -> (x) +x -> (BUNDLE y) +y -> !known vulnerability! +``` + +With a normal dep, any version of `x` would be considered vulnerable if its +dependency on `y` is entirely contained by the vulnerable range on the `y` +advisory. + +However, with `bundleDependencies`, any version of `x` whose dependency on +`y` _intersects_ with the vulnerable range on the `y` advisory should be +considered potentally vulnerable. + +When considering the meta-vulnerability of any version of `x`, thus: + +- If `x -> y` is gone, `x` is not vulnerable. +- If `x -> y` is not in `bundleDependencies`, treat as we normally do. +- If `x -> y` IS a in `bundleDependencies`, then treat `x` as a meta-vuln + if `semver.intersects(x's dep on y, y's vulnerability range)` + +From there, we just treat it as a regular meta-vulnerability, and report +the appropriate fix accordingly. diff --git a/workspaces/arborist/docs/diff.md b/workspaces/arborist/docs/diff.md new file mode 100644 index 000000000..b2e0aadc6 --- /dev/null +++ b/workspaces/arborist/docs/diff.md @@ -0,0 +1,52 @@ +# tree diffing + +For the tree: + +``` +a ++-- b +| +-- c +| +-- d +| | +-- e +| +-- f ++-- x +| +-- y +| +-- z ++-- p + +-- q +``` + +To turn it into: + +``` +a ++-- b' +| +-- c' +| +-- d +| | +-- e' +| +-- f ++-- x +| +-- y' +| +-- z ++-- i + +-- j +``` + +That is, update b, c, e, and y, but not d, x, or z, and remove p and q, and +add i and j. + +The Diff set looks like: + +``` ++-- b (change) +| +-- c (change) +| +-- e (change) ++-- y (change) ++-- p (remove) ++-- i (add) + +-- j (add) +``` + +Each Diff node has a reference to the actual and ideal node that it refers +to, and a list of each leaf `diff` object under its tree, and a list of +each ideal tree node that isn't changing. diff --git a/workspaces/arborist/docs/filtered-reify.md b/workspaces/arborist/docs/filtered-reify.md new file mode 100644 index 000000000..169f0a1f7 --- /dev/null +++ b/workspaces/arborist/docs/filtered-reify.md @@ -0,0 +1,93 @@ +# filtered reify + +Use case: call `Arborist.reify()` but _only_ reify nodes that are the +dependencies of one specific node. + +Examples: + +- Global reification: we don't want to have to read the entire global space + to install a new package globally. Anything that is not in the + `explicitRequests` list should be ignored. Since `global` triggers + `globalStyle` bundling, this means that effectively any other top-level + node should be untouched. Then if it is present in the idealTree but not + in the actualTree, we don't end up deleting it, which has been the source + of some very unfortunate bugs. + +- Workspace reification: we want to be able to say `npm i -w foo -w bar` to + only install specific workspaces and their (possibly hoisted/shared) + dependencies. + +The goal here would be that if we accidentally have more nodes in the +actualTree that are outside of the filter, we don't accidentally remove +them, or if we have nodes in the idealTree that are outside the filter, we +don't accidentally add/change them. + +## approach 1 - limit actualTree and idealTree to filtered subtree + +This is closest to the current behavior with respect to global installs. +However, rather than relying on starting with a filtered actualTree, and +trusting that we only built the idealTree correctly, we'd add a step where +we explicitly filter/prune both trees to only include the dependencies of +the starting Node. + +Advantage is that reify and diff can remain unaware of dependency graphs. +Diff continues to just calculate the difference between trees, and reify +just makes the changes required to turn the existing actual nodes into the +ideal nodes. + +Would avoid previous global install problems by adding an extra step to +prevent any accidental inclusion of nodes that are outside of the expected +set. + +This would still require that the idealTree be built up in its entirety for +workspace installs, because some dependencies might be hoisted or shared, +and the idealTree that is _saved_ must still reflect these. + +So, how to save the idealTree in its entirety, but only Diff the filtered +set? Some kind of flag to tell Diff to ignore it? The value of this +approach is that Diff doesn't have to know about the dependency +relationships, so if we have to put duct tape on it to communicate those +relationships out of band, then that's not great. Maybe keep two +idealTree's around, one with the full set, and one that's filtered down to +just the bits we want to diff/install? + +## approach 2 - make Diff dependency-graph-aware + +This avoids having to maintain two idealTree instances, or do a subsequent +step to prune the trees before letting reify/diff operate on them, and +preserves the ability for reify() to remain dependency-agnostic. + +Instead of starting from the root and walking `children` and `fsChildren`, +the Diff algorithm would start from the root node, and walk only to the +target filter node, and then walk through the `edgesOut` to find nodes that +need to be diffed. + +### one pass + +Instead of walking the physical `children`/`fsChildren` tree, _only_ walk +the `edgesOut` and build up the diff object that way. + +The challenge here will be that we may end up several layers deep in the +physical tree, and then have to add a node that is shallower. So that +makes the diff walking algorithm a bit more complicated, because we're not +just slapping a new object into a child set, but may find ourselves +back-tracking up the tree as the graph wanders around. + +### two pass + +Make a first pass to create the set of nodes based on the edgesOut +traversal. Then do the current physical-tree based differencing walk, but +skipping any nodes that are not in the set. + +This is potentially expensive if the trees are very large (eg, a project +with thousands of workspaces), but is safe and very low complexity. + +Start with this. + +## approach 3 - make Reify dependency-graph-aware + +Given the existing complexity of reify(), making it have to look at the +dependency graph is a pretty big paradigm shift. Currently, the one thing +that makes it reasonable to reason about, is that while the algorithm is +somewhat involved, it really only has one job (turn one representation of a +tree into another). diff --git a/workspaces/arborist/docs/global.md b/workspaces/arborist/docs/global.md new file mode 100644 index 000000000..e5f782ffa --- /dev/null +++ b/workspaces/arborist/docs/global.md @@ -0,0 +1,57 @@ +# global installs + +Global installs differ from local installs in the following ways: + +1. Only the explicit requests are relevant. Ignore everything that is not + in the add or rm set. +2. Rather than start from a full actual and ideal tree, the "actual" tree + is whatever versions of the add and rm set exist. +3. All deps are nested under the top level dependency, rather than being + deduped up to the root. +4. The `binLinks` package handles this entirely, but the bins and mans need + to be linked differently. +5. We don't save a package.json or package-lock.json + +In a "global style" (but not truly global) install: + +1. Only the things being added at the root level (ie, root-level where the + diff.action is `'ADD'`) are relevant. Whether this is an explicit + addition or not is irrelevant. (Implication: global installs can start + with an empty dep set, so if nothing is being added, there's nothing to + do.) +2. We _do_ have to get the actual and ideal trees, so that we can correct + any issues caused by clobbering deduped deps at the top level. + (Implication: global installs can do a filtered actualTree read that + only includes the items that are deps of the effective root node.) +3. This is identical between global-style and global. (In fact, it's what + makes it "global style" in the first place.) +4. `binLinks` will get `global: false`, and do the right thing. +5. We do save a package.json and package-lock.json. + +## performing global installs + +```js +arborist.reify({ + // path up to where the node_modules lives + // maybe consider making this $PREFIX, and adding the /lib on non-windows + // systems, so that we can just pass in path: npm.prefix? + path: '/usr/local/lib', + global: true, + add: { dependencies: [ pkgSpec ] }, + rm: [ pkgName ], +}).then(tree => { + // tree is an empty root that includes only the deps added/removed +}) +``` + +## Node `global` Field + +Set `global: true` in the Node ctor to set the internal `_global` property. + +`get global` returns `this.root[_global]`. + +## Node `globalTop` Field + +`globalTop` getter returns true if the node is global, and is parent is the +root node, to reflect that global nodes directly in the `node_modules` +folder need to have their bins linked differently. diff --git a/workspaces/arborist/docs/ideal-tree.md b/workspaces/arborist/docs/ideal-tree.md new file mode 100644 index 000000000..1faec5a3b --- /dev/null +++ b/workspaces/arborist/docs/ideal-tree.md @@ -0,0 +1,132 @@ +# building the ideal tree + +Options: + +- update: Object || boolean. If `true`, then treat as `{all:true}` + - all: Boolean. Ignore the lockfile, update everything everywhere as if + it was a clean install. + - names: Array. Update any dependencies in the tree by that name. +- add: Object. Provide dependencies, devDependencies, peerDependencies, + and/or peerDependenciesMeta objects, and/or a bundleDependencies array, + to add the metadata to the root node. +- rm: Array. List of names to remove from the root node's dependencies + objects. +- prune: Boolean, default true. Remove extraneous nodes from the tree + after building the ideal tree. +- preferDedupe: Boolean, default false. Prefer to deduplicate packages if + possible, rather than choosing a newer version of a dependency. Default + is to always try to get the highest possible version dependency, even if + that causes more duplication. +- legacyBundling: Boolean, default false. Always nest packages in their + dependent's `node_modules` folder, even if they could be moved further up + the tree, deduplicating only when the dependency is already met by a + shallower node in the tree. This is the layout algorithm of npm v1 and + v2. + +## to BUILD IDEAL TREE: + +1. Load the root node's package and (if present) shrinkwrap +2. Add and remove deps to/from the root node's package if requested +3. If any named updates specified, then for each name on the list: + 1. Find all nodes in the tree by that name that are not in a shrinkwrap + or bundle + 2. For each dependent in their edgesIn set, add the dependent to the + queue of packages to be checked for updates +4. Add the root node to the queue of packages to be checked for updates +5. Loop until queue is empty: + 1. Sort the queue so that shallower nodes are in front, and then + alphabetical + 2. Shift the next node off the list + 3. If node has already been visited, continue to next node (already + visited) + 4. If node has been moved out of the tree or replaced, continue to next + node (it's no longer relevant) + 5. If node has a shrinkwrap, continue to next node (its deps are + locked) + 6. For each dependency in node.edgesOut: + 1. If part of a bundle or shrinkwrap, continue to next dependency + 2. If edge is invalid, or name is on the update.names list: + 1. Fetch the manifest for the dependency, and create a new Node + 2. Add Node to virtual root node, also load Node's + peerDependencies (and meta-peerDependencies) + 3. Attempt to PLACE the dep in the tree + 3. Add each placed node to the queue to be checked for updates +6. If the shrinkwrap was loaded from disk, and the tree was mutated, reset + all dependency flags to true (dev, optional, devOptionl, extraneous) +7. If the shrinkwrap was not loaded from disk, or the tree was mutated, + calculate dependency flags appropriately (like for a `loadActual` walk) +8. If options.prune is not false, and we started from a shrinkwrap and then + mutated the tree, prune any deps that are now extraneous. + +## to PLACE a dep in the tree to satisfy an edge for a node: + +Starting from the original thing depending on the dep, walk up the tree +checking each spot until we find the shallowest spot in the tree where the +dependency can go without causing conflicts. + +1. If edge is valid, and dep name is not on the update list, do not place +2. If the node is not a top node, and the dep is a peer dep, then the + starting TARGET is node.parent, otherwise it's node +3. Do until CONFLICT: + 1. CHECK if dep can be placed at TARGET + 2. If not CONFLICT, set result in CAN PLACE + 3. set TARGET to TARGET parent +4. If no satisfying target found, throw Unresolvable Dep Tree error +5. set TARGET to last non-conclict target checked +6. If CAN PLACE is KEEP, do not place +7. add dep to placed list +8. If an existing child by that name at TARGET, + 1. Prune any nodes that are only needed because of the node being + replaced. (Ie, the set of dependencies that are not valid for the + new node being placed, and have no dependents outside of the set.) + + This is relevant because a cycle of peer deps can make it + impossible to update otherwise. + 2. Replace the old child node with the new dep +9. Else, set dep.parent = TARGET +10. If the original edge is valid, but does not resolve to the newly placed + dep, and the newly placed dep could replace the node that the edge + _does_ resolve to, then we've created an unnecessary duplicate. Prune + the node that the edge resolves to. +11. If the newly placed node has any invalid edges in, add their sources to + the update queue +12. For each peer dep in the virtual tree (the dep's former parent) + 1. If the dep's current dependency on that peer is met, then continue + to next peer dep. + 2. PLACE to satisfy the peer edge +13. Return list of all placed nodes + +## to CHECK if a dep can be placed at a target to satisfy an edge for a node + +At each level walking up the tree, try to determine whether it's acceptable +to place the dependency at that point in the tree. + +This can return: + +- CONFLICT: it is incorrect to place that dependency here. Note that there + _are_ cases where we allow a dependency to be placed such that it causes + problems, but only because we then clean up those problems. +- OK: no problem putting the dep here. +- KEEP: there is already a version at this spot that satisfies the + dependency as well (or better) than the dep being placed. +- REPLACE: there is already a version at this spot, but the dep being + placed is better, so put that here instead. + +1. If a child by that name in target: + 1. If integrity matches current dep in tree, KEEP + 2. If node is root, REPLACE if peers can be placed + 3. If target has edge not met by dep, CONFLICT + 3. If current version is less than new version, REPLACE if peers can be + placed + 4. If `preferDedupe` option is set, and current node can satisfy edge, + then KEEP if peers can be placed. + 5. CONFLICT +2. Else, no child by that name in target + 1. If target is not node, and target has a dependency on dep's name, + and dependency is not met by dep, then CONFLICT + 2. Find any descendants of target with a dependency on dep's name, + which resolve to a node HIGHER in the dep tree than target, and + which would be made invalid if this dep took its place (ie, would be + made invalid if their deduped dependency was masked by dep). IF any + exist, CONFLICT + 3. OK to place dep in target diff --git a/workspaces/arborist/docs/links.md b/workspaces/arborist/docs/links.md new file mode 100644 index 000000000..8eb69ce6d --- /dev/null +++ b/workspaces/arborist/docs/links.md @@ -0,0 +1,137 @@ +# links and dep resolution + +A link can be to a package in `node_modules`, or a package outside of +`node_modules`, which is either under the root folder, or not under the +root folder. + +Link targets are considered a top-of-tree node if they are not within the +`node_modules` folder of another traversed node. + +Missing dependencies of top nodes are searched for by checking each +parent dir's `node_modules` folder, as high as the common ancestor of the +project root and the top node itself. Thus, if a link target is _within_ +the project root, it'll only search as high as the project root; if it is +_outside_ the project root, it'll only search as high as the nearest common +ancestor. + +If the dependency is found, then a "dummy node" is created at the parent, +such that the parent/fsParent node resolution can find it, and the dep node +is rooted there as a child of the dummy node. The dummy node is not fully +loaded with its package and all dependencies, so if it's just a +`node_modules` folder containing a zillion packages, it will only load +those packages that are depended upon by something in the tree. + +## Link within the tree: node is subsumed in tree + +In all of these, root depends on link, b depends on c + +``` +root ++-- node_modules + +-- link => root/node_modules/a/node_modules/b + +-- a + | +-- node_modules + | +-- b + +-- c +``` + +Edge from `root/node_modules/a/node_modules/b` -> `root/node_modules/c` + +Loads fine, target is found in tree already. + +## Link outside of tree, but in root folder + +``` +root ++-- b ++-- node_modules + +-- link => root/b + +-- c +``` + +Edge from `root/b` to `root/node_modules/c` + +Loads fine, b has no edges out. + +## Link in tree, but not traversed node + +``` +root ++-- node_modules + +-- link => root/node_modules/a/b + +-- a + | +-- b + +-- c +``` + +Edge from `root/node_modules/a/b` to `root/node_modules/c` + +Loads fine, because link target has no edges out. + +## Link outside of root folder + +``` +root ++-- node_modules + +-- link => b + +b ++-- node_modules + +-- c +``` + +Edge from `b` to `b/node_modules/c` + +Loads fine, because link target has no edges out. + +## Link into other tree + +``` +root ++-- node_modules + +-- link => a/node_modules/b + +a ++-- node_modules + +-- b + +-- c +``` +Edge from `a/node_modules/b` to `a/node_modules/c` + +Load `b`'s deps as high as common ancestor of `b` and `root`, finding `c` +at `a/node_modules/c`. + +## Link into other root folder + +``` +root ++-- node_modules + +-- j => i/j + +i ++-- j ++-- node_modules + +-- k +``` + +Edge from `i/j` to `i/node_modules/k` + +Loads because `i` is under common ancestor of `root` and `i/j`. + +## Link into other tree folder + +``` +root ++-- node_modules + +-- o => m/node_modules/n/o + +m ++-- node_modules + +-- n + | +-- o + +-- p +``` + +Edge from `m/node_modules/n/o` to `m/node_modules/p` + +Loads because `m` is under common ancestor of `root` and `o`. diff --git a/workspaces/arborist/docs/logo.svg b/workspaces/arborist/docs/logo.svg new file mode 100644 index 000000000..7b3ea97cb --- /dev/null +++ b/workspaces/arborist/docs/logo.svg @@ -0,0 +1,44 @@ +<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 20010904//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd"> +<svg version="1.0" xmlns="http://www.w3.org/2000/svg" width="448px" height="600px" viewBox="0 0 4480 6000" preserveAspectRatio="xMidYMid meet"> +<style type="text/css"><![CDATA[ +.title { + font-weight: bold; + font-size: 800px; + font-family: Futura, sans-serif; + fill: #fff; +} +]]></style> +<g id="layer101" fill="#333"> + <path d="M0 3000 l0 -3000 2240 0 2240 0 0 3000 0 3000 -2240 0 -2240 0 0 -3000z"/> +</g> +<g id="layer102" fill="#ffffff" stroke="none"> + <path d="M0 3000 l0 -3000 2240 0 2240 0 0 3000 0 3000 -2240 0 -2240 0 0 -3000z m2828 2978 c-3 -7 -27 -26 -54 -44 -84 -54 -114 -88 -114 -126 0 -57 -41 -79 -62 -33 -14 31 -29 32 -54 4 -31 -35 -71 -50 -109 -43 -50 9 -79 -27 -102 -124 -14 -63 -17 -125 -18 -337 0 -284 7 -333 65 -460 60 -131 127 -188 260 -224 70 -19 317 -54 330 -46 3 2 75 8 160 14 364 27 428 33 485 47 33 8 69 23 80 34 21 20 264 120 291 120 30 0 8 -25 -47 -52 -29 -15 -83 -39 -119 -53 -69 -28 -100 -50 -100 -71 0 -16 64 -71 104 -91 31 -14 42 -12 244 40 117 30 212 51 212 47 0 -19 -38 -34 -162 -65 -73 -18 -141 -36 -150 -40 -36 -15 -18 -47 87 -147 119 -114 246 -256 335 -374 98 -129 105 -189 8 -67 -97 121 -297 313 -327 313 -25 0 -11 -38 100 -276 16 -34 29 -65 29 -69 0 -18 -23 13 -61 83 -23 42 -47 84 -54 92 -7 8 -31 51 -55 95 -78 148 -107 178 -228 238 -59 29 -140 63 -179 77 -100 34 -108 23 -52 -72 113 -189 120 -261 10 -87 -86 135 -113 153 -238 157 -95 4 -251 -12 -265 -26 -10 -10 81 -147 231 -352 117 -158 160 -196 316 -272 50 -24 117 -62 150 -84 61 -41 143 -77 250 -110 l60 -19 -62 -3 c-35 -2 -63 -7 -63 -11 0 -9 96 -145 149 -210 39 -49 98 -105 142 -136 16 -11 35 -36 43 -55 8 -19 48 -77 90 -128 77 -95 92 -121 57 -102 -24 12 -72 69 -149 175 -60 82 -98 112 -128 101 -23 -9 -13 -77 25 -176 15 -41 31 -88 35 -105 6 -28 5 -28 -14 8 -11 20 -41 92 -66 160 -78 209 -119 282 -204 367 -42 41 -102 90 -135 109 -33 19 -80 47 -104 63 -44 28 -71 35 -71 20 0 -5 22 -98 49 -208 52 -211 66 -276 77 -363 12 -97 38 -159 187 -442 42 -81 77 -150 77 -155 0 -12 -16 7 -38 46 -94 167 -191 320 -205 320 -26 0 14 -232 103 -605 48 -204 96 -475 79 -458 -4 4 -21 81 -39 171 -17 90 -46 224 -65 299 -32 125 -47 164 -60 152 -13 -14 -25 -207 -19 -316 5 -83 4 -123 -3 -123 -7 0 -13 147 -16 442 -5 458 -10 522 -57 769 -22 117 -77 312 -107 380 -18 40 -79 134 -108 165 -34 37 -87 98 -216 249 -200 232 -263 300 -275 292 -13 -8 -1 -118 22 -205 17 -66 44 -97 93 -108 54 -12 152 -47 158 -57 3 -5 -3 -7 -13 -4 -50 14 -145 18 -159 7 -12 -10 -12 -17 6 -52 12 -23 35 -62 53 -87 43 -62 89 -140 104 -176 l12 -30 -22 27 c-13 15 -38 47 -57 73 -19 25 -39 44 -45 42 -12 -4 -12 -13 -3 -142 11 -136 -10 -70 -27 84 -11 106 -19 139 -49 207 -19 45 -58 160 -86 255 -60 211 -63 219 -92 245 -32 28 -69 41 -159 54 -112 17 -222 56 -250 90 -17 19 -156 120 -166 120 -25 0 -23 -17 7 -71 112 -203 251 -546 293 -722 21 -91 72 -219 161 -402 70 -145 76 -154 159 -232 47 -45 124 -107 171 -139 76 -50 90 -56 122 -51 20 4 53 18 74 32 46 30 49 30 49 6 0 -26 -44 -59 -89 -66 -86 -14 -91 -42 -17 -89 31 -20 81 -52 111 -71 60 -38 70 -56 28 -49 -56 10 -190 102 -272 184 -55 56 -176 141 -218 154 -12 4 -24 3 -27 -2 -10 -16 29 -111 115 -282 162 -323 346 -599 501 -752 59 -59 108 -110 108 -113 0 -12 -20 -2 -70 35 -29 22 -59 40 -66 40 -13 0 -5 -53 26 -166 10 -39 26 -104 34 -144 10 -47 33 -106 67 -169 70 -129 101 -192 96 -197 -2 -2 -30 36 -63 84 -82 120 -90 117 -85 -33 2 -98 1 -118 -10 -108 -9 9 -15 70 -21 205 -9 227 -26 308 -94 448 -88 185 -231 400 -264 400 -35 0 26 -464 91 -693 17 -60 29 -110 27 -113 -3 -2 -17 18 -32 46 -15 27 -30 50 -33 50 -9 0 3 -110 57 -520 48 -362 60 -492 40 -444 -4 11 -11 56 -15 100 -16 180 -66 457 -85 469 -14 9 -18 -18 -24 -161 -8 -177 -16 -236 -26 -209 -4 11 -3 101 2 200 15 274 4 735 -22 935 -51 407 -76 501 -182 713 -77 154 -172 378 -238 565 -24 67 -46 122 -50 122 -9 0 -9 -164 0 -325 5 -82 23 -230 40 -329 17 -98 30 -186 28 -194 -2 -8 -18 46 -36 121 -30 120 -52 171 -67 155 -3 -2 -8 -34 -11 -69 -4 -35 -13 -75 -20 -89 -13 -23 -14 -20 -8 35 3 33 10 92 15 130 5 39 12 183 15 320 10 376 -15 505 -166 856 -93 217 -151 312 -179 295 -53 -33 -37 -717 25 -1066 22 -120 60 -249 106 -360 45 -107 266 -551 310 -621 77 -126 153 -371 173 -559 8 -74 26 -218 40 -320 14 -102 34 -309 45 -460 11 -151 24 -308 30 -348 5 -40 10 -83 10 -95 -1 -21 -1 -22 -15 -3 -8 11 -15 28 -15 38 0 27 -27 33 -50 13 -19 -18 -20 -17 -20 14 0 18 8 46 18 64 15 28 17 60 16 242 -1 219 -10 325 -49 585 -13 85 -26 184 -30 220 -14 138 -72 333 -142 477 -19 39 -41 74 -49 77 -51 20 -36 -171 22 -282 26 -49 30 -62 16 -57 -34 14 -46 10 -53 -20 -4 -16 -8 -192 -9 -390 -2 -306 -8 -429 -21 -405 -1 3 -5 160 -8 350 l-6 345 -27 49 c-15 26 -44 100 -63 163 -23 72 -45 122 -58 135 -39 35 -184 138 -194 138 -16 0 -21 -53 -34 -310 l-11 -235 30 -60 c53 -105 134 -278 163 -349 44 -105 34 -123 -14 -26 -57 113 -155 286 -171 299 -10 9 -15 -25 -26 -162 -16 -219 -12 -316 16 -354 18 -24 166 -121 186 -123 4 0 38 -20 75 -43 66 -42 150 -123 139 -134 -2 -3 -37 22 -76 56 -74 64 -77 66 -99 52 -19 -12 -7 -59 46 -169 44 -93 50 -130 13 -82 -21 27 -22 30 -70 158 -44 114 -113 178 -212 197 -31 6 -38 4 -49 -16 -7 -13 -13 -62 -13 -109 -1 -75 3 -92 30 -150 24 -52 143 -216 174 -240 15 -11 131 -163 141 -183 5 -12 -8 -2 -30 23 -34 39 -41 43 -47 27 -4 -11 -4 -55 -1 -99 4 -45 5 -79 2 -76 -3 2 -14 48 -25 101 -25 126 -51 179 -112 239 -91 88 -96 67 -39 -157 21 -82 37 -152 36 -153 -2 -2 -13 14 -25 37 -32 63 -46 54 -64 -37 -10 -48 -11 -31 -11 129 0 168 -3 194 -28 291 -15 59 -27 123 -27 141 0 18 -6 41 -13 51 -12 16 -15 13 -35 -31 -12 -26 -24 -69 -28 -94 -4 -34 -10 -46 -22 -45 -22 1 -46 -51 -67 -146 -9 -43 -19 -78 -22 -78 -15 0 9 146 37 230 l32 95 -6 180 c-8 235 -48 467 -76 439 -10 -10 -40 -143 -40 -179 0 -13 -5 -36 -12 -52 l-11 -28 -7 30 c-4 17 -8 48 -9 70 -2 55 -17 79 -26 39 -26 -112 -54 -178 -70 -162 -16 16 30 138 120 319 61 121 63 65 -11 435 -49 247 -68 316 -88 333 -12 10 -17 8 -26 -8 -29 -55 -65 -612 -55 -856 13 -341 39 -528 110 -800 40 -152 69 -322 58 -335 -10 -11 -22 29 -32 111 -17 130 -78 329 -101 329 -17 0 -37 -52 -76 -193 -20 -75 -39 -136 -40 -134 -8 7 28 213 51 297 26 94 27 98 21 330 -6 226 -19 445 -32 503 -4 21 -9 25 -19 17 -11 -9 -73 -130 -141 -275 -17 -35 -19 -37 -22 -17 -2 13 6 46 17 72 22 50 27 80 13 80 -12 0 -57 -45 -107 -109 -27 -34 -50 -59 -53 -57 -11 12 46 101 132 203 167 200 169 205 186 553 10 224 30 1014 30 1229 l0 124 -21 -19 c-47 -41 -114 -169 -236 -449 -95 -216 -110 -259 -128 -353 -20 -103 -42 -349 -50 -552 -18 -480 -72 -1309 -86 -1325 -6 -5 -9 -2 -9 8 0 20 -16 23 -23 5 -3 -7 -6 16 -6 52 -1 83 23 302 50 450 26 149 35 265 21 274 -18 11 -39 -33 -77 -164 -94 -322 -166 -541 -172 -524 -2 5 17 81 42 168 25 87 44 160 42 162 -2 2 -23 -17 -47 -41 -24 -25 -50 -45 -58 -45 -18 0 48 124 91 171 18 20 39 51 46 70 8 19 19 48 26 64 18 42 55 159 72 230 14 59 32 505 20 505 -3 0 -25 -38 -48 -85 -24 -46 -45 -83 -47 -80 -3 2 17 72 42 155 67 215 91 352 67 389 -14 23 -49 3 -90 -52 -88 -116 -162 -309 -181 -467 -7 -52 -29 -179 -51 -282 -21 -103 -45 -247 -53 -320 -15 -136 -37 -232 -43 -192 -2 11 0 63 6 115 9 98 7 117 -13 85 -41 -70 -109 -166 -114 -162 -8 9 65 150 104 202 26 34 42 71 59 140 26 102 56 285 48 297 -8 13 -21 -2 -100 -115 -42 -62 -79 -106 -81 -100 -2 7 23 58 56 115 33 56 68 118 79 137 37 67 82 185 85 225 l3 40 -52 3 c-54 3 -99 -12 -156 -51 -65 -44 -172 -247 -223 -420 -16 -54 -30 -93 -32 -88 -4 11 13 129 24 169 10 37 -11 26 -72 -38 -31 -33 -59 -58 -62 -55 -7 6 92 140 160 216 31 35 82 100 113 143 72 100 112 136 203 182 40 21 75 41 78 46 11 17 -24 53 -87 91 -73 43 -89 44 -161 17 -65 -25 -65 -25 -44 -1 22 25 99 61 128 61 12 0 39 -12 60 -27 50 -34 110 -35 161 -4 40 25 224 243 269 318 35 60 103 231 95 239 -3 3 -45 1 -94 -5 -128 -16 -213 -15 -213 3 0 14 33 28 99 41 l34 7 -29 23 c-64 50 -98 84 -92 90 7 7 13 5 134 -59 97 -50 157 -52 194 -8 14 18 62 105 107 194 44 90 100 192 125 228 24 36 54 83 66 105 45 82 102 255 102 307 0 24 -4 28 -27 28 -77 0 -344 -226 -438 -370 -12 -19 -33 -44 -45 -55 l-23 -20 18 35 c10 19 51 90 92 158 41 67 73 129 69 137 -3 8 -27 25 -53 39 -27 13 -59 33 -73 45 -35 29 -9 27 68 -4 34 -14 75 -25 91 -25 36 0 186 75 256 128 71 54 101 109 145 272 21 78 42 145 46 148 6 7 -1 -70 -13 -129 -11 -54 15 -30 36 34 28 86 34 128 23 157 -6 17 -5 52 4 105 20 125 18 162 -10 180 -28 18 -47 12 -136 -45 -160 -102 -310 -252 -434 -434 -99 -145 -145 -269 -205 -556 -47 -223 -52 -216 -11 18 32 186 37 252 19 252 -18 0 -144 -133 -173 -183 -28 -47 -86 -190 -132 -324 -93 -273 -201 -497 -276 -576 -107 -112 -164 -231 -180 -379 -6 -54 -15 -98 -20 -98 -13 0 -8 170 7 235 8 33 15 64 15 69 0 8 -52 6 -190 -9 -50 -6 -53 -5 -37 8 10 9 51 23 90 32 124 30 170 52 228 107 59 57 154 194 193 279 93 204 203 540 219 672 6 48 -12 49 -74 5 -82 -60 -300 -260 -339 -312 -67 -88 -134 -215 -189 -356 -69 -177 -71 -183 -75 -179 -2 2 0 27 4 54 12 77 3 81 -90 36 -33 -16 -60 -27 -60 -25 0 10 37 42 107 93 81 58 153 154 153 201 0 49 -89 97 -200 108 -36 4 -65 12 -65 17 0 21 89 16 171 -9 45 -14 89 -26 99 -26 16 0 289 267 327 320 40 56 21 78 -88 95 -122 20 -223 45 -259 64 l-30 16 38 3 c21 2 82 -7 135 -20 71 -16 125 -22 202 -22 120 1 161 11 301 78 80 37 106 56 154 108 31 35 62 74 68 87 16 34 -9 48 -97 56 -121 12 -131 32 -20 41 38 3 84 11 103 16 37 11 114 85 163 156 16 23 33 40 37 38 4 -3 8 0 8 7 0 7 17 39 37 72 32 53 35 61 21 74 -20 20 -135 26 -222 12 -150 -25 -232 -90 -359 -283 -121 -185 -138 -209 -148 -215 -10 -7 26 62 136 257 42 74 78 144 80 155 13 83 -281 -70 -450 -235 -35 -34 -66 -60 -69 -57 -2 3 16 37 42 77 58 89 57 107 -4 79 -60 -26 -290 -182 -358 -241 -50 -44 -73 -57 -73 -42 0 4 34 42 76 85 51 52 75 83 71 94 -4 8 -28 28 -54 45 -58 36 -94 65 -73 58 142 -45 182 -52 234 -41 61 13 126 41 215 95 34 21 65 38 68 38 16 0 63 44 63 60 0 22 -45 41 -182 75 -138 34 -248 74 -248 90 0 10 10 10 48 -2 158 -50 279 -71 311 -53 27 14 26 41 -4 115 -14 33 -25 61 -25 63 0 1 11 2 25 2 32 0 49 -27 76 -124 26 -96 37 -116 65 -116 12 0 95 16 184 35 216 46 249 68 163 110 -22 11 -69 45 -104 76 -72 62 -128 99 -153 99 -51 0 -196 106 -196 143 0 25 13 21 61 -19 46 -39 130 -83 194 -104 54 -18 80 -35 145 -95 35 -33 76 -60 113 -73 78 -29 228 -45 390 -40 115 3 138 7 177 27 145 76 329 248 378 351 45 97 54 154 59 387 9 361 -19 518 -101 557 -38 18 -88 21 -116 6 -19 -10 -40 -59 -40 -94 0 -31 -20 -49 -37 -35 -8 7 -13 34 -13 73 0 52 -4 66 -22 83 -33 29 -56 39 -112 48 l-49 7 21 21 c20 20 25 20 64 8 153 -46 184 -52 244 -47 71 7 87 -4 128 -94 36 -82 49 -190 49 -425 0 -279 -21 -444 -73 -584 -23 -63 -15 -54 -105 -114 -38 -26 -92 -67 -119 -90 l-49 -44 89 1 c102 1 97 7 111 -124 l8 -75 47 95 c49 101 74 170 84 234 12 80 25 15 34 -170 11 -212 27 -405 36 -428 3 -9 9 -16 14 -16 11 0 20 143 20 320 1 174 15 459 24 468 4 3 20 -4 36 -16 42 -31 64 -29 56 4 -4 15 -18 41 -30 58 -26 35 -56 142 -56 201 0 21 5 53 11 70 9 25 9 38 -4 66 -20 43 -38 141 -28 152 4 4 14 -12 22 -35 13 -41 14 -34 20 152 7 222 21 284 56 249 13 -13 34 -3 106 52 29 21 77 48 107 61 30 12 84 37 120 57 65 35 167 80 183 81 5 0 7 -5 5 -12z m-1731 -2715 c-4 -3 -7 0 -7 7 0 7 3 10 7 7 3 -4 3 -10 0 -14z" stroke="none" stroke-width="0"/> + <path d="M3546 4548 c-23 -32 -20 -47 10 -54 17 -4 31 -1 41 9 15 14 15 18 -1 41 -20 32 -30 32 -50 4z" /> + <path d="M3038 4506 c-6 -14 -7 -33 -4 -43 7 -16 10 -15 32 10 13 15 24 35 24 43 0 24 -39 16 -52 -10z" /> + <path d="M3110 4516 c0 -8 4 -18 10 -21 13 -8 52 14 45 26 -10 15 -55 11 -55 -5z" /> + <path d="M3197 4524 c-4 -4 -2 -13 3 -20 13 -14 52 -6 47 10 -4 13 -40 20 -50 10z" /> + <path d="M2601 4492 c-7 -6 -11 -14 -7 -20 8 -14 92 -41 136 -44 28 -3 36 1 38 17 2 12 -6 25 -19 34 -27 16 -129 25 -148 13z" /> + <path d="M1583 4441 c-10 -10 -39 -41 -64 -67 -50 -54 -125 -171 -115 -182 15 -14 226 224 226 256 0 18 -28 14 -47 -7z" /> + <path d="M2800 4446 c0 -19 14 -28 34 -20 25 10 19 34 -9 34 -15 0 -25 -6 -25 -14z" /> + <path d="M2927 4453 c-4 -3 -7 -18 -7 -31 0 -18 20 -45 68 -92 37 -37 105 -109 150 -159 85 -92 109 -110 96 -69 -16 52 -173 257 -216 283 -16 9 -28 24 -28 34 0 31 -43 55 -63 34z" /> + <path d="M2276 4418 c10 -34 109 -225 146 -283 41 -64 51 -38 15 38 -14 29 -31 70 -38 92 -14 47 -40 95 -76 142 -30 39 -58 45 -47 11z" /> + <path d="M2073 4418 c-30 -36 -185 -572 -242 -837 -57 -263 -65 -361 -66 -756 0 -277 3 -369 13 -398 18 -51 36 -42 39 19 10 201 63 355 231 673 l102 192 0 257 c0 321 -34 804 -59 850 -6 11 -9 11 -18 0z" /> + <path d="M1314 4105 c-60 -78 -80 -125 -55 -125 23 0 95 119 89 148 -2 12 -11 5 -34 -23z" /> + <path d="M2170 3935 c0 -25 5 -45 10 -45 6 0 10 20 10 45 0 25 -4 45 -10 45 -5 0 -10 -20 -10 -45z" /> + <path d="M1823 3895 c-21 -41 -37 -105 -48 -185 -12 -87 -46 -256 -62 -305 -9 -27 -15 -50 -13 -50 6 0 -53 -89 -96 -146 -46 -60 -98 -163 -91 -182 2 -6 32 29 68 77 35 49 76 100 91 114 15 13 34 42 43 64 33 84 142 611 132 641 -2 7 -13 -6 -24 -28z" /> + <path d="M2175 3787 c-9 -48 -2 -573 8 -583 4 -4 12 -3 18 3 15 15 8 571 -8 598 -10 18 -12 16 -18 -18z" /> + <path d="M2609 3658 c-3 -36 7 -76 56 -213 67 -189 124 -325 136 -325 22 0 -5 125 -45 214 -18 40 -36 88 -40 108 -3 19 -18 48 -31 64 -16 19 -25 41 -25 64 0 46 -48 127 -51 88z" /> + <path d="M2102 3133 c-121 -155 -206 -349 -231 -528 -16 -112 -13 -165 15 -310 27 -142 43 -332 27 -322 -5 3 -10 27 -12 54 -5 120 -38 260 -67 283 -12 11 -14 5 -14 -36 0 -47 43 -310 96 -587 13 -70 24 -136 24 -147 0 -37 15 -20 45 48 16 37 47 108 70 157 108 231 126 367 125 930 -1 361 -4 467 -16 487 -11 17 -33 7 -62 -29z" /> + <path d="M2224 3069 c-11 -23 -15 -49 -12 -91 5 -70 24 -98 32 -47 3 19 8 52 11 74 l7 40 19 -24 c27 -33 36 -18 19 32 -20 56 -52 62 -76 16z" /> + <path d="M2876 2903 c-7 -19 40 -123 56 -123 11 0 3 47 -19 103 -16 40 -27 46 -37 20z" /> + <path d="M2284 2869 c-14 -24 -27 -320 -17 -404 16 -126 45 -178 182 -318 64 -64 124 -117 133 -117 26 0 31 37 18 146 -18 157 -78 317 -208 552 -31 57 -63 115 -70 128 -14 26 -27 30 -38 13z" /> + <path d="M2225 2753 c-12 -28 -18 -241 -7 -223 10 16 32 162 32 212 0 31 -14 37 -25 11z" /> + <path d="M2960 2729 c0 -19 40 -81 57 -86 19 -7 4 32 -27 68 -20 23 -29 29 -30 18z" /> + <path d="M2217 2400 c-8 -18 -47 -432 -47 -501 0 -66 16 -70 29 -7 23 119 39 552 18 508z" /> + <path d="M1735 2270 c-11 -18 -8 -81 4 -93 19 -19 33 7 29 53 -3 43 -19 62 -33 40z" /> + <path d="M2540 1955 c0 -31 24 -90 40 -100 33 -20 27 48 -8 103 -19 29 -32 28 -32 -3z" /> + <path d="M2110 1733 c-12 -15 -47 -79 -78 -141 -53 -106 -57 -119 -57 -185 0 -40 12 -142 27 -227 15 -85 30 -201 34 -258 4 -56 10 -105 14 -107 18 -11 31 31 45 151 36 313 65 627 65 696 0 92 -16 114 -50 71z" /> +</g> +<rect width="98%" height="12%" fill="#080" x="1%" y="82%" fill-opacity="0.9"/> +<text x="50%" y="93%" dominant-baseline="bottom" text-anchor="middle" class="title">ARBORIST</text> +</svg> diff --git a/workspaces/arborist/docs/optional-deps.md b/workspaces/arborist/docs/optional-deps.md new file mode 100644 index 000000000..03c22f9a3 --- /dev/null +++ b/workspaces/arborist/docs/optional-deps.md @@ -0,0 +1,49 @@ +# handling optional dependencies failures + +If optional dependencies fail to install, we should remove the optional dep +from the disk, but keep them included in the lockfile. If they are +unresolveable, then they should be removed from the `idealTree`. + +Types of failure: + +1. Optional dependency metadata failure (404, etc.) `optional-dep-missing` +2. Optional dependency tgz download failure `optional-dep-tgz-missing` +3. Optional dependency version not satisfiable `optional-dep-enotarget` +4. Optional dependency build failure `optional-dep-postinstall-fail` +5. Optional dependency specifies unsupported engine or platform `optional-platform-specification` +6. Optional dependency depends on any failure above + 1. `optional-metadep-missing` + 2. `optional-metadep-tgz-missing` + 3. `optional-metadep-enotarget` + 4. `optional-metadep-postinstall-fail` + +These failures can occur in one of three places: + +1. buildIdealTree (unresolveable or failed metadata optional dep/meta-dep) + +2. reifyNode (failure to download/unpack tgz, engine/platform unsupported) + +3. lifecycle scripts + + + +After taking the diff, get the list of modules to remove `trashList` + +Also add the retired folders to the trash list when they're ready to be +removed + +If a dep fails, and it has `optional: true`, then we have to find the entry +point into the optional tree. Eg, `a -> b -OPT-> c -> d`, if `d` fails, +then we need to remove `c`, as well as any dependencies of `c` that are not +depended upon by any modules outside of the set. + + + +- Gather the set of nodes linking into the node that are not `optional: + true`. We know this won't be unbounded, because the node is optional in + the first place. + +- Add the set of dependencies from that root node that don't have any + edgesIn from outside the set. + +- Add the nodes in the set to the `trashList`. diff --git a/workspaces/arborist/docs/parent.md b/workspaces/arborist/docs/parent.md new file mode 100644 index 000000000..4398cbe72 --- /dev/null +++ b/workspaces/arborist/docs/parent.md @@ -0,0 +1,164 @@ +# parent/children, fsParent/fsChildren, target/linksIn, edges, root, and top + +Each Node object has a link to certain other nodes within the tree, +reflecting various different relationships between packages on disk. + +## root + +There is a fundamental difference between "the project where we're working" +and "a thing loaded as a dependency". When working in a project, we expect +metadata to be updated, new installs to be saved as new dependencies, +`peerDependencies` installed directly in `node_modules`, and so on. + +The root node represents the ultimate "client" of all of this package +resolution. Thus, it's important to keep metadata up to date in the root +node, so that the dependencies in use by the project can be kept +consistent. + +## parent + +If a package is located in a `node_modules` folder, then its `parent` +represents the the package that contains that `node_modules` folder. + +For example: + +``` +x ++-- node_modules + +-- y +``` + +In this tree, `x` is `y`'s parent. + +`y` can resolve its dependents either within its own `node_modules` folder +(ie, it's `children` nodes), or its parent's `node_modules` folder (ie, +it's parent's `children`), or its parent's parent's, and so on up the tree. + +Setting `node.parent` to another Node object will move it into that +location in the tree, automatically adding it to `parent.children`, +replacing an existing node, updating its path property, and so on. Set the +parent to `null` to remove a node fro the tree entirely. + +## children + +The `children` member is a Map of names to Node objects. In the example +above, `x.children.get('y')` returns the `y` Node. + +When resolving dependencies, `y` can look in its own `children` for a +match, or in the `children` of its parent, and so on up to the top of the +tree. + +## fsParent + +Package nodes are not only found in `node_modules` folders. They can be +symlinked into place from anywhere on the file system. + +If a package is underneath the folder of another Node, we call that it's +file system parent node, or `fsParent`. This is relevant when looking for +dependencies, because Node's `require()` lookup algorithm will walk up the +file system looking for resolutions. + +For example, consider this folder tree: + +``` +root ++-- packages +| +-- foo +| +-- bar ++-- node_modules + +-- asdf + +-- quux + +-- foo -> ../../packages/foo + +-- bar -> ../../packages/bar +``` + +When the code in `root/packages/foo` calls `require('asdf')`, it'll walk up +the folder tree, checking for a match first in +`root/packages/foo/node_modules`, then `root/packages/node_modules`, then +`root/node_modules`, and find the match at `root/node_modules/asdf`. + +Thus, the `foo` and `bar` nodes can find their dependencies in the children +of the `root` package. + +To reflect this relationship, the `foo` and `bar` nodes will have the +`root` node set as their `fsParent`. The `root` node will have `foo` and +`bar` in its `fsChildren` set. (This is a Set rather than a Map, because +the name is not relevant. See: path changes below for why this is a +bidirectional relationship at all.) + +## target/linksIn + +`Link` nodes represent a symbolic link to a package folder. + +The link's `target` property is the node representing the real path to the +package on disk. The targets of link nodes have a `linksIn` set of all +the known links to their package folder. + +This is relevant because Node's `require()` resolution algorithm walks from +the _realpath_ of a module, not the location where it was found. + +Links do not need to point to a package on disk that is underneath the root +node, like in the example above. The target of a link can be literally +anywhere. For example: + +``` +/home/isaacs/dev ++-- my-project <-- root node +| +-- node_modules +| +-- dep -> /home/isaacs/dev/dep <-- link node ++-- dep + +-- node_modules + +-- meta-dep +``` + +When we're working in `my-project`, that's the `root`. However, when +`my-project` calls `require('dep')`, it'll look in +`my-project/node_modules/dep` and find the symlink there. + +When the code in the `dep` package calls `require('meta-dep')`, it won't +look in `my-project/node_modules/meta-dep`, it'll start its search in +`dep/node_modules/meta-dep`. + +To reflect this, the `dep` node does not have a `parent` or `fsParent` +reference. All of its dependencies must be installed in its own +`node_modules` folder. + +## top + +A node without a parent is called a `top` node. The `root` node is +_typically_ a top node, but not necessarily. (Technically, the `root` node +is wherever we're doing work; there's nothing stopping us from also loading +a node above it in the fs hierarchy, and we may do so for workspaces.) + +## edgesIn, edgesOut + +Unlike file systems, dependencies are a _graph_ rather than a strict +hierarchical tree structure. (That is, there can be cycles.) + +An `Edge` represents a particular _dependency_ relationship, regardless of +the location on disk of the nodes in that relationship. + +Each edge has `from` and `to` fields referring to the nodes in question. +`from` is the node with the dependency, and `to` is node that satisfies the +dependency (or `null` if it's missing). Each edge also has a `type`, which +can be one of `prod`, `dev`, `optional`, `peer`, or `peerOptional`. + +The `edgesOut` property is a map of names to edge objects representing all +the dependencies of a given node. The `edgesIn` property is the set of all +edges that resolve to the node. + +Edges in and out are always kept in sync by refreshing whenever a change is +made to the tree that can affect dependency resolution. + +A node's `edgesOut` property is a Map of names to edges. A node's +`edgesIn` property is a Set of all edges resolving to the node. + +## refreshPath + +Whenever a node changes location on disk, several other fields are changed +to keep the tree valid. + +The paths of its `children` and `fsChildren` nodes are updated accordingly. + +If a Node has linksIn, then the Link nodes' `realpath` is updated to still +point to the same path on disk. diff --git a/workspaces/arborist/docs/reify.md b/workspaces/arborist/docs/reify.md new file mode 100644 index 000000000..15b047267 --- /dev/null +++ b/workspaces/arborist/docs/reify.md @@ -0,0 +1,275 @@ +# tree reification + +For the tree: + +``` +a ++-- b +| +-- c +| +-- d +| | +-- e +| +-- f ++-- x +| +-- y +| +-- z ++-- p + +-- q +``` + +Turn it into: + +``` +a ++-- b' +| +-- c' +| +-- d +| | +-- e' +| +-- f ++-- x +| +-- y' +| +-- z ++-- i + +-- j +``` + +That is, update b, c, e, and y, but not d, f, x, or z; delete p and q, and +add i and j. + +## Considerations + +- Windows will fail with EPERM when trying to move or delete a folder with + current (or even recent) file operations happening inside it. +- Nodes with `bundleDependencies` _also_ bundle the meta-deps of their + bundled deps. But, we don't know what those are from the manifest, and + it's possible that there's overlap between bundled metadeps and regular + dependencies, meaning that there can be extraneous nodes in the ideal + tree once all the tarballs are unpacked. +- This is exceedingly time-sensitive. +- Users generally expect that a failed install should not make their app + unusable. + +## Reify-A: Safe Rollback-able Process + +This algorithm avoids ever renaming a directory, or removing a directory +with recent writes (except in the case of failure rollbacks), so as to +minimize the chances of hitting Windows file-locking EPERM issues. + +It is very safe, and somewhat disk-inefficient. + +### step 1: retire shallowest nodes to be replaced + +Move aside the nodes that will be replaced. They'll be removed if all goes +well, but if there's an error, we'll move them back. + +(Note: `.b-hash` is actually something like `.b-<8 chars of base64 sha1>`. +We could probably do this _slightly_ faster if we didn't hash the folder, +and instead used a name like `.b-retired`, but not sure if it's worth +it? Seems like it _should_ be safe from collisions? If we're gonna hash +it to defend against concurrent reification commands, then it ought to +include the process id or something, which it currently doesn't, and limits +our options in the future.) + +``` +a ++-- b (.b-hash) +| +-- c +| +-- d +| | +-- e +| +-- f ++-- x +| +-- y (.y-hash) +| +-- z ++-- p (.p-hash) + +-- q +``` + +Fail: rename each retired `.${name}-hash` folder back to `${name}` + +### step 2: create sparse tree + +Now that the shallowest changing nodes are retired, `mkdirp` all leaf +nodes. + +``` +a ++-- b (.b-hash) +| +-- c +| +-- d +| | +-- e +| +-- f ++-- b (empty) +| +-- c (empty) +| +-- d (empty) +| +-- e (empty) ++-- x +| +-- y (.y-hash) +| | +-- z +| +-- y (empty) ++-- p (.p-hash) +| +-- q ++-- i (empty) + +-- j (empty) +``` + +Fail: rimraf sparse tree, fail step 1 + +### step 3: load shrinkwraps + +Unpack any shrinkwrapped nodes, and `loadVirtual` on them, and then update +the ideal tree with their virtual tree nodes. These will always be leaf +nodes, because any `hasShrinkwrap` nodes did not have their deps loaded in +the buildIdealTree step. + +If no shrinkwraps present, proceed to step 4. + +Else: + +- loadVirtual in the newly unpacked shrinkwrap node using the current ideal + node as the root of the virtual tree +- diff trees again +- create the sparse tree + +Fail: fail step 2 + +### step 4: unpack bundled deps into sparse tree and update + +1. Group all nodes with bundles by depth. +2. For depth = 0 to max, for all nodes at that depth with bundleDeps, + 1. If no nodes at that level with bundles still in tree, done + 2. extract all nodes with bundles at that depth level in parallel + 3. call `loadActual` on each one, and move each of the actual subtree's + children under the node in the ideal tree. +3. If any bundleDeps unpacked, prune tree + +Fail: fail step 2 + +### step 5: unpack sparse tree of nodes changing + +``` +a ++-- b (.b-hash) +| +-- c +| +-- d +| | +-- e +| +-- f ++-- b' +| +-- c' +| +-- d (empty) +| +-- e' ++-- x +| +-- y (.y-hash) +| | +-- z +| +-- y' ++-- p (.p-hash) +| +-- q ++-- i + +-- j +``` + +To maximize parallelization while minimizing unnecessary fetches for +bundled deps and meta-deps: + +For all diff nodes in parallel, `pacote.extract` if add/change. + +Fail: fail step 2 + +### step 6: move unchanging nodes into sparse tree + +``` +a ++-- b (.b-hash) +| +-- c +| +-- d (empty) +| +-- e ++-- b' +| +-- c' +| +-- d +| | +-- e' +| +-- f ++-- x +| +-- y (.y-hash) +| +-- y' +| +-- z ++-- p (.p-hash) +| +-- q ++-- i + +-- j +``` + +This actually means that we move each unchanging node's _contents_ (other +than `node_mdules`) into the new location. (Maybe we ought to _only_ ever +move files, not directories?) + +Fail: move unchanging nodes back to retired tree, fail step 2 + +**Windows Consideration!** Extremely easy for a failure in this step to +lead to EPERM in the rollback, if we try to rimraf the sparse tree before +we're fully moved out of it. + +### step 7: rimraf retired original (now sparse) nodes and removal nodes + +``` +a ++-- b' +| +-- c' +| +-- d +| | +-- e' +| +-- f ++-- x +| +-- y' +| +-- z ++-- i + +-- j +``` + +Fail: report failure as a warning and instruct user to `rm -rf` the garbage +directory. + +## Reify-B: Fast and Dirty Approach + +This is the fastest and most efficient way to proceed, but does not admit +any reasonable rollback approach, since we fully delete existing packages +_before_ unpacking the new ones. + +### step 1: clear away excess + +Scan actualTree inventory. For each node: + +- If not in idealTree, rimraf path, remove from actualTree (so we don't + check its children). +- If in ideal tree and identical resolved/integrity, leave it alone. +- If in ideal tree and different, and no unbundled children, rimraf path, + remove from actualTree +- If in ideal tree and different and has unbundled children, + - rimraf package contents + - rimraf bundled children and remove them from tree + +### step 2: lay the path + +`mkdir` every path in ideal tree that does not exist in actualTree + +### step 3: extract shrinkwrap modules + +Each node with a shrinkwrap is already a tip of the tree, because its child +nodes are not traversed in the ideal tree. + +Unpack them into place, and then call loadVirtual using the shrinkwrapped +node as the root, this filling out the ideal tree with the virtual nodes. + +Repeat step 2 if any shrinkwrap nodes were created. + +### step 4: extract bundlers + +1. Group all nodes with bundles by depth. +2. for depth = 0, for all nodes at that depth with bundleDependencies, + 1. If no nodes at that level with bundles still in the tree, done. + 2. extract all nodes with bundles at that depth level in parallel + 3. call `loadActual` with a new Arborist on each one, and move each + depth=1 child node into the ideal tree. +3. if any bundle deps unpacked, prune tree + +### step 5: extract all others + +Extract all remaining packages in parallel. + +Their folders already exist, so it's fine to just dump them all into place. diff --git a/workspaces/arborist/docs/tree-types.md b/workspaces/arborist/docs/tree-types.md new file mode 100644 index 000000000..5ffc6c261 --- /dev/null +++ b/workspaces/arborist/docs/tree-types.md @@ -0,0 +1,30 @@ +# Tree Types + +Arborist handles three different types of tree: + +- `arborist.actualTree` - This is the representation of the actual packages + on disk. It's loaded by calling `arborist.loadActual()`. + +- `arborist.virtualTree` - This is the package tree as captured in a + `package-lock.json` or `npm-shrinkwrap.json`. It's loaded by calling + `arborist.loadVirtual()`. + + This method _may_ be called with an argument + specifyig the node to use as the `root` of the tree, like + `arborist.loadVirtual({ root: nodeObject })`. If a root is not specified + then a missing shrinkwrap is treated as a failure. If a root is not + specified, then a shrinkwrap file must be present, or the virtual load + fails. + +- `arborist.idealTree` - This is the tree of package data that we intend to + install. It's built up based on the shrinkwrap/lockfile if present, the + dependencies in package.json, and any add/remove/update changes requested. + + This is loaded by calling `arborist.buildIdealTree(options)`. The + options object can specify what to add, remove, and/or update. + +During [reification](./reify.md), the `idealTree` is [diffed](./diff.md) +against the actual tree, and then the nodes from the ideal tree are +extracted onto disk. At the end of `arborist.reify()`, the ideal tree is +copied to `arborist.actualTree`, since then it reflects the actual state of +the `node_modules` folder. diff --git a/workspaces/arborist/docs/workspace.md b/workspaces/arborist/docs/workspace.md new file mode 100644 index 000000000..d8c09e937 --- /dev/null +++ b/workspaces/arborist/docs/workspace.md @@ -0,0 +1,181 @@ +# workspaces + +Conceptual install example: + +``` +root ++-- node_modules +| +-- app => app +| +-- a => pkgs/a +| +-- b => pkgs/b +| +-- c => pkgs/c +| +-- x +| +-- y +| +-- z +| +-- i +| +-- n ? +| +-- j +| +-- k ++-- app ++-- pkgs + +-- a (b, c, x, m@file:./m) + | +-- m (n) + | | +-- node_modules + | | +-- n ? + | +-- node_modules + | +-- n ? + | +-- m => pkgs/a/m + | +-- x + | +-- b => pkgs/b + | +-- c => pkgs/c + +-- b (a, c, y) + +-- c (a, b, z) + +-- w (nested workspace) + +-- pkgs + +-- j + +-- k +``` + +- `root/package.json`: + +```json +{ + "workspaces": [ + "pkgs/*", + "app" + ] +} +``` + +root `Node` + +- be aware it's a **Top-level workspace** +- 4 Edges in edgesOut with type=workspace, referencing workspace children + - that means we create a link node in root.children.get('app') + targetting `./app` Node, etc. +- during buildIdeal: + - need to know that app is in root's workspace + - app.wsParent = root + - root.wsChildren.add(app) + - if any dep CAN be satisfied by a named dep in the workspace, then + create a Link targetting that workspace child node + - resolving: _first_ check this.wsParent.get('dep-name'), and if + that's ok, then resolve with a link to that target. + - no hoisting by default: when doing `_canPlaceDep`, if target is + node.wsParent, AND we're not hoisting, THEN: return CONFLICT +- When setting Node.parent or Node.fsParent, set wsParent = parent.wsParent + +## Questions / Thoughts +- What a Workspace node class does? + - Add metadata to lock files + - Glob only relevant for loadActual +- maybe wsParent wsChildren +- when creating +- start with loadActual + - loadActual.loadFSNode + - on check for path is real add another check to make load from/as a workspace +- General impl ideas + - Do we support an opt-out flag? + - Do we want to add extra top-level commands to the cli? + - `npm workspaces info`: Retrieves workspaces locations (might be useful for tooling) + - What do we do with binaries? Do we only link bins for workspaces that are defined as a top-level dep? + - In order to support subsets of packages the way Wes been advocating for, instead of supporting adjacent folders with package.json defining different workspaces, what if we were to support a different workspaces config definition within the top-level package.json - maybe something that allows for definition of subsets? something in the likes of: + ``` + "workspaces": [ + "packages": { + "set-a": [ + "packages/*" + ], + "set-b": [ + "express/express-namespace", + "express/express" + ] + } + ] + ``` + - But then we'd need a way for commands to be aware of these subsets... + +## Registry vs workspace relationship: + +``` +given that registry has latest: a@1.1.0 + +workspace-root ++-- packages + +-- a (a@1.0.1) + +-- b (a@^1.0.0) + +npm install + +workspace-root ++-- packages + +-- a (a@1.0.1) + +-- b (a@^1.0.0) + +-- node_modules + +-- a -> ../../a + +workspaces always prefer to install a nested package if semver is satisfied +``` + +``` +given that registry has latest: a@2.0.0 + +workspace-root ++-- packages + +-- a@1.0.1 + +-- b (a@^2.0.0) + +npm install + +workspace-root ++-- packages + +-- a@1.0.1 + +-- b (a@^2.0.0) + +-- node_modules + +-- a@2.0.0 + +workspaces will try to install deps from registry if no satisfying semver version was found within its nested packages +``` + +## Implementation + +### Build Ideal Tree + +1. Read list of "workpaces" from `package.json` +2. Turn globs into actual locations, retrieve the final list of workspaces paths +3. Arborist needs to be made aware of the list of worskpaces paths + 1. Workspace info parsed (steps 1 and 2) needs to be attached before build ideal tree + 2. On building ideal tree, checks against existing workspaces to append them as child nodes + 3. Edge needs to support a `workspace` type + 4. Edge `satisfiedBy` and/or the `dep-valid` module needs to check against the available workspaces + 5. `edgesOut` with `type: workspace` +4. Represent a workspace node within a tree - is it just `node.workspace` ? + +NOTE: +- Now the build ideal tree is reading fs other than `package.json` and `package-lock` + +### Load Virtual + +1. How to figure out all the structure of workspaces form a pakcage-lock + 1. How it gets saved? + 2. How to build the virtual tree out of reading package-lock +2. maybe support a subset of glob? we need to optimize mapWorkspace regardless +3. lib/shrinkwrap.js has to be aware of workspaces and the structure + 1. Reading from lockfile might render nodes extraneous? + 2. include workspaces map into package-lock files + 3. loadVirtual reads from the lock file and put nodes in the right place + + +### Reify + +1. Correctly symlink workspaces to its dependants `node_modules` + +## Open Ended Questions + +- Maybe do not hoist workspace dependencies by default? + +## Missing: +- Nested workspaces + - Add support to `mapWorkspaces` in buildIdealTree `_linkFromSpec` + - Tests +- Test case with a mixed relationship between `file:` deps and workspaces |