Skip to main content

分布式任务执行 (DTE)

¥Distribute Task Execution (DTE)

Lerna 使用 caching--since 标志加快你的平均 CI 时间。但这些功能都无法应对最坏的情况。当你的存储库的核心内容被修改并且每个任务都需要在 CI 中运行时,提高性能的唯一方法是添加更多代理作业并有效地并行化任务。

¥Lerna speeds up your average CI time with caching and the --since flag. But neither of these features help with the worst case scenario. When something at the core of your repo has been modified and every task needs to be run in CI, the only way to improve the performance is by adding more agent jobs and efficiently parallelizing the tasks.

并行化任务最明显的方法是按类型拆分任务:在一项作业上运行所有测试,在另一项作业上运行所有测试,在第三项作业上运行所有 lint 任务。这种策略称为分箱。如果某些测试任务以构建任务作为先决条件,这可能会变得困难,但假设你找到某种方法来处理该问题,典型的设置可能如下图所示。这里,测试任务被延迟,直到所有必要的构建工件准备就绪,但构建和 lint 任务可以立即开始。

¥The most obvious way to parallelize tasks is to split tasks up by type: running all tests on one job, all builds on another and all lint tasks on a third. This strategy is called binning. This can be made difficult if some test tasks have build tasks as prerequisites, but assuming you figure out some way to handle that, a typical set up can look like the diagram below. Here the test tasks are delayed until all necessary build artifacts are ready, but the build and lint tasks can start right away.

CI using binning

分箱方法的问题是你最终会在一项或多项工作上有一些空闲时间。Nx 的分布式任务执行通过根据任务的平均运行时间将每个单独的任务分配给代理作业,将空闲时间降至最低。Nx 还保证任务以正确的顺序执行,并使用分布式缓存来确保以前任务的构建工件存在于需要它们的每个代理作业上。

¥The problem with the binning approach is you'll end up with some idle time on one or more jobs. Nx's distributed task execution reduces that idle time to the minimum possible by assigning each individual task to agent jobs based on the task's average run time. Nx also guarantees that tasks are executed in the correct order and uses distributed caching to make sure that build artifacts from previous tasks are present on every agent job that needs them.

当你设置 Nx 的分布式任务执行时,你的任务图将看起来更像这样:

¥When you set up Nx's distributed task execution, your task graph will look more like this:

CI using DTE

CI 不仅可以更快地完成,而且调试体验与在单个作业上运行所有 CI 一样。这是因为 Nx 使用分布式缓存来重新创建所有日志并在主作业上构建工件。

¥And not only will CI finish faster, but the debugging experience is the same as if you ran all of your CI on a single job. That's because Nx uses distributed caching to recreate all of the logs and build artifacts on the main job.

在此 改善最坏情况 CI 时间的详细指南 中查找更多信息。

¥Find more information in this detailed guide to improve your worst case CI times.

设置

¥Set up

要分发任务执行,你需要 (1) 连接到 Nx Cloud 并 (2) 在 CI 工作流程中启用 DTE。这些步骤中的每一个都可以使用单个命令来启用:

¥To distribute your task execution, you need to (1) connect to Nx Cloud and (2) enable DTE in your CI workflow. Each of these steps can be enabled with a single command:

1. Connect to Nx Cloud
nx connect-to-nx-cloud
2. Enable DTE in CI
nx generate @nrwl/workspace:ci-workflow --ci=github

--ci 标志可以是 githubcircleciazure。有关设置 DTE 的更多详细信息,请阅读 本指南

¥The --ci flag can be github, circleci or azure. For more details on setting up DTE, read this guide.

CI 执行流程

¥CI Execution Flow

分布式任务执行可以在任何 CI 提供商上运行。你负责在 CI 系统中启动作业。然后,Nx Cloud 协调这些作业的协同工作方式。你需要在 CI 系统中创建两种不同类型的作业。

¥Distributed task execution can work on any CI provider. You are responsible for launching jobs in your CI system. Nx Cloud then coordinates the way those jobs work together. There are two different kinds of jobs that you'll need to create in your CI system.

  1. 控制要执行的内容的一项主要工作

    ¥One main job that controls what is going to be executed

  2. 实际执行任务的多个代理作业

    ¥Multiple agent jobs that actually execute the tasks

主要作业执行流程如下所示:

¥The main job execution flow looks like this:

# Coordinate the agents to run the tasks
- npx nx-cloud start-ci-run
# Run any commands you want here
- lerna run lint --since=main & lerna run test --since=main & lerna run build --since=main
# Stop any run away agents
- npx nx-cloud stop-all-agents

Agent 作业执行流程非常简单:

¥The agent job execution flow is very simple:

# Wait for tasks to execute
- npx nx-cloud start-agent

主要工作看起来或多或少与你没有使用任何发行版相同。你唯一需要做的就是在开始时调用 npx nx-cloud start-ci-run,并可选择在结束时调用 npx nx-cloud stop-all-agents

¥The main job looks more or less the same way as if you haven't used any distribution. The only thing you need to do is to invoke npx nx-cloud start-ci-run at the beginning and optionally invoke npx nx-cloud stop-all-agents at the end.

代理作业运行长时间运行的 start-agent 进程,这些进程执行与给定 CI 运行关联的所有任务。设置它们所需要做的唯一一件事就是调用 npx nx-cloud start-agent。该进程将继续运行,直到 Nx Cloud 告诉它终止。

¥The agent jobs run long-running start-agent processes that execute all the tasks associated with a given CI run. The only thing you need to do to set them up is to invoke npx nx-cloud start-agent. This process will keep running until Nx Cloud tells it to terminate.

请注意,重要的是主作业和代理作业具有相同的环境和相同的源代码。他们大约在同一时间开始。而且,一旦主要工作完成,所有代理都将停止。

¥Note it's important that the main job and the agent jobs have the same environment and the same source code. They start around the same time. And, once the main job completes, all the agents will be stopped.

还需要注意的是,Nx Cloud 代理不是机器,而是在机器上运行的长时间运行的进程。也就是说,Nx Cloud 不管理你的代理 - 你需要在 CI 配置中进行管理(查看下面的 CI 示例)。

¥It's also important to note that an Nx Cloud agent isn't a machine but rather a long-running process that runs on a machine. I.e., Nx Cloud doesn't manage your agents--you need to do it in your CI config (check out CI examples below).

Nx Cloud 是一个协调器。主要作业告诉 Nx Cloud 你要运行什么,Nx Cloud 将在代理之间分配这些任务。Nx Cloud 会自动将文件从一个代理移动到另一个代理,从代理移动到主作业。

¥Nx Cloud is an orchestrator. The main job tells Nx Cloud what you want to run, and Nx Cloud will distribute those tasks across the agents. Nx Cloud will automatically move files from one agent to another, from the agents to the main job.

最终结果是,当 lerna run build --since=main 在主作业上完成时,在代理上创建的所有文件工件都会复制到主作业,就好像主作业已在本地构建了所有内容一样。

¥The end result is that when say lerna run build --since=main completes on the main job, all the file artifacts created on agents are copied over to the main job, as if the main job had built everything locally.

并行运行事物

¥Running Things in Parallel

--concurrency 传播到代理。例如,npx lerna run build --since=main --concurrency=3 --dte 告诉 Nx Cloud 在每个代理上并行运行最多 3 个构建目标。因此,如果你有 10 个代理,你将在所有代理上并行运行最多 30 个构建。

¥--concurrency is propagated to the agents. E.g., npx lerna run build --since=main --concurrency=3 --dte tells Nx Cloud to run up to 3 build targets in parallel on each agent. So if you have say 10 agents, you will run up to 30 builds in parallel across all of them.

你还希望并行运行尽可能多的命令。例如,

¥You also want to run as many commands in parallel as you can. For instance,

- lerna run lint --since=main 
- lerna run test --since=main
- lerna run build --since=main

¥is worse than

- lerna run lint --since=main & lerna run test --since=main & lerna run build --since=main

后者将同时安排所有三个命令,因此如果代理找不到任何要构建的内容,它将开始运行测试和 lints。结果是更好的代理利用率和更短的 CI 时间。

¥The latter is going to schedule all the three commands at the same time, so if an agent cannot find anything to build, it will start running tests and lints. The result is better agent utilization and shorter CI time.

CI/CD 示例

¥CI/CD Examples

下面的示例展示了如何使用 Nx 和 Nx Cloud 使用分布式任务执行和分布式缓存来设置 CI。

¥The examples below show how to set up CI using Nx and Nx Cloud using distributed task execution and distributed caching.

每个组织以不同的方式管理其 CI/CD 管道,因此这些示例不涵盖 CI/CD 的特定于组织的方面(例如部署)。他们主要关注正确配置 Nx。

¥Every organization manages their CI/CD pipelines differently, so the examples don't cover org-specific aspects of CI/CD (e.g., deployment). They mainly focus on configuring Nx correctly.

阅读指南以获取有关如何在 CI 中配置它们的更多信息。

¥Read the guides for more information on how to configure them in CI.

请注意,只能分发可缓存的操作,因为它们必须在主作业上重播。

¥Note that only cacheable operations can be distributed because they have to be replayed on the main job.

相关存储库和示例

¥Relevant Repositories and Examples