向代码库添加新的 Lint 工具¶

Lint 工具的要求¶

要将 Lint 工具集成到 mozilla-central 代码库中，它需要具备以下条件：

任何必需的依赖项都应作为 ./mach bootstrap 的一部分进行安装。
一个 ./mach lint 接口。
运行 ./mach lint 命令必须通过（注意，可以为单个目录禁用 Lint 工具）。
Taskcluster/Treeherder 集成。
树内文档（位于 docs/code-quality/lint 下），提供基本摘要、链接以及其他有用信息。
单元测试（位于 tools/lint/test 下），以确保 Lint 工具按预期工作并且不会出现回归。

Phabricator 中的审查组为 #linter-reviewers。

Lint 工具基础¶

Lint 工具是一个具有 .yml 扩展名的 yaml 文件。根据 Lint 工具的类型，可能会有与定义一起的 Python 代码，由 'payload' 属性指向。

这是一个简单的示例

no-eval.yml

EvalLinter:
    description: Ensures the string eval doesn't show up.
    extensions: ['js']
    type: string
    payload: eval

现在 no-eval.yml 被传递到 LintRoller.read() 中。

Lint 工具类型¶

有四种类型的 Lint 工具，将来可能会添加更多类型。

string - 如果找到子字符串则失败。
regex - 如果正则表达式匹配则失败。
external - 如果 Python 函数返回非空结果列表则失败。
structured_log - 如果 mozlog 记录器发出任何 lint_error 或 lint_warning 日志消息则失败。

从上面的示例可以看出，字符串和正则表达式 Lint 工具非常容易创建，但如果可能，应避免使用它们。最好为要 Lint 的语言使用上下文感知的 Lint 工具。例如，使用 eslint 来 Lint JavaScript 文件，使用 ruff 来 Lint Python 文件等。

这引出了第三种也是最有趣类型的 Lint 工具：external。外部 Lint 工具调用一个任意的 Python 函数，该函数不仅负责运行 Lint 工具，还负责确保结果的结构正确。例如，外部类型可以调用第三方 Lint 工具，收集输出并将其格式化为 Issue 对象列表。此 Python 函数的签名为 lint(files, config, **kwargs)，其中 files 是要 Lint 的文件列表，config 是在 .yml 文件中定义的 Lint 工具定义。

结构化日志 Lint 工具非常类似于外部 Lint 工具，但适用于 Lint 工具代码使用 mozlog 并在 Lint 失败时发出 lint_error 或 lint_warning 日志消息的情况。建议用于编写新颖的特定于 Gecko 的 Lint。在这种情况下，Lint 函数的签名为 lint(files, config, logger, **kwargs)。

Lint 工具定义¶

每个 .yml 文件必须至少定义一个 Lint 工具。以下是支持的键：

description - Lint 工具用途的简要描述（必需）。
type - ‘string’、‘regex’ 或 ‘external’ 之一（必需）。
payload - 实际的 Lint 逻辑，取决于类型（必需）。
include - 将被考虑的文件路径列表（可选）。
exclude - 不应匹配的文件路径或通配符模式列表（可选）。
extensions - 将被考虑的文件扩展名列表（可选）。
exclude_extensions - 要排除的文件扩展名列表（可选）。
setup - 设置外部依赖项的函数（可选）。
support-files - 与 Lint 工具本身的运行相关的配置文件的通配符模式列表（可选）。
find-dotfiles - 如果设置为 true，则对点文件 (.*) 运行（可选）。
ignore-case - 如果设置为 true 且 type 为正则表达式，则忽略大小写（可选）。

请注意，Lint 工具不能同时指定 extensions 和 exclude_extensions。

除了以上内容外，一些 .yml 文件对应于单个 Lint 规则。对于这些文件，可以指定以下其他键：

message - 在违规时打印的字符串（可选）。
hint - 关于如何修复违规的线索字符串（可选）。
rule - Lint 规则的 ID 字符串（可选）。
level - 违规的严重程度，可以是 ‘error’ 或 ‘warning’（可选）。

对于 structured_log Lint，以下其他键适用：

logger - 用于记录的 StructuredLog 对象。如果未提供，则会创建一个（可选）。

示例¶

以下是一个外部 Lint 工具的示例，它会调用 Python ruff Lint 工具，我们将其文件命名为 ruff_lint.py (树内版本)。

import json
import os
import subprocess
from collections import defaultdict
from shutil import which

from mozlint import result


RUFF_NOT_FOUND = """
Could not find ruff! Install ruff and try again.
""".strip()


def lint(paths, config, **lintargs):
    binary = which('ruff')
    if not binary:
        print(RUFF_NOT_FOUND)
        return 1


    cmd = ["ruff", "check", "--force-exclude", "--format=json"] + paths
    output = subprocess.run(cmd, stdout=subprocess.PIPE, env=os.environ).output

    # all passed
    if not output:
        return []

    try:
        issues = json.loads(output)
    except json.JSONDecodeError:
        log.error(f"Could not parse output: {output}")

    results = []
    for issue in issues:
        # convert ruff's format to mozlint's format
        res = {
            "path": issue["filename"],
            "lineno": issue["location"]["row"],
            "column": issue["location"]["column"],
            "lineoffset": issue["end_location"]["row"] - issue["location"]["row"],
            "message": issue["message"],
            "rule": issue["code"],
            "level": "error",
        }

        if issue["fix"]:
            res["hint"] = issue["fix"]["message"]

        results.append(result.from_config(config, **res))

    return {"results": results, "fixed": fixed}

以下是调用它的 Lint 工具定义：

ruff:
    description: Python Linter
    include: ["."]
    extensions: ["py"]
    support-files:
        - "**/.ruff.toml"
        - "**/ruff.toml"
        - "**/pyproject.toml"
    type: external
    payload: py.ruff:lint

请注意，payload 有两部分，由 ‘:’ 分隔。第一部分是模块路径，mozlint 将尝试导入它。第二部分是该模块中的对象路径（例如，要调用的函数的名称）。mozlint 的使用者负责确保模块在 sys.path 中。结构化日志 Lint 使用相同的导入机制。

support-files 键用于列出配置文件或与 Lint 工具本身的运行相关的文件。如果使用 --outgoing 或 --workdir 并且其中一个文件被修改，则将 Lint 整个代码库，而不是仅 Lint 修改的文件。

结果定义¶

在生成结果列表时，以下值可用。

名称	描述	可选
linter	标记此错误的 Lint 工具的名称。
path	包含错误的文件的路径。
message	描述错误的文本。
lineno	包含错误的行号。
column	包含错误的列。
level	错误的严重程度，可以是 ‘warning’ 或 ‘error’（默认为 ‘error’）。	是
hint	修复错误的建议。	是
source	错误的源代码上下文。	是
rule	违反的规则的名称。	是
lineoffset	表示错误跨越多行，格式为 (<lineno offset>，<num lines>)。	是
diff	描述需要对代码进行的更改的差异。	是

自动化测试¶

每个新的检查器都必须有相关的测试。如果您的 Lint 工具是 mylinter，则测试文件应命名为 tools/lint/test/test_mylinter.py，任何示例文件应命名为 tools/lint/test/files/mylinter/my-example-file。确保您的测试已作为清单 tools/lint/test/python.toml 中的一节 ["test_mylinter.py"] 添加。

它们应该很容易编写，因为大部分工作由 Mozlint 框架管理。关键声明是 LINTER 变量，它必须与链接器声明匹配。

例如，ruff 测试如下所示：

import mozunit
LINTER = 'ruff'

def test_lint_ruff(lint, paths):
    results = lint(paths('bad.py'))
    assert len(results) == 2
    assert results[0].rule == 'F401'
    assert results[1].rule == 'E501'
    assert results[1].lineno == 5

if __name__ == '__main__':
    mozunit.main()

与往常一样，请确保覆盖了足够的正例和反例。

运行测试

$ ./mach python-test --subsuite mozlint

运行特定测试

./mach python-test --subsuite mozlint tools/lint/test/test_black.py

更多测试可以在树内找到。

跟踪已修复的问题¶

所有提供 fix support 的 Lint 工具都返回字典而不是列表。

{"results":result,"fixed":fixed}

results - 它无法修复的所有 Lint 错误。
fixed - 已修复错误的数量（对于 fix=False，此值为 0）。

一些代码检查工具（例如：codespell）可能需要两遍扫描才能统计修复问题的数量。其他工具可能只需要一些调整。

要添加测试来检查修复计数，请添加一个全局变量fixed = 0，并编写一个函数来添加您的测试，如Automated testing部分所述。

以下是一个示例

fixed = 0


def test_lint_codespell_fix(lint, create_temp_file):
# Typo has been fixed in the contents to avoid triggering warning
# 'informations' ----> 'information'
    contents = """This is a file with some typos and information.
But also testing false positive like optin (because this isn't always option)
or stuff related to our coding style like:
aparent (aParent).
but detects mistakes like mozilla
""".lstrip()

    path = create_temp_file(contents, "ignore.rst")
    lint([path], fix=True)

    assert fixed == 2

启动依赖项¶

许多代码检查工具，特别是第三方工具，都需要一组依赖项。这可能像从包管理器安装二进制文件一样简单，也可能像拉取整个工具、插件及其依赖项的图一样复杂。

无论哪种方式，为了减轻用户的负担，代码检查工具都应努力提供其所有依赖项的自动启动。为了帮助实现这一点，mozlint允许代码检查工具定义一个setup配置，该配置具有与外部有效负载相同的路径对象格式。例如（树内版本）

ruff:
    description: Python linter
    include: ['.']
    extensions: ['py']
    type: external
    payload: py.ruff:lint
    setup: py.ruff:setup

setup 函数接受一个参数，即正在检查的存储库的根目录。对于ruff，它可能看起来像这样

import subprocess
from shutil import which

def setup(root, **lintargs):
    # This is a simple example. Please look at the actual source for better examples.
    if not which("ruff"):
        subprocess.call(["pip", "install", "ruff"])

setup 函数将在运行代码检查工具之前隐式调用。这意味着如果不需要执行任何设置，它应该快速返回并且不产生任何输出。

setup 函数也可以通过运行mach lint --setup显式调用。这只会执行设置，而不会执行任何代码检查。它主要用于其他工具（如mach bootstrap）进行调用。

将代码检查工具添加到 CI¶

首先，需要在 Taskcluster 中声明该作业。

这应该在mozlint Taskcluster 配置中完成。您需要定义一个符号、如何执行它以及在何种类型的更改上执行它。

例如，对于 ruff，配置如下

py-ruff:
    description: run ruff over the gecko codebase
    treeherder:
        symbol: py(ruff)
    run:
        mach: lint -l ruff -f treeherder -f json:/builds/worker/mozlint.json .
    when:
        files-changed:
            - '**/*.py'
            - '**/.ruff.toml'

如果代码检查工具需要外部程序，则需要在设置脚本中安装它，并可能在Docker 配置中安装必要的文件。

注意

如果代码检查工具发现的缺陷很小，请确保通过在Issue中设置{“level”: “warning”}将其记录为警告。这意味着如果缺陷被提交，它不会导致回滚，但仍然会在审查时间或在本地使用-W/–warnings时由 reviewbot 显示。