DomCrawler 组件

编辑此页

DomCrawler 组件简化了 HTML 和 XML 文档的 DOM 导航。

注意

虽然有可能，但 DomCrawler 组件并非设计用于操作 DOM 或重新转储 HTML/XML。

安装

        1
        $ composer require symfony/dom-crawler
    

注意

如果在 Symfony 应用程序外部安装此组件，则必须在代码中引入 vendor/autoload.php 文件，以启用 Composer 提供的类自动加载机制。阅读这篇文章了解更多详情。

表单也得到了特殊处理。Crawler 上有一个 selectButton() 方法，它返回另一个 Crawler，该 Crawler 匹配 <button> 或 <input type="submit"> 或 <input type="button"> 元素（或其中的 <img> 元素）。在 id、alt、name 和 value 属性以及这些元素的文本内容中查找作为参数给定的字符串。

此方法特别有用，因为你可以使用它返回一个 Form 对象，该对象表示按钮所在的表单

        1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
        // button example: <button id="my-super-button" type="submit">My super button</button>

// you can get button by its label
$form = $crawler->selectButton('My super button')->form();

// or by button id (#my-super-button) if the button doesn't have a label
$form = $crawler->selectButton('my-super-button')->form();

// or you can filter the whole form, for example a form has a class attribute: <form class="form-vertical" method="POST">
$crawler->filter('.form-vertical')->form();

// or "fill" the form fields with data
$form = $crawler->selectButton('my-super-button')->form([
    'name' => 'Ryan',
]);
    

Form 对象有很多非常有用的方法来处理表单

        1
2
3
        $uri = $form->getUri();
$method = $form->getMethod();
$name = $form->getName();
    

getUri() 方法不仅仅返回表单的 action 属性。如果表单方法是 GET，则它会模拟浏览器的行为，并返回 action 属性，后跟包含所有表单值的查询字符串。

注意

支持可选的 formaction 和 formmethod 按钮属性。getUri() 和 getMethod() 方法会考虑这些属性，以始终根据用于获取表单的按钮返回正确的操作和方法。

你可以在表单上虚拟地设置和获取值

        1
2
3
4
5
6
7
8
9
10
11
12
        // sets values on the form internally
$form->setValues([
    'registration[username]' => 'symfonyfan',
    'registration[terms]'    => 1,
]);

// gets back an array of values - in the "flat" array like above
$values = $form->getValues();

// returns the values like PHP would see them,
// where "registration" is its own array
$values = $form->getPhpValues();
    

要处理多维字段

        1
2
3
4
5
6
7
8
        <form>
    <input name="multi[]">
    <input name="multi[]">
    <input name="multi[dimensional]">
    <input name="multi[dimensional][]" value="1">
    <input name="multi[dimensional][]" value="2">
    <input name="multi[dimensional][]" value="3">
</form>
    

传递一个值数组

        1
2
3
4
5
6
7
8
9
10
11
12
13
        // sets a single field
$form->setValues(['multi' => ['value']]);

// sets multiple fields at once
$form->setValues(['multi' => [
    1             => 'value',
    'dimensional' => 'an other value',
]]);

// tick multiple checkboxes at once
$form->setValues(['multi' => [
    'dimensional' => [1, 3] // it uses the input value to determine which checkbox to tick
]]);
    

这很棒，但更棒的是！Form 对象允许你像浏览器一样与表单交互，选择单选按钮值、勾选复选框和上传文件

        1
2
3
4
5
6
7
8
9
10
11
12
13
14
        $form['registration[username]']->setValue('symfonyfan');

// checks or unchecks a checkbox
$form['registration[terms]']->tick();
$form['registration[terms]']->untick();

// selects an option
$form['registration[birthday][year]']->select(1984);

// selects many options from a "multiple" select
$form['registration[interests]']->select(['symfony', 'cookies']);

// fakes a file upload
$form['registration[photo]']->upload('/path/to/lucas.jpg');
    

使用表单数据

执行所有这些操作的意义是什么？如果你在内部进行测试，你可以像表单刚提交一样，使用 PHP 值从表单中获取信息

        1
2
        $values = $form->getPhpValues();
$files = $form->getPhpFiles();
    

如果你正在使用外部 HTTP 客户端，则可以使用表单来获取创建表单的 POST 请求所需的所有信息

        1
2
3
4
5
6
        $uri = $form->getUri();
$method = $form->getMethod();
$values = $form->getValues();
$files = $form->getFiles();

// now use some HTTP client and post using this information
    

使用所有这些功能的集成系统的一个很好的例子是 HttpBrowser，它由 BrowserKit 组件提供。它理解 Symfony Crawler 对象，并且可以使用它直接提交表单

        1
2
3
4
5
6
7
8
9
10
11
12
13
14
        use Symfony\Component\BrowserKit\HttpBrowser;
use Symfony\Component\HttpClient\HttpClient;

// makes a real request to an external site
$browser = new HttpBrowser(HttpClient::create());
$crawler = $browser->request('GET', 'https://github.com/login');

// select the form and fill in some values
$form = $crawler->selectButton('Sign in')->form();
$form['login'] = 'symfonyfan';
$form['password'] = 'anypass';

// submits the given form
$crawler = $browser->submit($form);
    

选择无效的选择值

默认情况下，选择字段（select、radio）已激活内部验证，以防止你设置无效值。如果你希望能够设置无效值，则可以在整个表单或特定字段上使用 disableValidation() 方法

        1
2
3
4
5
6
        // disables validation for a specific field
$form['country']->disableValidation()->select('Invalid value');

// disables validation for the whole form
$form->disableValidation();
$form['country']->select('Invalid value');
    

解析 URI

UriResolver 类接受 URI（相对、绝对、片段等），并将其转换为相对于另一个给定基本 URI 的绝对 URI

        1
2
3
4
5
        use Symfony\Component\DomCrawler\UriResolver;

UriResolver::resolve('/foo', 'https:///bar/foo/'); // https:///foo
UriResolver::resolve('?a=b', 'https:///bar#foo'); // https:///bar?a=b
UriResolver::resolve('../../', 'https:///'); // https:///
    

使用 HTML5 解析器

如果你需要 Crawler 使用 HTML5 解析器，请将其 useHtml5Parser 构造函数参数设置为 true

        1
2
3
        use Symfony\Component\DomCrawler\Crawler;

$crawler = new Crawler(null, $uri, useHtml5Parser: true);
    

通过这样做，crawler 将使用 masterminds/html5 库提供的 HTML5 解析器来解析文档。

了解更多

这项工作，包括代码示例，根据 Creative Commons BY-SA 3.0 许可获得许可。

版本

DomCrawler 组件

安装

用法

节点过滤

节点遍历

访问节点值

添加内容

表达式求值

链接

图片

表单

使用表单数据

选择无效的选择值

解析 URI

使用 HTML5 解析器

了解更多

成为 Symfony 贡献者