Ansible_处理失败的任务
阅读原文时间:2023年07月09日阅读:2

一、Ansible处理任务失败

1、管理play中任务错误

1️⃣:Ansible评估任务的返回代码,从而确定任务是成功还是失败

2️⃣:通常而言,当任务失败时,Ansible将立即在该主机上中止play的其余部分并且跳过所有后续任务,但有些时候,可能希望即使在任务失败时也继续执行play

2、忽略任务失败

1️⃣:默认情况下,任务失败时play会中止。不过,可以通过忽略失败的任务来覆盖此行为。可以在任务中使用ignore_errors关键字来实现此目的

  • 演示实例:

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      gather_facts: no
      tasks:

      • name: install httpd
        yum:
        name: packages //没有这个包
        state: present
        ignore_errors: yes //可选{yes、no}

      • name: shoe some massage
        debug:
        msg: "hello word"

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [install httpd] ******************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/libexec/platform-python"}, "changed": false, "failures": ["No package packages available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}
    …ignoring //已经忽略这个任务出错

    TASK [shoe some massage] **************************************************************************************************************************************************
    ok: [client.example.com] => {
    "msg": "hello word"
    }

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=1   

3、任务失败也强制执行处理程序(handlers)

1️⃣:在play中设置force_handlers: yes关键字,则即使play因为后续任务失败而中止也会调用被通知的处理程序(force:促使,推动)

  • 演示实例:

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      force_handlers: yes //可选{yes、no}
      tasks:

      • name: install httpd
        shell: ls //这条命令一定会执行成功,从而保证handlers处理程序一定会被触发
        notify:

        • massage
      • name: install httpd
        yum:
        name: packages //没有这个包,肯定会出错
        state: present

      handlers:

      • name: massage
        debug:
        msg: "hello word"

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [Gathering Facts] ****************************************************************************************************************************************************
    ok: [client.example.com]

    TASK [install httpd] ******************************************************************************************************************************************************
    changed: [client.example.com]

    TASK [install httpd] *****************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"changed": false, "failures": ["No package packages available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}

    RUNNING HANDLER [massage] *************************************************************************************************************************************************
    ok: [client.example.com] => {
    "msg": "hello word"
    }

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=3 changed=1 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

2️⃣:处理程序会在任务报告changed结果时获得通知,而在任务报告okfailed结果时不会获得通知

4、指定任务失败的条件

1️⃣:在任务中使用failed_when关键字来指定表示任务已失败的条件;通常与命令模块搭配使用,这些模块可能成功执行了某一命令,但命令的输出可能指示了失败

  • 演示实例一:使用failed_when关键字

    //查看使用的脚本
    [root@localhost project]# cat files/test.sh
    #!/bin/bash
    cat /root //这句肯定会出错
    echo "hello word"
    //注意:在playbook中执行脚本会以最后一个命令作为错误判断标准,中间错误命令不会影响整体的出错,同样也不会因为中间出错而报错

    //查看playbook,执行一次看是否成功

    [root@localhost project]# cat playbook.yaml

    • hosts: all tasks:
      • name: test
        script:
        files/test.sh
        [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [Gathering Facts] ****************************************************************************************************************************************************
    ok: [client.example.com]

    TASK [test] ***************************************************************************************************************************************************************
    changed: [client.example.com]

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
    //这样无法判断是否都执行成功

    //添加任务失败判断语句

    [root@localhost project]# cat playbook.yaml

    • hosts: all tasks:
      • name: test
        script:
        files/test.sh
        register: result
        failed_when: "'Is a directory' in result['stdout']"
        [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [Gathering Facts] ****************************************************************************************************************************************************
    ok: [client.example.com]

    TASK [test] ***************************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"changed": true, "failed_when_result": true, "rc": 0, "stderr": "Shared connection to client.example.com closed.\r\n", "stderr_lines": ["Shared connection to client.example.com closed."], "stdout": "cat: /root: Is a directory\r\nhello word\r\n", "stdout_lines": ["cat: /root: Is a directory", "hello word"]}

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=1 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

2️⃣:fail模块也可用于强制任务失败(主要是将杂乱的提示信息通过自己设置提示方式,达到简单、明了的目的)

  • 演示实例二:

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      tasks:

      • name: test
        script:
        files/test.sh
        register: result

      • fail:
        msg: "There have a failed"
        when: "'Is a directory' in result['stdout']"

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [Gathering Facts] ****************************************************************************************************************************************************
    ok: [client.example.com]

    TASK [test] ***************************************************************************************************************************************************************
    changed: [client.example.com]

    TASK [fail] ***************************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"changed": false, "msg": "There have a failed"}

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=2 changed=1 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

5、指定何时任务报告“changed”结果

1️⃣:当任务对托管主机进行了更改时,会报告 changed 状态并通知处理程序;如果任务不需要进行更改,则会报告ok并且不通知处理程序

2️⃣:使用changed_when关键字可用于控制任务在何时报告它已进行了更改

  • 演示实例一:

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      tasks:

      • name: test
        shell: echo "hello word"

      //执行后发现,每次都是changed
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [Gathering Facts] ****************************************************************************************************************************************************
    ok: [client.example.com]

    TASK [test] ***************************************************************************************************************************************************************
    changed: [client.example.com]

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0

    //添加changed_when关键字,以便报告OK

    [root@localhost project]# cat playbook.yaml

    • hosts: all tasks:
      • name: test
        shell: echo "hello word"
        changed_when: false //可选{true、false}
        [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [Gathering Facts] ****************************************************************************************************************************************************
    ok: [client.example.com]

    TASK [test] ***************************************************************************************************************************************************************
    ok: [client.example.com]

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=2 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0  

3️⃣:根据通过已注册变量收集的模块的输出来报告changed

  • 演示实例二:

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      tasks:

      • name: test
        command: echo "hello word"
        register: result
        changed_when: "'hello word' in result['stdout']"

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [Gathering Facts] ****************************************************************************************************************************************************
    ok: [client.example.com]

    TASK [test] ***************************************************************************************************************************************************************
    changed: [client.example.com]

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
    //因为在result['stdout']中有hello word ,所以被认定为是true,所以就显示changed

6、Ansible块和错误处理

1️⃣:在playbook中,块是对任务进行逻辑分组的子句,可用于控制任务的执行方式

2️⃣:通过块,也可结合rescuealways语句来处理错误。如果块中的任何任务失败,则执行其rescue块中的任务来进行恢复

3️⃣:在block子句中的任务以及rescue子句中的任务(如果出现故障)运行之后,always子句中的任务运行

4️⃣:总结:

  • block:定义要运行的主要任务
  • rescue:定义要在block子句中定义的任务失败时运行的任务
  • always:定义始终都独立运行的任务,不论blockrescue子句中定义的任务是成功还是失败

5️⃣:演示:

  • 演示实例一:当只有block和rescue,且block语句执行成功时,只执行block语句而不执行rescue语句(rescue:营救、救援)

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      gather_facts: no
      tasks:

      • name: test
        block:

        • name: block
          shell: echo "hello word"

        rescue:

        • name: rescue
          shell: ls /root

      //执行play
      [root@localhost project]# ansible-playbook --syntax-check playbook.yaml

    playbook: playbook.yaml
    [root@localhost project]# an
    anacron ansible-config ansible-console ansible-galaxy ansible-playbook ansible-test
    ansible ansible-connection ansible-doc ansible-inventory ansible-pull ansible-vault
    [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [block] **************************************************************************************************************************************************************
    changed: [client.example.com]

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
    //可以发现,只执行了block语句,并没有执行rescue语句

  • 演示实例二:当只有block和rescue,且block语句执行失败时,不执行block语句而执行rescue语句

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      gather_facts: no
      tasks:

      • name: test
        block:

        • name: block
          command: cat / //这句肯定会失败

        rescue:

        • name: rescue
          shell: ls /root

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [block] **************************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/libexec/platform-python"}, "changed": true, "cmd": ["cat", "/"], "delta": "0:00:00.005350", "end": "2020-09-08 10:59:18.381699", "msg": "non-zero return code", "rc": 1, "start": "2020-09-08 10:59:18.376349", "stderr": "cat: /: Is a directory", "stderr_lines": ["cat: /: Is a directory"], "stdout": "", "stdout_lines": []}

    TASK [rescue] *************************************************************************************************************************************************************
    changed: [client.example.com]

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=1 changed=1 unreachable=0 failed=0 skipped=0 rescued=1 ignored=0
    //可以看出,block语句执行失败而执行了rescue语句

  • 演示实例三:当block语句、rescue语句和always语句都有时,无论block语句是否失败,always语句总是执行

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      gather_facts: no
      tasks:

      • name: test
        block:

        • name: block
          command: cat /

        rescue:

        • name: rescue
          shell: ls /root

        always:

        • name: always
          debug:
          msg: "This is my test"

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [block] **************************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/libexec/platform-python"}, "changed": true, "cmd": ["cat", "/"], "delta": "0:00:00.008993", "end": "2020-09-08 11:05:47.816489", "msg": "non-zero return code", "rc": 1, "start": "2020-09-08 11:05:47.807496", "stderr": "cat: /: Is a directory", "stderr_lines": ["cat: /: Is a directory"], "stdout": "", "stdout_lines": []}

    TASK [rescue] *************************************************************************************************************************************************************
    changed: [client.example.com]

    TASK [always] *************************************************************************************************************************************************************
    ok: [client.example.com] => {
    "msg": "This is my test"
    }

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=1 ignored=0

6️⃣:block中的when条件也会应用到其rescuealways子句(若存在)

  • 演示实例一:

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      gather_facts: no
      tasks:

      • name: test
        block:

        • name: block
          command: echo "hello word" //该语句没有错误
          when: ansible_facts['distribution'] == "CentOS" //条件判断出错会导致block语句不会执行

        rescue:

        • name: rescue
          shell: ls /root

        always:

        • name: always
          debug:
          msg: "This is my test"

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [block] **************************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"msg": "The conditional check 'ansible_facts['distribution'] == \"CentOS\"' failed. The error was: error while evaluating conditional (ansible_facts['distribution'] == \"CentOS\"): 'dict object' has no attribute 'distribution'\n\nThe error appears to be in '/root/project/playbook.yaml': line 7, column 11, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n block:\n - name: block\n ^ here\n"}

    TASK [rescue] *************************************************************************************************************************************************************
    changed: [client.example.com]

    TASK [always] *************************************************************************************************************************************************************
    ok: [client.example.com] => {
    "msg": "This is my test"
    }

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=1 ignored=0  

  • 演示实例二:

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      gather_facts: no
      tasks:

      • name: test
        block:

        • name: block
          command: echo "hello word"
          when: ansible_facts['distribution'] == "CentOS"

        rescue:

        • name: rescue
          shell: ls /root
          when: ansible_facts['distribution_major_version'] == "7" //这句when语句会执行失败,导致rescue语句不会执行

        always:

        • name: always
          debug:
          msg: "This is my test"

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [block] **************************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"msg": "The conditional check 'ansible_facts['distribution'] == \"CentOS\"' failed. The error was: error while evaluating conditional (ansible_facts['distribution'] == \"CentOS\"): 'dict object' has no attribute 'distribution'\n\nThe error appears to be in '/root/project/playbook.yaml': line 7, column 11, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n block:\n - name: block\n ^ here\n"}

    TASK [rescue] *************************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"msg": "The conditional check 'ansible_facts['distribution_major_version'] == \"7\"' failed. The error was: error while evaluating conditional (ansible_facts['distribution_major_version'] == \"7\"): 'dict object' has no attribute 'distribution_major_version'\n\nThe error appears to be in '/root/project/playbook.yaml': line 12, column 11, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n rescue:\n - name: rescue\n ^ here\n"}

    TASK [always] *************************************************************************************************************************************************************
    ok: [client.example.com] => {
    "msg": "This is my test"
    }

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=1 changed=0 unreachable=0 failed=1 skipped=0 rescued=1 ignored=0
    //可以看出,block语句和rescue语句都没执行

    实例二

  • 演示实例三:

    //查看playbook

    [root@localhost project]# cat playbook.yaml

    • hosts: all
      gather_facts: no
      tasks:

      • name: test
        block:

        • name: block
          command: echo "hello word"

        rescue:

        • name: rescue
          shell: ls /root

        always:

        • name: always
          debug:
          msg: "This is my test"
          when: ansible_facts['distribution_version'] == "8" //when条件储出错没回导致always语句执行失败

      //执行play
      [root@localhost project]# ansible-playbook playbook.yaml

    PLAY [all] ****************************************************************************************************************************************************************

    TASK [block] **************************************************************************************************************************************************************
    changed: [client.example.com]

    TASK [always] *************************************************************************************************************************************************************
    fatal: [client.example.com]: FAILED! => {"msg": "The conditional check 'ansible_facts['distribution_version'] == \"8\"' failed. The error was: error while evaluating conditional (ansible_facts['distribution_version'] == \"8\"): 'dict object' has no attribute 'distribution_version'\n\nThe error appears to be in '/root/project/playbook.yaml': line 15, column 11, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n always:\n - name: always\n ^ here\n"}

    PLAY RECAP ****************************************************************************************************************************************************************
    client.example.com : ok=1 changed=1 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0

    实例三

  • 注意:block执行会成功的话,如果用when条件判断,即使判断条件会成功,但block语句任然会失败,而去执行rescue语句