OCP告警body模板无alarm_duration值

学习了

只从OCP推送至采购的第三方告警平台,擎创的

按照我上面说的方式试下:

即 使用 ${message} 方式,消息配置项中配置告警消息模板,包含 ${alarm_duration},通道配置项中修改 Body 模板, content 字段使用 ${message} 引用告警消息模板,这样 ${alarm_duration} 会在告警消息模板中被替换, 持续时间信息会包含在 content(通过 ${message})中, 这样 ${alarm_duration} 就能正常显示了

1 个赞

我写了一个脚本webhook_server.py, 使用本地 HTTP 服务器的方式来验证 触发告警后能否正确解析${alarm_duration},服务器的显示内容如下:
1.接收到的请求 Body
2.解析后的 JSON
3.每个字段的替换状态(是否被替换)
4.duration 字段的特殊检查

webhook_server.py:


# -*- coding: utf-8 -*-
import sys

# Compatible with Python 2.7 and Python 3.x
if sys.version_info[0] == 2:
    from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
    import json
else:
    from http.server import HTTPServer, BaseHTTPRequestHandler
    import json

class WebhookHandler(BaseHTTPRequestHandler):
    def do_POST(self):
        content_length = int(self.headers.getheader('Content-Length', 0) if sys.version_info[0] == 2 else self.headers['Content-Length'])
        body = self.rfile.read(content_length)
        
        # Decode body for Python 3
        if sys.version_info[0] == 3:
            body_str = body.decode('utf-8')
        else:
            body_str = body
        
        print("=" * 50)
        print("Received POST request")
        print("Headers:", dict(self.headers))
        print("Body:", body_str)
        
        # Parse JSON
        try:
            data = json.loads(body_str)
            print("\nParsed JSON:")
            # Python 2.7 doesn't support ensure_ascii parameter
            if sys.version_info[0] == 2:
                print(json.dumps(data, indent=2))
            else:
                print(json.dumps(data, indent=2, ensure_ascii=False))
            
            # Check duration field in content
            if 'text' in data and 'content' in data['text']:
                content = data['text']['content']
                print("\n[INFO] Content field analysis:")
                if '${alarm_duration}' in content:
                    print("[ERROR] ${alarm_duration} found in content (NOT REPLACED)")
                elif '持续时间:' in content or 'duration' in content.lower():
                    # Extract duration value from content
                    import re
                    duration_match = re.search(r'持续时间[::]\s*(\d+)', content)
                    if duration_match:
                        duration_value = duration_match.group(1)
                        print("[SUCCESS] ${alarm_duration} replaced to: {} (seconds)".format(duration_value))
                        print("  Duration in seconds: {}".format(duration_value))
                        print("  Duration in minutes: {:.2f}".format(float(duration_value) / 60))
                        print("  Duration in hours: {:.2f}".format(float(duration_value) / 3600))
                    else:
                        print("[INFO] Duration information found in content, but format may be different")
                else:
                    print("[INFO] No duration information found in content")
            
            # Check duration field (if exists as separate field)
            if 'duration' in data:
                print("\n[WARNING] duration field value: {}".format(data['duration']))
                if data['duration'] == '${alarm_duration}':
                    print("[ERROR] duration field shows original text, not replaced")
                else:
                    print("[SUCCESS] duration field replaced to: {}".format(data['duration']))
            
            # Check other fields
            print("\nField replacement status:")
            for key, value in data.items():
                # Python 2.7 uses basestring, Python 3 uses str
                if sys.version_info[0] == 2:
                    is_string = isinstance(value, basestring)
                else:
                    is_string = isinstance(value, str)
                
                if is_string and value.startswith('${') and value.endswith('}'):
                    print("  {}: [NOT REPLACED] {}".format(key, value))
                else:
                    print("  {}: [REPLACED] {}".format(key, value))
        except Exception as e:
            print("Error parsing JSON: {}".format(e))
        
        self.send_response(200)
        self.send_header('Content-Type', 'application/json; charset=utf-8')
        self.end_headers()
        # Return JSON format response that OCP expects
        response = json.dumps({"errcode": 0, "errmsg": "ok"})
        if sys.version_info[0] == 3:
            self.wfile.write(response.encode('utf-8'))
        else:
            self.wfile.write(response)
    
    def log_message(self, format, *args):
        # Suppress default logging
        pass

if __name__ == '__main__':
    httpd = HTTPServer(('0.0.0.0', 8080), WebhookHandler)
    print("Webhook server started on http://0.0.0.0:8080")
    print("Waiting for requests...")
    httpd.serve_forever()

a.运行服务 python webhook_server.py

b.在 OCP 中配置 Webhook URL
http://your-server-ip:8080/webhook

Header 模板
Content-Type:application/json; charset=utf-8

Body 模板
{
  "msgtype": "text",
  "text": {
    "content": "【告警】${alarm_name}\n触发时间:${alarm_active_at}\n集群:${ob_cluster}\n详情:${alarm_url}\n持续时间:${alarm_duration}"
  }
}

Response 校验信息
{"errcode":0,"errmsg":"ok"}

告警消息模板和告警恢复消息模版无需特殊配置

c.触发告警

d.查看并分析收到请求后的输出


显然 ${alarm_duration} 已被正确替换为 421872(秒数), 没有显示 ${alarm_duration}(原样)

1 个赞

测试显示 ${alarm_duration} 可以正常替换,第三方平台显示原样更可能是:

  1. 配置问题(最可能)

Body 模板配置未正确保存

配置位置不对(应在通道配置项中)

JSON 格式有误(虽说格式正确,但可能在某些细节上有问题)

  1. 告警消息模板的影响

如果第三方平台配置了告警消息模板,可能影响 Body 模板中的变量替换

需要检查告警消息模板的配置

  1. 第三方平台的特殊处理

第三方平台可能对某些字段有特殊处理

可能需要特定的配置方式

建议的排查步骤:

  1. 检查第三方平台的实际配置

确认 Body 模板配置在通道配置项中

确认配置已正确保存

对比钉钉和第三方平台的完整配置

  1. 检查告警消息模板

如果配置了告警消息模板,尝试清空相关字段

只使用 Body 模板测试

  1. 使用测试服务器验证

使用相同的测试方法验证第三方平台

查看实际接收到的请求内容

2 个赞

可以可以,学习到了