学习了
只从OCP推送至采购的第三方告警平台,擎创的
按照我上面说的方式试下:
即 使用 ${message} 方式,消息配置项中配置告警消息模板,包含 ${alarm_duration},通道配置项中修改 Body 模板, content 字段使用 ${message} 引用告警消息模板,这样 ${alarm_duration} 会在告警消息模板中被替换, 持续时间信息会包含在 content(通过 ${message})中, 这样 ${alarm_duration} 就能正常显示了
我写了一个脚本webhook_server.py, 使用本地 HTTP 服务器的方式来验证 触发告警后能否正确解析${alarm_duration},服务器的显示内容如下:
1.接收到的请求 Body
2.解析后的 JSON
3.每个字段的替换状态(是否被替换)
4.duration 字段的特殊检查
webhook_server.py:
# -*- coding: utf-8 -*-
import sys
# Compatible with Python 2.7 and Python 3.x
if sys.version_info[0] == 2:
from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler
import json
else:
from http.server import HTTPServer, BaseHTTPRequestHandler
import json
class WebhookHandler(BaseHTTPRequestHandler):
def do_POST(self):
content_length = int(self.headers.getheader('Content-Length', 0) if sys.version_info[0] == 2 else self.headers['Content-Length'])
body = self.rfile.read(content_length)
# Decode body for Python 3
if sys.version_info[0] == 3:
body_str = body.decode('utf-8')
else:
body_str = body
print("=" * 50)
print("Received POST request")
print("Headers:", dict(self.headers))
print("Body:", body_str)
# Parse JSON
try:
data = json.loads(body_str)
print("\nParsed JSON:")
# Python 2.7 doesn't support ensure_ascii parameter
if sys.version_info[0] == 2:
print(json.dumps(data, indent=2))
else:
print(json.dumps(data, indent=2, ensure_ascii=False))
# Check duration field in content
if 'text' in data and 'content' in data['text']:
content = data['text']['content']
print("\n[INFO] Content field analysis:")
if '${alarm_duration}' in content:
print("[ERROR] ${alarm_duration} found in content (NOT REPLACED)")
elif '持续时间:' in content or 'duration' in content.lower():
# Extract duration value from content
import re
duration_match = re.search(r'持续时间[::]\s*(\d+)', content)
if duration_match:
duration_value = duration_match.group(1)
print("[SUCCESS] ${alarm_duration} replaced to: {} (seconds)".format(duration_value))
print(" Duration in seconds: {}".format(duration_value))
print(" Duration in minutes: {:.2f}".format(float(duration_value) / 60))
print(" Duration in hours: {:.2f}".format(float(duration_value) / 3600))
else:
print("[INFO] Duration information found in content, but format may be different")
else:
print("[INFO] No duration information found in content")
# Check duration field (if exists as separate field)
if 'duration' in data:
print("\n[WARNING] duration field value: {}".format(data['duration']))
if data['duration'] == '${alarm_duration}':
print("[ERROR] duration field shows original text, not replaced")
else:
print("[SUCCESS] duration field replaced to: {}".format(data['duration']))
# Check other fields
print("\nField replacement status:")
for key, value in data.items():
# Python 2.7 uses basestring, Python 3 uses str
if sys.version_info[0] == 2:
is_string = isinstance(value, basestring)
else:
is_string = isinstance(value, str)
if is_string and value.startswith('${') and value.endswith('}'):
print(" {}: [NOT REPLACED] {}".format(key, value))
else:
print(" {}: [REPLACED] {}".format(key, value))
except Exception as e:
print("Error parsing JSON: {}".format(e))
self.send_response(200)
self.send_header('Content-Type', 'application/json; charset=utf-8')
self.end_headers()
# Return JSON format response that OCP expects
response = json.dumps({"errcode": 0, "errmsg": "ok"})
if sys.version_info[0] == 3:
self.wfile.write(response.encode('utf-8'))
else:
self.wfile.write(response)
def log_message(self, format, *args):
# Suppress default logging
pass
if __name__ == '__main__':
httpd = HTTPServer(('0.0.0.0', 8080), WebhookHandler)
print("Webhook server started on http://0.0.0.0:8080")
print("Waiting for requests...")
httpd.serve_forever()
a.运行服务 python webhook_server.py
b.在 OCP 中配置 Webhook URL
http://your-server-ip:8080/webhook
Header 模板
Content-Type:application/json; charset=utf-8
Body 模板
{
"msgtype": "text",
"text": {
"content": "【告警】${alarm_name}\n触发时间:${alarm_active_at}\n集群:${ob_cluster}\n详情:${alarm_url}\n持续时间:${alarm_duration}"
}
}
Response 校验信息
{"errcode":0,"errmsg":"ok"}
告警消息模板和告警恢复消息模版无需特殊配置
c.触发告警
d.查看并分析收到请求后的输出
显然 ${alarm_duration} 已被正确替换为 421872(秒数), 没有显示 ${alarm_duration}(原样)
测试显示 ${alarm_duration} 可以正常替换,第三方平台显示原样更可能是:
- 配置问题(最可能)
Body 模板配置未正确保存
配置位置不对(应在通道配置项中)
JSON 格式有误(虽说格式正确,但可能在某些细节上有问题)
- 告警消息模板的影响
如果第三方平台配置了告警消息模板,可能影响 Body 模板中的变量替换
需要检查告警消息模板的配置
- 第三方平台的特殊处理
第三方平台可能对某些字段有特殊处理
可能需要特定的配置方式
建议的排查步骤:
- 检查第三方平台的实际配置
确认 Body 模板配置在通道配置项中
确认配置已正确保存
对比钉钉和第三方平台的完整配置
- 检查告警消息模板
如果配置了告警消息模板,尝试清空相关字段
只使用 Body 模板测试
- 使用测试服务器验证
使用相同的测试方法验证第三方平台
查看实际接收到的请求内容
可以可以,学习到了
