ocp 接管服务器失败,提示未知错误COM10999

【 使用环境 】生产环境
【 OB or 其他组件 】ocp
【 使用版本 】4.2.2-20240315150922
【问题描述】
接管服务器失败,


对应日志:
subtask_8003832.log (90.8 KB)

但是在服务器上可以正常启动ocp agent

1 个赞

点击重试有效果么

1 个赞

重试多次都不行,而且服务器上启动ocp agent服务,跳过之后,主机处于 离线 状态

也尝试过重装ocp agent,仍然报错

麻烦发下待接管服务器的操作系统类型及版本

1 个赞

centos x64 其他同一批机器都可以正常加入,就这台不行

报错信息:

2024-12-26 15:53:04.473  INFO 231614 --- [pool-manual-subtask-executor14,6ae4efbbd23e4521,4a53e0d2e24c] c.o.o.e.internal.template.SshTemplate    : SSH execute end: sudo bash -c ''"'"'/home/admin/ocp_agent'"'"'/bin/ocp_agentctl -c '"'"'/home/admin/ocp_agent'"'"'/conf/agentctl.yaml config --update '"'"'agent.http.basic.auth.username=ocp_agent,agent.http.basic.auth.password=xxx on 192.168.10.172,result:SshResult(host=192.168.10.172, username=root, command=sudo bash -c ''"'"'/home/admin/ocp_agent'"'"'/bin/ocp_agentctl -c '"'"'/home/admin/ocp_agent'"'"'/conf/agentctl.yaml config --update '"'"'agent.http.basic.auth.username=ocp_agent,agent.http.basic.auth.password=xxx, out=, err=, extOut=null, exitStatus=0)

2024-12-26 15:53:04.478 ERROR 231614 --- [pool-manual-subtask-executor14,6ae4efbbd23e4521,4a53e0d2e24c] c.o.o.executor.internal.util.JsonUtils   : failed to convert to  object

com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input
 at [Source: (String)""; line: 1, column: 0]
	at com.fasterxml.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:59)
	at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:4821)
	at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4723)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3677)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3645)
	at com.oceanbase.ocp.executor.internal.util.JsonUtils.fromJson(JsonUtils.java:23)
	at com.oceanbase.ocp.executor.executor.SshExecutor.updateConfig(SshExecutor.java:274)
	at com.oceanbase.ocp.service.compute.AgentInstallationTaskService.configOcpAgent(AgentInstallationTaskService.java:379)
	at com.oceanbase.ocp.service.compute.AgentInstallationTaskService$$FastClassBySpringCGLIB$$f7a6037f.invoke(<generated>)
	at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
	at org.springframework.aop.framework.CglibAopProxy.invokeMethod(CglibAopProxy.java:386)
	at org.springframework.aop.framework.CglibAopProxy.access$000(CglibAopProxy.java:85)
	at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:704)
	at com.oceanbase.ocp.service.compute.AgentInstallationTaskService$$EnhancerBySpringCGLIB$$7e9f6ac1.configOcpAgent(<generated>)
	at com.oceanbase.ocp.service.task.business.host.InstallOcpAgentTask.run(InstallOcpAgentTask.java:65)
	at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.execute(JavaSubtaskRunner.java:64)
	at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.doRun(JavaSubtaskRunner.java:32)
	at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.run(JavaSubtaskRu
nner.java:26)
	at com.oceanbase.ocp.core.task.engine.runner.RunnerFactory.doRun(RunnerFactory.java:76)
	at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.doRun(SubtaskExecutor.java:203)
	at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.redirectConsoleOutput(SubtaskExecutor.java:197)
	at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.lambda$submit$2(SubtaskExecutor.java:134)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
1 个赞

在待接管的机器上单独执行这个命令试下

sudo bash -c ''"'"'/home/admin/ocp_agent'"'"'/bin/ocp_agentctl -c '"'"'/home/admin/ocp_agent'"'"'/conf/agentctl.yaml config --update 
1 个赞


命令好像不对额

1 个赞

这个命令我是从日志里面摘取的,看起来是这个符号的问题

1 个赞


好像后面这段也是在语句里,这个密码指的是什么密码呀


服务器上手动启动ocp agent报错

这个问题正在联系ocp老师分析中,有进展会尽快回复您

1 个赞

这台agent版本是什么?和其它的一样吗?另外都是用root用户部署的吗?还是使用的具有sudo权限的admin用户?

sudo /home/admin/ocp_agent/bin/ocp_agentctl status

1 个赞

这台agent版本和其他节点一致的,ocp-agent-ce-4.2.2-20240315150922.el7.x86_64
通过ocp启动不了agent,但是手动可以起

这个密码是agent提供的api的鉴权密码,是随机密码,看不到

1 个赞

看起来是在ocp上启动agent报错,然后手动在服务器上启动可以成功,启动成功后ocp上显示主机仍然处于离线状态 是吧?

对,似乎是ocp接管失败

接管这台主机是稳定失败是吧?

尝试升级下ocp到433bp1试下