【 使用环境 】生产环境
【 OB or 其他组件 】ocp
【 使用版本 】4.2.2-20240315150922
【问题描述】
接管服务器失败,
对应日志:
subtask_8003832.log (90.8 KB)
但是在服务器上可以正常启动ocp agent
【 使用环境 】生产环境
【 OB or 其他组件 】ocp
【 使用版本 】4.2.2-20240315150922
【问题描述】
接管服务器失败,
但是在服务器上可以正常启动ocp agent
点击重试有效果么
重试多次都不行,而且服务器上启动ocp agent服务,跳过之后,主机处于 离线 状态
也尝试过重装ocp agent,仍然报错
麻烦发下待接管服务器的操作系统类型及版本
报错信息:
2024-12-26 15:53:04.473 INFO 231614 --- [pool-manual-subtask-executor14,6ae4efbbd23e4521,4a53e0d2e24c] c.o.o.e.internal.template.SshTemplate : SSH execute end: sudo bash -c ''"'"'/home/admin/ocp_agent'"'"'/bin/ocp_agentctl -c '"'"'/home/admin/ocp_agent'"'"'/conf/agentctl.yaml config --update '"'"'agent.http.basic.auth.username=ocp_agent,agent.http.basic.auth.password=xxx on 192.168.10.172,result:SshResult(host=192.168.10.172, username=root, command=sudo bash -c ''"'"'/home/admin/ocp_agent'"'"'/bin/ocp_agentctl -c '"'"'/home/admin/ocp_agent'"'"'/conf/agentctl.yaml config --update '"'"'agent.http.basic.auth.username=ocp_agent,agent.http.basic.auth.password=xxx, out=, err=, extOut=null, exitStatus=0)
2024-12-26 15:53:04.478 ERROR 231614 --- [pool-manual-subtask-executor14,6ae4efbbd23e4521,4a53e0d2e24c] c.o.o.executor.internal.util.JsonUtils : failed to convert to object
com.fasterxml.jackson.databind.exc.MismatchedInputException: No content to map due to end-of-input
at [Source: (String)""; line: 1, column: 0]
at com.fasterxml.jackson.databind.exc.MismatchedInputException.from(MismatchedInputException.java:59)
at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:4821)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4723)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3677)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3645)
at com.oceanbase.ocp.executor.internal.util.JsonUtils.fromJson(JsonUtils.java:23)
at com.oceanbase.ocp.executor.executor.SshExecutor.updateConfig(SshExecutor.java:274)
at com.oceanbase.ocp.service.compute.AgentInstallationTaskService.configOcpAgent(AgentInstallationTaskService.java:379)
at com.oceanbase.ocp.service.compute.AgentInstallationTaskService$$FastClassBySpringCGLIB$$f7a6037f.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
at org.springframework.aop.framework.CglibAopProxy.invokeMethod(CglibAopProxy.java:386)
at org.springframework.aop.framework.CglibAopProxy.access$000(CglibAopProxy.java:85)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:704)
at com.oceanbase.ocp.service.compute.AgentInstallationTaskService$$EnhancerBySpringCGLIB$$7e9f6ac1.configOcpAgent(<generated>)
at com.oceanbase.ocp.service.task.business.host.InstallOcpAgentTask.run(InstallOcpAgentTask.java:65)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.execute(JavaSubtaskRunner.java:64)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.doRun(JavaSubtaskRunner.java:32)
at com.oceanbase.ocp.core.task.engine.runner.JavaSubtaskRunner.run(JavaSubtaskRu
nner.java:26)
at com.oceanbase.ocp.core.task.engine.runner.RunnerFactory.doRun(RunnerFactory.java:76)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.doRun(SubtaskExecutor.java:203)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.redirectConsoleOutput(SubtaskExecutor.java:197)
at com.oceanbase.ocp.core.task.engine.coordinator.worker.subtask.SubtaskExecutor.lambda$submit$2(SubtaskExecutor.java:134)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
在待接管的机器上单独执行这个命令试下
sudo bash -c ''"'"'/home/admin/ocp_agent'"'"'/bin/ocp_agentctl -c '"'"'/home/admin/ocp_agent'"'"'/conf/agentctl.yaml config --update
这个问题正在联系ocp老师分析中,有进展会尽快回复您
这台agent版本是什么?和其它的一样吗?另外都是用root用户部署的吗?还是使用的具有sudo权限的admin用户?
sudo /home/admin/ocp_agent/bin/ocp_agentctl status
这个密码是agent提供的api的鉴权密码,是随机密码,看不到
看起来是在ocp上启动agent报错,然后手动在服务器上启动可以成功,启动成功后ocp上显示主机仍然处于离线状态 是吧?
对,似乎是ocp接管失败
接管这台主机是稳定失败是吧?
尝试升级下ocp到433bp1试下