【产品名称】OCP
【产品版本】
【问题描述】
1、提供一下当前使用的ocp的版本。
2、修复一下ocp主机和当前访问ocp控制台机器上的时钟同步。
OCP版本3.1.1-ce
ocp主机已经同步ntp了
能否发一下具体的报错,现在的信息量太少了;
点击租户--》用户管理,显示这个报错,我确认了root@tpcc_tenant密码(密码为空)和白名单都没问题
手工可以连接到集群吗?看起来就是用户名或者密码不对导致的无法连接;
可以的,性能监控有实时图,我一直在做压测
可不可以卸载ocp然后我再重新部署试试,有卸载ocp的操作步骤吗
ocp本质上是docker容器 卸载可以直接docker rm ${container},重新部署时建议drop掉ocp依赖的元数据库(meta和monitor),然后重新部署即可
好,建议文档上也加上卸载相关的操作
感谢建议,如有问题请继续反馈
好的,多谢。
不客气,感谢您的反馈
可以不使用空密码吗,在密码箱中是否有添加过这个租户的密码,接管的时候应该只填了sys租户的密码
重新部署后,进行接管集群,有两个问题
麻烦解答一下上面的两个问题,总的来说感觉这个接管任务不是很合理,建议可以编辑任务,随时可以修改。
看的不是很明白,重新部署是指用obd部署的还是直接使用ocp部署呢?被接管的OB集群是obd部署的还是ocp部署的呢?
重新部署指的是卸载ocp后重新部署的ocp,然后接管obd部署的ob集群
下面是对应的报错日志:看样子是检测端口问题,检测了2883,实际observer的端口应该是2881
############{EXECUTE}{2022-05-10T09:38:54.010+08:00}############2022-05-10 09:38:54.018 INFO 57 --- [pool-subtask-executor-thread-58,49160d6a7acd43bb,833014262b70] c.a.o.c.m.j.model.SubtaskInstanceEntity : Run subtask, id=80, context=Context(parallelIdx=0, stringMap={cluster_version=3.1.2, cluster_name=obtest, target_server_status=RUNNING, ssh_port=22, service_name=obtest:1, target_zone_status=RUNNING, task_instance_id=61, ob_connect_address=xx.209:2883, task_operation=execute, cluster_type=PRIMARY, service_version=3.1.2, ob_cluster_id=1, cluster_id=1, service_type=OB_CLUSTER, ob_data_dir=/ssddata1/ob/data, connection_mode=proxy, target_cluster_status=RUNNING, latest_execution_start_time=2022-05-10T09:38:54.005+08:00, sub_task_instance_id=80, credential_id=1}, listMap={add_region_ids=[1], server_ids=[1, 2, 3, 4, 5, 6], add_idc_ids=[1], all_host_ids=[1, 2, 3, 4, 5, 6], add_host_ids=[1, 2, 3, 4, 5, 6], host_ids=[1, 2, 3, 4, 5, 6], zone_names=[zone2, zone1, zone3]}), executor=192.168.149.209
2022-05-10 09:38:54.059 INFO 57 --- [pool-subtask-executor-thread-58,49160d6a7acd43bb,833014262b70] c.a.ocp.core.task.util.OcpAgentUtils : [OcpAgentUtils.runCmd] svrIp=192.168.149.213, port=62888, user=root, cmd=netstat -tunlp | grep 2883 | grep observer | awk '{print $7}' | awk -F/ '{print $1}'
2022-05-10 09:38:54.103 INFO 57 --- [pool-subtask-executor-thread-58,49160d6a7acd43bb,833014262b70] c.a.ocp.core.task.util.OcpAgentUtils : [OcpAgentUtils.runCmd] result=
2022-05-10 09:38:54.107 ERROR 57 --- [pool-subtask-executor-thread-58,49160d6a7acd43bb,833014262b70] com.alipay.ocp.core.util.ExceptionUtils : Checked Exception: com.alipay.ocp.core.exception.UnexpectedException occurred with code error.ob.cluster.takeover.pid.not.found, and args [2883]
2022-05-10 09:38:54.109 INFO 57 --- [pool-subtask-executor-thread-58,49160d6a7acd43bb,833014262b70] c.a.o.c.m.j.model.SubtaskInstanceEntity : Set state for subtask: 80, current state: RUNNING, new state: FAILED
2022-05-10 09:38:54.112 WARN 57 --- [pool-subtask-executor-thread-58,49160d6a7acd43bb,833014262b70] c.a.ocp.core.job.runner.RunnerFactory : Execute task failed, subtask=SubtaskInstanceEntity{id=80, name=Check observer process user, state=FAILED, operation=EXECUTE, className=com.alipay.ocp.service.task.business.host.CheckObserverProcessUserTask, seriesId=37, startTime=2022-05-10T09:38:54.005+08:00, endTime=2022-05-10T09:38:54.111+08:00}, failedMessage=Can not find observer process with port 2883
com.alipay.ocp.core.exception.UnexpectedException: [OCP UnexpectedException]: status=500 INTERNAL_SERVER_ERROR, errorCode=OB_CLUSTER_TAKEOVER_OBSERVER_PID_NOT_FOUND, args=2883
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[na:1.8.0_312]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[na:1.8.0_312]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[na:1.8.0_312]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[na:1.8.0_312]
at com.alipay.ocp.core.util.ExceptionUtils.newException(ExceptionUtils.java:96) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.util.ExceptionUtils.throwException(ExceptionUtils.java:90) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.util.ExceptionUtils.unExpected(ExceptionUtils.java:77) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.service.task.business.host.CheckObserverProcessUserTask.run(CheckObserverProcessUserTask.java:40) ~[ocp-service-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.metadb.job.model.SubtaskInstanceEntity.run(SubtaskInstanceEntity.java:216) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.job.runner.JavaTaskRunner.doExecute(JavaTaskRunner.java:26) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.job.runner.JavaTaskRunner.run(JavaTaskRunner.java:20) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.job.runner.RunnerFactory.doRun(RunnerFactory.java:103) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.job.runner.RunnerFactory.redirectOutputIfNotSysSchedule(RunnerFactory.java:147) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.job.runner.RunnerFactory.run(RunnerFactory.java:92) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at com.alipay.ocp.core.job.coordinator.worker.subtask.ReadySubtaskWorker.lambda$submitTask$2(ReadySubtaskWorker.java:123) ~[ocp-core-3.1.1-20210916.jar!/:3.1.1-20210916]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_312]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_312]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_312]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_312]
############{RETRY}{2022-05-10T09:39:42.992+08:00}############2022-05-10 09:39:43.000 INFO 57 --- [pool-subtask-executor-thread-17,49160d6a7acd43bb,5a2e00ca06c9] c.a.o.c.m.j.model.SubtaskInstanceEntity : Retry subtask, id=80, context=Context(parallelIdx=0, stringMap={cluster_version=3.1.2, cluster_name=obtest, target_server_status=RUNNING, ssh_port=22, service_name=obtest:1, target_zone_status=RUNNING, task_instance_id=61, ob_connect_address=xx.209:2883, task_operation=retry, cluster_type=PRIMARY, service_version=3.1.2, ob_cluster_id=1, cluster_id=1, service_type=OB_CLUSTER, ob_data_dir=/ssddata1/ob/data, connection_mode=proxy, target_cluster_status=RUNNING, latest_execution_start_time=2022-05-10T09:39:42.986+08:00, sub_task_instance_id=80, credential_id=1}, listMap={add_region_ids=[1], server_ids=[1, 2, 3, 4, 5, 6], add_idc_ids=[1], all_host_ids=[1, 2, 3, 4, 5, 6], add_host_ids=[1, 2, 3, 4, 5, 6], host_ids=[1, 2, 3, 4, 5, 6], zone_names=[zone2, zone1, zone3]}), executor=192.168.149.209
2022-05-10 09:39:43.039 INFO 57 --- [pool-subtask-executor-thread-17,49160d6a7acd43bb,5a2e00ca06c9] c.a.ocp.core.task.util.OcpAgentUtils : [OcpAgentUtils.runCmd] svrIp=192.168.149.213, port=62888, user=root, cmd=netstat -tunlp | grep 2883 | grep observer | awk '{print $7}' | awk -F/ '{print $1}'
2022-05-10 09:39:43.082 INFO 57 --- [pool-subtask-executor-thread-17,49160d6a7acd43bb,5a2e00ca06c9] c.a.ocp.core.task.util.OcpAgentUtils : [OcpAgentUtils.runCmd] result=
2022-05-10 09:39:43.086 ERROR 57 --- [pool-subtask-executor-thread-17,49160d6a7acd43bb,5a2e00ca06c9] com.alipay.ocp.core.util.ExceptionUtils : Checked Exception: com.alipay.ocp.core.exception.UnexpectedException occurred with code error.ob.cluster.takeover.pid.not.found, and args [2883]
2022-05-10 09:39:43.087 INFO 57 --- [pool-subtask-executor-thread-17,49160d6a7acd43bb,5a2e00ca06c9] c.a.o.c.m.j.model.SubtaskInstanceEntity : Set state for subtask: 80, current state: RUNNING, new state: FAILED
2022-05-10 09:39:43.090 WARN 57 --- [pool-subtask-executor-thread-17,49160d6a7acd43bb,5a2e00ca06c9] c.a.ocp.core.job.runner.RunnerFactory : Execute task failed, subtask=SubtaskInstanceEntity{id=80, name=Check observer process user, state=FAILED, operation=RETRY, className=com.alipay.ocp.service.task.business.host.CheckObserverProcessUserTask, seriesId=37, startTime=2022-05-10T09:39:42.987+08:00, endTime=2022-05-10T09:39:43.089+08:00}, failedMessage=Can not find observer process with port 2883
com.alipay.ocp.core.exception.UnexpectedException: [OCP UnexpectedException]: status=500 INTERNAL_SERVER_ERROR, errorCode=OB_CLUSTER_TAKEOVER_OBSERVER_PID_NOT_FOUND, args=2883
使用ocp接管前,obd cluster check4ocp 验证的截图麻烦提供一下呢
没有这个命令