【创意工坊】试用 Dify x OceanBase 构建 “应用级” 的 AI 助手

前几天写了一篇博客《试用 OceanBase 4.3.3 构建 <黑神话:悟空> 智能游戏助手》,文中简述了向量数据库和 AI 大模型的关系,并做了一下 AI Workshop 的实验。

上次的实验放出来之后,有很多用户都参与了试用,并在论坛、博客纷纷发布试用体验。前两天看到有用户发文表示还不够过瘾,希望让我们能够继续打磨下这个实验项目。

所以身边的同事策划了一个 “Dify x OceanBase 实验方案”,想在 OceanBase 数据库支持向量能力的基础上,借助 Dify 这个开源的 LLM 应用开发平台,让用户通过白屏拖拽的方式,就能够轻松构建和运营生成式的 AI 应用。相比上次的实验,这次会有一些变化:

  • 个人测试环境里只需要有一个 Docker,搭建便捷。
  • RAG 机器人支持定制,还支持通过白屏管理多个知识库。
  • 支持通过 Dify x OceanBase 创建多种 AI 应用,远不止 RAG 聊天机器人,可以继续深入探索其他功能。
  • 简而言之:可以开门接客了(不仅可以私人使用,甚至可以上 “生产”,在工作团队里使用)。

今天抽空儿试用了下新方案,利用 Dify x OceanBase 构建了一个 “应用级” 的 AI 助手

详见:《试用 Dify x OceanBase 构建应用级的 AI 助手》,欢迎大家阅读、试用、吐槽!

7 个赞

实验文档在此: dify@oceanbase-workshop 实验文档,欢迎大家一试~

最后再给大家推荐两个好东西:


  • 另一个是 OBCloud,免费薅一年羊毛。无需实名认证,只要活动不结束,一年之后用家人的手机号还能再继续用一年。反正我后面一年的功能测试环境就用这个了,此羊毛不薅,属实有点儿浪费~

image

6 个赞

老师好迅速,支持!!

6 个赞

太赞了 学习一下

7 个赞

不明觉厉!支持!

6 个赞

学习一下

5 个赞

真的厉害呀,非常棒!

5 个赞

老师,请教一下。昨晚就是做到最后那一步机器人回复报错了,报程序错误。这是为什么,前面都没有问题,然后昨晚做的时候做了三遍,第一次报缺少表,我就重新做了一次然后在OceanBase cloud那里看一下一共有多少张表报错的那张表是否创建,一共69张表,也创建好了,后面就机器人回复报错,程序错误或服务器超载,500错误码,16G的运行内存。然后 Dify 里面可以看日志消息,然后也报错说缺少message表导致没办法看平台日志消息,那就是70张表才对。请问我是哪一步出现问题?

6 个赞

哈哈,我在 OceanBase 创建的元数据库叫 dify_meta_db,通过 show tables 能看到 72 张元数据表。辛苦 @与义 一起帮忙看下这个问题?

obclient> use dify_meta_db
Database changed

obclient> show tables;
+-----------------------------------+
| Tables_in_dify_meta_db            |
+-----------------------------------+
| account_integrates                |
| accounts                          |
| alembic_version                   |
| api_based_extensions              |
| api_requests                      |
| api_tokens                        |
| app_annotation_hit_histories      |
| app_annotation_settings           |
| app_dataset_joins                 |
| app_model_configs                 |
| apps                              |
| celery_taskmeta                   |
| celery_tasksetmeta                |
| conversations                     |
| data_source_api_key_auth_bindings |
| data_source_oauth_bindings        |
| dataset_collection_bindings       |
| dataset_keyword_tables            |
| dataset_permissions               |
| dataset_process_rules             |
| dataset_queries                   |
| dataset_retriever_resources       |
| datasets                          |
| dify_setups                       |
| document_segments                 |
| documents                         |
| embeddings                        |
| end_users                         |
| external_knowledge_apis           |
| external_knowledge_bindings       |
| installed_apps                    |
| invitation_codes                  |
| load_balancing_model_configs      |
| message_agent_thoughts            |
| message_annotations               |
| message_chains                    |
| message_feedbacks                 |
| message_files                     |
| messages                          |
| operation_logs                    |
| pinned_conversations              |
| provider_model_settings           |
| provider_models                   |
| provider_orders                   |
| providers                         |
| recommended_apps                  |
| saved_messages                    |
| sites                             |
| tag_bindings                      |
| tags                              |
| tenant_account_joins              |
| tenant_default_models             |
| tenant_preferred_model_providers  |
| tenants                           |
| tidb_auth_bindings                |
| tool_api_providers                |
| tool_builtin_providers            |
| tool_conversation_variables       |
| tool_files                        |
| tool_label_bindings               |
| tool_model_invokes                |
| tool_providers                    |
| tool_published_apps               |
| tool_workflow_providers           |
| trace_app_config                  |
| upload_files                      |
| whitelists                        |
| workflow_app_logs                 |
| workflow_conversation_variables   |
| workflow_node_executions          |
| workflow_runs                     |
| workflows                         |
+-----------------------------------+
72 rows in set (0.00 sec)
5 个赞

请问报什么错呢?正常情况下会创建 72 张表,迁移完成的时候可能会有 Redis 相关的报错,这是 Redis 加锁超时的问题,如果显示迁移成功,Redis 的错误可以忽略不计。

5 个赞

厉害

3 个赞

数据库账号是超级账号,早上临时复现的环境含部分日志,皆是按照步骤来
16G运行内存,报迁移失败,部分表没有

mysql> select user();
+---------------------------+
| user()                    |
+---------------------------+
| centenary2@xxxxxx         |
+---------------------------+
1 row in set (0.07 sec)

mysql> use gz;
Database changed
mysql> select database();
+------------+
| database() |
+------------+
| gz         |
+------------+
1 row in set (0.14 sec)

mysql> show tables;
+-----------------------------------+
| Tables_in_gz                      |
+-----------------------------------+
| account_integrates                |
| accounts                          |
| alembic_version                   |
| api_based_extensions              |
| api_requests                      |
| api_tokens                        |
| app_annotation_hit_histories      |
| app_annotation_settings           |
| app_dataset_joins                 |
| app_model_configs                 |
| apps                              |
| celery_taskmeta                   |
| celery_tasksetmeta                |
| conversations                     |
| data_source_api_key_auth_bindings |
| data_source_oauth_bindings        |
| dataset_collection_bindings       |
| dataset_keyword_tables            |
| dataset_permissions               |
| dataset_process_rules             |
| dataset_queries                   |
| dataset_retriever_resources       |
| datasets                          |
| dify_setups                       |
| document_segments                 |
| documents                         |
| embeddings                        |
| end_users                         |
| external_knowledge_apis           |
| external_knowledge_bindings       |
| installed_apps                    |
| invitation_codes                  |
| load_balancing_model_configs      |
| message_agent_thoughts            |
| message_chains                    |
| message_feedbacks                 |
| message_files                     |
| operation_logs                    |
| pinned_conversations              |
| provider_model_settings           |
| provider_models                   |
| provider_orders                   |
| providers                         |
| recommended_apps                  |
| saved_messages                    |
| sites                             |
| tag_bindings                      |
| tags                              |
| tenant_account_joins              |
| tenant_default_models             |
| tenant_preferred_model_providers  |
| tenants                           |
| tidb_auth_bindings                |
| tool_api_providers                |
| tool_builtin_providers            |
| tool_conversation_variables       |
| tool_files                        |
| tool_label_bindings               |
| tool_model_invokes                |
| tool_providers                    |
| tool_workflow_providers           |
| trace_app_config                  |
| upload_files                      |
| whitelists                        |
| workflow_app_logs                 |
| workflow_conversation_variables   |
| workflow_node_executions          |
| workflow_runs                     |
| workflows                         |
+-----------------------------------+
69 rows in set (0.14 sec)

mysql> show grants;
+------------------------------------------------------------------------------------------------------------------------------------------------+
| Grants for centenary2@%                                                                                                                        |
+------------------------------------------------------------------------------------------------------------------------------------------------+
| GRANT ALTER, CREATE, DELETE, DROP, INSERT, UPDATE, SELECT, INDEX, CREATE VIEW, SHOW VIEW, ALTER ROUTINE, CREATE ROUTINE ON *.* TO 'centenary2' |
+------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.07 sec)

docker logs -f docker-api-1 日志

nohup: 忽略输入
Running migrations
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
/app/api/.venv/lib/python3.12/site-packages/tencentcloud/hunyuan/v20230901/models.py:5585: SyntaxWarning: invalid escape sequence '\_'
  """function名称,只能包含a-z,A-Z,0-9,\_或-
/app/api/.venv/lib/python3.12/site-packages/jieba/__init__.py:44: SyntaxWarning: invalid escape sequence '\.'
  re_han_default = re.compile("([\u4E00-\u9FD5a-zA-Z0-9+#&\._%\-]+)", re.U)
/app/api/.venv/lib/python3.12/site-packages/jieba/__init__.py:46: SyntaxWarning: invalid escape sequence '\s'
  re_skip_default = re.compile("(\r\n|\s)", re.U)
/app/api/.venv/lib/python3.12/site-packages/jieba/finalseg/__init__.py:78: SyntaxWarning: invalid escape sequence '\.'
  re_skip = re.compile("([a-zA-Z0-9]+(?:\.\d+)?%?)")
/app/api/.venv/lib/python3.12/site-packages/jieba/posseg/__init__.py:16: SyntaxWarning: invalid escape sequence '\.'
  re_skip_detail = re.compile("([\.0-9]+|[a-zA-Z0-9]+)")
/app/api/.venv/lib/python3.12/site-packages/jieba/posseg/__init__.py:17: SyntaxWarning: invalid escape sequence '\.'
  re_han_internal = re.compile("([\u4E00-\u9FD5a-zA-Z0-9+#&\._]+)")
/app/api/.venv/lib/python3.12/site-packages/jieba/posseg/__init__.py:18: SyntaxWarning: invalid escape sequence '\s'
  re_skip_internal = re.compile("(\r\n|\s)")
/app/api/.venv/lib/python3.12/site-packages/jieba/posseg/__init__.py:21: SyntaxWarning: invalid escape sequence '\.'
  re_num = re.compile("[\.0-9]+")
2024-12-04 09:18:45.964 INFO [pre_load_builtin_providers_cache] [font_manager.py:1578] - generated new fontManager
Preparing database migration...
Database migration skipped
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
[2024-12-04 09:19:19 +0000] [1] [INFO] Starting gunicorn 22.0.0
[2024-12-04 09:19:19 +0000] [1] [INFO] Listening at: http://0.0.0.0:5001 (1)
[2024-12-04 09:19:19 +0000] [1] [INFO] Using worker: gevent
[2024-12-04 09:19:19 +0000] [40] [INFO] Booting worker with pid: 40

机器人运行后新增

2024-12-04 09:54:08.188 ERROR [Dummy-2] [completion.py:147] - internal server error.
Traceback (most recent call last):
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 941, in do_execute
    cursor.execute(statement, parameters)
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/cursors.py", line 153, in execute
    result = self._query(query)
             ^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/connections.py", line 563, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/connections.py", line 825, in _read_query_result
    result.read()
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/connections.py", line 1199, in read
    first_packet = self.connection._read_packet()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/connections.py", line 775, in _read_packet
    packet.raise_for_error()
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/protocol.py", line 219, in raise_for_error
    err.raise_mysql_exception(self._data)
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/err.py", line 150, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.ProgrammingError: (1146, "Table 'gz.messages' doesn't exist")


。。。。。。。


sqlalchemy.exc.ProgrammingError: (pymysql.err.ProgrammingError) (1146, "Table 'gz.messages' doesn't exist")
[SQL: INSERT INTO messages (id, app_id, model_provider, model_id, override_model_configs, conversation_id, inputs, query, message, message_tokens, message_uni
t_price, message_price_unit, answer, answer_tokens, answer_unit_price, answer_price_unit, parent_message_id, provider_response_latency, total_price, currency,
 status, error, message_metadata, invoke_from, from_source, from_end_user_id, from_account_id, workflow_run_id) VALUES (%(id)s, %(app_id)s, %(model_provider)s
, %(model_id)s, %(override_model_configs)s, %(conversation_id)s, %(inputs)s, %(query)s, %(message)s, %(message_tokens)s, %(message_unit_price)s, %(message_pri
ce_unit)s, %(answer)s, %(answer_tokens)s, %(answer_unit_price)s, %(answer_price_unit)s, %(parent_message_id)s, %(provider_response_latency)s, %(total_price)s,
 %(currency)s, %(status)s, %(error)s, %(message_metadata)s, %(invoke_from)s, %(from_source)s, %(from_end_user_id)s, %(from_account_id)s, %(workflow_run_id)s)]
[parameters: {'id': '373861f7-b668-489e-95cc-79c1b1f35b49', 'app_id': 'cfb99d2e-4db0-4f1c-aea6-82208eae5537', 'model_provider': 'tongyi', 'model_id': 'qwen-tu
rbo-2024-11-01', 'override_model_configs': '{"pre_prompt": "", "retriever_resource": {"enabled": true}, "completion_prompt_config": {}, "opening_statement": "
", "user_input_form": [], "agent_mo ... (1203 characters truncated) ... ble": true, "retrieval_model": "multiple", "datasets": {"datasets": [{"dataset": {"ena
bled": true, "id": "11c4d3db-9c82-44d9-8d6a-c6fba4bb36c1"}}]}}}', 'conversation_id': '60f534be-5851-4b26-9d5d-a82be36b0b2c', 'inputs': '{}', 'query': '请介绍
一下 OceanBase 的向量功能', 'message': '""', 'message_tokens': 0, 'message_unit_price': 0, 'message_price_unit': 0, 'answer': '', 'answer_tokens': 0, 'answer_unit_price': 0, 'answer_price_unit': 0, 'parent_message_id': None, 'provider_response_latency': 0, 'total_price': 0, 'currency': 'USD', 'status': 'normal', 'error': None, 'message_metadata': None, 'invoke_from': 'debugger', 'from_source': 'console', 'from_end_user_id': None, 'from_account_id': '7e79f1bf-fdb6-443c-9ed6-0a0ab06aa957', 'workflow_run_id': None}]


。。。。。。
werkzeug.exceptions.InternalServerError: 500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

docker logs -f docker-worker-1 日志

nohup: 忽略输入
Running migrations
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
/app/api/.venv/lib/python3.12/site-packages/tencentcloud/hunyuan/v20230901/models.py:5585: SyntaxWarning: invalid escape sequence '\_'
  """function名称,只能包含a-z,A-Z,0-9,\_或-
/app/api/.venv/lib/python3.12/site-packages/jieba/__init__.py:44: SyntaxWarning: invalid escape sequence '\.'
  re_han_default = re.compile("([\u4E00-\u9FD5a-zA-Z0-9+#&\._%\-]+)", re.U)
/app/api/.venv/lib/python3.12/site-packages/jieba/__init__.py:46: SyntaxWarning: invalid escape sequence '\s'
  re_skip_default = re.compile("(\r\n|\s)", re.U)
/app/api/.venv/lib/python3.12/site-packages/jieba/finalseg/__init__.py:78: SyntaxWarning: invalid escape sequence '\.'
  re_skip = re.compile("([a-zA-Z0-9]+(?:\.\d+)?%?)")
/app/api/.venv/lib/python3.12/site-packages/jieba/posseg/__init__.py:16: SyntaxWarning: invalid escape sequence '\.'
  re_skip_detail = re.compile("([\.0-9]+|[a-zA-Z0-9]+)")
/app/api/.venv/lib/python3.12/site-packages/jieba/posseg/__init__.py:17: SyntaxWarning: invalid escape sequence '\.'
  re_han_internal = re.compile("([\u4E00-\u9FD5a-zA-Z0-9+#&\._]+)")
/app/api/.venv/lib/python3.12/site-packages/jieba/posseg/__init__.py:18: SyntaxWarning: invalid escape sequence '\s'
  re_skip_internal = re.compile("(\r\n|\s)")
/app/api/.venv/lib/python3.12/site-packages/jieba/posseg/__init__.py:21: SyntaxWarning: invalid escape sequence '\.'
  re_num = re.compile("[\.0-9]+")
2024-12-04 09:18:45.962 INFO [pre_load_builtin_providers_cache] [font_manager.py:1578] - generated new fontManager
Preparing database migration...
Starting database migration.
INFO  [alembic.runtime.migration] Context impl MySQLImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 01d6889832f7, snapshot
ERROR [root] Failed to execute database migration
Traceback (most recent call last):
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 941, in do_execute
    cursor.execute(statement, parameters)
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/cursors.py", line 153, in execute
    result = self._query(query)
             ^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/connections.py", line 563, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/connections.py", line 825, in _read_query_result
    result.read()
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/connections.py", line 1199, in read
    first_packet = self.connection._read_packet()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/connections.py", line 775, in _read_packet
    packet.raise_for_error()
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/protocol.py", line 219, in raise_for_error
    err.raise_mysql_exception(self._data)
  File "/app/api/.venv/lib/python3.12/site-packages/pymysql/err.py", line 150, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.OperationalError: (1142, "REFERENCES command denied to user 'centenary2'@'%' for table 'conversations'")
。。。。。。。。

sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1050, "Table 'account_integrates' already exists")
[SQL:
CREATE TABLE account_integrates (
        id CHAR(36) NOT NULL,
        account_id CHAR(36) NOT NULL,
        provider VARCHAR(16) NOT NULL,
        open_id VARCHAR(255) NOT NULL,
        encrypted_token VARCHAR(255) NOT NULL,
        created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP(0),
        updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP(0),
        CONSTRAINT account_integrate_pkey PRIMARY KEY (id),
        CONSTRAINT unique_account_provider UNIQUE (account_id, provider),
        CONSTRAINT unique_provider_open_id UNIQUE (provider, open_id)
)

]
(Background on this error at: https://sqlalche.me/e/20/e3q8)
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
/app/api/.venv/lib/python3.12/site-packages/celery/platforms.py:829: SecurityWarning: You're running the worker with superuser privileges: this is
absolutely not recommended!


。。。。。。。



2 个赞

这部分错误日志是因为之前迁移没有成功,建议删除当前数据库并重建一个新的用来执行迁移过程。另外,因为错误信息里还提到 REFERENCES command denied to user 可以看出你的用户权限不足,如果你的 OBCloud 实例是上次 AI Workshop 时就创建出来的,建议新建一个新的超级用户来进行连接。

解决方案总结:

  1. 创建一个不同于 centenary2 用户的新的超级用户,新的用户将具备 REFERENCES 权限
  2. 删除当前创建部分表的数据库,创建一个新的数据库用以连接
5 个赞

膜拜与神~

OBCloud 曾经有个问题,创建的旧用户没有创建外键的权限。但是现在元数据库里有几张表上有外键,所以创建不出来。

现在的 OBCloud 已经修复了这个问题,可以通过创建新用户解决~

3 个赞

明白了,谢谢老师。确实是上次 OBCloud 与 AI Workshop 时就创建出来的超级用户,因为昨晚也看到了权限问题,但是后面想了一下是当时创建的超级用户所以就觉得有点奇怪。我找个时间重做一下,感谢老师!

2 个赞

感谢老师,已成功解决,正在整理踩坑文档

2 个赞

感谢分享···

2 个赞

真的好,来学习一下

2 个赞

:+1: :+1: :+1: 666

2 个赞

真赞啊

1 个赞