Spring AI 系列之三十六 - Spring AI Alibaba-nl2sql_Alibaba-nl2sql使用指南
之前做个几个大模型的应用,都是使用Python语言,后来有一个项目使用了Java,并使用了Spring AI框架。随着Spring AI不断地完善,最近它发布了1.0正式版,意味着它已经能很好的作为企业级生产环境的使用。对于Java开发者来说真是一个福音,其功能已经能满足基于大模型开发企业级应用。借着这次机会,给大家分享一下Spring AI框架。
注意:由于框架不同版本改造会有些使用的不同,因此本次系列中使用基本框架是 Spring AI-1.0.0,JDK版本使用的是19,Spring-AI-Alibaba-1.0.0.3-SNAPSHOT。
代码参考: https://github.com/forever1986/springai-study
目录
- 1 spring-ai-alibaba-starter-nl2sql
-
- 1.1 包的结构说明
- 1.2 底层原理
- 2 示例演示
-
- 2.1 前提准备
- 2.2 代码实现
- 2.3 演示效果
前几章讲了Spring AI Alibaba的Graph框架,它是构建智能体、工作流的基本框架。但是前几章都是在讲Graph一些常见的使用节点和流程,对于一个真正智能体来说,可能还需要更复杂的实现。这一章再来将一个Spring AI Alibaba基于Graph的扩展实现的框架:nl2sql 。在实际业务中,往往有些数据存储在关系型数据库中,而需要通过大模型自己将用户的问题生成SQL语句去查询数据库,这就是 nl2sql 的功能,在Spring AI Alibaba中就是通过Graph实现了该功能。
1 spring-ai-alibaba-starter-nl2sql
1.1 包的结构说明
spring-ai-alibaba-starter-nl2sql :自然语言到 SQL 转换组件。
<dependency> <groupId>com.alibaba.cloud.ai</groupId> <artifactId>spring-ai-alibaba-starter-nl2sql</artifactId></dependency>
当引入该组件时,可以看到其3个主要的jar,其中nl2sql-chat是最主要的实现逻辑,其中Nl2sqlConfiguration是最主要的配置:
1.2 底层原理
从上图中知道Nl2sqlConfiguration是配置的主要点,现在来看看Nl2sqlConfiguration的源码:
@Configurationpublic class Nl2sqlConfiguration {// 主要的nl2sql的服务类@Autowired@Qualifier(\"nl2SqlServiceImpl\")private BaseNl2SqlService nl2SqlService;// 将数据库的schema转换存入到向量数据库@Autowired@Qualifier(\"schemaServiceImpl\")private BaseSchemaService schemaService;// 关系型数据库的连接@Autowiredprivate DbAccessor dbAccessor;// 关系型数据库的配置(目前支持mysql和postgrSql)@Autowiredprivate DbConfig dbConfig;// 构建图的关键@Beanpublic StateGraph nl2sqlGraph(ChatClient.Builder chatClientBuilder) throws GraphStateException {// 这里就不贴图的构建过程}}
在Nl2sqlConfiguration的nl2sqlGraph方法中构建了一个StateGraph图,从前面知道StateGraph就是构建智能体的框架,因此nl2sqlGraph就是一个将自然语言转换为可执行SQL的智能体,其流程结构如下图:
下面说一下上面StateGraph图的各个节点作用:
1)START : 开始节点
2)QUERY_REWRITE_NODE :查询重写。问题重写与意图澄清,提升意图理解准确性。完成后需要通过condition校验:
- 校验失败:直接跳转到 END 节点,终止流程
- 校验成功:则进入KEVNORD_EXTRACT_NODE 节点
3)KEVNORD_EXTRACT_NODE :关键词提取。关键词、实体、时间等信息抽取,为后续 Schema 召回做准备。
4)SCHEMA_RECALL_NODE :模式召回。根据关键词和意图,召回相关表、字段、关系等数据库 Schema 信息。
5)TABLE_RELATION_NODE:表关系解析。推理表与字段间的关系,自动补全 Join、外键等复杂结构。
6)SQL_GENERATE_NODE :生成 SQL 语句。通过condition的校验:
- 生成成功且无需验证:直接输出结果(END)
- 生成成功: 进入 SQL_VALIDATE_NODE 节点
- 生成失败:回退至KEVNORD_EXTRACT_NODE 节点,重新进行关键词提取
7)SQL_VALIDATE_NODE:校验 SQL 语句。通过condition校验:
- 校验失败: 返回SQL_GENERATE_NODE节点,重新生成SQL
- 校验通过: 进入SEMANTIC_CONSISTENC_NODE 节点,进行语义一致性校验
8)SEMANTIC_CONSISTENC_NODE:语义一致性校验。比对SQL逻辑与原始查询意图的吻合度。经condition校验:·
- 校验不一致:返回SQL_GENERATE_NODESQL 节点,重新生成
- 校验一致:直接跳转到 END 节点,终止流程
9)END:结束节点
从上面可以知道Spring AI Alibaba-nl2sql的底层原理就是使用StateGraph构建一个智能体,下面就通过一个示例演示如何使用
2 示例演示
代码参考lesson27子模块
示例说明:通过创建用户表、商品表、订单表、订单明细表、商品分类表,然后通过问题生成对应的SQL
2.1 前提准备
1)准备mysql数据库,创建一个springai数据库,并执行一下初始化语句
-- 用户表CREATE TABLE users ( id INT PRIMARY KEY AUTO_INCREMENT COMMENT \'用户ID,主键自增\', username VARCHAR(50) NOT NULL COMMENT \'用户名\', email VARCHAR(100) NOT NULL COMMENT \'用户邮箱\', created_at DATETIME DEFAULT CURRENT_TIMESTAMP COMMENT \'用户注册时间\') COMMENT=\'用户表\';-- 商品表CREATE TABLE products ( id INT PRIMARY KEY AUTO_INCREMENT COMMENT \'商品ID,主键自增\', name VARCHAR(100) NOT NULL COMMENT \'商品名称\', price DECIMAL(10,2) NOT NULL COMMENT \'商品单价\', stock INT NOT NULL COMMENT \'商品库存数量\', created_at DATETIME DEFAULT CURRENT_TIMESTAMP COMMENT \'商品上架时间\') COMMENT=\'商品表\';-- 订单表CREATE TABLE orders ( id INT PRIMARY KEY AUTO_INCREMENT COMMENT \'订单ID,主键自增\', user_id INT NOT NULL COMMENT \'下单用户ID\', order_date DATETIME DEFAULT CURRENT_TIMESTAMP COMMENT \'下单时间\', total_amount DECIMAL(10,2) NOT NULL COMMENT \'订单总金额\', status VARCHAR(20) DEFAULT \'pending\' COMMENT \'订单状态(pending/completed/cancelled等)\', FOREIGN KEY (user_id) REFERENCES users(id)) COMMENT=\'订单表\';-- 订单明细表CREATE TABLE order_items ( id INT PRIMARY KEY AUTO_INCREMENT COMMENT \'订单明细ID,主键自增\', order_id INT NOT NULL COMMENT \'订单ID\', product_id INT NOT NULL COMMENT \'商品ID\', quantity INT NOT NULL COMMENT \'购买数量\', unit_price DECIMAL(10,2) NOT NULL COMMENT \'下单时商品单价\', FOREIGN KEY (order_id) REFERENCES orders(id), FOREIGN KEY (product_id) REFERENCES products(id)) COMMENT=\'订单明细表\';-- 商品分类表CREATE TABLE categories ( id INT PRIMARY KEY AUTO_INCREMENT COMMENT \'分类ID,主键自增\', name VARCHAR(50) NOT NULL COMMENT \'分类名称\') COMMENT=\'商品分类表\';-- 商品-分类关联表(多对多)CREATE TABLE product_categories ( product_id INT NOT NULL COMMENT \'商品ID\', category_id INT NOT NULL COMMENT \'分类ID\', PRIMARY KEY (product_id, category_id), FOREIGN KEY (product_id) REFERENCES products(id), FOREIGN KEY (category_id) REFERENCES categories(id)) COMMENT=\'商品与分类关联表\';-- 用户表插入数据INSERT INTO users (username, email) VALUES(\'alice\', \'alice@example.com\'),(\'bob\', \'bob@example.com\'),(\'cathy\', \'cathy@example.com\'),(\'daniel\', \'daniel@example.com\'),(\'emily\', \'emily@example.com\');-- 商品分类插入数据INSERT INTO categories (name) VALUES(\'电子产品\'),(\'服装\'),(\'图书\'),(\'家居用品\'),(\'食品\');-- 商品表插入数据INSERT INTO products (name, price, stock) VALUES(\'智能手机\', 2999.00, 100),(\'T恤衫\', 89.00, 500),(\'小说\', 39.00, 200),(\'咖啡机\', 599.00, 50),(\'牛奶\', 15.00, 300),(\'笔记本电脑\', 4999.00, 30),(\'沙发\', 2599.00, 10),(\'巧克力\', 25.00, 100),(\'羽绒服\', 399.00, 80),(\'历史书\', 69.00, 150);-- 商品-分类关联数据INSERT INTO product_categories (product_id, category_id) VALUES(1, 1), -- 智能手机-电子产品(2, 2), -- T恤衫-服装(3, 3), -- 小说-图书(4, 1), (4, 4), -- 咖啡机-电子产品、家居用品(5, 5), -- 牛奶-食品(6, 1), -- 笔记本电脑-电子产品(7, 4), -- 沙发-家居用品(8, 5), -- 巧克力-食品(9, 2), -- 羽绒服-服装(10, 3); -- 历史书-图书-- 订单表插入数据INSERT INTO orders (user_id, total_amount, status, order_date) VALUES(1, 3088.00, \'completed\', \'2025-06-01 10:10:00\'),(2, 39.00, \'pending\', \'2025-06-02 09:23:00\'),(3, 1204.00, \'completed\', \'2025-06-03 13:45:00\'),(4, 65.00, \'cancelled\', \'2025-06-04 16:05:00\'),(5, 5113.00, \'completed\', \'2025-06-05 20:12:00\'),(1, 814.00, \'completed\', \'2025-06-05 21:03:00\'),(2, 424.00, \'pending\', \'2025-06-06 08:10:00\'),(3, 524.00, \'completed\', \'2025-06-06 14:48:00\'),(4, 399.00, \'completed\', \'2025-06-07 10:15:00\'),(5, 129.00, \'pending\', \'2025-06-07 18:00:00\');-- 订单明细表插入数据INSERT INTO order_items (order_id, product_id, quantity, unit_price) VALUES(1, 1, 1, 2999.00),(1, 2, 1, 89.00),(2, 3, 1, 39.00),(3, 4, 2, 599.00),(3, 5, 2, 3.00),(4, 8, 2, 25.00),(4, 5, 1, 15.00),(5, 6, 1, 4999.00),(5, 2, 1, 89.00),(5, 5, 5, 5.00),(5, 8, 1, 25.00),(6, 9, 2, 399.00),(6, 3, 1, 16.00),(7, 2, 2, 89.00),(7, 3, 3, 39.00),(8, 10, 4, 69.00),(9, 9, 1, 399.00),(10, 8, 4, 25.00),(10, 5, 1, 29.00);-- 补充部分商品价格波动或促销模拟-- 注:部分订单明细使用了与商品表不同的 unit_price,模拟促销活动
2)新建lesson27子模块,其pom引入如下:
<properties> <spring-boot.version>3.4.5</spring-boot.version></properties><dependencies> <dependency> <groupId>com.alibaba.cloud.ai</groupId> <artifactId>spring-ai-alibaba-starter-nl2sql</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency></dependencies>
注意:由于之前的示例使用spring-boot是3.3.0版本,而Spring AI Alibaba-nl2sql使用的Spring-boot版本是3.4.5以上,因此这里重新定义一下版本号
2.2 代码实现
1)新建application.properties配置文件
# 聊天模型spring.ai.openai.api-key=你的阿里百炼API KEYspring.ai.openai.model=qwen-max# embedding的api-keyspring.ai.dashscope.api-key=你的阿里百炼API KEY# 数据库配置chatbi.dbconfig.url=jdbc:mysql://127.0.0.1:3306/springai?serverTimezone=UTCchatbi.dbconfig.username=rootchatbi.dbconfig.password=root#chatbi.dbconfig.schema=truechatbi.dbconfig.connection-type=jdbcchatbi.dbconfig.dialect-type=mysql
注意:这里使用的是阿里的千问模型,由于需要将schema进行向量存储,因此还配置dashscope的api-key,是为了加载embededing模型(也是使用阿里线上模型)
2)新建SimpleChatController 演示类:
import com.alibaba.cloud.ai.constant.Constant;import com.alibaba.cloud.ai.dbconnector.DbConfig;import com.alibaba.cloud.ai.graph.CompiledGraph;import com.alibaba.cloud.ai.graph.OverAllState;import com.alibaba.cloud.ai.graph.StateGraph;import com.alibaba.cloud.ai.request.SchemaInitRequest;import com.alibaba.cloud.ai.service.simple.SimpleVectorStoreService;import org.springframework.beans.factory.annotation.Autowired;import org.springframework.web.bind.annotation.GetMapping;import org.springframework.web.bind.annotation.RequestParam;import org.springframework.web.bind.annotation.RestController;import java.util.Arrays;import java.util.Map;import java.util.Optional;@RestControllerpublic class SimpleChatController { @Autowired private StateGraph nl2sqlGraph; @Autowired private SimpleVectorStoreService simpleVectorStoreService; @Autowired private DbConfig dbConfig; @GetMapping(\"/simpleChat\") public String simpleNl2Sql(@RequestParam(value = \"message\", defaultValue = \"查询每个分类下已经成交且销量最高的商品及其销售总量,每个分类只返回销量最高的商品。\", required = true) String message) throws Exception { SchemaInitRequest schemaInitRequest = new SchemaInitRequest(); // 配置数据库 schemaInitRequest.setDbConfig(dbConfig); // 将表名和字段等信息放入向量数据库 schemaInitRequest.setTables(Arrays.asList(\"categories\",\"order_items\",\"orders\",\"products\",\"users\",\"product_categories\")); simpleVectorStoreService.schema(schemaInitRequest); // 执行图 CompiledGraph compiledGraph = nl2sqlGraph.compile(); Optional<OverAllState> invoke = compiledGraph.invoke(Map.of(Constant.INPUT_KEY, message)); // 结果返回 OverAllState overAllState = invoke.get(); return overAllState.value(Constant.RESULT).get().toString(); }}
3)新建Lesson27Application 启动类:
import org.springframework.boot.SpringApplication;import org.springframework.boot.autoconfigure.SpringBootApplication;// 注意:这里加载com.alibaba.cloud.ai包下,是因为Spring AI Alibaba-nl2sql并不是自动配置,因此需要扫描其路径,同时也要扫描自己路径下的包@SpringBootApplication(scanBasePackages = {\"com.alibaba.cloud.ai\",\"com.demo.lesson27\"})public class Lesson27Application { public static void main(String[] args) { SpringApplication.run(Lesson27Application.class, args); }}
2.3 演示效果
1)访问:http://localhost:8080/simpleChat
2)在控制台可以看到执行的日志:(执行日志中有StateGraph 的执行过程)
2025-07-09T17:07:36.106+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.QueryRewriteNode : 进入 QueryRewriteNode 节点2025-07-09T17:07:36.107+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.QueryRewriteNode : [QueryRewriteNode] 处理用户输入: 查询每个分类下已经成交且销量最高的商品及其销售总量,每个分类只返回销量最高的商品。2025-07-09T17:07:41.742+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.QueryRewriteNode : [QueryRewriteNode] 问题重写结果: 查询每个分类下已经成交且销量最高的商品及其销售总量,每个分类只返回销量最高的商品。2025-07-09T17:07:41.742+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.QueryRewriteDispatcher : [QueryRewriteDispatcher]进入KEYWORD_EXTRACT_NODE节点2025-07-09T17:07:41.742+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.KeywordExtractNode : 进入 KeywordExtractNode 节点2025-07-09T17:07:42.991+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.KeywordExtractNode : evidences:[] , keywords: [每个分类, 已经成交, 销量最高的商品, 销售总量]2025-07-09T17:07:42.991+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.KeywordExtractNode : KeywordExtractNode 节点输出 evidences:[] , keywords: [每个分类, 已经成交, 销量最高的商品, 销售总量]2025-07-09T17:07:42.992+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.SchemaRecallNode : 进入 SchemaRecallNode 节点2025-07-09T17:07:43.401+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.SchemaRecallNode : [SchemaRecallNode] Schema召回结果 - 表文档数量: 6, 关键词相关列文档组数: 42025-07-09T17:07:43.403+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.TableRelationNode : 进入 TableRelationNode 节点2025-07-09T17:07:43.404+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.TableRelationNode : [TableRelationNode] 执行常规Schema选择2025-07-09T17:07:44.737+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.cloud.ai.node.TableRelationNode : [TableRelationNode] Schema处理结果: SchemaDTO(name=springai, description=null, tableCount=null, table=[TableDTO(name=categories, description=商品分类表, column=[ColumnDTO(name=name, description=分类名称, enumeration=0, range=null, type=text, samples=null, data=null, mapping=null), ColumnDTO(name=id, description=分类ID,主键自增, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null)], primaryKeys=[id]), TableDTO(name=product_categories, description=商品与分类关联表, column=[ColumnDTO(name=product_id, description=商品ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=category_id, description=分类ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null)], primaryKeys=[product_id]), TableDTO(name=products, description=商品表, column=[ColumnDTO(name=id, description=商品ID,主键自增, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=created_at, description=商品上架时间, enumeration=0, range=null, type=datetime, samples=null, data=null, mapping=null), ColumnDTO(name=price, description=商品单价, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=stock, description=商品库存数量, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=name, description=商品名称, enumeration=0, range=null, type=text, samples=null, data=null, mapping=null)], primaryKeys=[id]), TableDTO(name=order_items, description=订单明细表, column=[ColumnDTO(name=id, description=订单明细ID,主键自增, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=quantity, description=购买数量, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=unit_price, description=下单时商品单价, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=product_id, description=商品ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=order_id, description=订单ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null)], primaryKeys=[id]), TableDTO(name=orders, description=订单表, column=[ColumnDTO(name=user_id, description=下单用户ID, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=id, description=订单ID,主键自增, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=order_date, description=下单时间, enumeration=0, range=null, type=datetime, samples=null, data=null, mapping=null), ColumnDTO(name=total_amount, description=订单总金额, enumeration=0, range=null, type=number, samples=null, data=null, mapping=null), ColumnDTO(name=status, description=订单状态(pending/completed/cancelled等), enumeration=0, range=null, type=text, samples=null, data=null, mapping=null)], primaryKeys=[id])], foreignKeys=[[order_items.order_id=orders.id, product_categories.category_id=categories.id, orders.user_id=users.id, product_categories.product_id=products.id, order_items.product_id=products.id]])2025-07-09T17:07:44.739+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 进入 SqlGenerateNode 节点2025-07-09T17:07:45.548+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 召回信息是否满足需求:是2025-07-09T17:07:45.549+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 开始生成SQL2025-07-09T17:08:07.824+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 生成的SQL为:SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_salesFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = \'completed\'GROUP BY c.id, p.idWITH ROLLUPHAVING category_id IS NOT NULL AND product_id IS NOT NULLORDER BY category_id, total_sales DESCLIMIT 18446744073709551615 OFFSET 02025-07-09T17:08:07.824+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : SqlGenerateNode 节点执行完成2025-07-09T17:08:07.824+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL 生成结果: SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_salesFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = \'completed\'GROUP BY c.id, p.idWITH ROLLUPHAVING category_id IS NOT NULL AND product_id IS NOT NULLORDER BY category_id, total_sales DESCLIMIT 18446744073709551615 OFFSET 02025-07-09T17:08:07.824+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL生成成功,进入SQL校验节点: SQL_VALIDATE_NODE2025-07-09T17:08:07.826+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : 进入 SqlValidateNode 节点2025-07-09T17:08:07.826+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] 开始验证SQL语句: SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_salesFROM categories cJOIN product_categories pc ON c.id = pc.category_idJOIN products p ON pc.product_id = p.idJOIN order_items oi ON p.id = oi.product_idJOIN orders o ON oi.order_id = o.idWHERE o.status = \'completed\'GROUP BY c.id, p.idWITH ROLLUPHAVING category_id IS NOT NULL AND product_id IS NOT NULLORDER BY category_id, total_sales DESCLIMIT 18446744073709551615 OFFSET 02025-07-09T17:08:07.844+08:00 INFO 8276 --- [nio-8080-exec-3] com.alibaba.druid.pool.DruidDataSource : {dataSource-84} inited2025-07-09T17:08:07.850+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] SQL语法验证通过2025-07-09T17:08:07.850+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : SQL语法校验是否通过: true2025-07-09T17:08:07.850+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : [SqlValidateDispatcher] SQL语法校验通过,跳转到节点: SEMANTIC_CONSISTENC_NODE2025-07-09T17:08:07.852+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 进入 SemanticConsistencNode 节点2025-07-09T17:08:28.615+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 语义一致性校验结果详情: 不通过,并附具体原因:1. **核心逻辑验证**:当前SQL在尝试找出每个分类下销量最高的商品,但使用了`GROUP BY c.id, p.id WITH ROLLUP HAVING category_id IS NOT NULL AND product_id IS NOT NULL`的方式来进行分组和过滤,这会导致每个分类下的所有商品都被列出,而不是只列出销量最高的那一个。正确的做法应该是先计算出每个商品的总销量,然后根据分类进行分区,最后筛选出每一分区中销量最高的商品。2. **致命问题**:结果集不准确。由于上述提到的问题,最终输出不会是每个类别中销量最高的单个商品信息,而是该类别中所有商品的信息列表(尽管经过了排序),这意味着未能满足“每个分类只返回销量最高的商品”的需求。为了修正这个问题,可以考虑使用窗口函数(如`ROW_NUMBER()`)来为每个类别的商品按销量排名,再从中选择排名第一的商品。例如:WITH Sales AS ( SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales, ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rank FROM categories c JOIN product_categories pc ON c.id = pc.category_id JOIN products p ON pc.product_id = p.id JOIN order_items oi ON p.id = oi.product_id JOIN orders o ON oi.order_id = o.id WHERE o.status = \'completed\' GROUP BY c.id, p.id)SELECT category_id, category_name, product_id, product_name, total_salesFROM SalesWHERE rank = 1;此修改建议属于解决核心业务逻辑错误所必需的调整,因此原SQL基于审计原则被判定为不通过。2025-07-09T17:08:28.615+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 语义一致性校验结果: false2025-07-09T17:08:28.615+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SemanticConsistenceDispatcher : 语义一致性校验结果: false,跳转节点配置2025-07-09T17:08:28.616+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SemanticConsistenceDispatcher : 语义一致性校验未通过,跳转到SQL生成节点:SQL_GENERATE_NODE2025-07-09T17:08:28.618+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 进入 SqlGenerateNode 节点2025-07-09T17:08:28.618+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 语义一致性校验未通过,开始重新生成SQL2025-07-09T17:08:37.378+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 重新生成的SQL为:WITH Sales AS ( SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales, ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rank FROM categories c JOIN product_categories pc ON c.id = pc.category_id JOIN products p ON pc.product_id = p.id JOIN order_items oi ON p.id = oi.product_id JOIN orders o ON oi.order_id = o.id WHERE o.status = \'completed\' GROUP BY c.id, p.id)SELECT category_id, category_name, product_id, product_name, total_salesFROM SalesWHERE rank = 1;2025-07-09T17:08:37.378+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL 生成结果: WITH Sales AS ( SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales, ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rank FROM categories c JOIN product_categories pc ON c.id = pc.category_id JOIN products p ON pc.product_id = p.id JOIN order_items oi ON p.id = oi.product_id JOIN orders o ON oi.order_id = o.id WHERE o.status = \'completed\' GROUP BY c.id, p.id)SELECT category_id, category_name, product_id, product_name, total_salesFROM SalesWHERE rank = 1;2025-07-09T17:08:37.378+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL生成成功,进入SQL校验节点: SQL_VALIDATE_NODE2025-07-09T17:08:37.382+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : 进入 SqlValidateNode 节点2025-07-09T17:08:37.382+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] 开始验证SQL语句: WITH Sales AS ( SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales, ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rank FROM categories c JOIN product_categories pc ON c.id = pc.category_id JOIN products p ON pc.product_id = p.id JOIN order_items oi ON p.id = oi.product_id JOIN orders o ON oi.order_id = o.id WHERE o.status = \'completed\' GROUP BY c.id, p.id)SELECT category_id, category_name, product_id, product_name, total_salesFROM SalesWHERE rank = 1;2025-07-09T17:08:37.398+08:00 INFO 8276 --- [nio-8080-exec-3] com.alibaba.druid.pool.DruidDataSource : {dataSource-85} inited2025-07-09T17:08:37.401+08:00 ERROR 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] SQL语法验证失败 - 原因: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near \'rank FROM categories c JOIN product_categories pc ON c.id = pc.category_id \' at line 82025-07-09T17:08:37.402+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : SQL语法校验是否通过: false2025-07-09T17:08:37.402+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : [SqlValidateDispatcher] SQL语法校验未通过,跳转到节点: SQL_GENERATE_NODE2025-07-09T17:08:37.404+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 进入 SqlGenerateNode 节点2025-07-09T17:08:37.404+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : SQL 语法校验未通过,开始重新生成SQL2025-07-09T17:08:49.460+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlGenerateNode : 重新生成的SQL为:WITH Sales AS ( SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales, ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rnk FROM categories c JOIN product_categories pc ON c.id = pc.category_id JOIN products p ON pc.product_id = p.id JOIN order_items oi ON p.id = oi.product_id JOIN orders o ON oi.order_id = o.id WHERE o.status = \'completed\' GROUP BY c.id, p.id)SELECT category_id, category_name, product_id, product_name, total_salesFROM SalesWHERE rnk = 1;2025-07-09T17:08:49.460+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL 生成结果: WITH Sales AS ( SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales, ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rnk FROM categories c JOIN product_categories pc ON c.id = pc.category_id JOIN products p ON pc.product_id = p.id JOIN order_items oi ON p.id = oi.product_id JOIN orders o ON oi.order_id = o.id WHERE o.status = \'completed\' GROUP BY c.id, p.id)SELECT category_id, category_name, product_id, product_name, total_salesFROM SalesWHERE rnk = 1;2025-07-09T17:08:49.460+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlGenerateDispatcher : SQL生成成功,进入SQL校验节点: SQL_VALIDATE_NODE2025-07-09T17:08:49.462+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : 进入 SqlValidateNode 节点2025-07-09T17:08:49.462+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] 开始验证SQL语句: WITH Sales AS ( SELECT c.id AS category_id, c.name AS category_name, p.id AS product_id, p.name AS product_name, SUM(oi.quantity) AS total_sales, ROW_NUMBER() OVER (PARTITION BY c.id ORDER BY SUM(oi.quantity) DESC) as rnk FROM categories c JOIN product_categories pc ON c.id = pc.category_id JOIN products p ON pc.product_id = p.id JOIN order_items oi ON p.id = oi.product_id JOIN orders o ON oi.order_id = o.id WHERE o.status = \'completed\' GROUP BY c.id, p.id)SELECT category_id, category_name, product_id, product_name, total_salesFROM SalesWHERE rnk = 1;2025-07-09T17:08:49.476+08:00 INFO 8276 --- [nio-8080-exec-3] com.alibaba.druid.pool.DruidDataSource : {dataSource-86} inited2025-07-09T17:08:49.480+08:00 INFO 8276 --- [nio-8080-exec-3] c.alibaba.cloud.ai.node.SqlValidateNode : [SqlValidateNode] SQL语法验证通过2025-07-09T17:08:49.481+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : SQL语法校验是否通过: true2025-07-09T17:08:49.481+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SqlValidateDispatcher : [SqlValidateDispatcher] SQL语法校验通过,跳转到节点: SEMANTIC_CONSISTENC_NODE2025-07-09T17:08:49.482+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 进入 SemanticConsistencNode 节点2025-07-09T17:08:55.608+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 语义一致性校验结果详情: 通过该SQL语句满足了核心需求,正确地计算了每个分类下已经成交的商品销量,并且筛选出了每个分类中销量最高的商品及其销售总量。聚合逻辑、过滤条件(仅完成订单)以及输出字段都符合业务要求。未发现任何影响结果准确性的致命问题或无法简单修复的结构缺陷。2025-07-09T17:08:55.608+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.ai.node.SemanticConsistencNode : 语义一致性校验结果: true2025-07-09T17:08:55.609+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SemanticConsistenceDispatcher : 语义一致性校验结果: true,跳转节点配置2025-07-09T17:08:55.609+08:00 INFO 8276 --- [nio-8080-exec-3] c.a.c.a.d.SemanticConsistenceDispatcher : 语义一致性校验通过,跳转到结束节点。
3)可以看到其生成的SQL:
with Sales as (selectc.id as category_id,c.name as category_name,p.id as product_id,p.name as product_name,SUM(oi.quantity) as total_sales,row_number() over (partition by c.idorder bySUM(oi.quantity) desc) as rnkfromcategories cjoin product_categories pc onc.id = pc.category_idjoin products p onpc.product_id = p.idjoin order_items oi onp.id = oi.product_idjoin orders o onoi.order_id = o.idwhereo.status = \'completed\'group byc.id,p.id )selectcategory_id,category_name,product_id,product_name,total_salesfromSaleswherernk = 1;
4)将SQL拿取Mysql执行,其结果如下,可以看到效果还是不错(注意:最终生成的效果与模型能力有关系,有些模型是专门用于SQL生成的,这些模型效果更好):
结语:本章通过讲解nl2sql使用方式,可以让用户便捷的搭建一个nl2sql智能体。但是nl2sql只能算是一个实现代码框架,还不能算是一个真正的智能体,Spring AI Alibaba在企业级应用中总结出很多实践经验,ChatBI(NL2SQL)就是一款轻量、高效、可扩展的 NL2SQL 智能体框架,让 Java 程序员可以快速构建基于自然语言的数据查询系统。同时在Spring AI Alibaba基础上,还实现了JManus(一款基于 Java 实现的,包含良好的前端 UI 交互界面的通用智能体产品)和DeepResearch(一款基于 Spring AI Alibaba Graph 实现的 DeepResearch 产品)两款真正智能体产品,有兴趣的朋友可以去了解一下。
Spring AI系列上一章:《Spring AI 系列之三十五 - Spring AI Alibaba-Graph框架之MCP》
Spring AI系列下一章:《Spring AI 系列之三十七 - Spring AI Alibaba-DocumentReader》