使用Java+SSE实现大模型流式返回前端(完整代码+优化方案)_java调用大模型再流式输出
SSE技术简介
解决核心问题:
适用场景对比:
使用SSE更轻量,且大模型结果属于单向响应,在输出时不需要考虑交互
环境准备
开发环境:
org.springframework.boot spring-boot-starter-web
后端实现步骤
核心控制器:
@RestController@RequestMapping(\"/api/stream\")public class StreamController { private static final Map EMITTERS = new ConcurrentHashMap(); @GetMapping(produces = MediaType.TEXT_EVENT_STREAM_VALUE) public SseEmitter stream(@RequestParam String sessionId) { SseEmitter emitter = new SseEmitter(180_000L); // 3分钟超时 EMITTERS.put(sessionId, emitter); emitter.onCompletion(() -> EMITTERS.remove(sessionId)); emitter.onTimeout(() -> EMITTERS.remove(sessionId)); return emitter; }}
大模型调用封装:
public void generateStream(String input, String sessionId) { ExecutorService executor = Executors.newVirtualThreadPerTaskExecutor(); executor.execute(() -> { SseEmitter emitter = EMITTERS.get(sessionId); try { for (String chunk : aiModel.streamGenerate(input)) { //chunk 应为大模型逐条返回的流式响应 emitter.send(SseEmitter.event() .data(chunk) .id(UUID.randomUUID().toString()) .reconnectTime(3000)); } emitter.complete(); } catch (IOException e) { emitter.completeWithError(e); } });}
前端对接方案
// 建立SSE连接const eventSource = new EventSource(\'/api/stream?sessionId=123\');// 监听数据eventSource.onmessage = (event) => { document.getElementById(\'output\').innerHTML += event.data;};// 错误处理eventSource.onerror = () => { eventSource.close(); console.log(\'连接异常终止\');};
性能优化技巧
流量控制:
// 使用令牌桶限流RateLimiter limiter = RateLimiter.create(50); // 每秒50个chunkfor (String chunk : chunks) { limiter.acquire(); emitter.send(chunk);}
心跳检测:
ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);scheduler.scheduleAtFixedRate(() -> { emitter.send(SseEmitter.event().comment(\"heartbeat\"));}, 0, 30, TimeUnit.SECONDS);