poi 读取word内容转html在线预览
文章目录
前言
word内容预览步骤:
- word 数据读取
- word 内容数据格式保留
注:本篇word基于2007 docx格式(doc格式没写),且文件保存地址为minio服务器。
一、java代码实现
public String perviewDocx(String bucket, String objectName) throws Exception { BizPreconditions.checkArgumentNoStack(this.bucketExists(bucket), "目录" + bucket + "不存在!"); String content; // minio读取word到流 InputStream in = this.getObject(bucket, objectName); // 实现关键代码 // docx 内容转html输出 XWPFDocument document = new XWPFDocument(in); XHTMLOptions options = XHTMLOptions.create(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); XHTMLConverter.getInstance().convert(document, baos, options); content = new String(baos.toByteArray(), "utf-8"); baos.close();// 读取纯文本,带换行// XWPFWordExtractor extractor = new XWPFWordExtractor(document);// content = extractor.getText();// System.out.println(content); in.close(); return content; }
pom
<dependency> <groupId>fr.opensagres.xdocreport</groupId> <artifactId>org.apache.poi.xwpf.converter.core</artifactId> <version>1.0.6</version> </dependency> <dependency> <groupId>fr.opensagres.xdocreport</groupId> <artifactId>org.apache.poi.xwpf.converter.xhtml</artifactId> <version>1.0.6</version> </dependency> <dependency> <groupId>org.apache.xmlbeans</groupId> <artifactId>xmlbeans</artifactId> <version>2.6.0</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>ooxml-schemas</artifactId> <version>1.3</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi</artifactId> <version>3.14</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml-schemas</artifactId> <version>3.14</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>3.14</version> </dependency> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-scratchpad</artifactId> <version>3.14</version> </dependency>
二、坑爹问题
1.找不到方法
.convert方法飘红
XHTMLConverter.getInstance().convert(document, baos, options);
manven引入org.apache.poi.xwpf.converter.core、org.apache.poi.xwpf.converter.xhtml
2.NoSuchMethodError getPackageRelationship
poi版本不对 改为3.14
3.中文乱成这样,是什么鬼
得到结果中文全部被转为⑧无消
形式。
不用担心,这个前端渲染的时候就是中文啦!