JAVA/ERROR

[ERROR] org.apache.tika.exception.TikaException: Cannot connect to Grobid Service

김민둉 2021. 4. 22. 17:02

Error Detail

tika 사용중 pdf 형식의 파일만 변환이 안 되며 다음과 같은 에러 메세지 출력

 

2021-04-16 14:33:40,688  WARN [org.apache.tika.parser.journal.GrobidRESTParser] Couldn't read response
org.apache.tika.exception.TikaException: Cannot connect to Grobid Service
	at org.apache.tika.parser.journal.GrobidRESTParser.checkMode(GrobidRESTParser.java:120) ~[tika-parsers-1.26.jar:1.26]
	at org.apache.tika.parser.journal.GrobidRESTParser.parse(GrobidRESTParser.java:80) [tika-parsers-1.26.jar:1.26]
	at org.apache.tika.parser.journal.JournalParser.parse(JournalParser.java:60) [tika-parsers-1.26.jar:1.26]
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) [tika-core-1.26.jar:1.26]
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) [tika-core-1.26.jar:1.26]
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) [tika-core-1.26.jar:1.26]
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:159) [tika-core-1.26.jar:1.26]
	at com.saramin.ai.service.impl.FileCnvrServiceImpl.cnvrFileToStr(FileCnvrServiceImpl.java:34) [classes/:?]
	at com.saramin.ai.web.WebController.main(WebController.java:49) [classes/:?]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_261]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_261]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_261]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_261]
	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) [spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133) [spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:854) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:765) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:626) [servlet-api.jar:4.0.FR]
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846) [spring-webmvc-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:733) [servlet-api.jar:4.0.FR]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:227) [catalina.jar:9.0.44]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) [catalina.jar:9.0.44]
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53) [tomcat-websocket.jar:9.0.44]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) [catalina.jar:9.0.44]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) [catalina.jar:9.0.44]
	at egovframework.rte.ptl.mvc.filter.HTMLTagFilter.doFilter(HTMLTagFilter.java:51) [egovframework.rte.ptl.mvc-3.10.0.jar:?]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) [catalina.jar:9.0.44]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) [catalina.jar:9.0.44]
	at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:197) [spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107) [spring-web-4.3.25.RELEASE.jar:4.3.25.RELEASE]
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:189) [catalina.jar:9.0.44]
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:162) [catalina.jar:9.0.44]
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202) [catalina.jar:9.0.44]
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:97) [catalina.jar:9.0.44]
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:542) [catalina.jar:9.0.44]
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143) [catalina.jar:9.0.44]
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92) [catalina.jar:9.0.44]
	at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:687) [catalina.jar:9.0.44]
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:78) [catalina.jar:9.0.44]
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:357) [catalina.jar:9.0.44]
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:374) [tomcat-coyote.jar:9.0.44]
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65) [tomcat-coyote.jar:9.0.44]
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:893) [tomcat-coyote.jar:9.0.44]
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1707) [tomcat-coyote.jar:9.0.44]
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-coyote.jar:9.0.44]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_261]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_261]
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-util.jar:9.0.44]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]

 

 

해결 과정

하아....

아무리 검색해도 안 나와서 대체 뭐지 했는데 다른 파일들은 다 변환이 되는 점을 보고 혹시 pdf 쪽에 문제가 있나? 하며 하나하나 디버깅 시작..

잘은 모르겠지만 Grobid가 pdf 관련 모듈임을 확인함

 

내가 해결한 방법

pom.xml 정리하면서 pdfbox 의존성을 삭제했는데 이 놈이 문제였다.

tika 내부에 pdf 변환하는 모듈이 있을 것이라 생각해서 없어도 될 줄 알았다.

pdfbox 의존성 추가 후 메이븐 업데이트 하니 정상 구동됨.

 

<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox -->
		<dependency>
		    <groupId>org.apache.pdfbox</groupId>
		    <artifactId>pdfbox</artifactId>
		    <version>2.0.22</version>
		</dependency>