Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hadoop based batch ingestion is not able to start job because of jetty related classes loading from pinot. #14552

Open
chrajeshbabu opened this issue Nov 27, 2024 · 2 comments

Comments

@chrajeshbabu
Copy link
Contributor

When try to load batch ingestion job with hadoop job is not getting started because jetty classes loading from Pinot batch ingestion shaded jar could not able to find the webapp files which leads to NPE

%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d WARN [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d WARN [%t] %c: %m%n%d WARN [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d WARN [%t] %c: %m%n%d WARN [%t] %c: %m%n java.io.FileNotFoundException: jar:file:/spark-local2/hadoop/yarn/local/filecache/12/pinot-batch-ingestion-hadoop-1.2.0-shaded.jar!/webapps/mapreduce/mapreduce
	at org.eclipse.jetty.webapp.WebInfConfiguration.unpack(WebInfConfiguration.java:670) ~[jetty-webapp-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.webapp.WebInfConfiguration.preConfigure(WebInfConfiguration.java:143) ~[jetty-webapp-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.webapp.WebAppContext.preConfigure(WebAppContext.java:488) ~[jetty-webapp-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:523) ~[jetty-webapp-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) ~[jetty-server-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.server.Server.start(Server.java:423) ~[jetty-server-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) ~[jetty-server-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.server.Server.doStart(Server.java:387) ~[jetty-server-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1316) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:472) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:461) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.serviceStart(MRClientService.java:152) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1290) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712) [?:?]
	at java.base/javax.security.auth.Subject.doAs(Subject.java:439) [?:?]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) [hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d ERROR [%t] %c: %m%n org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:476) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:461) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.serviceStart(MRClientService.java:152) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1290) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712) [?:?]
	at java.base/javax.security.auth.Subject.doAs(Subject.java:439) [?:?]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) [hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
Caused by: java.io.IOException: Unable to initialize WebAppContext
	at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1343) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:472) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	... 11 more
Caused by: java.io.FileNotFoundException: jar:file:/spark-local2/hadoop/yarn/local/filecache/12/pinot-batch-ingestion-hadoop-1.2.0-shaded.jar!/webapps/mapreduce/mapreduce
	at org.eclipse.jetty.webapp.WebInfConfiguration.unpack(WebInfConfiguration.java:670) ~[jetty-webapp-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.webapp.WebInfConfiguration.preConfigure(WebInfConfiguration.java:143) ~[jetty-webapp-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.webapp.WebAppContext.preConfigure(WebAppContext.java:488) ~[jetty-webapp-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:523) ~[jetty-webapp-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) ~[jetty-server-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.server.Server.start(Server.java:423) ~[jetty-server-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97) ~[jetty-server-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.server.Server.doStart(Server.java:387) ~[jetty-server-9.4.53.v20231009.jar:9.4.53.v20231009]
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:1316) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:472) ~[pinot-batch-ingestion-hadoop-1.2.0-shaded.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a]
	... 11 more
%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n%d ERROR [%t] %c: %m%n java.lang.NullPointerException: Cannot invoke "org.apache.hadoop.yarn.webapp.WebApp.port()" because "this.webApp" is null
	at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.getHttpPort(MRClientService.java:182) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:159) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:979) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:122) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712) [?:?]
	at java.base/javax.security.auth.Subject.doAs(Subject.java:439) [?:?]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) [hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
%d INFO [%t] %c: %m%n org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NullPointerException: Cannot invoke "org.apache.hadoop.yarn.webapp.WebApp.port()" because "this.webApp" is null
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:178) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:979) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:122) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712) [?:?]
	at java.base/javax.security.auth.Subject.doAs(Subject.java:439) [?:?]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) [hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.hadoop.yarn.webapp.WebApp.port()" because "this.webApp" is null
	at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.getHttpPort(MRClientService.java:182) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:159) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	... 14 more
%d INFO [%t] %c: %m%n%d INFO [%t] %c: %m%n org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NullPointerException: Cannot invoke "org.apache.hadoop.yarn.webapp.WebApp.port()" because "this.webApp" is null
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:178) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:979) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:122) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194) ~[hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712) [?:?]
	at java.base/javax.security.auth.Subject.doAs(Subject.java:439) [?:?]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) [hadoop-common-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691) [hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.hadoop.yarn.webapp.WebApp.port()" because "this.webApp" is null
	at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.getHttpPort(MRClientService.java:182) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:159) ~[hadoop-mapreduce-client-app-3.3.6.3.3.6.4-2.jar:?]
	... 14 more
@chrajeshbabu
Copy link
Contributor Author

Jetty related packages not required for batch ingestion so exclude the jetty dependencies and try out.

@chrajeshbabu
Copy link
Contributor Author

Able to make the job gets started by excluding the dependency causing org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:472) loading from pinot jar which leads to this issue. Found other problems the way the dependencies get's pulled before starting the mappers.
During the job initialisation pushing the dependencies tar ball to HDFS path.

    File pluginsTarGzFile = new File(PINOT_PLUGINS_TAR_GZ);
    try {
      File[] files = validPluginDirectories.toArray(new File[0]);
      TarCompressionUtils.createCompressedTarFile(files, pluginsTarGzFile);

      // Copy to staging directory
      Path cachedPluginsTarball = new Path(stagingDirURI.toString(), SegmentGenerationUtils.PINOT_PLUGINS_TAR_GZ);
      outputDirFS.copyFromLocalFile(pluginsTarGzFile, cachedPluginsTarball.toUri());
      job.addCacheFile(cachedPluginsTarball.toUri());
    } catch (Exception e) {
      LOGGER.error("Failed to tar plugins directories and upload to staging dir", e);
      throw new RuntimeException(e);
    } 

But in the mappers expecting the tar ball to be available at the local file system which are not downloaded from the HDFS before starting the mappers so the job fails with CNFE.


    File localPluginsTarFile = new File(PINOT_PLUGINS_TAR_GZ);
    if (localPluginsTarFile.exists()) {
      File pluginsDirFile = Files.createTempDirectory(PINOT_PLUGINS_DIR).toFile();
      try {
        TarCompressionUtils.untar(localPluginsTarFile, pluginsDirFile);
      } catch (Exception e) {
        LOGGER.error("Failed to untar local Pinot plugins tarball file [{}]", localPluginsTarFile, e);
        throw new RuntimeException(e);
      }
      LOGGER.info("Trying to set System Property: {}={}", PLUGINS_DIR_PROPERTY_NAME, pluginsDirFile.getAbsolutePath());
      System.setProperty(PLUGINS_DIR_PROPERTY_NAME, pluginsDirFile.getAbsolutePath());
      String pluginsIncludes = _jobConf.get(PLUGINS_INCLUDE_PROPERTY_NAME);
      if (pluginsIncludes != null) {
        LOGGER.info("Trying to set System Property: {}={}", PLUGINS_INCLUDE_PROPERTY_NAME, pluginsIncludes);
        System.setProperty(PLUGINS_INCLUDE_PROPERTY_NAME, pluginsIncludes);
      }
      LOGGER.info("Pinot plugins System Properties are set at [{}], plugins includes [{}]",
          System.getProperty(PLUGINS_DIR_PROPERTY_NAME), System.getProperty(PLUGINS_INCLUDE_PROPERTY_NAME));
    } else {
      LOGGER.warn("Cannot find local Pinot plugins directory at [{}]", localPluginsTarFile.getAbsolutePath());
    }

Fixing this to make it work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants