在 Arm64 上编译 Hbase regexisart
如需转载,请标明原文出处以及作者
陈锐 RuiChen @kiwik
2019/06/25 15:42:14
背景
今天开始这个系列的第三篇文章,在aarch64
上编译HBase,
HBase 95%的代码由Java组成,有少量的Ruby,Perl和C++代码,本以为会比较顺利,但是还是遇到
问题。
编译环境
编译环境的配置请参考我的另一篇博客《在ARM64上编译Hadoop》, 在这里仅列出基本的编译环境信息:
- OS: CentOS 7.6
- Arch: aarch64
- Host: 华为云ARM公测实例
- HBase: git commit
15ac781057c2909dfa7867eadd5075e16d16ee65
(2019-06主干) - Java: 1.8.0
- Maven: 3.6.1
执行编译
根据HBase
的官方开发者文档,编译直接执行
Maven命令如下:
mvn package -DskipTests
[INFO] --- protobuf-maven-plugin:0.5.0:compile (compile-protoc) @ hbase-protocol ---
Downloading from alimaven: http://maven.aliyun.com/repository/central/com/google/protobuf/protoc/2.5.0/protoc-2.5.0-linux-aarch_64.exe
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary for Apache HBase 3.0.0-SNAPSHOT:
[INFO]
[INFO] Apache HBase ....................................... SUCCESS [ 5.884 s]
[INFO] Apache HBase - Checkstyle .......................... SUCCESS [ 3.614 s]
[INFO] Apache HBase - Annotations ......................... SUCCESS [ 1.053 s]
[INFO] Apache HBase - Build Configuration ................. SUCCESS [ 0.146 s]
[INFO] Apache HBase - Shaded Protocol ..................... SUCCESS [ 35.803 s]
[INFO] Apache HBase - Common .............................. SUCCESS [ 10.359 s]
[INFO] Apache HBase - Metrics API ......................... SUCCESS [ 1.553 s]
[INFO] Apache HBase - Hadoop Compatibility ................ SUCCESS [ 2.283 s]
[INFO] Apache HBase - Metrics Implementation .............. SUCCESS [ 1.470 s]
[INFO] Apache HBase - Hadoop Two Compatibility ............ SUCCESS [ 3.299 s]
[INFO] Apache HBase - Protocol ............................ FAILURE [ 0.933 s]
[INFO] Apache HBase - Client .............................. SKIPPED
[INFO] Apache HBase - Zookeeper ........................... SKIPPED
[INFO] Apache HBase - Replication ......................... SKIPPED
[INFO] Apache HBase - Resource Bundle ..................... SKIPPED
[INFO] Apache HBase - HTTP ................................ SKIPPED
[INFO] Apache HBase - Procedure ........................... SKIPPED
[INFO] Apache HBase - Server .............................. SKIPPED
[INFO] Apache HBase - MapReduce ........................... SKIPPED
[INFO] Apache HBase - Testing Util ........................ SKIPPED
[INFO] Apache HBase - Thrift .............................. SKIPPED
[INFO] Apache HBase - RSGroup ............................. SKIPPED
[INFO] Apache HBase - Shell ............................... SKIPPED
[INFO] Apache HBase - Coprocessor Endpoint ................ SKIPPED
[INFO] Apache HBase - Backup .............................. SKIPPED
[INFO] Apache HBase - Integration Tests ................... SKIPPED
[INFO] Apache HBase - Rest ................................ SKIPPED
[INFO] Apache HBase - Examples ............................ SKIPPED
[INFO] Apache HBase - Shaded .............................. SKIPPED
[INFO] Apache HBase - Shaded - Client (with Hadoop bundled) SKIPPED
[INFO] Apache HBase - Shaded - Client ..................... SKIPPED
[INFO] Apache HBase - Shaded - MapReduce .................. SKIPPED
[INFO] Apache HBase - External Block Cache ................ SKIPPED
[INFO] Apache HBase - Assembly ............................ SKIPPED
[INFO] Apache HBase Shaded Packaging Invariants ........... SKIPPED
[INFO] Apache HBase Shaded Packaging Invariants (with Hadoop bundled) SKIPPED
[INFO] Apache HBase - Archetypes .......................... SKIPPED
[INFO] Apache HBase - Exemplar for hbase-client archetype . SKIPPED
[INFO] Apache HBase - Exemplar for hbase-shaded-client archetype SKIPPED
[INFO] Apache HBase - Archetype builder ................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:08 min
[INFO] Finished at: 2019-06-25T14:20:55+08:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.xolstice.maven.plugins:protobuf-maven-plugin:0.5.0:compile (compile-protoc) on project hbase-protocol: Missing:
[ERROR] ----------
[ERROR] 1) com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0
[ERROR]
[ERROR] Try downloading the file manually from the project website.
[ERROR]
[ERROR] Then, install it using the command:
[ERROR] mvn install:install-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 -Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/path/to/file
[ERROR]
[ERROR] Alternatively, if you host your own repository you can deploy the file there:
[ERROR] mvn deploy:deploy-file -DgroupId=com.google.protobuf -DartifactId=protoc -Dversion=2.5.0 -Dclassifier=linux-aarch_64 -Dpackaging=exe -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
[ERROR]
[ERROR] Path to dependency:
[ERROR] 1) org.apache.hbase:hbase-protocol:jar:3.0.0-SNAPSHOT
[ERROR] 2) com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0
[ERROR]
[ERROR] ----------
[ERROR] 1 required artifact is missing.
[ERROR]
[ERROR] for artifact:
[ERROR] org.apache.hbase:hbase-protocol:jar:3.0.0-SNAPSHOT
[ERROR]
[ERROR] from the specified remote repositories:
[ERROR] alimaven (http://maven.aliyun.com/repository/central, releases=true, snapshots=false),
[ERROR] apache.snapshots (https://repository.apache.org/snapshots, releases=false, snapshots=true)
[ERROR]
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :hbase-protocol
根据上面的输出可以发现仅编译了几个模块就报错了,原因是在Maven仓库中找不到对应的包com.google.protobuf:protoc:exe:linux-aarch_64:2.5.0
,
HBase依赖一个aarch64
环境下的protoc
的二进制可执行文件,这个问题类似于我在《在ARM64上编译Hadoop》
中遇到的问题,稍有不同。
分析了一下HBase的根模块的pom.xml
文件,有如下的一段声明引起上述问题。
<plugin>
<groupId>org.xolstice.maven.plugins</groupId>
<artifactId>protobuf-maven-plugin</artifactId>
<version>${protobuf.plugin.version}</version>
<configuration>
<protocArtifact>com.google.protobuf:protoc:${external.protobuf.version}:exe:${os.detected.classifier}</protocArtifact>
<protoSourceRoot>${basedir}/src/main/protobuf/</protoSourceRoot>
<clearOutputDirectory>false</clearOutputDirectory>
<checkStaleness>true</checkStaleness>
</configuration>
</plugin>
HBase也用到了protobuf,用于RPC消息
的序列化和反序列化。编译时,通过Maven插件protobuf-maven-plugin
将源代码中的*.proto
文件
转换成Java源文件,然后再通过Java编译器编译成jar包。这个Maven插件依赖于protoc
,根据声明
${os.detected.classifier}
,Maven会自动对应不同的体系架构,从Maven仓库中下载相应的二进
制执行文件。这是非常方便的Maven内建机制,用于自动支持不同平台的Java依赖,但是HBase依赖于
一个很老版本的protoc 2.5.0,
但是Google还未开始从这个版本支持aarch64
。
对比HBase的另一个模块hbase-protocol-shaded
,应用了protoc 3.5.1-1,
这个版本已经有aarch64
的支持,编译是成功的。
[INFO] Apache HBase - Shaded Protocol ..................... SUCCESS [ 35.803 s]
问题
在HBase的jira问题跟踪系统里搜索了一下,这个问题HBASE-19146 在2017年就有人提出,并给出了建议的解决方案,但是HBase社区一直没有响应。
说到底Apache社区上游的几个大数据项目都没有明确回答一个问题:“是否支持ARM平台?”
这种不确定性和风险,不应该留给下游的用户和厂商,支持或者不支持社区上游应该给出明确的答案。
现在很多云服务提供商都发布了云上的ARM实例,作为基础设施,大数据的ARM化也有很多用户提出。 但是从我们现在调研来看,至少Hadoop、Spark和HBase对于ARM的支持都存在或大或小的问题,见我的博客。
整体看来Apache社区对于ARM平台的验证是缺失的,我们希望通过ARM CI
暴露这些问题,并推动
Apache社区的这些项目在ARM平台上有充分的验证,包括:编译、UT和功能测试,甚至是对比X86平台的
性能测试。最终官方声明支持ARM平台,使ARM不再是开源社区中的二等公民。
我们OpenLab团队当前已经在Apache社区的多个项目中开展
了ARM CI
的公开讨论,
如果你有志向一起推动ARM的服务器端软件生态,请加入我们,一起创造历史。