python连接hbase 带kerberos,HBase Kerberos连接更新策略
Recently I enabled kerberos in my cluster, everything works great until my kerberos login expires, at say, 12 hours. At that point any connections I have created, any tables created with those connections etc will throw when I use them. This could potentially crash my app depending on how I handle this.
I don't mind crashing hugely because my app is managed by slider which will resurrect the app if and when it goes down, however this will only happen when HBase is "used" (i.e. I call a method on a table with a now stale connection) which will probably be caused by a user interaction and this would lead to poor UX.
I don't want authentication implementation details to pervade my application and also don't want to create connection objects more often than is necessary because it is a costly operation which makes a large number of RPC calls (zookeeper metadata location to start with).
Is there a common strategy (preferably inbuilt in HBase client) for managing kerberos authentication expiry and renewing HBase connections/tables when that happens?
解决方案
A Kerberos TGT has a lifetime (e.g. 12h) and a renewable lifetime (e.g. 7 days). As long as the ticket is still valid and is still renewable, you can request a "free" renewal -- no password required --, and the lifetime counter is reset (e.g. 12h to go, again).
The Hadoop authentication library spawns a specific Java thread for automatic renewal of the current TGT. It's kind of ugly, using a kinit -R command line instead of a JAAS library call, but it works - see HADOOP-6656
So, if you get Slider to create a renewable ticket on startup, and if you can bribe your SysAdmin to raise the default (cf. client conf) and the max (cf. KDC conf) renewable lifetime to, say, 30 days, then your app could run for 30 days straight with the initial TGT. A nice improvement.
~~~~~~~~~~
If you really crave for eternity... sorry, but you will actually have some programming to do. That means a dedicated thread/process in charge or re-creating automagically the TGT.
The Java Way: on startup, before you connect to HBase/HDFS/whatever,
create explicitly an UGI with loginUserFromKeytab() then run
checkTGTAndReloginFromKeytab() from time to time
The Shell Way: start a shell that (a) creates a TGT with kinit (b)
spawns a sub-process that periodically fires kinit again (c)
launches your Java app then kills the subprocess when/if your app ever terminates
Caveat: if some other thread happens to open, or re-open, a connection while the TGT is being re-created, that connection may fail because the cache was empty at the exact time it was accessed ("race condition"). The next attempt will be successful, but expect a few rogue warnings in your logs.
~~~~~~~~~~
Final advice: you can use a private ticket cache for your app (i.e. you can run multiple apps on the same node with the same Linux account but different Kerberos principals) by setting KRB5CCNAME environment variable, as long as it's a "FILE:" cache.
魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。
更多推荐


所有评论(0)