这篇文章记录下下面两篇文章
Point Pair Features Based Object Detection and Pose Estimation Revisited
PPFNet: Global Context Aware Local Features for Robust 3D Point Matching

1.Point Pair Features Based Object Detection and Pose Estimation Revisited

1.1 摘要

摘要讲了之前的做什么,怎么做,有什么缺点,文章的方法,
摘要先讲了之前的某一篇论文的做法,1.using self-similar-point pairs represent 3D3D3D target object; 2. Hough-like on reduced pose parameter space 3.match 3D3D3D model to 3D3D3D scene.然后提出这些做法有几个缺点:比如对建立的 3D correspondences 敏感,还有模型稀疏或者outlier太多的时候效果不好。
然后文章提出自己的方法:1、couple object detection with a coarse-to-fine segmentation; 2、match: a weighed Houghing voting; an interpolated recovery of pose

1.2 Introduction

对于本文的做法

  • object:depth only; extract feature relating pairs of 3D points and their normals: store in hash table
  • scene: extract feature: query hash table, hough-like voting; multi instances:multi Hough peak

1.3 Method

1.3.1 Contribution
  • an enhanced model representation
  • voting with segmentation
  • a fast hypotheis verification
1.3.2 Model representation
  • computing the surface normals and the weights
  • downsample the points
  • hash-table is created, storing the quantized pair features as well as the weights and the rotation angles to the ground plane
1.3.2.1 Surface Features

F(m1,m2)=(∣∣d∣∣2,∠(n1,d),∠(n2,d),∠(n1,n2))F(m_1,m_2) = (||d||_2,∠(n_1,d),∠(n_2,d),∠(n_1,n_2))F(m1,m2)=(d2,(n1,d),(n2,d),(n1,n2))
上代码比较好理解

1 ppf(point1,point2)
2     d = point1.Location - point2.Location
3     d_unit = d/norm(d)
4     apha1 = acos(point1.Normal*d_unit')
5     apha2 = acos(point2.Normal*d_unit')
6     apha3 = acos(point1.Normal*point2.Normal')
7     return [norm(d),apha1,apha2,apha3] 

Also compute the angle between two vectors as follows:
∠(v1,v2)=tan−1∣∣v1×v2∣∣v1⋅v2∠(v_1,v_2) = tan^{−1}\frac{||v_1 ×v_2||}{ v1 ·v2 }(v1,v2)=tan1v1v2v1×v2

1.3.2.2 Computing Model Normals

上面的特征是需要法线的,这里能得到越准确的法线越好
use 2nd2^{nd}2nd order term
目标是在给定局部参考坐标系的情况下,求出二阶多项式的参数,逼近相邻点的高度场

Given a point pip_ipi on the set P∈R3P \in R^3PR3, MLS operates by fitting a surface of order mmm in a local KKK-neighborhood pkp_kpk and projecting the point on this surface. Fitting is essentially a standard weighted least squares estimation of the polynomial surface parameters.(加权最小二乘估计)
The closer the neighbors are, the higher the contribution is. This is controlled by the weighting function : w(pi)=exp(−∣∣pi−pk∣∣/2σmls2)w(p_i) = exp(-||p_i - p_k||/2\sigma^2_{mls})w(pi)=exp(pipk/2σmls2)
σmls\sigma_{mls}σmls can be selected adaptively

1.3.2.3 Weighting Model Points

做这个的原因是作者认为不同的点对于match来说有不同的重要性。文章的做法是将焦点放在物体的可见表面上,因为那些点法线是准确的好。从文章看这个权重是用于hashbin上吗。

  • focus on the visible surfaces of the object (accuracy)
  • base weighting strategy on ambient occlusion maps

给一个半球,通过积分公式可以算出一个点ppp上面的遮挡
a hemisphere ΩΩ, the occlusion ApA_pAp at point ppp on a surface with normal nnn can be obtained by computing the integral of the visibility function VVV :
Ap=1π∫ΩV(p⋅w)dwA_p = \frac{1}{\pi}\int_{Ω} V(p ·w)dwAp=π1V(pw)dw
VVV is a dirac delta function, defined to be 111 if ppp is occluded in the direction of www and 000 otherwise. Based on ApA_pAp, we propose to weigh the entries of the hashtable. Thus, given the hashtable bins, our weights are nothing but a normalized, geometric mean of AmrA_{m_r}Amr and AmiA_{m_i}Ami

1.3.2.4 Global Model Description

Given the extracted PPF, the global description is implemented as a hash table mapping the feature space to the space of point pairs.
给定提取的PPF,全局描述被实现为将特征空间映射到点对空间的哈希表
sample distance and angles
a careful downsample a Poisson DiskSampling algorithm
all the points to have at least ddistd_{dist}ddist distances
This algorithm consists of generating samples from a uniform random distribution where the minimum distance between each sample is 2r2r2r.

1.3.3 Online Matching

input : depth
the required normals are computed using SRI method

1.3.3.1 Hough Voting

给定一个固定场景的点对(sr,si)(s_r,s_i)(sr,si),寻找一个最优的模型对应的(mr,mi)(m_r, m_i)(mr,mi)去匹配,计算6D6D6D pose.
当找到scene pair相对应的model pair,就会建立一个中间坐标系,其中mim_imisis_isi通过围绕法线旋转对象来对齐。模型和场景的平面旋转角有先进行预计算。
Whenever a model pair, corresponding to a scene pair is found, an intermediate coordinate system is established, where mim_imi and sis_isi are aligned by rotating the object around the normal. The planar rotation angle αm\alpha_mαm for the model is precomputed, while the analogous for the scene point αs\alpha_sαs is computed online. The resulting plana r otation angle aroundx-axis is found by a simple subtraction, α=αm−αs\alpha= \alpha_m −\alpha_sα=αmαs.

1.3.3.2 Matching Disjoint Segments

显示segment场景,然后filter掉一些场景

深度图看成是无向图,顶点和边,每条边都有非负权重。然后找到一个set CCC,计算CCC之间的相似性。
treat the depth image as an undirected graph G={V,E}G = \{V,E\}G={V,E}, vertices vi∈Vv_i \in VviV and edges (vi,vj)∈E(v_i, v_j) \in E(vi,vj)E, each edge has a non-negative weight w(vi,vj)w(v_i,v_j)w(vi,vj), We then seek to find a set of components C∈SC \in SCS, where S isthesegmentation. The component-wise similarity is achieved via the weights of the graph.
A pair-wise comparison predicate (P)(P)(P)
P(C1,C2)={1,ifD(C1,C2)>Mint(C1,C2)≤00,othersP(C_1,C_2) = \left\{ \begin{aligned} 1,if D(C_1,C_2) > M_{int}(C_1,C_2) \leq 0\\ 0, others \end{aligned} \right. P(C1,C2)={1,ifD(C1,C2)>Mint(C1,C2)00,others
D(C1,C2)D(C_1,C_2)D(C1,C2) is the difference between components, defined as the minimum weight edge:
D(C1,C2)=min⁡v1∈C1,vj∈C2,(vi,vj)∈Ew((vi,vj))D(C_1,C_2) = \min_{v_1 \in C_1,v_j \in C_2,(v_i,v_j) \in E} w((v_i,v_j))D(C1,C2)=minv1C1,vjC2,(vi,vj)Ew((vi,vj))
and minimum internal difference MintM_{int}Mint equals :
Mint(C1,C2)=min(Int(C1)+τ(C1),Int(C2)+τ(C2))M_{int}(C_1,C_2) = min(Int(C_1) + \tau(C_1),Int(C_2)+\tau(C_2))Mint(C1,C2)=min(Int(C1)+τ(C1),Int(C2)+τ(C2))

This approach generates a descent segmentation. But do not need to process every segment. Use three method to filter.

  • undefined depth values
  • not obeying the size constraints
  • evaluate the linearity of the segments(normal)
1.3.3.3 Pose Clustering and Averaging
1.3.3.4 Hypotheses Verification

Categorize the visible space into : Clutter(outlier) ScS_cSc, occluders SoS_oSo and points on the model SmS_mSm according to the following projectin error function:
Eh(p,m)=Dp−Φ(p∣M,Θh,K)E_h(p,m) = D_p - \Phi(p|M, \Theta_h,K)Eh(p,m)=DpΦ(pM,Θh,K)
Φ\PhiΦ selects the projection of the model points MMM corresponding to pixel ppp, given a camera matrix KKK and the pose parameters Θh\Theta_hΘh for hypothesis hhh. The classification for a given valid point ppp is conducted as :
p∈{Sm,if∣Eh(p,m)∣≤τmSo,if∣Eh(p,m)∣≥τoSc,otherwise p \in \left\{ \begin{aligned} S_m,if |E_h(p,m)| \leq \tau_m\\ S_o, if |E_h(p,m)| \geq \tau_o \\ S_c, otherwise \end{aligned} \right. pSm,ifEh(p,m)τmSo,ifEh(p,m)τoSc,otherwise
the Score for a given hypothesis is:
Sh=(1−p∈SoNm)∗p∈SmNm−∣So∣S_h = (1 - \frac{p\in S_o}{N_m})*\frac{p\in S_m}{N_m - |S_o|}Sh=(1NmpSo)NmSopSm
NmN_mNm the number of model points on valid region of the projection Φ(p∣M,Θh,K)\Phi(p|M, \Theta_h, K)Φ(pM,Θh,K), τm\tau_mτm and τo\tau_oτo depend on the sensor and are relaxed

1.4 Result

evaluate on synthetic and real datasets
real datsets: ACCV3D dataset

1.2 PPFNet: Global Context Aware Local Features for Robust 3D Point Matching

持续更新。。

Logo

魔乐社区(Modelers.cn) 是一个中立、公益的人工智能社区,提供人工智能工具、模型、数据的托管、展示与应用协同服务,为人工智能开发及爱好者搭建开放的学习交流平台。社区通过理事会方式运作,由全产业链共同建设、共同运营、共同享有,推动国产AI生态繁荣发展。

更多推荐