Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation

1Xi'an Jiaotong University

Abstract

This paper introduces the point-axis representation for oriented object detection, as depicted in aerial images in Figure~\ref{intro_vis}, emphasizing its flexibility and geometrically intuitive nature with two key components: points and axes. 1) Points delineate the spatial extent and contours of objects, providing detailed shape descriptions. 2) Axes define the primary directionalities of objects, providing essential orientation cues crucial for precise detection. The point-axis representation decouples location and rotation, addressing the loss discontinuity issues commonly encountered in traditional bounding box-based approaches. For effective optimization without introducing additional annotations, we propose the max-projection loss to supervise point set learning and the cross-axis loss for robust axis representation learning. Further, leveraging this representation, we present the Oriented DETR model, seamlessly integrating the DETR framework for precise point-axis prediction and end-to-end detection. Experimental results demonstrate significant performance improvements in oriented object detection tasks.

Keywords: Oriented Object Detection, Aerial Object Detection, Point-Axis Representation, Detection Transformer

Challenges in Contemporary Representation Techniques

Figure
  1. Angle-based methods encounter abrupt changes in angle definitions.
  2. Quadrilateral-based methods struggle with sudden shifts in offset definitions.
  3. Gaussian distribution-based methods face the "square" problem.
  4. Point set-based methods often overlook the primary directionalities.

A Novel Point-Axis Representation

Figure
  1. In the point-axis representation, each oriented object is defined by a combination of a set of points and an axis representation. Point-axis representation effectively disentangles location and rotation, which prevents abrupt changesat boundaries.
  2. We propose the max-projection loss to supervise point set learning and the cross-axis loss for axis learning without introducing additional annotations.

Oriented DETR Model

Figure
  1. We extend the Detection Transformer (DETR) framework with point-axis prediction for detecting oriented objects.
  2. Oriented DETR introduces conditioned point queries and a points detection decoder for points prediction.

Visualization

Image

Remote Sensing

Figure

Text Spotting

Figure

Video

Previous Methods Based on Rotated Bounding Box

Oriented DETR via Point-Axis Representation

BibTeX

@article{zhao2024projecting,
  title={Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation},
  author={Zhao, Zeyang and Xue, Qilong and He, Yuhang and Bai, Yifan and Wei, Xing and Gong, Yihong},
  journal={arXiv preprint arXiv:2407.08489},
  year={2024}
},