Departamento de Inform?ica da Universidade da Beira Interior

























SOCIA Lab. - Soft Computing and Image Analysis Group 

Department of Computer Science, University of Beira Interior, 6201-001 Covilhã, Portugal





Fully Annotated Datasets for Pedestrian Detection, Tracking, Re-Identification and Search from Aerial Devices




03-12-2020: A new file is available at the "Dataset/Download" section, containing the cropped regions-of-interest (ROIs) of all subjects in the P-DESTRE dataset, in ".jpg" format.


12-07-2020: The annotation files were updated. The repeated annotations per ID/frame and the bounding boxes with negative coordinates (i.e., partially out-of-screen objects) were removed.


21-05-2020: We updated the annotation files, and added head pose information (yaw, pitch and roll angles), obtained according to the Deep Head Pose [1] method.

[1] N. Ruiz, E. Chong and J. Rehg. Fine-Grained Head Pose Estimation Without Keypoints. In proceedings of the  IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, doi:  10.1109/CVPRW.2018.00281, 2018.


01-04-2020: We are pleased to announce the availability of the first (as per April, 2020) fully annotated freely available dataset for supporting the research about pedestrian 1) detection; 2) tracking; 3) re-identification and 4) search methods from aerial data. 


A large number of applications using unmanned aerial vehicle (UAV) sensors and platforms is being developed, for agriculture, logistics, recreational and military purposes. A branch of these applications uses the UAV exclusively for remote sensing purposes (RS), acquiring either top-view or oblique data that can be further processed at a centralized node. 

Simultaneously, being at the core of video surveillance analysis, growing research efforts have been putted in the development of pedestrian re-identification and search methods able to work in real-world conditions, which is seen as a grand challenge. In particular, the problem of identifying pedestrians in crowded scenes based on very low resolution and partially occluded data becomes much harder in the multi-camera/multi-session mode, when matching data acquired in different places and with time lapses that deny the use of clothing information.

To date, the evaluation of pedestrian identification techniques has been conducted mostly on tracking databases (such as PETS, VIPeR, ETHZ and i-LIDS), with limited availability of soft biometric information, or even on gait recognition datasets (e.g., CASIA), which data acquisition conditions are highly dissimilar of the typical occurring in surveillance environments. 

As a tool to support the research on pedestrian detection, tracking, re-identification and search methods, the P-DESTRE is a multi-session dataset of videos of pedestrians in outdoor public environments, fully annotated at the frame level for:

1) ID. Each pedestrian has a unique identifier that is kept among the data acquisition sessions, which enables to use the dataset for pedestrian re-identification and search problems;

2) Bounding box. The relative position of each pedestrian in the scene is provided as a bounding box, for every frame of the dataset, which also enables to use the data for object detection/semantic segmentation purposes;

3) Head Pose. 3D head pose information is provided in terms of "yaw", "pitch" and "roll" angles for all the bounding boxes (except backside views);
4) Soft biometrics. Each subject of the dataset is fully characterised using 16 labels: gender, age, height, body volume, ethnicity, hair color, hairstyle, beard, mustache, glasses, head accessories, action, accessories and clothing information (x3), which enables to use the dataset also for evaluating soft biometrics
inference and inference  techniques.

The P-DESTRE data were acquired inside the campi of two different universities, with students offering themselves as volunteers of the data acquisition sessions.

1) The University of Beira Interior, Portugal:

2) The JSS Science and Technology University, India:


Task 1: Pedestrian Detection

P-DESTRE dataset provides a bounding box for defining the region-of-interest regarding every pedestrian in each frame of any scene. This information is provided in the following way:

  •  [x, y, height, width]; with (x,y) being the coordinates of the upper left corner of a pedestrian region-of-interest  with "height" and "width" dimensions



Task 2: Pedestrian Tracking

Tracking information is provided implicitly by the detection information of each pedestrian in the scene, together with an ID that provides an unique identifier for each subject.
This enable to get the sequences of positions of each person along time.

Tracking Example 1:

Tracking Example 2:


Task 3: (Short-Term) Pedestrian Re-Identification

This is the task of associating images of the same person taken in different occasions of the same day, i.e., assuming that subjects keep the same clothes between the different images.

Example 1:

Example 2:


Task 4: (Long-Term) Pedestrian Search

In opposition to Re-Identification, this is a more challenging task, and aims at the association of images of the same person using data acquired in different days, where no clothing information can be reliably used.

Example 1:

Example 2:













DI-UBI Bloco VI Rua Marqu? de ?vila e Bolama P- 6201-001 Covilh?PORTUGAL