Selected Publications

CVPRW 2018, 2018

In this paper, we introduce SoccerNet, a benchmark for action spotting in soccer videos. The dataset is composed of 500 complete soccer games from six main European leagues, covering three seasons from 2014 to 2017 and a total duration of 764 hours. A total of 6,637 temporal annotations are automatically parsed from online match reports at a one minute resolution for three main classes of events (Goal, Yellow/Red Card, and Substitution). As such, the dataset is easily scalable. These annotations are manually refined to a one second resolution by anchoring them at a single timestamp following well-defined soccer rules. With an average of one event every 6.9 minutes, this dataset focuses on the problem of localizing very sparse events within long videos. We define the task of spotting as finding the anchors of soccer events in a video. Making use of recent developments in the realm of generic action recognition and detection in video, we provide strong baselines for detecting soccer events. We show that our best model for classifying temporal segments of length one minute reaches a mean Average Precision (mAP) of 67.8%. For the spotting task, our baseline reaches an Average-mAP of 49.7% for tolerances ranging from 5 to 60 seconds.
CVPRW’18, 2018

Despite the numerous developments in object tracking, further improvement of current tracking algorithms is limited by small and mostly saturated datasets. As a matter of fact, data-hungry trackers based on deep-learning currently rely on object detection datasets due to the scarcity of dedicated large-scale tracking datasets. In this work, we present TrackingNet, the first large-scale dataset and benchmark for object tracking in the wild. We provide more than 30K videos with more than 14 million dense bounding box annotations. Our dataset covers a wide selection of object classes in broad and diverse context. By releasing such a large-scale dataset, we expect deep trackers to further improve and generalize. In addition, we introduce a new benchmark composed of 500 novel videos, modeled with a distribution similar to our training dataset. By sequestering the annotation of the test set and providing an online evaluation server, we provide a fair benchmark for future development of object trackers. Deep trackers fine-tuned on a fraction of our dataset improve their performance by up to 1.6% on OTB100 and up to 1.7% on TrackingNet Test. We provide an extensive benchmark on TrackingNet by evaluating more than 20 trackers. Our results suggest that object tracking in the wild is far from being solved.
ECCV’18, 2018

In this study, we evaluate the performances of the body tracking algorithm of the Kinect V2 low-cost time-of-flight camera for medical rehabilitation purposes…
MobiHealth, 2016

Recent Publications

More Publications

. A Moving 3D Laser Scanner for Automated Underbridge Inspection. Multidisciplinary Digital Publishing Institute, Machine, 2017.

Preprint PDF Project

. A Solution for Crime Scene Reconstruction using Time-of-Flight Cameras. ArXiv, 2017.

Preprint PDF Project

. Recognition of Children on Age-Different Images: Facial Morphology and Age-Stable Features. Elsevier, 2017.

Preprint PDF Project

. Accuracy of the Microsoft Kinect System in the Identification of the Body Posture. MobiHealth, 2016.

Preprint PDF Project Project


Metrological Characterisation of the Kinect V2

In this project, we investigate the performance of the Kinect V2 3D camera as a depth sensor and human body tracker.

Visione Industria

Visione Industria in an attempt to bring computer vision to the italian industry.

Computer Vision in Forensic Sciences

This collaboration is an attempt to apply computer vision for forensic application.

Human body rehabilitation

In this project, we track human body to assess their performances.

Under Bridge Reconstruction

In this project, we reconstructed the geometry under bridges for civil engineering purposes.

Robot Painter

In this project, we sensorized a tool with reflective marker in order to track its position and orientation in time and reproduce the trajectory with a anthropomorphic robot.

La Volage

La Volage is a lamp design based on ultrasound technology which variates the light intensity in function of its height in a room.

Academic Experience


Industrial and Information Engineering School, Politecnico di Milano, Italy:

  • 2nd year B.Sc. Mechanical Engineering, Mechanical and Thermal Measurements (2015, 2016).
  • 3rd year B.Sc. Energy Engineering, Industrial Measurement and Instruments (2014, 2016).
  • 1st year M.Sc. Mechanical Engineering, Measurements (2015, 2016).

Master Thesis tutoring/collaboration