MFP3D: Monocular Food Portion Estimation Leveraging 3D Point Clouds
Food portion estimation is crucial for monitoring health and tracking dietary intake. Image-based dietary assessment, which involves analyzing eating occasion images using computer vision techniques, is increasingly replacing traditional methods such as 24-hour recalls. However, accurately estimatin...
Gespeichert in:
Hauptverfasser: | , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Food portion estimation is crucial for monitoring health and tracking dietary
intake. Image-based dietary assessment, which involves analyzing eating
occasion images using computer vision techniques, is increasingly replacing
traditional methods such as 24-hour recalls. However, accurately estimating the
nutritional content from images remains challenging due to the loss of 3D
information when projecting to the 2D image plane. Existing portion estimation
methods are challenging to deploy in real-world scenarios due to their reliance
on specific requirements, such as physical reference objects, high-quality
depth information, or multi-view images and videos. In this paper, we introduce
MFP3D, a new framework for accurate food portion estimation using only a single
monocular image. Specifically, MFP3D consists of three key modules: (1) a 3D
Reconstruction Module that generates a 3D point cloud representation of the
food from the 2D image, (2) a Feature Extraction Module that extracts and
concatenates features from both the 3D point cloud and the 2D RGB image, and
(3) a Portion Regression Module that employs a deep regression model to
estimate the food's volume and energy content based on the extracted features.
Our MFP3D is evaluated on MetaFood3D dataset, demonstrating its significant
improvement in accurate portion estimation over existing methods. |
---|---|
DOI: | 10.48550/arxiv.2411.10492 |