The Hamlyn Centre The Institute of Global Health Innovation Benny Lo, PhD The Hamlyn Centre Department of Surgery and Cancer Passive Dietary Monitoring - the use of wearable cameras and AI to quantify dietary intake An innovative passive dietary monitoring system The Bill and Melinda Gates Foundation funded project “An Innovative passive dietary monitoring system” aims to develop a passive dietary monitoring system for people living in Low-or-Middle Income Countries (LMICs) which does not rely on individual participation to record intake. This project focuses on both urban and rural areas in two African countries, Uganda and Ghana. To capture individual dietary intake, wearable camera technologies and fixed cameras are integrated into the system for capturing food preparation and eating activities in kitchens and dining areas. Extensive studies and field trials are being carried out in home settings in Uganda and Ghana. Nutrition intake estimate Context Nutrient Water (g) 146.92 Energy (kcal) 190 Protein (g) 3.44 Fat (g) 1.28 Carbohydrate (g) 46.5 Fiber (g) 3 Sugars (g) 38.16 Calcium (mg) 48 Iron (mg) 0.46 Magnesium (mg) 58 Phosphorus (mg) 42 Potassium (mg) 896 Sodium (mg) 4 Vitamin C (mg) 27.4 ~200g Volume estimation Jackfruit Food recognition USDA National Nutrient Database A wearable camera was mounted on a subject’s shoulder, at the same side as the subject’s dominant hand, to capture the entire eating episode Food consumption Qiu, J., Lo, F.P.W., Jiang, S., Tsai, C., Sun, Y. and Lo, B., 2020. Counting Bites and Recognizing Consumed Food from Videos for Passive Dietary Monitoring. IEEE Journal of Biomedical and Health Informatics. Egocentric Video Clips Snapshots of captured egocentric videos. • A new dataset was constructed, which has 1,022 egocentric video clips of dietary intake. 66 unique and visible food items were identified in the dataset Qiu, J., Lo, F.P.W., Jiang, S., Tsai, C., Sun, Y. and Lo, B., 2020. Counting Bites and Recognizing Consumed Food from Videos for Passive Dietary Monitoring. IEEE Journal of Biomedical and Health Informatics. Dataset Meal statistics. - 8 meal classes - 18 fine-grained meal classes - 66 unique food items (food ingredients and drinks) Results of Bite Counting • 64.89% accuracy of counting bites directly from videos Qiu, J., Lo, F.P.W., Jiang, S., Tsai, C., Sun, Y. and Lo, B., 2020. Counting Bites and Recognizing Consumed Food from Videos for Passive Dietary Monitoring. IEEE Journal of Biomedical and Health Informatics. Results of General Food Recognition • 97.55% accuracy of classifying a meal into 8 classes; 54.77% accuracy of classifying it into 18 classes. 65% accuracy of recognizing visible food items. Qiu, J., Lo, F.P.W., Jiang, S., Tsai, C., Sun, Y. and Lo, B., 2020. Counting Bites and Recognizing Consumed Food from Videos for Passive Dietary Monitoring. IEEE Journal of Biomedical and Health Informatics. Results of Consumed Food Recognition • 40.5% accuracy of recognizing food items consumed by the subjects Qiu, J., Lo, F.P.W., Jiang, S., Tsai, C., Sun, Y. and Lo, B., 2020. Counting Bites and Recognizing Consumed Food from Videos for Passive Dietary Monitoring. IEEE Journal of Biomedical and Health Informatics. Results Ground Truth 0: chicken 1: water 2: rice 3: takuan 4: celery 5: green_bean TSM (F1 59.5%) SlowFast (F1 65.0%) 0: chicken 1: rice 2: takuan 3: celery 4: green_bean 0: water 1: rice 2: takuan 3: celery 4: green_bean 5: pork_ribs 0: water 1: prawn 2: mussel 3: pasta 4: squid 5: tomato_sauce 0: water 1: prawn 2: mussel 3: pasta 4: squid 5: tomato_sauce 0: water 1: mussel 2: pasta 3: tomato_sauce 0: chicken 1: water 2: broccoli 3: rice 4: carrot 5: teriyaki_sauce 0: chicken 1: water 2: broccoli 3: rice 4: carrot 5: teriyaki_sauce 0: water 1: rice 2: carrot 3: teriyaki_sauce 0: chicken 1: rice 2: tofu 3: miso_soup 4: tomato 5: sushi_vegetable 6: soy_sauce 7: lettuce 0: rice 1: tofu 2: miso_soup 3: tomato 4: salmon 5: sushi_vegetable 6: soy_sauce 7: lettuce 0: rice 1: salmon 2: sushi_vegetable 3: soy_sauce 4: lettuce 0: rice 1: curry 2: pickled_radish 0: rice 1: curry 2: carrot 0: rice 1: curry 0: rice 0: baked_beans 1: hash_browns 2: scrambled_eggs 0: baked_beans 1: hash_browns 2: scrambled_eggs 0: baked_beans 1: scrambled_eggs 0: prawn 1: pasta 2: tomato_sauce 0: pasta 1: tomato_sauce 0: pasta 1: tomato_sauce 0: celery 1: green_bean 0: celery 1: green_bean Sample Frames TSM         (F1 37.8%) SlowFast (F1 40.5%)Two-Head Recognized food items (ingredients and drinks). Top 4 rows are samples of recognizing visible food items and bottom 4 rows are samples of recognizing consumed food items in a clip. True positives are indicated using green color and false positives are in red color. Qiu, J., Lo, F.P.W., Jiang, S., Tsai, C., Sun, Y. and Lo, B., 2020. Counting Bites and Recognizing Consumed Food from Videos for Passive Dietary Monitoring. IEEE Journal of Biomedical and Health Informatics. Studies Studies • Study 1: Laboratory validation of food intake estimation devices • Study 2: Acceptability and feasibility in the field • Phase 1: Household food behavior • Phase 2: Pre-field test data gathering prior to the preliminary field test: • Acceptability of the devices • Preliminary field test for acceptability, reliability and performance of recording devices • Study 3: Field validation studies in Uganda and Ghana • Phase 1: Preliminary field data (~4 households at each site (~ 16 in total) lasting one day) • Phase 2: System validation in target populations (in ~22 households at each site (~88 in total) lasting three consecutive days) Large datasets – Ghana study • Study 1 • 700k images • Study 2 • 2.9M images • Study 3 • ~7M images Food images captured by eButton Clustering Egocentric Images in Passive Dietary Monitoring with Self-Supervised Learning In passive dietary monitoring, wearable cameras continuously capture subjects' activities, which yields massive amount of data to be cleaned and annotated before analysis being conducted. Peng et al, Clustering Egocentric Images in Passive Dietary Monitoring with Self-Supervised Learning, in IEEE International Conference on Biomedical and Health Informatics (BHI22), Ioannina, Greece, Sep 27-30 2022 Objective We propose a novel self-supervised learning framework, named CM-Net to: • Cluster the large volume of egocentric images into separate events • Ease the data post-processing and annotation tasks for annotators and dietitians The proposed pipeline for clustering raw egocentric images into separate events. Peng et al, Clustering Egocentric Images in Passive Dietary Monitoring with Self-Supervised Learning, in IEEE International Conference on Biomedical and Health Informatics (BHI22), Ioannina, Greece, Sep 27-30 2022 Datasets Mother 55206 21% Father 21153 8% Child 36297 14% Mother 84402 33% Father 37244 15% Child 22570 9% Urban 112656 44% Rural 144216 56% Figure 3: Statistics of Dataset-L (a) and Dataset-S (b) (a) (b) Dataset-L (Large): includes 256,872 unprocessed egocentric images (no labels) taken from various individuals, households, areas. Used for self-supervised pre-training. Dataset-S (Small): includes 4954 images with 199 different dietary events. Each image is assigned with a label indicating which event it belongs to. Used for testing the performance of self-supervised learning frameworks. Peng et al, Clustering Egocentric Images in Passive Dietary Monitoring with Self-Supervised Learning, in IEEE International Conference on Biomedical and Health Informatics (BHI22), Ioannina, Greece, Sep 27-30 2022 Results - Clustering CM-Net MAE 96&97 Outlier in 96&97 44 38 44 38 25&47&55&111 Our CM-Net is able to merge images from the same event into a cluster where MAE fails, and better separate events with similar images than MAE. For example, event 44 resembles event 38 as they contain similar actions and objects. They both depict eating with a bowl (yellow and orange, respectively). Our CM-Net recognizes this difference and separates these two events distinctly whereas MAE clusters them together. Peng et al, Clustering Egocentric Images in Passive Dietary Monitoring with Self-Supervised Learning, in IEEE International Conference on Biomedical and Health Informatics (BHI22), Ioannina, Greece, Sep 27-30 2022 The Hamlyn Centre The Institute of Global Health Innovation Benny Lo, PhD benny.lo@imperial.ac.uk