You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes

Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks such as text-image searches, interactions with chatbots, and content generation. As use increases, so does concern over the disparities in performance and fairness of these models for different people in differe...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Magomere, Jabez, Ishida, Shu, Afonja, Tejumade, Salama, Aya, Kochin, Daniel, Yuehgoh, Foutse, Hamzaoui, Imane, Sefala, Raesetje, Alaagib, Aisha, Semenova, Elizaveta, Crais, Lauren, Hall, Siobhan Mackenzie
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Computers and Society
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Magomere, Jabez Ishida, Shu Afonja, Tejumade Salama, Aya Kochin, Daniel Yuehgoh, Foutse Hamzaoui, Imane Sefala, Raesetje Alaagib, Aisha Semenova, Elizaveta Crais, Lauren Hall, Siobhan Mackenzie
description	Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks such as text-image searches, interactions with chatbots, and content generation. As use increases, so does concern over the disparities in performance and fairness of these models for different people in different parts of the world. To assess these growing regional disparities, we present World Wide Dishes, a mixed text and image dataset consisting of 765 dishes, with dish names collected in 131 local languages. World Wide Dishes has been collected purely through human contribution and decentralised means, by creating a website widely distributed through social networks. Using the dataset, we demonstrate a novel means of operationalising capability and representational biases in foundation models such as language models and text-to-image generative models. We enrich these studies with a pilot community review to understand, from a first-person perspective, how these models generate images for people in five African countries and the United States. We find that these models generally do not produce quality text and image outputs of dishes specific to different regions. This is true even for the US, which is typically considered to be more well-resourced in training data - though the generation of US dishes does outperform that of the investigated African countries. The models demonstrate a propensity to produce outputs that are inaccurate as well as culturally misrepresentative, flattening, and insensitive. These failures in capability and representational bias have the potential to further reinforce stereotypes and disproportionately contribute to erasure based on region. The dataset and code are available at https://github.com/oxai/world-wide-dishes/.
doi_str_mv	10.48550/arxiv.2406.09496
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2406_09496</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2406_09496</sourcerecordid><originalsourceid>FETCH-LOGICAL-a676-58c85a6419fa2611ce39206cc03e546b9470a1a4a124319873df69c1c70c3b963</originalsourceid><addsrcrecordid>eNotj8FKAzEURbNxIdUPcOX7gRmTSSYzWYlUq0LBTaG4Gl6TlzYwnUgyrfbvHWtXlwOHC4exO8FL1dY1f8D0E45lpbguuVFGXzP_GQ-AieB7hyOcJiAcH2FB5MKwBR8Pg8MxxAH20VGfASHRdmLs-xO4cKSUadKig8nDTCNED-uYegfr4AieQ95RvmFXHvtMt5edsdXiZTV_K5Yfr-_zp2WButFF3dq2Rq2E8VhpISxJU3FtLZdUK70xquEoUKGolBSmbaTz2lhhG27lxmg5Y_f_t-fQ7iuFPaZT9xfcnYPlL4gJUFQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes</title><source>arXiv.org</source><creator>Magomere, Jabez ; Ishida, Shu ; Afonja, Tejumade ; Salama, Aya ; Kochin, Daniel ; Yuehgoh, Foutse ; Hamzaoui, Imane ; Sefala, Raesetje ; Alaagib, Aisha ; Semenova, Elizaveta ; Crais, Lauren ; Hall, Siobhan Mackenzie</creator><creatorcontrib>Magomere, Jabez ; Ishida, Shu ; Afonja, Tejumade ; Salama, Aya ; Kochin, Daniel ; Yuehgoh, Foutse ; Hamzaoui, Imane ; Sefala, Raesetje ; Alaagib, Aisha ; Semenova, Elizaveta ; Crais, Lauren ; Hall, Siobhan Mackenzie</creatorcontrib><description>Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks such as text-image searches, interactions with chatbots, and content generation. As use increases, so does concern over the disparities in performance and fairness of these models for different people in different parts of the world. To assess these growing regional disparities, we present World Wide Dishes, a mixed text and image dataset consisting of 765 dishes, with dish names collected in 131 local languages. World Wide Dishes has been collected purely through human contribution and decentralised means, by creating a website widely distributed through social networks. Using the dataset, we demonstrate a novel means of operationalising capability and representational biases in foundation models such as language models and text-to-image generative models. We enrich these studies with a pilot community review to understand, from a first-person perspective, how these models generate images for people in five African countries and the United States. We find that these models generally do not produce quality text and image outputs of dishes specific to different regions. This is true even for the US, which is typically considered to be more well-resourced in training data - though the generation of US dishes does outperform that of the investigated African countries. The models demonstrate a propensity to produce outputs that are inaccurate as well as culturally misrepresentative, flattening, and insensitive. These failures in capability and representational bias have the potential to further reinforce stereotypes and disproportionately contribute to erasure based on region. The dataset and code are available at https://github.com/oxai/world-wide-dishes/.</description><identifier>DOI: 10.48550/arxiv.2406.09496</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Computers and Society</subject><creationdate>2024-06</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2406.09496$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2406.09496$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Magomere, Jabez</creatorcontrib><creatorcontrib>Ishida, Shu</creatorcontrib><creatorcontrib>Afonja, Tejumade</creatorcontrib><creatorcontrib>Salama, Aya</creatorcontrib><creatorcontrib>Kochin, Daniel</creatorcontrib><creatorcontrib>Yuehgoh, Foutse</creatorcontrib><creatorcontrib>Hamzaoui, Imane</creatorcontrib><creatorcontrib>Sefala, Raesetje</creatorcontrib><creatorcontrib>Alaagib, Aisha</creatorcontrib><creatorcontrib>Semenova, Elizaveta</creatorcontrib><creatorcontrib>Crais, Lauren</creatorcontrib><creatorcontrib>Hall, Siobhan Mackenzie</creatorcontrib><title>You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes</title><description>Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks such as text-image searches, interactions with chatbots, and content generation. As use increases, so does concern over the disparities in performance and fairness of these models for different people in different parts of the world. To assess these growing regional disparities, we present World Wide Dishes, a mixed text and image dataset consisting of 765 dishes, with dish names collected in 131 local languages. World Wide Dishes has been collected purely through human contribution and decentralised means, by creating a website widely distributed through social networks. Using the dataset, we demonstrate a novel means of operationalising capability and representational biases in foundation models such as language models and text-to-image generative models. We enrich these studies with a pilot community review to understand, from a first-person perspective, how these models generate images for people in five African countries and the United States. We find that these models generally do not produce quality text and image outputs of dishes specific to different regions. This is true even for the US, which is typically considered to be more well-resourced in training data - though the generation of US dishes does outperform that of the investigated African countries. The models demonstrate a propensity to produce outputs that are inaccurate as well as culturally misrepresentative, flattening, and insensitive. These failures in capability and representational bias have the potential to further reinforce stereotypes and disproportionately contribute to erasure based on region. The dataset and code are available at https://github.com/oxai/world-wide-dishes/.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Computers and Society</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotj8FKAzEURbNxIdUPcOX7gRmTSSYzWYlUq0LBTaG4Gl6TlzYwnUgyrfbvHWtXlwOHC4exO8FL1dY1f8D0E45lpbguuVFGXzP_GQ-AieB7hyOcJiAcH2FB5MKwBR8Pg8MxxAH20VGfASHRdmLs-xO4cKSUadKig8nDTCNED-uYegfr4AieQ95RvmFXHvtMt5edsdXiZTV_K5Yfr-_zp2WButFF3dq2Rq2E8VhpISxJU3FtLZdUK70xquEoUKGolBSmbaTz2lhhG27lxmg5Y_f_t-fQ7iuFPaZT9xfcnYPlL4gJUFQ</recordid><startdate>20240613</startdate><enddate>20240613</enddate><creator>Magomere, Jabez</creator><creator>Ishida, Shu</creator><creator>Afonja, Tejumade</creator><creator>Salama, Aya</creator><creator>Kochin, Daniel</creator><creator>Yuehgoh, Foutse</creator><creator>Hamzaoui, Imane</creator><creator>Sefala, Raesetje</creator><creator>Alaagib, Aisha</creator><creator>Semenova, Elizaveta</creator><creator>Crais, Lauren</creator><creator>Hall, Siobhan Mackenzie</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240613</creationdate><title>You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes</title><author>Magomere, Jabez ; Ishida, Shu ; Afonja, Tejumade ; Salama, Aya ; Kochin, Daniel ; Yuehgoh, Foutse ; Hamzaoui, Imane ; Sefala, Raesetje ; Alaagib, Aisha ; Semenova, Elizaveta ; Crais, Lauren ; Hall, Siobhan Mackenzie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a676-58c85a6419fa2611ce39206cc03e546b9470a1a4a124319873df69c1c70c3b963</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Computers and Society</topic><toplevel>online_resources</toplevel><creatorcontrib>Magomere, Jabez</creatorcontrib><creatorcontrib>Ishida, Shu</creatorcontrib><creatorcontrib>Afonja, Tejumade</creatorcontrib><creatorcontrib>Salama, Aya</creatorcontrib><creatorcontrib>Kochin, Daniel</creatorcontrib><creatorcontrib>Yuehgoh, Foutse</creatorcontrib><creatorcontrib>Hamzaoui, Imane</creatorcontrib><creatorcontrib>Sefala, Raesetje</creatorcontrib><creatorcontrib>Alaagib, Aisha</creatorcontrib><creatorcontrib>Semenova, Elizaveta</creatorcontrib><creatorcontrib>Crais, Lauren</creatorcontrib><creatorcontrib>Hall, Siobhan Mackenzie</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Magomere, Jabez</au><au>Ishida, Shu</au><au>Afonja, Tejumade</au><au>Salama, Aya</au><au>Kochin, Daniel</au><au>Yuehgoh, Foutse</au><au>Hamzaoui, Imane</au><au>Sefala, Raesetje</au><au>Alaagib, Aisha</au><au>Semenova, Elizaveta</au><au>Crais, Lauren</au><au>Hall, Siobhan Mackenzie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes</atitle><date>2024-06-13</date><risdate>2024</risdate><abstract>Foundation models are increasingly ubiquitous in our daily lives, used in everyday tasks such as text-image searches, interactions with chatbots, and content generation. As use increases, so does concern over the disparities in performance and fairness of these models for different people in different parts of the world. To assess these growing regional disparities, we present World Wide Dishes, a mixed text and image dataset consisting of 765 dishes, with dish names collected in 131 local languages. World Wide Dishes has been collected purely through human contribution and decentralised means, by creating a website widely distributed through social networks. Using the dataset, we demonstrate a novel means of operationalising capability and representational biases in foundation models such as language models and text-to-image generative models. We enrich these studies with a pilot community review to understand, from a first-person perspective, how these models generate images for people in five African countries and the United States. We find that these models generally do not produce quality text and image outputs of dishes specific to different regions. This is true even for the US, which is typically considered to be more well-resourced in training data - though the generation of US dishes does outperform that of the investigated African countries. The models demonstrate a propensity to produce outputs that are inaccurate as well as culturally misrepresentative, flattening, and insensitive. These failures in capability and representational bias have the potential to further reinforce stereotypes and disproportionately contribute to erasure based on region. The dataset and code are available at https://github.com/oxai/world-wide-dishes/.</abstract><doi>10.48550/arxiv.2406.09496</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2406.09496
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2406_09496
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Computers and Society
title	You are what you eat? Feeding foundation models a regionally diverse food dataset of World Wide Dishes
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-20T00%3A58%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=You%20are%20what%20you%20eat?%20Feeding%20foundation%20models%20a%20regionally%20diverse%20food%20dataset%20of%20World%20Wide%20Dishes&rft.au=Magomere,%20Jabez&rft.date=2024-06-13&rft_id=info:doi/10.48550/arxiv.2406.09496&rft_dat=%3Carxiv_GOX%3E2406_09496%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true