Algebraic Positional Encodings

We introduce a novel positional encoding strategy for Transformer-style models, addressing the shortcomings of existing, often ad hoc, approaches. Our framework provides a flexible mapping from the algebraic specification of a domain to an interpretation as orthogonal operators. This design preserve...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Kogkalidis, Konstantinos, Bernardy, Jean-Philippe, Garg, Vikas
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer Science - Artificial Intelligence Computer Science - Learning
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page
container_issue
container_start_page
container_title
container_volume
creator	Kogkalidis, Konstantinos Bernardy, Jean-Philippe Garg, Vikas
description	We introduce a novel positional encoding strategy for Transformer-style models, addressing the shortcomings of existing, often ad hoc, approaches. Our framework provides a flexible mapping from the algebraic specification of a domain to an interpretation as orthogonal operators. This design preserves the algebraic characteristics of the source domain, ensuring that the model upholds its desired structural properties. Our scheme can accommodate various structures, ncluding sequences, grids and trees, as well as their compositions. We conduct a series of experiments to demonstrate the practical applicability of our approach. Results suggest performance on par with or surpassing the current state-of-the-art, without hyper-parameter optimizations or "task search" of any kind. Code is available at https://github.com/konstantinosKokos/ape.
doi_str_mv	10.48550/arxiv.2312.16045
format	Article
fullrecord	<record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2312_16045</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2312_16045</sourcerecordid><originalsourceid>FETCH-LOGICAL-a675-b7f3faae1d6c8d654b6a97baf070b01cf7b8ea0b7c3fa3f936447e28ecbe2c933</originalsourceid><addsrcrecordid>eNotzrsOgjAYhuEuDga9ABflBsCeCyMheEhMdGAnf0tLmiAYMEbvXkWnb3nz5UFoRXDMEyHwFoanf8SUERoTibmYo3XWNlYP4E146Ud_930HbVh0pq9914wLNHPQjnb53wCVu6LMD9HpvD_m2SkCqUSklWMOwJJamqSWgmsJqdLgsMIaE-OUTixgrcwnYy5lknNlaWKNttSkjAVo87udgNVt8FcYXtUXWk1Q9gZzqThd</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Algebraic Positional Encodings</title><source>arXiv.org</source><creator>Kogkalidis, Konstantinos ; Bernardy, Jean-Philippe ; Garg, Vikas</creator><creatorcontrib>Kogkalidis, Konstantinos ; Bernardy, Jean-Philippe ; Garg, Vikas</creatorcontrib><description>We introduce a novel positional encoding strategy for Transformer-style models, addressing the shortcomings of existing, often ad hoc, approaches. Our framework provides a flexible mapping from the algebraic specification of a domain to an interpretation as orthogonal operators. This design preserves the algebraic characteristics of the source domain, ensuring that the model upholds its desired structural properties. Our scheme can accommodate various structures, ncluding sequences, grids and trees, as well as their compositions. We conduct a series of experiments to demonstrate the practical applicability of our approach. Results suggest performance on par with or surpassing the current state-of-the-art, without hyper-parameter optimizations or "task search" of any kind. Code is available at https://github.com/konstantinosKokos/ape.</description><identifier>DOI: 10.48550/arxiv.2312.16045</identifier><language>eng</language><subject>Computer Science - Artificial Intelligence ; Computer Science - Learning</subject><creationdate>2023-12</creationdate><rights>http://creativecommons.org/licenses/by-sa/4.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,776,881</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2312.16045$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2312.16045$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Kogkalidis, Konstantinos</creatorcontrib><creatorcontrib>Bernardy, Jean-Philippe</creatorcontrib><creatorcontrib>Garg, Vikas</creatorcontrib><title>Algebraic Positional Encodings</title><description>We introduce a novel positional encoding strategy for Transformer-style models, addressing the shortcomings of existing, often ad hoc, approaches. Our framework provides a flexible mapping from the algebraic specification of a domain to an interpretation as orthogonal operators. This design preserves the algebraic characteristics of the source domain, ensuring that the model upholds its desired structural properties. Our scheme can accommodate various structures, ncluding sequences, grids and trees, as well as their compositions. We conduct a series of experiments to demonstrate the practical applicability of our approach. Results suggest performance on par with or surpassing the current state-of-the-art, without hyper-parameter optimizations or "task search" of any kind. Code is available at https://github.com/konstantinosKokos/ape.</description><subject>Computer Science - Artificial Intelligence</subject><subject>Computer Science - Learning</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNotzrsOgjAYhuEuDga9ABflBsCeCyMheEhMdGAnf0tLmiAYMEbvXkWnb3nz5UFoRXDMEyHwFoanf8SUERoTibmYo3XWNlYP4E146Ud_930HbVh0pq9914wLNHPQjnb53wCVu6LMD9HpvD_m2SkCqUSklWMOwJJamqSWgmsJqdLgsMIaE-OUTixgrcwnYy5lknNlaWKNttSkjAVo87udgNVt8FcYXtUXWk1Q9gZzqThd</recordid><startdate>20231226</startdate><enddate>20231226</enddate><creator>Kogkalidis, Konstantinos</creator><creator>Bernardy, Jean-Philippe</creator><creator>Garg, Vikas</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20231226</creationdate><title>Algebraic Positional Encodings</title><author>Kogkalidis, Konstantinos ; Bernardy, Jean-Philippe ; Garg, Vikas</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a675-b7f3faae1d6c8d654b6a97baf070b01cf7b8ea0b7c3fa3f936447e28ecbe2c933</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Computer Science - Artificial Intelligence</topic><topic>Computer Science - Learning</topic><toplevel>online_resources</toplevel><creatorcontrib>Kogkalidis, Konstantinos</creatorcontrib><creatorcontrib>Bernardy, Jean-Philippe</creatorcontrib><creatorcontrib>Garg, Vikas</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kogkalidis, Konstantinos</au><au>Bernardy, Jean-Philippe</au><au>Garg, Vikas</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Algebraic Positional Encodings</atitle><date>2023-12-26</date><risdate>2023</risdate><abstract>We introduce a novel positional encoding strategy for Transformer-style models, addressing the shortcomings of existing, often ad hoc, approaches. Our framework provides a flexible mapping from the algebraic specification of a domain to an interpretation as orthogonal operators. This design preserves the algebraic characteristics of the source domain, ensuring that the model upholds its desired structural properties. Our scheme can accommodate various structures, ncluding sequences, grids and trees, as well as their compositions. We conduct a series of experiments to demonstrate the practical applicability of our approach. Results suggest performance on par with or surpassing the current state-of-the-art, without hyper-parameter optimizations or "task search" of any kind. Code is available at https://github.com/konstantinosKokos/ape.</abstract><doi>10.48550/arxiv.2312.16045</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	DOI: 10.48550/arxiv.2312.16045
ispartof
issn
language	eng
recordid	cdi_arxiv_primary_2312_16045
source	arXiv.org
subjects	Computer Science - Artificial Intelligence Computer Science - Learning
title	Algebraic Positional Encodings
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-27T16%3A39%3A15IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Algebraic%20Positional%20Encodings&rft.au=Kogkalidis,%20Konstantinos&rft.date=2023-12-26&rft_id=info:doi/10.48550/arxiv.2312.16045&rft_dat=%3Carxiv_GOX%3E2312_16045%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true