Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models
The imitation of cursive handwriting is mainly limited to generating handwritten words or lines. Multiple synthetic outputs must be stitched together to create paragraphs or whole pages, whereby consistency and layout information are lost. To close this gap, we propose a method for imitating handwri...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | |
container_volume | |
creator | Mayr, Martin Dreier, Marcel Kordon, Florian Seuret, Mathias Zöllner, Jochen Wu, Fei Maier, Andreas Christlein, Vincent |
description | The imitation of cursive handwriting is mainly limited to generating
handwritten words or lines. Multiple synthetic outputs must be stitched
together to create paragraphs or whole pages, whereby consistency and layout
information are lost. To close this gap, we propose a method for imitating
handwriting at the paragraph level that also works for unseen writing styles.
Therefore, we introduce a modified latent diffusion model that enriches the
encoder-decoder mechanism with specialized loss functions that explicitly
preserve the style and content. We enhance the attention mechanism of the
diffusion model with adaptive 2D positional encoding and the conditioning
mechanism to work with two modalities simultaneously: a style image and the
target text. This significantly improves the realism of the generated
handwriting. Our approach sets a new benchmark in our comprehensive evaluation.
It outperforms all existing imitation methods at both line and paragraph
levels, considering combined style and content preservation. |
doi_str_mv | 10.48550/arxiv.2409.00786 |
format | Article |
fullrecord | <record><control><sourceid>arxiv_GOX</sourceid><recordid>TN_cdi_arxiv_primary_2409_00786</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2409_00786</sourcerecordid><originalsourceid>FETCH-arxiv_primary_2409_007863</originalsourceid><addsrcrecordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DMwMLcw42TwiUotytcNzsgvUQhILEpML0osyNDNSS1LzVHwSMxLKS_KLMnMS1fwzM0sSSzJzM9TKM8syVDwSSxJzStRcMlMSystBon65qek5hTzMLCmJeYUp_JCaW4GeTfXEGcPXbC98QVFmbmJRZXxIPvjwfYbE1YBAA-OO24</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models</title><source>arXiv.org</source><creator>Mayr, Martin ; Dreier, Marcel ; Kordon, Florian ; Seuret, Mathias ; Zöllner, Jochen ; Wu, Fei ; Maier, Andreas ; Christlein, Vincent</creator><creatorcontrib>Mayr, Martin ; Dreier, Marcel ; Kordon, Florian ; Seuret, Mathias ; Zöllner, Jochen ; Wu, Fei ; Maier, Andreas ; Christlein, Vincent</creatorcontrib><description>The imitation of cursive handwriting is mainly limited to generating
handwritten words or lines. Multiple synthetic outputs must be stitched
together to create paragraphs or whole pages, whereby consistency and layout
information are lost. To close this gap, we propose a method for imitating
handwriting at the paragraph level that also works for unseen writing styles.
Therefore, we introduce a modified latent diffusion model that enriches the
encoder-decoder mechanism with specialized loss functions that explicitly
preserve the style and content. We enhance the attention mechanism of the
diffusion model with adaptive 2D positional encoding and the conditioning
mechanism to work with two modalities simultaneously: a style image and the
target text. This significantly improves the realism of the generated
handwriting. Our approach sets a new benchmark in our comprehensive evaluation.
It outperforms all existing imitation methods at both line and paragraph
levels, considering combined style and content preservation.</description><identifier>DOI: 10.48550/arxiv.2409.00786</identifier><language>eng</language><subject>Computer Science - Computer Vision and Pattern Recognition</subject><creationdate>2024-09</creationdate><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,780,885</link.rule.ids><linktorsrc>$$Uhttps://arxiv.org/abs/2409.00786$$EView_record_in_Cornell_University$$FView_record_in_$$GCornell_University$$Hfree_for_read</linktorsrc><backlink>$$Uhttps://doi.org/10.48550/arXiv.2409.00786$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Mayr, Martin</creatorcontrib><creatorcontrib>Dreier, Marcel</creatorcontrib><creatorcontrib>Kordon, Florian</creatorcontrib><creatorcontrib>Seuret, Mathias</creatorcontrib><creatorcontrib>Zöllner, Jochen</creatorcontrib><creatorcontrib>Wu, Fei</creatorcontrib><creatorcontrib>Maier, Andreas</creatorcontrib><creatorcontrib>Christlein, Vincent</creatorcontrib><title>Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models</title><description>The imitation of cursive handwriting is mainly limited to generating
handwritten words or lines. Multiple synthetic outputs must be stitched
together to create paragraphs or whole pages, whereby consistency and layout
information are lost. To close this gap, we propose a method for imitating
handwriting at the paragraph level that also works for unseen writing styles.
Therefore, we introduce a modified latent diffusion model that enriches the
encoder-decoder mechanism with specialized loss functions that explicitly
preserve the style and content. We enhance the attention mechanism of the
diffusion model with adaptive 2D positional encoding and the conditioning
mechanism to work with two modalities simultaneously: a style image and the
target text. This significantly improves the realism of the generated
handwriting. Our approach sets a new benchmark in our comprehensive evaluation.
It outperforms all existing imitation methods at both line and paragraph
levels, considering combined style and content preservation.</description><subject>Computer Science - Computer Vision and Pattern Recognition</subject><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>GOX</sourceid><recordid>eNpjYJA0NNAzsTA1NdBPLKrILNMzMjGw1DMwMLcw42TwiUotytcNzsgvUQhILEpML0osyNDNSS1LzVHwSMxLKS_KLMnMS1fwzM0sSSzJzM9TKM8syVDwSSxJzStRcMlMSystBon65qek5hTzMLCmJeYUp_JCaW4GeTfXEGcPXbC98QVFmbmJRZXxIPvjwfYbE1YBAA-OO24</recordid><startdate>20240901</startdate><enddate>20240901</enddate><creator>Mayr, Martin</creator><creator>Dreier, Marcel</creator><creator>Kordon, Florian</creator><creator>Seuret, Mathias</creator><creator>Zöllner, Jochen</creator><creator>Wu, Fei</creator><creator>Maier, Andreas</creator><creator>Christlein, Vincent</creator><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20240901</creationdate><title>Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models</title><author>Mayr, Martin ; Dreier, Marcel ; Kordon, Florian ; Seuret, Mathias ; Zöllner, Jochen ; Wu, Fei ; Maier, Andreas ; Christlein, Vincent</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-arxiv_primary_2409_007863</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Computer Science - Computer Vision and Pattern Recognition</topic><toplevel>online_resources</toplevel><creatorcontrib>Mayr, Martin</creatorcontrib><creatorcontrib>Dreier, Marcel</creatorcontrib><creatorcontrib>Kordon, Florian</creatorcontrib><creatorcontrib>Seuret, Mathias</creatorcontrib><creatorcontrib>Zöllner, Jochen</creatorcontrib><creatorcontrib>Wu, Fei</creatorcontrib><creatorcontrib>Maier, Andreas</creatorcontrib><creatorcontrib>Christlein, Vincent</creatorcontrib><collection>arXiv Computer Science</collection><collection>arXiv.org</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Mayr, Martin</au><au>Dreier, Marcel</au><au>Kordon, Florian</au><au>Seuret, Mathias</au><au>Zöllner, Jochen</au><au>Wu, Fei</au><au>Maier, Andreas</au><au>Christlein, Vincent</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models</atitle><date>2024-09-01</date><risdate>2024</risdate><abstract>The imitation of cursive handwriting is mainly limited to generating
handwritten words or lines. Multiple synthetic outputs must be stitched
together to create paragraphs or whole pages, whereby consistency and layout
information are lost. To close this gap, we propose a method for imitating
handwriting at the paragraph level that also works for unseen writing styles.
Therefore, we introduce a modified latent diffusion model that enriches the
encoder-decoder mechanism with specialized loss functions that explicitly
preserve the style and content. We enhance the attention mechanism of the
diffusion model with adaptive 2D positional encoding and the conditioning
mechanism to work with two modalities simultaneously: a style image and the
target text. This significantly improves the realism of the generated
handwriting. Our approach sets a new benchmark in our comprehensive evaluation.
It outperforms all existing imitation methods at both line and paragraph
levels, considering combined style and content preservation.</abstract><doi>10.48550/arxiv.2409.00786</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | DOI: 10.48550/arxiv.2409.00786 |
ispartof | |
issn | |
language | eng |
recordid | cdi_arxiv_primary_2409_00786 |
source | arXiv.org |
subjects | Computer Science - Computer Vision and Pattern Recognition |
title | Zero-Shot Paragraph-level Handwriting Imitation with Latent Diffusion Models |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-05T21%3A44%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-arxiv_GOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Zero-Shot%20Paragraph-level%20Handwriting%20Imitation%20with%20Latent%20Diffusion%20Models&rft.au=Mayr,%20Martin&rft.date=2024-09-01&rft_id=info:doi/10.48550/arxiv.2409.00786&rft_dat=%3Carxiv_GOX%3E2409_00786%3C/arxiv_GOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |