MeronymNet: A Hierarchical Approach for Unified and Controllable Multi-Category Object Generation
We introduce MeronymNet, a novel hierarchical approach for controllable, part-based generation of multi-category objects using a single unified model. We adopt a guided coarse-to-fine strategy involving semantically conditioned generation of bounding box layouts, pixel-level part layouts and ultimat...
Gespeichert in:
Veröffentlicht in: | arXiv.org 2021-10 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Baghel, Rishabh Trivedi, Abhishek Ravichandran, Tejas Sarvadevabhatla, Ravi Kiran |
description | We introduce MeronymNet, a novel hierarchical approach for controllable, part-based generation of multi-category objects using a single unified model. We adopt a guided coarse-to-fine strategy involving semantically conditioned generation of bounding box layouts, pixel-level part layouts and ultimately, the object depictions themselves. We use Graph Convolutional Networks, Deep Recurrent Networks along with custom-designed Conditional Variational Autoencoders to enable flexible, diverse and category-aware generation of 2-D objects in a controlled manner. The performance scores for generated objects reflect MeronymNet's superior performance compared to multiple strong baselines and ablative variants. We also showcase MeronymNet's suitability for controllable object generation and interactive object editing at various levels of structural and semantic granularity. |
doi_str_mv | 10.48550/arxiv.2110.08818 |
format | Article |
fullrecord | <record><control><sourceid>proquest_arxiv</sourceid><recordid>TN_cdi_arxiv_primary_2110_08818</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2583227389</sourcerecordid><originalsourceid>FETCH-LOGICAL-a529-ec1d1cf186df58b9f910141bf9c2a0e33338d260931b397507b3ba76cad399de3</originalsourceid><addsrcrecordid>eNotUEtLAzEYDIJgqf0Bngx43ppHs5t4K4u2Qmsv9bx8edmUbVKzW7H_3rU6l4FhGGYGoTtKpjMpBHmE_B2-powOApGSyis0YpzTQs4Yu0GTrtsTQlhZMSH4CMHa5RTPhzfXP-E5XgaXIZtdMNDi-fGYE5gd9inj9xh8cBZDtLhOsc-pbUG3Dq9PbR-KGnr3kfIZb_TemR4vXByS-pDiLbr20HZu8s9jtH153tbLYrVZvNbzVQGCqcIZaqnxVJbWC6mVV5TQGdVeGQbE8QHSspIoTjVXlSCV5hqq0oDlSlnHx-j-L_ayvznmcIB8bn5_aC4_DI6HP8ew6vPkur7Zp1OOQ6eGCckZq7hU_AdD9WB8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2583227389</pqid></control><display><type>article</type><title>MeronymNet: A Hierarchical Approach for Unified and Controllable Multi-Category Object Generation</title><source>arXiv.org</source><source>Free E- Journals</source><creator>Baghel, Rishabh ; Trivedi, Abhishek ; Ravichandran, Tejas ; Sarvadevabhatla, Ravi Kiran</creator><creatorcontrib>Baghel, Rishabh ; Trivedi, Abhishek ; Ravichandran, Tejas ; Sarvadevabhatla, Ravi Kiran</creatorcontrib><description>We introduce MeronymNet, a novel hierarchical approach for controllable, part-based generation of multi-category objects using a single unified model. We adopt a guided coarse-to-fine strategy involving semantically conditioned generation of bounding box layouts, pixel-level part layouts and ultimately, the object depictions themselves. We use Graph Convolutional Networks, Deep Recurrent Networks along with custom-designed Conditional Variational Autoencoders to enable flexible, diverse and category-aware generation of 2-D objects in a controlled manner. The performance scores for generated objects reflect MeronymNet's superior performance compared to multiple strong baselines and ablative variants. We also showcase MeronymNet's suitability for controllable object generation and interactive object editing at various levels of structural and semantic granularity.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2110.08818</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Ablation ; Computer Science - Computer Vision and Pattern Recognition ; Computer Science - Graphics ; Computer Science - Multimedia ; Interactive control ; Layouts ; Object generation ; Stability</subject><ispartof>arXiv.org, 2021-10</ispartof><rights>2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>http://arxiv.org/licenses/nonexclusive-distrib/1.0</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>228,230,777,781,882,27906</link.rule.ids><backlink>$$Uhttps://doi.org/10.1145/3474085.3475521$$DView published paper (Access to full text may be restricted)$$Hfree_for_read</backlink><backlink>$$Uhttps://doi.org/10.48550/arXiv.2110.08818$$DView paper in arXiv$$Hfree_for_read</backlink></links><search><creatorcontrib>Baghel, Rishabh</creatorcontrib><creatorcontrib>Trivedi, Abhishek</creatorcontrib><creatorcontrib>Ravichandran, Tejas</creatorcontrib><creatorcontrib>Sarvadevabhatla, Ravi Kiran</creatorcontrib><title>MeronymNet: A Hierarchical Approach for Unified and Controllable Multi-Category Object Generation</title><title>arXiv.org</title><description>We introduce MeronymNet, a novel hierarchical approach for controllable, part-based generation of multi-category objects using a single unified model. We adopt a guided coarse-to-fine strategy involving semantically conditioned generation of bounding box layouts, pixel-level part layouts and ultimately, the object depictions themselves. We use Graph Convolutional Networks, Deep Recurrent Networks along with custom-designed Conditional Variational Autoencoders to enable flexible, diverse and category-aware generation of 2-D objects in a controlled manner. The performance scores for generated objects reflect MeronymNet's superior performance compared to multiple strong baselines and ablative variants. We also showcase MeronymNet's suitability for controllable object generation and interactive object editing at various levels of structural and semantic granularity.</description><subject>Ablation</subject><subject>Computer Science - Computer Vision and Pattern Recognition</subject><subject>Computer Science - Graphics</subject><subject>Computer Science - Multimedia</subject><subject>Interactive control</subject><subject>Layouts</subject><subject>Object generation</subject><subject>Stability</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GOX</sourceid><recordid>eNotUEtLAzEYDIJgqf0Bngx43ppHs5t4K4u2Qmsv9bx8edmUbVKzW7H_3rU6l4FhGGYGoTtKpjMpBHmE_B2-powOApGSyis0YpzTQs4Yu0GTrtsTQlhZMSH4CMHa5RTPhzfXP-E5XgaXIZtdMNDi-fGYE5gd9inj9xh8cBZDtLhOsc-pbUG3Dq9PbR-KGnr3kfIZb_TemR4vXByS-pDiLbr20HZu8s9jtH153tbLYrVZvNbzVQGCqcIZaqnxVJbWC6mVV5TQGdVeGQbE8QHSspIoTjVXlSCV5hqq0oDlSlnHx-j-L_ayvznmcIB8bn5_aC4_DI6HP8ew6vPkur7Zp1OOQ6eGCckZq7hU_AdD9WB8</recordid><startdate>20211017</startdate><enddate>20211017</enddate><creator>Baghel, Rishabh</creator><creator>Trivedi, Abhishek</creator><creator>Ravichandran, Tejas</creator><creator>Sarvadevabhatla, Ravi Kiran</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>AKY</scope><scope>GOX</scope></search><sort><creationdate>20211017</creationdate><title>MeronymNet: A Hierarchical Approach for Unified and Controllable Multi-Category Object Generation</title><author>Baghel, Rishabh ; Trivedi, Abhishek ; Ravichandran, Tejas ; Sarvadevabhatla, Ravi Kiran</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a529-ec1d1cf186df58b9f910141bf9c2a0e33338d260931b397507b3ba76cad399de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Ablation</topic><topic>Computer Science - Computer Vision and Pattern Recognition</topic><topic>Computer Science - Graphics</topic><topic>Computer Science - Multimedia</topic><topic>Interactive control</topic><topic>Layouts</topic><topic>Object generation</topic><topic>Stability</topic><toplevel>online_resources</toplevel><creatorcontrib>Baghel, Rishabh</creatorcontrib><creatorcontrib>Trivedi, Abhishek</creatorcontrib><creatorcontrib>Ravichandran, Tejas</creatorcontrib><creatorcontrib>Sarvadevabhatla, Ravi Kiran</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>arXiv Computer Science</collection><collection>arXiv.org</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Baghel, Rishabh</au><au>Trivedi, Abhishek</au><au>Ravichandran, Tejas</au><au>Sarvadevabhatla, Ravi Kiran</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MeronymNet: A Hierarchical Approach for Unified and Controllable Multi-Category Object Generation</atitle><jtitle>arXiv.org</jtitle><date>2021-10-17</date><risdate>2021</risdate><eissn>2331-8422</eissn><abstract>We introduce MeronymNet, a novel hierarchical approach for controllable, part-based generation of multi-category objects using a single unified model. We adopt a guided coarse-to-fine strategy involving semantically conditioned generation of bounding box layouts, pixel-level part layouts and ultimately, the object depictions themselves. We use Graph Convolutional Networks, Deep Recurrent Networks along with custom-designed Conditional Variational Autoencoders to enable flexible, diverse and category-aware generation of 2-D objects in a controlled manner. The performance scores for generated objects reflect MeronymNet's superior performance compared to multiple strong baselines and ablative variants. We also showcase MeronymNet's suitability for controllable object generation and interactive object editing at various levels of structural and semantic granularity.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2110.08818</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2021-10 |
issn | 2331-8422 |
language | eng |
recordid | cdi_arxiv_primary_2110_08818 |
source | arXiv.org; Free E- Journals |
subjects | Ablation Computer Science - Computer Vision and Pattern Recognition Computer Science - Graphics Computer Science - Multimedia Interactive control Layouts Object generation Stability |
title | MeronymNet: A Hierarchical Approach for Unified and Controllable Multi-Category Object Generation |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-20T22%3A15%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_arxiv&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MeronymNet:%20A%20Hierarchical%20Approach%20for%20Unified%20and%20Controllable%20Multi-Category%20Object%20Generation&rft.jtitle=arXiv.org&rft.au=Baghel,%20Rishabh&rft.date=2021-10-17&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2110.08818&rft_dat=%3Cproquest_arxiv%3E2583227389%3C/proquest_arxiv%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2583227389&rft_id=info:pmid/&rfr_iscdi=true |