A 2.5-GFLOPS, 6.5 million polygons per second, four-way VLIW geometry processor with SIMD instructions and a software bypass mechanism
A four-way very long instruction word (VLIW), 312-MHz geometry processor with peripheral component interconnect/accelerated graphic port bus bridge was implemented in a 0.21-/spl mu/m, 2.5-V, three-layer-metal CMOS process. We adopted (1) a software bypass mechanism, (2) single-instruction multiple-...
Gespeichert in:
Veröffentlicht in: | IEEE journal of solid-state circuits 1999-11, Vol.34 (11), p.1619-1626 |
---|---|
Hauptverfasser: | , , , , , , , , , , , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 1626 |
---|---|
container_issue | 11 |
container_start_page | 1619 |
container_title | IEEE journal of solid-state circuits |
container_volume | 34 |
creator | Kubosawa, H. Higaki, N. Ando, S. Takahashi, H. Asada, Y. Anbutsu, H. Sato, T. Sakate, M. Suga, A. Kimura, M. Miyake, H. Okano, H. Asato, A. Kimura, Y. Nakayama, H. Kimoto, M. Hirochi, K. Saito, H. Kaido, N. Nakagawa, Y. Shimada, T. |
description | A four-way very long instruction word (VLIW), 312-MHz geometry processor with peripheral component interconnect/accelerated graphic port bus bridge was implemented in a 0.21-/spl mu/m, 2.5-V, three-layer-metal CMOS process. We adopted (1) a software bypass mechanism, (2) single-instruction multiple-data stream instructions, (3) four sets of floating-point multiply add and accumulate execution units, (4) special condition code registers and a branch condition generator for a clipping operation, and (5) automatic clock delay tuning methodology. As a result of these features, we achieved a performance of 2.5 GFLOPS and 6.5 million polygons per second for a three-dimensional geometry processor, which is the highest published performance as a single geometry processor. The processor is applicable to computer-aided-design systems that require very high graphics performance. |
doi_str_mv | 10.1109/4.799871 |
format | Article |
fullrecord | <record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_proquest_miscellaneous_28171904</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>799871</ieee_id><sourcerecordid>28171904</sourcerecordid><originalsourceid>FETCH-LOGICAL-c307t-e9769547893c96601ccd7796f9f183628f20a331844f6313adf2a398d0474b163</originalsourceid><addsrcrecordid>eNp90U1P3DAQBmALtRJbQOLck0_QA9l6YscfR0QLXWkrKgEtt8g4NrhK4uDJapU_wO9u0KIeOY1G8-jVSC8hx8CWAMx8FUtljFawRxZQVboAxe8_kAVjoAtTMrZPPiH-nVchNCzIyzktl1Vxdbm-_nVzRuWyol1s25h6OqR2ekw90sFnit6lvjmjIW1ysbUT_b1e_aGPPnV-zBMdcnIeMWW6jeMTvVn9_EZjj2PeuDG-Zti-oZZiCuPWZk8fpsEi0s67J9tH7A7Jx2Bb9Edv84DcXX6_vfhRrK-vVhfn68JxpsbCGyVNJZQ23BkpGTjXKGVkMAE0l6UOJbOcgxYiSA7cNqG03OiGCSUeQPIDcrrLnR9-3ngc6y6i821re582WBswhsuqKmd58q4sNSgwTMzwyw66nBCzD_WQY2fzVAOrXyupRb2rZKafdzR67_-zt-M_OCiEdA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>28171904</pqid></control><display><type>article</type><title>A 2.5-GFLOPS, 6.5 million polygons per second, four-way VLIW geometry processor with SIMD instructions and a software bypass mechanism</title><source>IEEE Electronic Library (IEL)</source><creator>Kubosawa, H. ; Higaki, N. ; Ando, S. ; Takahashi, H. ; Asada, Y. ; Anbutsu, H. ; Sato, T. ; Sakate, M. ; Suga, A. ; Kimura, M. ; Miyake, H. ; Okano, H. ; Asato, A. ; Kimura, Y. ; Nakayama, H. ; Kimoto, M. ; Hirochi, K. ; Saito, H. ; Kaido, N. ; Nakagawa, Y. ; Shimada, T.</creator><creatorcontrib>Kubosawa, H. ; Higaki, N. ; Ando, S. ; Takahashi, H. ; Asada, Y. ; Anbutsu, H. ; Sato, T. ; Sakate, M. ; Suga, A. ; Kimura, M. ; Miyake, H. ; Okano, H. ; Asato, A. ; Kimura, Y. ; Nakayama, H. ; Kimoto, M. ; Hirochi, K. ; Saito, H. ; Kaido, N. ; Nakagawa, Y. ; Shimada, T.</creatorcontrib><description>A four-way very long instruction word (VLIW), 312-MHz geometry processor with peripheral component interconnect/accelerated graphic port bus bridge was implemented in a 0.21-/spl mu/m, 2.5-V, three-layer-metal CMOS process. We adopted (1) a software bypass mechanism, (2) single-instruction multiple-data stream instructions, (3) four sets of floating-point multiply add and accumulate execution units, (4) special condition code registers and a branch condition generator for a clipping operation, and (5) automatic clock delay tuning methodology. As a result of these features, we achieved a performance of 2.5 GFLOPS and 6.5 million polygons per second for a three-dimensional geometry processor, which is the highest published performance as a single geometry processor. The processor is applicable to computer-aided-design systems that require very high graphics performance.</description><identifier>ISSN: 0018-9200</identifier><identifier>EISSN: 1558-173X</identifier><identifier>DOI: 10.1109/4.799871</identifier><identifier>CODEN: IJSCBC</identifier><language>eng</language><publisher>IEEE</publisher><subject>Acceleration ; Bridges ; Buses (vehicles) ; Bypasses ; Clocks ; CMOS process ; Computer programs ; Delay ; Floating point arithmetic ; Geometry ; Graphics ; High performance computing ; Microprocessors ; Polygons ; Registers ; Software ; Tuning ; VLIW</subject><ispartof>IEEE journal of solid-state circuits, 1999-11, Vol.34 (11), p.1619-1626</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c307t-e9769547893c96601ccd7796f9f183628f20a331844f6313adf2a398d0474b163</citedby><cites>FETCH-LOGICAL-c307t-e9769547893c96601ccd7796f9f183628f20a331844f6313adf2a398d0474b163</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/799871$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,778,782,794,27911,27912,54745</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/799871$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Kubosawa, H.</creatorcontrib><creatorcontrib>Higaki, N.</creatorcontrib><creatorcontrib>Ando, S.</creatorcontrib><creatorcontrib>Takahashi, H.</creatorcontrib><creatorcontrib>Asada, Y.</creatorcontrib><creatorcontrib>Anbutsu, H.</creatorcontrib><creatorcontrib>Sato, T.</creatorcontrib><creatorcontrib>Sakate, M.</creatorcontrib><creatorcontrib>Suga, A.</creatorcontrib><creatorcontrib>Kimura, M.</creatorcontrib><creatorcontrib>Miyake, H.</creatorcontrib><creatorcontrib>Okano, H.</creatorcontrib><creatorcontrib>Asato, A.</creatorcontrib><creatorcontrib>Kimura, Y.</creatorcontrib><creatorcontrib>Nakayama, H.</creatorcontrib><creatorcontrib>Kimoto, M.</creatorcontrib><creatorcontrib>Hirochi, K.</creatorcontrib><creatorcontrib>Saito, H.</creatorcontrib><creatorcontrib>Kaido, N.</creatorcontrib><creatorcontrib>Nakagawa, Y.</creatorcontrib><creatorcontrib>Shimada, T.</creatorcontrib><title>A 2.5-GFLOPS, 6.5 million polygons per second, four-way VLIW geometry processor with SIMD instructions and a software bypass mechanism</title><title>IEEE journal of solid-state circuits</title><addtitle>JSSC</addtitle><description>A four-way very long instruction word (VLIW), 312-MHz geometry processor with peripheral component interconnect/accelerated graphic port bus bridge was implemented in a 0.21-/spl mu/m, 2.5-V, three-layer-metal CMOS process. We adopted (1) a software bypass mechanism, (2) single-instruction multiple-data stream instructions, (3) four sets of floating-point multiply add and accumulate execution units, (4) special condition code registers and a branch condition generator for a clipping operation, and (5) automatic clock delay tuning methodology. As a result of these features, we achieved a performance of 2.5 GFLOPS and 6.5 million polygons per second for a three-dimensional geometry processor, which is the highest published performance as a single geometry processor. The processor is applicable to computer-aided-design systems that require very high graphics performance.</description><subject>Acceleration</subject><subject>Bridges</subject><subject>Buses (vehicles)</subject><subject>Bypasses</subject><subject>Clocks</subject><subject>CMOS process</subject><subject>Computer programs</subject><subject>Delay</subject><subject>Floating point arithmetic</subject><subject>Geometry</subject><subject>Graphics</subject><subject>High performance computing</subject><subject>Microprocessors</subject><subject>Polygons</subject><subject>Registers</subject><subject>Software</subject><subject>Tuning</subject><subject>VLIW</subject><issn>0018-9200</issn><issn>1558-173X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1999</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNp90U1P3DAQBmALtRJbQOLck0_QA9l6YscfR0QLXWkrKgEtt8g4NrhK4uDJapU_wO9u0KIeOY1G8-jVSC8hx8CWAMx8FUtljFawRxZQVboAxe8_kAVjoAtTMrZPPiH-nVchNCzIyzktl1Vxdbm-_nVzRuWyol1s25h6OqR2ekw90sFnit6lvjmjIW1ysbUT_b1e_aGPPnV-zBMdcnIeMWW6jeMTvVn9_EZjj2PeuDG-Zti-oZZiCuPWZk8fpsEi0s67J9tH7A7Jx2Bb9Edv84DcXX6_vfhRrK-vVhfn68JxpsbCGyVNJZQ23BkpGTjXKGVkMAE0l6UOJbOcgxYiSA7cNqG03OiGCSUeQPIDcrrLnR9-3ngc6y6i821re582WBswhsuqKmd58q4sNSgwTMzwyw66nBCzD_WQY2fzVAOrXyupRb2rZKafdzR67_-zt-M_OCiEdA</recordid><startdate>19991101</startdate><enddate>19991101</enddate><creator>Kubosawa, H.</creator><creator>Higaki, N.</creator><creator>Ando, S.</creator><creator>Takahashi, H.</creator><creator>Asada, Y.</creator><creator>Anbutsu, H.</creator><creator>Sato, T.</creator><creator>Sakate, M.</creator><creator>Suga, A.</creator><creator>Kimura, M.</creator><creator>Miyake, H.</creator><creator>Okano, H.</creator><creator>Asato, A.</creator><creator>Kimura, Y.</creator><creator>Nakayama, H.</creator><creator>Kimoto, M.</creator><creator>Hirochi, K.</creator><creator>Saito, H.</creator><creator>Kaido, N.</creator><creator>Nakagawa, Y.</creator><creator>Shimada, T.</creator><general>IEEE</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>L7M</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>19991101</creationdate><title>A 2.5-GFLOPS, 6.5 million polygons per second, four-way VLIW geometry processor with SIMD instructions and a software bypass mechanism</title><author>Kubosawa, H. ; Higaki, N. ; Ando, S. ; Takahashi, H. ; Asada, Y. ; Anbutsu, H. ; Sato, T. ; Sakate, M. ; Suga, A. ; Kimura, M. ; Miyake, H. ; Okano, H. ; Asato, A. ; Kimura, Y. ; Nakayama, H. ; Kimoto, M. ; Hirochi, K. ; Saito, H. ; Kaido, N. ; Nakagawa, Y. ; Shimada, T.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c307t-e9769547893c96601ccd7796f9f183628f20a331844f6313adf2a398d0474b163</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1999</creationdate><topic>Acceleration</topic><topic>Bridges</topic><topic>Buses (vehicles)</topic><topic>Bypasses</topic><topic>Clocks</topic><topic>CMOS process</topic><topic>Computer programs</topic><topic>Delay</topic><topic>Floating point arithmetic</topic><topic>Geometry</topic><topic>Graphics</topic><topic>High performance computing</topic><topic>Microprocessors</topic><topic>Polygons</topic><topic>Registers</topic><topic>Software</topic><topic>Tuning</topic><topic>VLIW</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kubosawa, H.</creatorcontrib><creatorcontrib>Higaki, N.</creatorcontrib><creatorcontrib>Ando, S.</creatorcontrib><creatorcontrib>Takahashi, H.</creatorcontrib><creatorcontrib>Asada, Y.</creatorcontrib><creatorcontrib>Anbutsu, H.</creatorcontrib><creatorcontrib>Sato, T.</creatorcontrib><creatorcontrib>Sakate, M.</creatorcontrib><creatorcontrib>Suga, A.</creatorcontrib><creatorcontrib>Kimura, M.</creatorcontrib><creatorcontrib>Miyake, H.</creatorcontrib><creatorcontrib>Okano, H.</creatorcontrib><creatorcontrib>Asato, A.</creatorcontrib><creatorcontrib>Kimura, Y.</creatorcontrib><creatorcontrib>Nakayama, H.</creatorcontrib><creatorcontrib>Kimoto, M.</creatorcontrib><creatorcontrib>Hirochi, K.</creatorcontrib><creatorcontrib>Saito, H.</creatorcontrib><creatorcontrib>Kaido, N.</creatorcontrib><creatorcontrib>Nakagawa, Y.</creatorcontrib><creatorcontrib>Shimada, T.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE journal of solid-state circuits</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Kubosawa, H.</au><au>Higaki, N.</au><au>Ando, S.</au><au>Takahashi, H.</au><au>Asada, Y.</au><au>Anbutsu, H.</au><au>Sato, T.</au><au>Sakate, M.</au><au>Suga, A.</au><au>Kimura, M.</au><au>Miyake, H.</au><au>Okano, H.</au><au>Asato, A.</au><au>Kimura, Y.</au><au>Nakayama, H.</au><au>Kimoto, M.</au><au>Hirochi, K.</au><au>Saito, H.</au><au>Kaido, N.</au><au>Nakagawa, Y.</au><au>Shimada, T.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A 2.5-GFLOPS, 6.5 million polygons per second, four-way VLIW geometry processor with SIMD instructions and a software bypass mechanism</atitle><jtitle>IEEE journal of solid-state circuits</jtitle><stitle>JSSC</stitle><date>1999-11-01</date><risdate>1999</risdate><volume>34</volume><issue>11</issue><spage>1619</spage><epage>1626</epage><pages>1619-1626</pages><issn>0018-9200</issn><eissn>1558-173X</eissn><coden>IJSCBC</coden><abstract>A four-way very long instruction word (VLIW), 312-MHz geometry processor with peripheral component interconnect/accelerated graphic port bus bridge was implemented in a 0.21-/spl mu/m, 2.5-V, three-layer-metal CMOS process. We adopted (1) a software bypass mechanism, (2) single-instruction multiple-data stream instructions, (3) four sets of floating-point multiply add and accumulate execution units, (4) special condition code registers and a branch condition generator for a clipping operation, and (5) automatic clock delay tuning methodology. As a result of these features, we achieved a performance of 2.5 GFLOPS and 6.5 million polygons per second for a three-dimensional geometry processor, which is the highest published performance as a single geometry processor. The processor is applicable to computer-aided-design systems that require very high graphics performance.</abstract><pub>IEEE</pub><doi>10.1109/4.799871</doi><tpages>8</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 0018-9200 |
ispartof | IEEE journal of solid-state circuits, 1999-11, Vol.34 (11), p.1619-1626 |
issn | 0018-9200 1558-173X |
language | eng |
recordid | cdi_proquest_miscellaneous_28171904 |
source | IEEE Electronic Library (IEL) |
subjects | Acceleration Bridges Buses (vehicles) Bypasses Clocks CMOS process Computer programs Delay Floating point arithmetic Geometry Graphics High performance computing Microprocessors Polygons Registers Software Tuning VLIW |
title | A 2.5-GFLOPS, 6.5 million polygons per second, four-way VLIW geometry processor with SIMD instructions and a software bypass mechanism |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T21%3A35%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%202.5-GFLOPS,%206.5%20million%20polygons%20per%20second,%20four-way%20VLIW%20geometry%20processor%20with%20SIMD%20instructions%20and%20a%20software%20bypass%20mechanism&rft.jtitle=IEEE%20journal%20of%20solid-state%20circuits&rft.au=Kubosawa,%20H.&rft.date=1999-11-01&rft.volume=34&rft.issue=11&rft.spage=1619&rft.epage=1626&rft.pages=1619-1626&rft.issn=0018-9200&rft.eissn=1558-173X&rft.coden=IJSCBC&rft_id=info:doi/10.1109/4.799871&rft_dat=%3Cproquest_RIE%3E28171904%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=28171904&rft_id=info:pmid/&rft_ieee_id=799871&rfr_iscdi=true |