Talk:General-purpose computing on graphics processing units

Learn more about this page

This is the talk page for discussing improvements to the General-purpose computing on graphics processing units article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL

Computing High‑importance

	This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ComputingWikipedia:WikiProject ComputingTemplate:WikiProject ComputingComputing articles
High	This article has been rated as High-importance on the project's importance scale.

Updated

Latest comment: 18 years ago2 comments2 people in discussion

Updated the page to include a basic overview of GPGPU - history and techniques, as well as some more applications. More detail could be added to all the techniques, and the data structures need to be expanded. Applications could also be added and more detail about important/common ones could be added. Echeslack 19:01, 13 December 2005 (UTC)Reply

I noticed there's plenty of detail for stream processing. I believe some of this should actually be moved to the ad-hoc article. Streams, kernels, flow control, map, scatter/gather and such seems to be good candidates. I'll take a memo about this. MaxDZ8 talk 09:07, 9 May 2006 (UTC)Reply

Data types

Latest comment: 12 years ago2 comments2 people in discussion

8 bits per pixel - Palette mode, where each value is an index into a table with the real color value specified in one of the other formats. Possibly 2 bits for red, 3 bits for green, and 3 bits for blue.

this bit allocation does not make sense for pallete mode. Should we remove that phrase? --Diego 14:38, 13 June 2006 (UTC)Reply

I rephrased it. Joe 09:22 June 29 2011

I'd also contest that the bit allocation is inaccurate. Given the sensitivity of the human eye to different colours, it's far more likely that any fixed bit-per-channel allocation would be 3-3-2, with blue only having 4 levels (2 bits) and the other two components having 8. Though in practice, isn't a 216 colour evenly-spread "web safe"-type scheme (6 levels per channel with no direct correlation between channels and bits) more common in the fairly rare examples of a 3D accelerator outputting in less than 15 bpp but not using a completely custom (eg as found in Quake et al) palette? 193.63.174.211 (talk) 13:37, 11 January 2012 (UTC)Reply

Languages

Latest comment: 13 years ago3 comments3 people in discussion

what about creating a section containing OpenGL Shading Language and related?

There's a page on shading languages which would need some work. I think it would be better to put the thing there.

MaxDZ8 talk 21:53, 13 June 2006 (UTC)Reply

Well, gpgpu have been using glsl much more than hlsl, so it's a bit strange that it isn't mentioned at all. —Preceding unsigned comment added by 128.39.210.208 (talk) 12:32, 23 April 2010 (UTC)Reply

how does this all relate to OpenCL etc - the article seems overly full of references to Microsoft's DirectX when there are other competing technologies out there. [edit] APIs — Preceding unsigned comment added by 78.150.136.155 (talk) 11:58, 15 July 2011 (UTC)Reply

Programmability

Latest comment: 12 years ago2 comments2 people in discussion

"The programmability of the pipelines have trended according the Microsoft’s DirectX specification, with DirectX8 introducing Shader Model 1.1, DirectX8.1 Pixel Shader Models 1.2, 1.3 and 1.4, and DirectX9 defining Shader Model 2.x and 3.0. " This sentence make no sense what so ever, 'trended' is a strange verb to use here (or indeed anywhere) and the rest is non grammatical anyhow. Did the author want to say something like "have followed the DirectX specification?".

how does this all relate to OpenCL etc - the article seems overly full of references to Microsoft's DirectX when there are other competing technologies out there.

Long story short: CL is the newcomer to all of this and there's no chance (or very little chance) to rephrase the above without referencing to D3D or some DirectX ecosystem. Direct3D is the only API I know with clear and (often) explicit definition of the concepts referred in the above statements. Writing the same in terms of OpenGL would be possible but extremely verbose in the best case and just plain out impossible in some other cases (such as single-float blending, of which GL had no notion at all last time I checked). When it comes to OpenCL the situation is slightly better (I have been told so) but in the end, its approach it's very similar to GL and has no chance to scale back to anything that is not at least Shader Model 4 (maybe 3) in functionality. — Preceding unsigned comment added by MaxDZ8 (talk • contribs) 14:37, 25 July 2011 (UTC)Reply

What I take from that phrase is that graphics chip / card manufacturers have concentrated on adding features to their GPUs - both those useful for GLGPU, and for "normal" graphics - that move them further up the DirectX version ladder, sometimes at the expense of the features which don't. It improves both their saleability, compatibility and longevity. So to get an idea of what features next years' GPUs may have, take a look at Microsoft's DirectX roadmap... 193.63.174.211 (talk) 13:40, 11 January 2012 (UTC)Reply

APIs

Latest comment: 18 years ago1 comment1 person in discussion

I'm brand new to GPGPU, but I think frameworks such as Lib Sh, BrookGPU and the Microsoft Research Accelerator Project [1] should be mentioned somehow... —Disavian (^talk/_contribs) 17:07, 27 October 2006 (UTC)Reply

Move

Latest comment: 17 years ago4 comments4 people in discussion

I disagree with the move proposal; GPGPU is more than just a subset of stream processing. Also, the stream processing article is too long, and should probably have pages broken out of it, instead of having more content merged into it. —Disavian (^talk/_contribs) 01:57, 6 November 2006 (UTC)Reply

Conditionally support. Not the whole article shall be merged. The section on kernels, for example, should. The section on data types should not.
MaxDZ8 talk 08:15, 6 November 2006 (UTC)Reply

I concur. However, some of this article should be moved out. A lot of this article is talking about various normal functions of the GPU which have little or nothing to do with GPGPU. This requires a more general treatment, and ought to include more information and citations on actual examples of its use. This article could also use a good deal of organization =\. --24.98.124.237 12:34, 9 December 2006 (UTC)HaploReply

And who is doing it? Takomat 2nd of March 2007

This article is poorly organized but it certainly is separate from stream processing. It could benefit from summarizing concepts in stream processing, GPU, etc and referencing appropriately. Potatoswatter 19:35, 7 April 2007 (UTC)Reply

"a recent trend in computer science"

Latest comment: 17 years ago2 comments2 people in discussion

The phrase in the intro "a recent trend in computer science": it's not really anything at all new in computer science at all, is it? Much more computer engineering. Though not just software engineering. (Though "computer science" itself is closer to engineering than science a lot of the time.) The Computer engineering article is about hardware engineering and I suspect wouldn't be the right term to put in the intro. Computer programming isn't the perfect choice either. Ideas? - David Gerard 09:47, 7 September 2007 (UTC)Reply

I agree this would need way more attention... I support your idea since CS usually doesn't even speak about the hardware. I'm not even sure it's the case to call for engineering: this would also be controversial, it's more an application design choice than anything else...
For the time being, I am modifying the page by removing the reference.
MaxDZ8 talk 06:35, 8 September 2007 (UTC)Reply

Suitable decoration?

Latest comment: 17 years ago2 comments2 people in discussion

Is there a suitable image for this article? A photo, some block diagrams? I just thought it could do with some decoration :-) - David Gerard 09:49, 7 September 2007 (UTC)Reply

Based on my awful experience with the shaders page, there's not. Furthermore, looks like editors run away when having to deal with diagrams. Someone shall really do that, Inkscape is your friend.
MaxDZ8 talk 06:35, 8 September 2007 (UTC)Reply

Missing link

Latest comment: 17 years ago1 comment1 person in discussion

This article is definitely remiss for not mentioning or linking to NVIDIA Tesla Raul654 05:44, 6 November 2007 (UTC)Reply

Software portability

Latest comment: 17 years ago2 comments2 people in discussion

A few questions that I was left wondering after reading the article:

Is code written for CPU processors (eg i386) easily ported to GPUs or is major redesign necessary. (CPU->GPU)
Is code written for one GPU portable to another? (GPU1->GPU2)
Are there any GPU programming standards?

Pgr94 11:21, 14 November 2007 (UTC)Reply

Note: I've changed the comment to a numbered list for better tracking.

Generally redesign is required. Some simple pieces can be ported pretty easily, possibly at sub-optimal performance. In general, re-targeting requires to change not just the implementation but the algorithm itself and the surrounding application. This is a typical weak point of vector-based engines.
Yes, if you mean for example GLSL, compiled HLSL bytecode or another intermediate language. Differently from CPUs, in which you can copy per-byte the real executable with a high chance of getting it to work on another model, GPU code must generally be recompiled. Compatibility is not guaranteed even in different models of the same family, let alone vendors.
Yes, you may have heard of shading languages. They're the backbone of all other language but a few ones.

MaxDZ8 talk 16:55, 19 November 2007 (UTC)Reply

Only interesting for floating-point / arithmetic applications?

Latest comment: 17 years ago2 comments2 people in discussion

The article says it is important for applications to have high arithmetic intensity - does this mean that non-arithmetic applications are not well-suited to GPGPU's? Pgr94 11:25, 14 November 2007 (UTC)Reply

No, it means that the more compute-heavy is the algo, the more it will benefit. Current (and previous) GPUs however also have way higher bandwidth compared to CPUs on typical mainboards so the gain is substantial even if the algorithm is bandwidth-bound.
In both cases however it is important for each algo "step" to be long enough. Some algorithms do not do enough work per-step to benefit because going from step n to n+1 is typically an expensive operation.
MaxDZ8 talk 17:02, 19 November 2007 (UTC)Reply

GL Shader

Latest comment: 16 years ago2 comments2 people in discussion

All this text, and not a single mention of OpenGL Shading Language? Insensitive Clods! -Anonymous Coward —Preceding unsigned comment added by 65.24.75.114 (talk) 04:45, 5 February 2008 (UTC)Reply

Yeah. What is this? An article on DirectX? DirectX gets mentioned over and over, but OpenGL doesn't. And it also turns into a version number fest listing each shader model added in different DirectX revisions... Supertin (talk) 08:47, 3 August 2008 (UTC)Reply

Real-time

Latest comment: 16 years ago1 comment1 person in discussion

Removed a vague mention in the introduction of a (technically questionable) connection with real-time computing. Neither article expands on this connection. If it's legitimate, it should be explained in the body of the article. Bryan Silverthorn (talk) 14:41, 30 March 2008 (UTC)Reply

Proposed deletion of Lib Sh

Latest comment: 16 years ago1 comment1 person in discussion

A proposed deletion template has been added to the article Lib Sh, suggesting that it be deleted according to the proposed deletion process. All contributions are appreciated, but this article may not satisfy Wikipedia's criteria for inclusion, and the deletion notice should explain why (see also "What Wikipedia is not" and Wikipedia's deletion policy). You may prevent the proposed deletion by removing the {{dated prod}} notice, but please explain why you disagree with the proposed deletion in your edit summary or on its talk page.

Please consider improving the article to address the issues raised because even though removing the deletion notice will prevent deletion through the proposed deletion process, the article may still be deleted if it matches any of the speedy deletion criteria or it can be sent to Articles for Deletion, where it may be deleted if consensus to delete is reached. Do you want to opt out of receiving this notice? —Disavian (^talk/_contribs) 21:15, 10 June 2008 (UTC)Reply

Graphics Specific Applications

Latest comment: 15 years ago2 comments2 people in discussion

It seems to me that Digital image processing, Video Processing, Raytracing, Global illumination and Geometric computing all fall under the normal use of a graphics processing unit and thus do not qualify to be listed on this articles' page as general purpose computing applications. I suggest that they be removed from the list. Bkessler (talk) 02:32, 28 July 2008 (UTC)Reply

Given normal operation, GPU's are used for displaying arbitrary 2d or 3d images onto the user's screen.

In 3D rendering those operations are performed; but they are not the end goal. The result of the computation will be lost save for an image on screen. In general purpose processing; the result of (for example) a ray trace may be used in other ways. Nidomedia (talk) 12:25, 26 March 2009 (UTC)Reply

Relative performance

Latest comment: 15 years ago2 comments2 people in discussion

Given that people go to the effort of implementing algorithms on a GPU are doing so because of potential speed improvements some comments on the relative performance level of specific algorithms. -- for example the folowing evaluation showing up to 30x speedup for large problems (the paper concludes that the large problem is still limited by overheads).

Intel Core 2 Duo E6750 Vs GeForce GTX 280 (G200 chip).^[1]

Number of variables	Mflops on Intel Core 2 Duo E6750	Mflops on Nvidia GTX 280	speedup
16, 641	1405	2788	2.3x
66, 049	1114	8086	7.8x
263, 169	886	15179	17.4x
1, 050, 625	805	21406	26.9x

They also estimate performance per watt improvements of 11.5 and performance per euro increase of about 16.

For other applications speedups of >60 have been reported^[2].

Andy t roo (talk) 01:27, 16 September 2008 (UTC) (Andrew Hill)Reply

Performance speedups are dependent mostly on possible parallelism and implementation. Schiwietz et al report in "mr image reconstruction using the gpu" an 128x improvement in using a GPU instead of a CPU for CT reconstruction. However; Kachelreiss et al's has a version of the same reconstruction algorithm which runs twice as fast on an intel CPU as Schiwietz et al's gpu version; even though he was optimising for the CBE (Hyperfast parallel-beam and cone-beam backprojection using the cell general purpose hardware) Nidomedia (talk) 12:43, 26 March 2009 (UTC)Reply

References

Integer processing

Latest comment: 14 years ago2 comments2 people in discussion

Is GPGPU applicable to integer processing tasks as well as floating point? The article doesn't mention, but it seems like a rather important point. Or perhaps it does explain it, but not to someone that doesn't already understand the topic thoroughly. - Taxman ^Talk 13:29, 27 August 2009 (UTC)Reply

Yes, GPUs have had the capability to work with integers with all the normal operators since DirectX 10 (Shader Model 4.0) came. See The Direct3D 10 System by David Blythe for example.

Also, GPGPU has been used for accelerating encryption/decryption as well which are exclusively integer math (lots of modulo and xor operations for example). - 85.30.174.234 (talk) 14:44, 21 July 2010 (UTC) ChristianReply

GPU computing

Latest comment: 13 years ago3 comments2 people in discussion

Who originated this term? Whose baby was it? The lede does not say. As acronyms go, this is a particularly crummy one, invented by a dyslexic in training of the GPG/PGP or MediaWiki/WikiMedia school of obfuscation (at least GPG had humour).

Why can't this article be called instead GPU computing? This is a clearer term and less likely to confuse the uninitiated. Changing the name of the article might help alleviate the technology bundle gridlock which this article suffers from. I think of GPGPU as a hardware design movement (or banner thereof), and GPU computing as the reality that has now come to pass. — MaxEnt 04:56, 1 May 2010 (UTC)Reply

Looked around some more and found this: The term GPGPU was coined and GPGPU.org was founded by Mark Harris in 2002 when he recognized an early trend of using GPUs for non-graphics applications.

My opinion remains that we should hold a competition for the best backronym of PGPGU, have a laugh, call it a day, and rename the article to something more comprehensible to the average lay-person. — MaxEnt 05:12, 1 May 2010 (UTC)Reply

The book Programming Massively Parallel Processors by Kirk and Hwu suggests that GPU computing is the general term and GPGPU refers to a special type. I preferred the non-jargonistic "GPU computing" to the acronym GPGPU. I kind of wish I had moved the article to "GPU computing" rather than creating a redirect from "GPU computing". Perhaps the article should still be renamed. Jason Quinn (talk) 00:25, 14 July 2011 (UTC)Reply

GPU programming libraries/layers section is missing the Kappa Library

Latest comment: 12 years ago2 comments2 people in discussion

The Kappa Library is missing from "GPU programming libraries/layers". It is an object-oriented Producer/Consumer scheduler library on top of CUDA and OpenMP from Psi Lambda LLC (my company--full disclosure).

—Preceding unsigned comment added by Psilambda (talk • contribs) 20:20, 8 July 2010 (UTC)Reply

I don't think it should be mentioned if it's on top of CUDA. Those are layers for interface with hardware. Serg3d2 (talk) 08:40, 31 January 2012 (UTC)Reply

binary format

Latest comment: 14 years ago2 comments2 people in discussion

If you disassemble a typical cpu program you get a list of cpu opcodes starting from an entry point such as a main() function. What exactly happens when the program wants to transfer to the gpu (whether stored in the program itself or a library like a DLL)? I think this is important to add to the article but i can't find it elsewhere on WP. —Preceding unsigned comment added by 58.152.239.68 (talk) 07:34, 1 August 2010 (UTC)Reply

It does not work even remotely like you expect. The program does not 'jump' into a kernel. The explanations you are searching for are to be inferred from various articles involving the API of choice. As an extreme appoximation you could think at the GPU as another processor churning the kernel you give to it. You can change the kernel using the appropriate API infrastructures and 'it will run' when certain conditions are met. Because opcodes are just data, they can be stored either in the program itself as data blobs, or in an external resource which has nothing to do with DLLs. OpCodes sent to GPU do not connect with CPU code in any way (at best, the results can be connected using appropriate calls).

Also see "Software Portability".
MaxDZ8 talk 13:12, 2 August 2010 (UTC)Reply

Ref 22 has dead link

Latest comment: 11 years ago1 comment1 person in discussion

I'm new to editing wiki pages and not sure how to fix this so I'm mentioning it here until I can figure it out or someone else does it.

Ref 22 has dead/broken link. — Preceding unsigned comment added by Abdd0e77 (talk • contribs) 14:58, 28 May 2013 (UTC)Reply

Computing any computable value

Latest comment: 10 years ago1 comment1 person in discussion

I just removed this from the lede:

Any GPU providing a functionally complete set of operations performed on arbitrary bits can compute any computable value.

For a suitable definition of "computable value" (and some stretching of the notion of functional completeness), this is a tautology. It's not saying anything about GPUs. QVVERTYVS (hm?) 11:50, 25 August 2014 (UTC)Reply

External links modified

Latest comment: 8 years ago1 comment1 person in discussion

Hello fellow Wikipedians,

I have just added archive links to one external link on General-purpose computing on graphics processing units. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

Added archive https://web.archive.org/20130616205308/http://openhmpp.org/ to http://www.openhmpp.org/

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—^{cyberbot II}_{Talk to my owner:Online} 23:35, 11 February 2016 (UTC)Reply

External links modified

Latest comment: 7 years ago1 comment1 person in discussion

Hello fellow Wikipedians,

I have just modified one external link on General-purpose computing on graphics processing units. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Added archive https://web.archive.org/web/20100927155948/http://www.hpcwire.com/features/MATLAB-Adds-GPGPU-Support-103307084.html to http://www.hpcwire.com/features/MATLAB-Adds-GPGPU-Support-103307084.html

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 18:01, 9 September 2017 (UTC)Reply

External links modified

Latest comment: 7 years ago1 comment1 person in discussion

Hello fellow Wikipedians,

I have just modified 3 external links on General-purpose computing on graphics processing units. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 12:39, 12 October 2017 (UTC)Reply

External links modified

Latest comment: 6 years ago1 comment1 person in discussion

Hello fellow Wikipedians,

I have just modified one external link on General-purpose computing on graphics processing units. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 04:48, 26 December 2017 (UTC)Reply

The "emerging technologies" category

Latest comment: 6 years ago1 comment1 person in discussion

Does this fall under that category any more? I mean, is it still possible to buy a new GPU, integrated or otherwise, that doesn't have some level of OpenCL (or CUDA) support?

Also, in the molecular modelling section:

† Expected speedups are highly dependent on system configuration. GPU performance compared against multi-core x86 CPU socket. GPU performance benchmarked on GPU supported features and may be a kernel to kernel performance comparison. For details on configuration used, view application website. Speedups as per Nvidia in-house testing or ISV's documentation.

Is that a direct copy-paste from NVidia's website, or does someone just really like writing disclaimers? A Shortfall Of Gravitas (talk) 06:17, 27 July 2018 (UTC)Reply

Updates for MATLAB

Latest comment: 1 year ago1 comment1 person in discussion

Hi, I work at MathWorks on the Parallel Computing team. Thank you for including MathWorks in this page about GPGPU. There are some updates from 2012, 2017, and 2019 that affect the accuracy of the content.

1. There was a product name change in 2019: MATLAB Distributed Computing Server is now MATLAB Parallel Server (https://www.mathworks.com/products/matlab-parallel-server.html)

2. MathWorks launched a new product for generating CUDA code in 2017: GPU Coder (https://www.mathworks.com/products/gpu-coder.html)

3. ArrayFire discontinued sales of Jacket in 2012: (https://arrayfire.com/blog/tag/jacket/)

We suggest that in the "Implementations" section you change "MATLAB supports GPGPU acceleration using the Parallel Computing Toolbox and MATLAB Distributed Computing Server, and third-party packages like Jacket." to "MATLAB supports GPGPU acceleration using Parallel Computing Toolbox, MATLAB Parallel Server, and GPU Coder."

Thank you for your consideration. (@MaxDZ8 would you be able to help?) OwlCarbonant54 (talk) 21:00, 24 January 2023 (UTC)Reply

Add topic

[1] ttp://www.mathematik.tu-dortmund.de/~goeddeke/pubs/GTX280_mixedprecision.html

[2] ttp://www.citeulike.org/user/bwking/article/3008417

[1]

[2]