Talk:CUDA/Archive 1

Latest comment: 6 years ago by 79.214.190.227 in topic Cuda 10 announced
Archive 1

Initial text

in fact, the cuda SDK has been released to many registered developers. — Preceding unsigned comment added by 24.80.123.40 (talk) 22:09, 11 December 2006 (UTC)

references

badly needs references section updating and reformatting —The preceding unsigned comment was added by 151.37.169.186 (talk) 17:28, 19 February 2007 (UTC).

Bilinear filtering?

Currently, the article states:

[Disadvantage]: Only bilinear texture filtering is supported .

And why is this of importance for GPGPU? --Abdull (talk) 14:52, 23 May 2008 (UTC)

"Only bilinear texture filtering is supported – mipmapped textures and anisotropic filtering are not supported at this time."

I thought this claim was bizarre as well, and I think boils down to the conceptual distinction between filtering and interpolation. Bilinear interpolation is supported, and in that spirit it would seem desirable to have higher order interpolation methods as well. Mipmapping and anisotropic filtering, on the other hand, would seem fairly useless and out of place. —Preceding unsigned comment added by 128.253.249.227 (talk) 18:06, 20 June 2008 (UTC)

CUDA to Run on ATi Cards

Eran Badit editor-in-chief of ngohq.com posted "I can confirm that our CUDA Radeon library is almost done and everything is going as planned on this side. There are some issues that need to be addressed, since adding Radeon support in CUDA isn’t a big deal - but it’s not enough! We also need to add CUDA support on AMD’s driver level and its being addressed as we speak." for more information visit http://www.ngohq.com/news/14254-physx-gpu-acceleration-radeon-update.html
http://www.tomshardware.com/news/nvidia-ati-physx,5841.html — Preceding unsigned comment added by Serag4000 (talkcontribs) 20:25, 8 July 2008 (UTC)

What are they trying to say?

First paragraph, last line: By opening up the architecture, CUDA provides developers both with the low-level, deterministic, and for repeatable access to hardware that is necessary API to develop essential high-level programming tools such as compilers, debuggers, math libraries, and application platforms. It is so awkward, and the grammar so bad, I am not sure what is trying to be said. If I could understand what they were getting at, I would fix it. Seems to be that they are talking about how CUDA gives programmers an interface to program the GPU. Main bits that are grammatically iffy are from "both with the..." which seems to be followed by three things, rather than the two which" both" implies, and "that is necessary API to". The sentence is a little two long, and the placement of clauses could be improved. Can someone translate? 122.107.110.187 (talk) 23:05, 18 April 2008 (UTC)

I almost left this sentence, but agree it needs truncated. This sounds like market speak from someone with English as their 2nd language. —Preceding unsigned comment added by 74.219.122.138 (talk) 12:44, 26 August 2008 (UTC)

The last edit removed information about support for single- v. double-precision floating-point.

The edit titled "Removed duplicated, too complicated text" removed all information about which nVIDIA GPUs and products support double-precision and which only support single-precision floating-point. This information is necessary for people to make make purchase decisions, and should be included.

This could be done by bringing back the "Hardware" section which this edit deleted, or by adding footnotes to the "Supported GPUs" section later in the page.

I would make this change myself, but I do not have information which I believe to be 100% correct.

Bezenek (talk) 19:26, 29 October 2008 (UTC)

Please add comments to source code examples.

Could someone please comment source code examples? Source code without comments is useless as an example. I can only guess where the parallelism is hidden from the Python source and I have no clue about the C++ example. Thanks. 85.222.103.104 (talk) 11:45, 29 November 2008 (UTC)

Reads like an advertisement

"CUDA gives developers access to the native instruction set and memory of the massively parallel computational elements in CUDA GPUs." This is blatant advertising. I'm fixing it now.~-F.S-~(Talk,Contribs,Online?) 16:49, 3 December 2008 (UTC)

Uh, no it's not; it's an accurate description. See eg Massive parallel processing. 68.73.84.231 (talk) 07:13, 11 February 2009 (UTC)

Supported GPU section

Article does not cite sources for the listed GPUs. The list given in the article does not agree with the obvious source, the official NVidia list (e.g. in the wiki GeForce 9400M G is listed as supported, on the NVidia website not). If other sources are used the location should be given. By the way: Is the GeForce Go 6600 supported or not?

The list of supported cards is usually reasonably up-to-date in the CUDA programming manual. — Kristleifur (talk) 09:56, 25 September 2009 (UTC)

Compute Compatibility

There should be a section on Compute Compatibility, and what it entails. Then each card can have the compute compatibility next to it - this is very important for purchasing decisions and deepens the topic. borandi (talk) 08:39, 12 July 2010 (UTC)

bad example ... BOINC

Footnote 8 is a broken URL, but the reference in the body of the text is factually incorrect. BOINC does not use CUDA any more than CUDA uses BOINC. Both are tools for enabling certain aspects of challenging computing projects.

You can say that Folding@home uses CUDA (without BOINC) or that Milkyway@home uses CUDA (with BOINC) but BOINC, itself, does not use CUDA. 68.183.61.32 (talk) 17:37, 21 October 2010 (UTC)

Bug in example

I don't think that this line in the C++ example is correct:
dim3 gridDim((width + blockDim.x - 1)/ blockDim.x, (height + blockDim.y - 1) / blockDim.y, 1);

Why is the blockDim.x/y added to the width/height, 1 substracted and than the whole thing divided through the blockDim.x/y? I think, it's senseless.
Example:
width = 800
blockDim.x = 16

((width + blockDim.x - 1)/ blockDim.x) = 50,9375
With the implicit cast to integer: 50

So, where's the point? 153.96.171.70 (talk) 11:53, 8 December 2010 (UTC)

Forget about it... the example works fine. 153.96.171.70 (talk) 08:33, 26 January 2011 (UTC)

Language links.

I made an update to reflect that CUDA version 3.1 is now the latest stable version.

There are now bindings for Perl, Ruby, and Lua as well as a new binding for Python. These bindings are provided by my company--if these bindings were not unique and these languages were not significant I would not have added them to this page. I leave it to the editors whether these link edits should be reverted. There is another binding for Ruby (Baracuda) that should also be added.


—Preceding unsigned comment added by Psilambda (talkcontribs) 02:27, 8 July 2010 (UTC)

Suggested change: the current production version (stable release) of CUDA is version 3.2 (ref). A release candidate of the CUDA Toolkit version 4.0 was announced in March 2011, and is currently available to registered CUDA developers, but the production release is not yet available. (See: http://www.ddj.com/high-performance-computing/229219474 ; http://drdobbs.com/high-performance-computing/229300467 ; and http://www.anandtech.com/show/4198/nvidia-announces-cuda-40) — Preceding unsigned comment added by Gmillington (talkcontribs) 23:30, 28 March 2011 (UTC)

Outdated info in 4th bullet under "Limitations" section

I’d like to suggest editing or removing the 4th bullet point in this section, as the information it is outdated (Full disclosure – I work for NVIDIA, but my only objective is to make sure the CUDA information is factually accurate). At present, the bullet notes that CUDA does not support all of the IEEE rounding modes, which is no longer the case.

To be specific, with the launch of our Fermi-based GPU architecture in March of 2010 [See “Current CUDA architectures” Section] for details, support for 4 rounding modes was enabled. In addition on Fermi and later hardware, single-precision is IEEE 754 accurate by default.

My suggestion is to either remove this as a “Limitation,” or if preferred, to update the bullet point to reflect the above note about the Fermi architecture. I would be happy to supply supporting documentation as required. Thanks!

216.228.112.21 (talk) 21:54, 29 March 2011 (UTC)

Adding references to support the above changes: http://www.behardware.com/articles/772-7/nvidia-fermi-the-gpu-computing-revolution.html http://www.xbitlabs.com/news/video/display/20090930175307_Nvidia_Gives_a_Glimpse_on_Next_Generation_Fermi_Graphics_Processors.html http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf

Gmillington (talk) 23:03, 13 April 2011 (UTC)

Texture rendering is not supported?

I'm pretty sure that it IS supported. At least int the 2.2 version I'm studying now. 217.67.117.64 (talk) 08:33, 17 May 2009 (UTC)

I (Gernot Ziegler, NVIDIA UK employee, gziegler@nvidia.com) have just extended the line as follows: Texture rendering is not supported (CUDA 3.2 and up addresses this by introducing "surface writes" to cudaArrays, the underlying opaque data structure). — Preceding unsigned comment added by 80.108.22.119 (talk) 09:13, 12 July 2011 (UTC)

DMA transfers

Regarding: "Copying between host and device memory may incur a performance hit due to system bus bandwidth and latency"

Gernot Ziegler (NVIDIA employee, gziegler@nvidia.com) has just added

"(this can be partly alleviated with asynchronous memory transfers, handled by the GPU's DMA engine)"  — Preceding unsigned comment added by 80.108.22.119 (talk) 09:16, 12 July 2011 (UTC) 

Most Games Require CUDA Enabled Cards?

This is from the first section, third paragraph: "Nowadays most of the 3-D games that are released require a graphics card to run and CUDA is used in all of these graphics cards." this doesn't sound right to me... Enisbayramoglu (talk) 15:09, 21 October 2011 (UTC)

CUDA became opensource

http://developer.nvidia.com/content/cuda-platform-source-release — Preceding unsigned comment added by 177.16.247.85 (talk) 22:56, 16 December 2011 (UTC)

Max Register per thread

Please update the table with the value regarding the max number per thread allowed (63 on Fermi and Kepler 3.0, 255 on 3.5) — Preceding unsigned comment added by 92.79.135.41 (talk) 09:53, 5 July 2012 (UTC)

so is is it based on open64 or pathscale?

ref says pathscale, unless somebody can provide ref to open64 as a foundation...144.223.173.22 (talk) 21:29, 11 December 2008 (UTC)

The current compiler uses LLVM(one of potential sources). I cannot update it due to conflict of interest rule. 86.31.2.91 (talk) 19:35, 7 October 2012 (UTC)

Python Evangelism?

I know Phython is *like the greatest programming language ever* but I think its extra on this page as the examples take up a lot of space and don't add anything. --Frozenport (talk) 01:20, 4 December 2012 (UTC)

Version 6

The specs need to be updated for version 6. Roger (talk) 06:19, 24 May 2014 (UTC)

Compute Unified Device Architecture

@Pateljay43: Since you have ignored my comments on your talk page, let's bring the discussion here. You have repeatedly removed properly sourced information and reverted multiple editors without a valid explanation (See 1, 2, 3, 4, 5, 6,7). The reliable sources in question are from AnandTech and Tom's Hardware, well-known and widely published resources. They clearly define the acronym CUDA. In addition to the reliable, secondary sources I've provided, here are two more straight from Nvidia: 1, 2. These are weaker primary sources, but apparently that's what you wanted to see, so there they are. I see no reason to use these as references in the article, since secondary sources are preferred, but I have provided them for your convenience.

Continuing to reinstate the changes you've made without first discussing it here can be considered a form of edit warring which is not permitted. Engaging in an edit war can lead to the loss of editing priveleges. Instead of reverting edits, participate in this discussion to change consensus. There are several editors which disagree with the changes you've made, so the burden is on you to justify the changes you are trying to make. --GoneIn60 (talk) 14:08, 19 May 2015 (UTC)

As a result of this edit war, the page has been editprotected for the time being. The protection will, of course, come off as soon as the matter is resolved, but I was left with little choice as the editor in question has proven remarkably stubborn and unwilling to participate in any discussion about the matter — in addition to their actions here, they have also been trying to replace the redirect from "Compute Unified Device Architecture" with a metacommentary essay about how stupid people for thinking that's what CUDA stands for, which is even more inappropriate than their editwarring on this page (Wikipedia articles are never allowed to contain metacommentary about Wikipedia.) The editor has also been advised that they may be cruising for a temporary or permanent editblock if they don't start cooperating instead of being disruptive, so hopefully this can be resolved soon. Bearcat (talk) 07:36, 20 May 2015 (UTC)
The user has now been temporarily editblocked by another administrator for continued inappropriate behaviour, so I've bumped the page protection back down to "pending changes". I'm prepared and willing to bump it back up again if the edit warring restarts, however, so please feel free to @ping me if necessary. Bearcat (talk) 19:04, 21 May 2015 (UTC)

Here's the earliest primary source I could find for the acronym, from 2006, in case it helps convince anybody: [1] Pjrich (talk) 13:44, 8 July 2015 (UTC)

Open-source CUDA implementations

Currently, there seems to be no mention of any open-source CUDA compilers in this article. GPUCC is the only one that I am aware of, but it was created very recently. Is it noteworthy enough to be mentioned in this article? Jarble (talk) 20:52, 10 August 2016 (UTC)

Define Compute Capability early on

It seems to relate to a version of something, but not clearly explained. ★NealMcB★ (talk) 14:26, 9 May 2017 (UTC)

Hello fellow Wikipedians,

I have just modified 2 external links on CUDA. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 19:36, 28 July 2017 (UTC)

Device list

Do we really need a list of all GPUs supporting CUDA in this article? It is really long, constantly growing, and does not seem to add much value. Is not a link sufficient? I am however hesitant to go ahead and delete it from the article without bringing this up in the discussion first. Cmpxchg8b (talk) 00:55, 25 February 2016 (UTC)

I agree the list is distractingly long and unnecessary. An external link and/or ref should suffice. --GoneIn60 (talk) 14:34, 25 February 2016 (UTC)
Alright, I went ahead and deleted that long list. Note that the exact same information is available right above that list in another list which is IMO more readable. There's a link to the manufacturer's site, too. So we're down from two lists and a link to one list and a link. Further comments are welcome. Cmpxchg8b (talk) 17:36, 4 March 2016 (UTC)
Am I right in thinking the "GPUs supported" chart contains (as of 2017-12-24) the same info? That chart is very handy and shouldn't be removed. BMJ-pdx (talk) 08:41, 24 December 2017 (UTC)

"Version features and specifications" chart seems fubar

It appears that the No/Yes indications under the "Compute capability" heading of the "Version features and specifications" are reversed. As it is, capabilities decrease with higher versions -- e.g., 6.x is "Yes" only for 64-bit floating point atomic addition, and 1.0 appears to have everything except the first two features. BMJ-pdx (talk) 08:53, 24 December 2017 (UTC)

Cuda 10 announced

Cuda 10 announced on siggraph — Preceding unsigned comment added by 79.214.190.227 (talk) 07:52, 16 August 2018 (UTC)