Wikipedia:Reference desk/Archives/Computing/2012 October 19

Computing desk
< October 18 << Sep | Oct | Nov >> October 20 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.



October 19

edit

ddos attack

edit

when people are ddosing a website, how does the site (server admin) stop the attack? thanks, 70.114.254.43(talk) 02:08, 19 October 2012 (UTC)[reply]

The section "Prevention and Response" in the article DoS lists some options. RudolfRed (talk) 04:16, 19 October 2012 (UTC)[reply]

Zero or one based indexing

edit

A previous comment in this ref desk said: <quote>Any programming language that isn't a toy defaults to arrays starting at zero. Using one-based arrays is a foolish concession to foolish programmers which fills anything more than trivial uses of array indexing with endless minor adjustments</quote> As most of my programming is in c and Java, zero indexing seems natural to me. However there are a number of languages that are not toys, such as lua. There are also languages with no default, like Ada and Pascal. The primary advantage of zero-based arrays seem to me to be that you can treat elements as offsets (0 is this house, 1 is one down the road and so on). It is not "obviously" an advantage where you just want to label a position in a record. This seems to be born out by Erlang that uses a zero base for arrays, but a one base for lists.

What I can't see is why Erlang just doesn't index everything from 0 - what are the advantages of indexing lists from 1? If there were no advantages I would have thought that zero would have the advantage of being consistent with arrays. -- Q Chris (talk) 09:49, 19 October 2012 (UTC)[reply]

The relevant article is Zero-based numbering. I don't think there are any disadvantages except for people whose brains can't warp around the idea. There are a number of technical advantages. --Mr.98 (talk) 11:40, 19 October 2012 (UTC)[reply]
Sorry I was asking the other way round, what is the advantage of one-based numbering ? I assume that if there wasn't any then Erlang would use zero-based numbering for lists as it does for arrays. -- Q Chris (talk) 12:28, 19 October 2012 (UTC)[reply]
Erlang uses linked lists for the list data type. 1 is used to represent the first element, because it is an intuitive way of dealing with linked lists. They aren't array based, so the address+offset (and intuitively 0-based) method doesn't apply. You don't normally want to look up a linked list element by index anyways - if you are, then you probably should re-think how you are solving the problem.209.131.76.183 (talk) 12:51, 19 October 2012 (UTC)[reply]
Count-from-one gives similarity to how mathematics is written (for example, the very typical sigma-notation examples in the summation article, where i=1 means "the first one"). If you're taking a big paper about aerodynamics and want to turn it into code, count-from-one in the language means the code slightly more resembles the paper than count-from-zero. Matlab works this way(matrix example). Fortran 90 lets you define the "extent" of an array, so arrays can run from any integer index (including negatives) to any other. Python will let you make things that syntactically look like arrays (but probably aren't really) that you can index with any darn thing you like. -- Finlay McWalterTalk13:01, 19 October 2012 (UTC)[reply]
Actually, Python has arrays/lists (that probably really aren't) that are indexed from 0, and dicts, which can be indexed by arbitrary immutable keys (but not, e.g., by a list). And you can probably create your own classes and do whatever you can imagine ;-). --Stephan Schulz (talk) 13:50, 19 October 2012 (UTC)[reply]
For the record there's also tons of mathematics written using zero-based indexing, so the split extends to math too.--83.84.137.22 (talk) 16:32, 19 October 2012 (UTC)[reply]
The advantage of 1-based indexing is that it relates better to the real world. Even a programmer able to think of everything as 0-based will likely still need to communicate with customers, who will get confused if he says "the zeroth item on the list returned will be...". Since the conversion between the 1-based real world and 0-based CPU has to occur somewhere, I'd argue that it should occur as deep inside the computer as possible. That is, you specify array item 1, and it figures out that this means memory offset 0. This allows programmers to more effectively relate to customers, customer specs, non-technical managers, etc. Essentially, the issue is whether computers should be adapted to fit us, or we should be adapted to fit computers. StuRat (talk) 17:58, 19 October 2012 (UTC)[reply]
I have to maintain a 10-year-old code base that was based pretty directly on an older embedded system, and there is constant 0-to-1 based translation going on, and it drives me crazy. In any new development I just let the array be one item larger than it needs to be and ignore the 0 element. Even in most embedded systems nowadays there is more than enough memory to lose a few bytes here and there for convenience. Obviously, the story is different when dealing with larger array elements, but in 95% of cases it is no problem to waste the bit of space.209.131.76.183 (talk) 19:01, 19 October 2012 (UTC)[reply]
I often use 0 to mean something special. For example, with a color map, the color 0 might be used to mean transparent. StuRat(talk) 20:23, 19 October 2012 (UTC)[reply]
If you think of an array of n elements as a row of boxes of width 1, each of which holds a value, then the width of the array is n, and you can naturally put it on the number line with the leftmost point at 0 and the rightmost point at n. The elements of the array are 0:1, 1:2, ..., (n-1):n, where the a:b notation means the box that extends from a to b on the number line. There's no 0-vs-1 ambiguity there. The ambiguity shows up when you want to refer to a box by a single number instead of a pair of numbers. You can use the leftmost point, in which case the indices are 0, ..., n-1, or the rightmost point, in which case the indices are 1, ..., n. I think the latter works as well as the former as long as you're consistent about it. In Python, for example, if arrays started at 1, the first element of the array would still be [0:1], because the slice notation would exclude the low index and include the high, instead of the other way around. Even in C, arrays could start at 1 if you take the view that pointers point between bytes, and *p means the value that ends at the point designated by p, instead of the value beginning there. Likewise, in the C++ STL, iterators point between elements for many purposes (whenever you pass beginning and ending iterators to a function), and arrays could be indexed by 1 if you used the convention that *p, where p is now an iterator, means the element before p instead of the element after. The definition of v[i] in this system is still *(v.begin() + i). You don't need any ±1 fixups as long as you're consistent.
I think the only reason to use 1 rather than 0 is historical: human languages do it because counting words predate the discovery of zero. The best reason to use 0 is also historical: the currently most popular programming languages do it. Either way, to avoid off-by-one errors, you have to understand the difference between points (which lie between elements and have no width) and elements (which are bounded by points). This issue also comes up when dealing with times. 12:00 might mean a moment—the stroke of midnight—or it might mean the minute that starts at midnight (since digital clocks use the 0-based convention). A good time library distinguishes the two. -- BenRG (talk) 19:08, 19 October 2012 (UTC)[reply]
Also consider some of the implications of 1-based or 0-based counting. Here's a simple decrement print in FORTRAN:
     do I = 12,10,-1
       print *,I
     enddo
and the results:
         12
         11
         10
Here's a simple decrement print in Python:
for I in range(12,10,-1):
    print I
and the results:
12
11
So, which is more intuitive ? StuRat (talk) 19:38, 19 October 2012 (UTC)[reply]

Google Books

edit

Hello,

how to view pages of books in Gbooks that are usually not visible? I really need one page from Dostoevsky Encyclopedia, either 187 or 188. Is there any trick? Regards.--Tomcat (7) 09:55, 19 October 2012 (UTC)[reply]

Buy it! That's the reason its not visible. -- Q Chris (talk) 09:59, 19 October 2012 (UTC)[reply]
It's on Amazon — do a "search inside" for 187 and the page comes up. --Mr.98 (talk) 11:43, 19 October 2012 (UTC)[reply]
I can not view the page there. Regards.--Tomcat (7) 15:17, 19 October 2012 (UTC)[reply]
I can see page 187 here, not 188 (if you meant that book). You can try WP:REX.--Tito Dutta (talk) 20:09, 19 October 2012 (UTC)[reply]
I suppose if, somehow, you know what content might be on that page, you could type that content in the search bar along with the book name. If the book is preview-able you might be able to read that page, but only if it comes up as a suggestion. dci | TALK 02:22, 20 October 2012 (UTC)[reply]
Let me note that the set of visible pages in a Google Books preview varies from visit to visit, so pages that one person can see might not be visible to somebody else. That's been a frequent cause of confusion. Looie496 (talk) 16:40, 20 October 2012 (UTC)[reply]

Doubts about R

edit

If you can use R just as a statistics package, why does it have an own programming language? Do you have to learn this programming language necessarily or can you just use all its capabilities with the corresponding libraries of Java or Python? OsmanRF34 (talk) 12:52, 19 October 2012 (UTC)[reply]

I don't think most people understand a "statistics package" to be anything more than software for statistical analysis, which R is. I'm sure there are ways to call R functions from java and python, but I doubt there's much advantage in that if you are starting from scratch.--83.84.137.22 (talk) 16:27, 19 October 2012 (UTC)[reply]

Do you have to understand pointers?

edit

If you are programming in a high-level language, you know that a = 10 stores this 10 somewhere called x034873497 or something like that, but do you have to know that for something useful? OsmanRF34 (talk) 14:14, 19 October 2012 (UTC)[reply]

The actual value (0x034...) doesn't normally come into play, but pointers can be useful in lots of higher-level applications. For example, if you're working with a 200MB piece of data in a class, you'll probably want to refer to it with pointers (or by reference, which just hides the fact that you are using a pointer) so that you aren't copying the entire object every time you pass it to a function. Pointer arithmetic also comes in handy in some situations, but you can probably get by without ever using it for most applications. I'm currently working on a project that involves multiple processes using a shared memory space. Properly using that space without introducing concurrency bugs involves allocators and queues built from scratch and is very reliant on pointers. 209.131.76.183 (talk) 15:02, 19 October 2012 (UTC)[reply]
I've also found pointers quite useful when sorting long strings. If you move 4 byte pointers around, versus 400 byte (character) strings, that takes 1% as long, all other things being equal. StuRat (talk) 17:43, 19 October 2012 (UTC)[reply]
Strictly answering the question: you don't have to understand anything. As far as I know, basic competency in computer science topics is notrequired by any officiating organization. However, even if you are not programming in a low-level language, you will improve your productivity as a programmer, and your program's efficiency, if you know a few things about computers. First, understand your computer's memory system. Peter Norvig's infamous article actually lists some common performance metrics for reasonable machines in this decade. If you're programming in a high-level language, these performance details affect you - even when you don't manage them directly through numeric access to machine memory or managed/virtual memory. And secondly, understand the difference between a piece of data, and a reference to that data. Whether your programming-language encapsulates that concept in the machine-specific representation of a memory-pointer to a hardware address is irrelevant: the distinction is real (albeit, a little bit abstract); and this distinction is very important when discussing algorithmic efficiency. Nimur (talk) 18:28, 19 October 2012 (UTC)[reply]
You are not "strictly answering the question" BTW. I asked if given a condition (high-level programming language), it would be useful to know something. In general, obviously you won't need basic competency in anything. OsmanRF34 (talk) 19:23, 19 October 2012 (UTC)[reply]
And to answer your Q, it certainly is important to understand the concept of pointers, even though you may not need to use them, in many cases.StuRat (talk) 20:21, 19 October 2012 (UTC)[reply]
I think the main point is that you need to grok the difference between a reference to a thing and the thing itself. This is probably the hardest part of understanding pointers, so in a way you do have to understand pointers. I think what makes learning pointers painful in C is not really the concept—it's that popular C compilers don't produce useful error messages when you do illegal things such as using a pointer after the pointed-to thing has ceased to exist, or moving a pointer off the end of an array, so it takes forever to figure out what went wrong when you make a mistake. There should be a learner's C implementation that does do all of these checks, but there isn't that I know of (at least, not free and actively supported). --BenRG (talk) 21:00, 19 October 2012 (UTC)[reply]
BenRG, you might find a managed C or C++ environment a lot more to your liking (or at least, more suitable when you train new programmers). Another option, for example, is static analysis with strongly enforced type- and use- warning, if you use the llvm compiler and static analysis tools, you can compile C code in a way that will warn you about uninitialized or invalid pointers. However, permitting "abuse" of pointers is a feature of the C programming language (rather than any specific implementation on any particular machine). The language itself does not differentiate between "good" and "bad" data access. It is "assumed" that the programmer is providing correct and valid instructions - for example, even when he/she accesses a pointer to unallocated memory, the programmer may be intentionally accessing memory-mapped hardware. In managed C environments, a lot more strict rules are in force; and a static-analyzer can therefore provide helpful messages about mis-used pointers. Or, consider Java: it is a language where the safety is designed in to the programming-language (rather than into any specific tool or compiler). It is not possible - in the strict, pure mathematically-provable sense - to construct a legal Java program that corrupts memory. (See, for example, the section on Language Safety in the LLVM compiler guide. Nimur(talk) 22:40, 19 October 2012 (UTC)[reply]
"you might find a managed C or C++ environment a lot more to your liking"—no, the point is to learn standard C, not a garbage collected language with a C-like syntax. Not that there's anything wrong with that, but you might as well use a more popular one like Java. "The language itself does not differentiate between 'good' and 'bad' data access"—yes, it does. If you write int a[10]; int *p = a; p += 11; a conforming implementation is allowed to abort execution with a message like "pointer p was incremented past the end of array a" and a stack traceback. You may be used to writing things like *(char *)0x1234 = 0x56 to write a hardware register, but that's a feature of whatever embedded system compiler you're using, not a feature of standard C. -- BenRG (talk) 06:41, 20 October 2012 (UTC)[reply]
BenRG, on careful inspection of my K&R text, The C Programming Language, I accede the point: certain pointer access is explicitly declared illegal. For example, (Section 5.3), referring to an object outside an array bound is explicitly illegal. So, you are correct: a good C compiler could verbosely log that error at compile or even at runtime. (However, I interpret this to mean "p += 11; " is legal, while "a[11]" is not, in the example you gave). There are other cases, too. Other pointer access is explicitly syntactically legal, but its value is undefined per the C language specification. Your example, *(char *)0x1234 = 0x56, is just frightening, and I hope I don't come across as the kind of person who would author that style of code. I was thinking more about uninitialized or unallocated arrays accessed by constant pointers: in some embedded systems, it's common to treat an entire address range of memory-mapped I/O as one giant array, even though it's never been explicitly allocated or initialized. Such code is very easy to read and maintain, when written in C; and is "syntactically legal," per K&R's use of that phrase; but is clearly taking advantage of machine-specific behavior that is "undefined" by the language. Nimur (talk) 07:48, 20 October 2012 (UTC)[reply]
Many, many years ago, I was reviewing code checked in by my team lead and discovered that in some unlikely situation it did *((char*)0)=0, apparently because he wanted to stop execution at that point. I complained vociferously. He couldn't understand the problem.Marnanel (talk) 07:54, 20 October 2012 (UTC)[reply]
The ANSI C (draft) standard says of pointer arithmetic "Unless both the pointer operand and the result point to a member of the same array object, or one past the last member of the array object, the behavior is undefined", and themost recent draft ISO standard has similar language in section 6.5.6, so it's definitely permitted to abort at that point even if the pointer is never dereferenced. -- BenRG (talk) 23:09, 21 October 2012 (UTC)[reply]
No you don't have to understand pointers. I do a lot of programming in Matlab, which is a language that doesn't even have a pointer class and provides no simple way to directly access memory. One could get very deep into Matlab and do many useful things while having no idea about pointers or memory architecture in general. That said, understanding the internals is useful. Knowing when Matlab is going to pass a reference to data and when it is going to duplicate the data can often help diagnose performance bottlenecks. In practice, one could probably do a lot of code optimization simply by trial and error, but knowing how the internals work does provide an advantage. That said, I've seen a lot of academic research done by people who probably couldn't give you a good definition of what a pointer is, so it certainly is possible to program in some high level languages without ever learning about pointers. Dragons flight (talk) 20:41, 19 October 2012 (UTC)[reply]
In fact, MATLAB (the programming language) has pointers. These can be explicitly defined, using the MATLAB special superclass,handle; or they can be implicitly used via theMATLAB class system; or machine-specific memory access can be implemented by exercising an interface to any of the numerous external programming language interfaces. Or, you can implicitly access offset data, through the use of MATLAB's powerful matrix and vector syntax. It has been my experience that bad MATLAB code - the kind of code that causes people to invalidly assert that "MATLAB is slower than FORTRAN" - is almost entirely due to programmer-error: a tiny misunderstanding of the subtleties of MATLAB's programming language syntax means the difference between copying a large matrix, versus copying a tiny pointer to a large matrix. In fact, one of the things that makes MATLAB so convenient for writing mathematical expressions is that you can express data-access using array and matrix notation - which is a special case of using pointers. Just because you exercise a small subset of the language - or you don't understand the way that the code you write is actually working - doesn't mean that MATLAB lacks pointers. In fact, one can write many complex C programs that uses no pointer syntax; but that doesn't mean C lacks pointers. Nimur (talk) 22:44, 19 October 2012 (UTC)[reply]
I use handles, they are as close as Matlab comes to exposing a pointer class, but they aren't pointers as people who use low level languages would usually understand the term. Many languages have a kind of pointer which is a memory address you can manipulate, reposition, recast, etc. Matlab never exposes something that looks like a memory address, and imposes strong limitations on how and when data structures can be typecast. Matlab simply doesn't have any primitive that looks like a memory address pointer. I also wouldn't say that having Matlab call an external language, such as C or Java, to accomplish something is at all the same as saying that Matlab has that ability built into it. I do sometimes write scripts in other languages and interface them with Matlab to deal with Matlab's limitations, but just because such workarounds are available doesn't mean there isn't a limitation. Of course the underlying behavior of Matlab includes pointer style pass by reference and many other aspects where having an understanding of pointers and memory architecture is useful, but the programmer will not be manipulating something that actually looks like a memory reference. Lastly, please avoid the personal attacks. Dragons flight (talk) 00:01, 20 October 2012 (UTC)[reply]
No personal attack was intended. Anyway, the point is that pointers are not always C-style pointers, where the pointer is represented as a numeric type corresponding to a machine address. (In fact, K&R make clear that even in C, a pointer is not an integer: it is a pointer, and follows different rules, particularly with respect to arithmetic). A pointer is an abstraction of the representation of a reference to data, distinct from the data itself; it need not have anything to do with the data's layout in memory or its memory address. This abstraction feature is built in to the MATLAB programming language: in many ways, these pointers are more "pure" than C pointers; enforcement of strongly typed pointers comes withsome advantages. Finally, I simply wanted to point out that MATLAB is not a sandboxed environment: the language specification explcitly allows for external access to system resources, including memory. This is unlike some other languages, which intentionally make raw machine-access impossible for the sake of sandboxing. Nimur (talk) 08:13, 20 October 2012 (UTC)[reply]
To quote pointer (computer programming): "More generally, a pointer is a kind of reference, and it is said that a pointer references a datum stored somewhere in memory; to obtain that datum is to dereference the pointer. The feature that separates pointers from other kinds of reference is that a pointer's value is meant to be interpreted as a memory address, which is a rather low-level concept." (emphasis added). Matlab has operations that qualify as references, but the language of our articles as well as my general understanding, is that what separates "pointers" from the broader class of "references" is that they can be interpreted as memory addresses. Matlab abstracts the memory operations in a way that generally makes such interpretations impossible. For example, a matrix in Matlab does not necessarily map to a contiguous block of RAM, and even within a contiguous blocks they don't guarantee that the fourth element of a list is always the fourth element in the allocated memory block. I could give many other examples. You seem to be defining "pointers" to include all forms of reference (computer science), which is an incorrect usage of the terminology in my opinion. They aren't "pure" pointers, because they aren't "pointers" at all. Dragons flight (talk) 18:44, 20 October 2012 (UTC)[reply]
Well, that (the bolded phrase) is clearly wrong. Look at pointers in (standard) Pascal, for example. Java calls them references most of the time but it has NullPointerException. Even in "modern" C (starting with ANSI C, which is over 20 years old now), thinking of pointers as memory addresses will lead you to writedangerously wrong code. -- BenRG (talk) 23:09, 21 October 2012 (UTC)[reply]

What is the first image posted to the Internet (not to the WWW)?

edit

The image of the singing group "Les Horribles Cernettes" is the first image posted to the world wide web. The coffee pot is the first web-cam image. But what is the first image posted to the Internet? It's hard to search because many hits return, incorrectly, the cernettes image. Feel free to define "posted to" and "Internet" in any sensible helpful way. I guess tcp/ip and publicly available are required. THANK YOU! :-)--31.126.189.205 (talk) 21:23, 19 October 2012 (UTC)[reply]

You'd also need to define "image" to come up with a sensible answer. Does ASCII art count as an image? --Carnildo (talk) 01:40, 20 October 2012 (UTC)[reply]
And what does "posted to the Internet" mean? Stored in some accessible place? Announced (how?) as being available? Doesnews:alt.binaries.pictures.* count as "on the Internet"? (Usenet did not always propagate on Internet, or so I understand.)—Tamfang (talk) 03:24, 20 October 2012 (UTC)[reply]
yeah, "posting to the internet" most often means posted on a website. Emailing or putting in a ftpable directory isn't usually considered posting, is it? Gzuckier (talk) 15:26, 25 October 2012 (UTC)[reply]


There were ASCII art versions of a number of Playboy centerfolds making their way around the networks in the mid-to-late 1970's (prior to TCP/IP). Zoonoses (talk) 03:37, 20 October 2012 (UTC)[reply]
"OMG, you can see her entire exclamation mark !" StuRat (talk) 06:58, 20 October 2012 (UTC) [reply]