Wikipedia:Reference desk/Archives/Computing/2012 December 21

Computing desk
< December 20 << Nov | December | Jan >> December 22 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


December 21

edit

What was/is the original Linux OS?

edit

What was/is the original Linux OS?13:02, 21 December 2012 (UTC) — Preceding unsigned comment added by 189.115.209.8 (talk)

Start by reading History of Linux. When first announed, Linux included a kernel, the bash shell, and gcc (a C compiler). This was the operating system: the kernel and the user-interface (i.e., a command-line interpreter). More recently, the user-interface utility now named X.org was ported; many utilities like desktop managers and commonplace software packages became bundled as "Linux distributions," and many software forks split off. There is still one canonical repository of "the" source-code, hosted at http://kernel.org - though this contains source for the Linux kernel only, and not for full distributions of GNU/Linux or alternative operating systems. Nimur (talk) 17:22, 21 December 2012 (UTC)[reply]
If I understand the question correctly, the OP is interested in which the original linux distribution (i.e. packaged bundle of linux kernel and tools) was. Have a look at Linux_distribution, and especially at the nice timeline plot in that article. The answer to the question is probably that (apart from a few short lived efforts) Slackware is the oldest distribution which is still around. The other major distributions such as RedHat, Debian and derivatives are independent of Slackware though, so it is doubtful you can call Slackware the 'original' distribution. 81.156.176.219 (talk) 22:15, 21 December 2012 (UTC)[reply]
I will be sincere here. Linux is a kernel not an os. But I saw some people on some places saying something like the kernel being an os or something like that. So I decided to post that here to see what answers I would get.177.98.122.31 (talk) 10:19, 22 December 2012 (UTC)[reply]
As you read more about operating systems, you will learn that the term is not very clearly defined, especially around the edges of its use-cases. For very simple systems, like the very earliest incarnations of Linux, there was no significant distinction between "the kernel" and "the system." This was especially true for the earliest Linux, because its scheduler was deeply architected to take advantage of the Intel 386 system. So, the kernel and its platform-driver were the same pieces of code. And, in the very beginning (during development), the kernel booted and did nothing; perhaps it ran Linus's test hello world program. Eventually, it booted and ran the bash shell. That was the operating system. Now, in modern use, we distinguish kernels from system-software as a whole; because (especially in the last few years), the code has become clean and standard enough so that you can easily separate these entities. You can replace Linux Kernel with GNU Hurd, and the "operating system" appears identical - not just the user interface, but even things OS programmers think about, like your USB keyboard driver, and your X.org server. So, on modern GNU/Linux distributions, we can "cleanly" say that the kernel and the OS are "separate parts." That clean separation is not always true for other systems, even some Linux systems. Nimur (talk) 18:35, 22 December 2012 (UTC)[reply]

Hash that allows for similarity comparison

edit

Disclaimer: This is not for security. It is not for true hashing. It is for a very specific task that has absolutely nothing to do with common hashing tasks.

Is there a one-way function (like a hash) that allows for similarity comparison? Example, I want to hash names. I hash John Lennon and John W. Lennon. I want the result to allow me to identify that those two things are much more similar than a hash of Paul McCartney. So, this function requires two things. 1: It is one-way. I don't want to know the original information. 2: It retains similarity. 128.23.113.249 (talk) 13:24, 21 December 2012 (UTC)[reply]

I think you might need to be a bit more specific about your requirements. What kind of data are you hashing (presumably not names of musicians), and what does "similarity" mean in this context? Do you actually require the hash to be one-way, or is it just that it can be one-way? How much smaller do you want the hash to be than the original data? If you actually are interested in names, then are you aware of soundex and similar algorithms to group words with similar pronounciations together? 130.88.99.231 (talk) 16:01, 21 December 2012 (UTC)[reply]
The data will be names. But, the names cannot be known. Therefore, there must be a one-way hash-like function. So, assume the function spits out 8 digit numbers. I give you 12345678 and 12245678 and you can see that the original names, which you don't know, are about 90% similar. The end goal is to identify similarity of names without having an easy means of knowing the original names. 128.23.113.249 (talk) 18:27, 21 December 2012 (UTC)[reply]
If similar inputs give similar outputs, and if (as you apparently do), you also want to be able to reliably conclude the reverse (given similar hashes, you want to know that the names are similar), then it's hard to see how you can prevent someone from inverting the function, by just searching. What you're looking for seems (almost?) provably nonexistent. --Trovatore (talk) 18:37, 21 December 2012 (UTC)[reply]
Perhaps Soundex would work for your purposes? —Chowbok 20:20, 21 December 2012 (UTC)[reply]


I suppose I could be a little more explicit about what I mean by "just searching". Suppose I have one of your hashes, and I want to know the input that generated it. I start generating inputs and evaluating the distance between their hashes and the given hash. I change a little bit and see if it improves or degrades the cost function (that's the distance between the hash of my chosen input, and the target hash). Now if I always accept a change that improves that, and always reject one that degrades it, that's called greedy descent, and it might not work because I could get stuck in a local minimum.
But there are lots of strategies to avoid this fate. A simple one is simulated annealing.
This might not work all the time, but it's probably going to work a significant fraction of the time, and it's hard to see how any scheme fitting your requirements can be made resistant to it.
On the other hand, since you say you don't care about "security", maybe you're not interested in preventing someone from recovering the input if he's willing to work that hard; you just don't want it to be trivial. That's probably possible. I'd start with a search on "obfuscation". --Trovatore (talk) 18:58, 21 December 2012 (UTC)[reply]
It sounds like you want fuzzy hashing (context triggered piecewise hashes) (which we don't have an article on?). ssdeep does a version of that, although I'm sure there's something better for text. The soundex is a step in the right direction but I don't think it directly does what you want. Does anyone know if there's some better wiki articles on fuzzy hashing? I don't know the name of any algorithms that implement it, but that's what I'd start to investigate. Shadowjams (talk) 07:53, 22 December 2012 (UTC)[reply]
And, even ignoring the hashing issue, which names are most similar is still difficult to define. I'd assume the family name is most important, but this is sometimes the last name and sometimes the first name, depending on the culture. Then the given name may be next, or perhaps a nickname. Middle names are less important, but there can be any number of those. And what do we do about aliases, nom de plumes, etc. ? StuRat (talk) 08:23, 22 December 2012 (UTC)[reply]
Is a hash of length 27x26 bits ok?
(OR) If it is, you zero all bits first. Then you process all pairs of letters (JO, OH, HN, LE, etc.) as follows:
-Treat A as 0, B as 1, ... , Z as 25.
-Compute N = 26 * (first letter) + (second letter). If there is only one letter (W for example in JOHN W LENNON), make that the second letter and make the first letter 26.
-Set the Nth bit of the hash.
That way, the bit pattern will resemble the letter pattern. The XOR of two hashes is the "distance."
If 27x26 is too large, take the first letter modulo 9 and the second modulo 13. Voilâ, a 9x13-bit hash at the price of some accuracy.
Close to trivial, and probably inferior to soundex, but maybe my hashing strategy is closer to the thing you had in mind. Hope that helps. - ¡Ouch! (hurt me / more pain) 18:05, 22 December 2012 (UTC)[reply]

Is it possible to display 2 Excel 2007 spreadsheets on screen at the same time?

edit
  Resolved

I'm using Excel 2007 and have 2 spreadsheets open. Is there a way I can display both on the screen at the same time? What I would love to do is to snap one spreadsheet to the left side of my monitor and the other spreadsheet to the right side of my monitor. But Excel won't let me do this and I don't see an option to turn it off. I'm using Excel 2007 and Windows 7. I have one monitor. A Quest For Knowledge (talk) 14:21, 21 December 2012 (UTC)[reply]

Nevermind. I figured it out. You go to the View tab on the ribbon and click Arrange All. A Quest For Knowledge (talk) 14:46, 21 December 2012 (UTC)[reply]
For older versions of Excel, you can just open the spreadsheets in different instances of Excel, then resize as required. Dbfirs 21:01, 22 December 2012 (UTC)[reply]