Wikipedia:Reference desk/Archives/Computing/2011 March 20
Computing desk | ||
---|---|---|
< March 19 | << Feb | March | Apr >> | March 21 > |
Welcome to the Wikipedia Computing Reference Desk Archives |
---|
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages. |
March 20
editreliability of UDP
editYeah, I know, it's the unreliable datagram protocol. My question is, suppose I send a UDP packet (100 payload bytes, say) to some arbitrary node on the internet that is online and listening for it. What is the practical likelihood of it actually getting there? The application is a "push" subscription service with occasional notifications. I don't want the overhead of a full TCP setup, don't want the subscribers polling all the time, and the messages are too infrequent to keep a TCP connection open, and it is deemed ok if a small percentage of messages never get delivered. Thanks. 75.57.242.120 (talk) 03:23, 20 March 2011 (UTC)
- The "probability" of successful packet reception depends on a lot of factors - which routers you need to use to connect from source to destination node; amount of traffic those routers are currently experiencing; what queuing or QoS mechanism they use; and so on. There's no substitute for benchmarking by monitoring a realistic, statistically-significant trial period. The percentage of dropped packets can practically range from 0 to 100 %. Nimur (talk) 06:16, 20 March 2011 (UTC)
- Hmm, thanks. Well, should the probability be fairly independent in the short run, from one packet to another? And should it be the same for UDP packets as for the raw IP packets that make up a TCP session? I think the Linux kernel may keep stats on TCP retry rates, so maybe checking those numbers is basically my answer. 75.57.242.120 (talk) 06:35, 20 March 2011 (UTC)
- Unless you have really peculiar requirements, I would suggest TCP. The rumored overhead of TCP is much exaggerated: there is an initial handshake which is three small packets, and each packet header is 12 bytes larger than an UDP packet header - neither of which is likely to be a problem. TCP as such does not need to be kept open; a connection can be idle forever with no packets passing in either direction. Though some firewalls may time out TCP connections, so you may want to send a heartbeat packet every minute or so. As a bonus, a heartbeat lets you detect a defunct client - which you won't get with UDP; you'll never know if a UDP client has gone away. 88.112.59.31 (talk) 10:15, 20 March 2011 (UTC)
- Thanks. I thought TCP connections always time out after some number of minutes. A 1 minute heartbeat would increase traffic by orders of magnitude, if messages are only going out a few times a day (the message rate will vary by large factors). Also, TCP packets result in ack packets, so that doubles the number of packets by itself. I guess that's a reasonable point about detecting a disconnected client. I could have the clients poll ~1x a day to pick up any messages they might have missed, and stop sending to clients not heard from in a long enough while. Also I'm not sure how many TCP connections a standard Linux box can keep active at once. It would be great (probably not possible even with UDP) if I could serve ~1M clients from one box. 75.57.242.120 (talk) 10:40, 20 March 2011 (UTC)
- Personally, I would expect that, all other things being equal, all your UDP packets would get through. They will only be dropped under conditions of significant overload on the routers, which is by no means a common situation. I think you are likely to lose them as often as you lose a PING attempt. That said, it might be worth putting a little reliabilty into your own protocol: "This is the current message. The last one was this number. Please ask for it if you didn't recieve it".--Phil Holmes (talk) 11:10, 20 March 2011 (UTC)
- ...at which point you are on your way implementing your own, most likely poorly designed, copy of TCP. Retransmit timers, "did you get that message?", "did you get my message asking did you get that?", timer-controlled queue of messages in case a client asks for retransmission, etc. That is the complex slippery slope I imagine many attempts at using UDP descend. I simply use TCP, it gives me all that for free. 88.112.59.31 (talk) 18:09, 20 March 2011 (UTC)
- Personally, I would expect that, all other things being equal, all your UDP packets would get through. They will only be dropped under conditions of significant overload on the routers, which is by no means a common situation. I think you are likely to lose them as often as you lose a PING attempt. That said, it might be worth putting a little reliabilty into your own protocol: "This is the current message. The last one was this number. Please ask for it if you didn't recieve it".--Phil Holmes (talk) 11:10, 20 March 2011 (UTC)
- Thanks. I thought TCP connections always time out after some number of minutes. A 1 minute heartbeat would increase traffic by orders of magnitude, if messages are only going out a few times a day (the message rate will vary by large factors). Also, TCP packets result in ack packets, so that doubles the number of packets by itself. I guess that's a reasonable point about detecting a disconnected client. I could have the clients poll ~1x a day to pick up any messages they might have missed, and stop sending to clients not heard from in a long enough while. Also I'm not sure how many TCP connections a standard Linux box can keep active at once. It would be great (probably not possible even with UDP) if I could serve ~1M clients from one box. 75.57.242.120 (talk) 10:40, 20 March 2011 (UTC)
Good points by everyone and I hadn't thought of the PING comparison. Yes the messages will have sequence numbers, so the clients could notice if there is a gap, but I wouldn't have any UDP retransmission of any sort. The application (not Wikipedia-related) is something like a Wikipedia watchlist, where you can view the status of articles that interest you by hitting a web page. The UDP service is so you can also have realtime alerts of when a watched article changes, without having to reload your watchlist page all the time. You can always get the same info by viewing the watchlist by HTTP, and that's what you'd do if you know you missed an alert. The 1-day-timeout thing means all subscriptions expire after 24h unless renewed, so to keep getting notices you'd have to refresh the subscription (by hitting a URL) once a day. 75.57.242.120 (talk) 00:02, 21 March 2011 (UTC)
Has .gov.ly been hacked?
editI've been trying to find information about the current situation in Libya directly from the horse's mouth, so I searched for http://www.google.com/search?q=site%3A.gov.ly&lr=lang_en. However, that doesn't give the desired result. The first link says "HACKED By SecurityPort.org [...] Turk tarihinde bir askeri ve siyasi basari olmaktan [...] www.mof.gov.ly/". The other websites are either 404 or "under construction". Have they all been hacked? Surely, .gov.ly must have an interest in making their side of the story known to the world? Justcurious2011 (talk) 06:51, 20 March 2011 (UTC)
- All that has happened is that one single website (http://www.mof.gov.ly - the Libyan Ministry of Finance), which uses an RSS feed aggregator, has read a feed from an untrusted source. As a result of what appears to be a malicious attack, somebody has injected an internet news item whose "title" is:
<td class="news_title"><meta http-equiv="refresh" content="0;URL=http://securityport.org/islam/"></td>
- And, because the MOF website does not appear to properly clean the input to its RSS aggregator, it simply includes that HTML in the page, which causes an HTTP redirect to the offender's website.
- How did this happen? 1) the MOF website uses an RSS aggregator script that doesn't check and clean its input; 2) somebody (who has access to a website that the RSS aggregator sources from) noticed this vulnerability, probably by recognizing the specific RSS system, whose exploit may be published; 3) the attacker created a new news-item on their own website, whose title was the code injection.
- See our article - code injection - preventing code injection with data sanitization - and secure input and output handling, for best practices that should be followed to eliminate this sort of vulnerability. Nimur (talk) 20:19, 20 March 2011 (UTC)
- Incidentally, you can override the URL redirect, if you want to see the original page. In Firefox, check the box for "warn me when web sites try to redirect or reload the page". Then, visit the website at www.mof.gov.ly - which is fully intact. And if you do not read Arabic, Google's machine-translator still works (it sanitizes and therefore ignores the code-injection)! Nimur (talk) 20:34, 20 March 2011 (UTC)
- Thank you, that seems to make sense for the first page. Unfortunately, I couldn't try it out since that page takes too long to respond now. Which leads me to the second question: I still don't understand why there are so few pages under .gov.ly that seem to work. Is that negligence by the domain owner, or a targeted effort by others? Justcurious2011 (talk) 20:48, 20 March 2011 (UTC)
- Keep in mind the difference between the domain name system and the individual websites that are underneath the domain. Generally speaking, each web site is operated independently - it has its own DNS name, its own server hardware, and probably has its own staff responsible for system maintenance and web content.
- Now, I am not sure which Libyan government website would be the authoritative news website. Our article, Media of Libya, lists several state-owned newspapers, including, for example, The Republic. Their website is updated daily, and does not appear to have been compromised in any way. So you can read the Government story at that website. (Today's top headline - The people of Libya call for the killing to end. If you do not read Arabic, you may be out of luck - but of course, that newspaper is not targeting your demographic). It does not appear that any foreign agent, state-sponsored cyber-warfare, or "internet vandal" has yet managed to pull the plug on those Libyan Government newspapers.
- You asked why so few Libyan government websites seem to be working. Having lived through a few rough spots in my day, I can tell you that trying to get professional workers to show up at the office and "fix computers", "write articles," and so on, while the city streets outside are unsafe, is not an easy job. If I were a Libyan government minister, and I wanted to make sure my side of the story was coming across, I would worry more about making sure my staff felt safe in their work environment, and keeping basic infrastructure (such as the plumbing, electricity, telephone, and roadways) operational. When the airstrikes get a little closer to Tripoli, those problems will be a lot harder to address. This "cyber war" nonsense is really laughable; some script kiddie in Turkey can shut down a website, but who cares? Bombs are literally falling. There are bigger problems for the Libyan government agencies than playing around with their websites. Nimur (talk) 21:54, 20 March 2011 (UTC)
- Good points by Nimur. I heard that during the Egyptian street protests last month, the Egyptian middle and professional classes were quite interested in what was going on, and they followed developments mostly by sitting around at home reading the internet. They might have felt sympathetic but they basically did nothing, leaving actual protesting to the plebes. Then the Egyptian govt shut down the internet in Egypt, hoping that stopping online communications between the protesters would make the protests go away. What happened instead was that the "armchair" observers now had no way of finding out what was going on other than getting out on the streets themselves, which they did. So the internet shutoff made the protest got bigger and more respectable, rather than smaller and more marginalized. 75.57.242.120 (talk) 23:47, 20 March 2011 (UTC)
- Thank you both, and especially thanks for the link to Media of Libya! BTW, as of now, no English state owned website listed in that article seems to be working. jamahiriyanews.com contains only sponsored listings (one of which claims it's from Al-Jazeera, although that seems fishy), and azzahfalakhder.com is "forbidden". But the Tripoli Post is alive and well and outspoken, with headlines such as "Despite Declaration, Pro-Al Qathafi Forces Reported Pushing into Benghazi". Justcurious2011 (talk) 07:15, 21 March 2011 (UTC)
- Good points by Nimur. I heard that during the Egyptian street protests last month, the Egyptian middle and professional classes were quite interested in what was going on, and they followed developments mostly by sitting around at home reading the internet. They might have felt sympathetic but they basically did nothing, leaving actual protesting to the plebes. Then the Egyptian govt shut down the internet in Egypt, hoping that stopping online communications between the protesters would make the protests go away. What happened instead was that the "armchair" observers now had no way of finding out what was going on other than getting out on the streets themselves, which they did. So the internet shutoff made the protest got bigger and more respectable, rather than smaller and more marginalized. 75.57.242.120 (talk) 23:47, 20 March 2011 (UTC)
Tracking Flickr images
editIs it possible to deduce the "profile page" of an image on flickr from the file's URL? Trying to find licencing details etc. for http://farm3.static.flickr.com/2068/1791811909_acc666f314.jpg, and TinEye can't find it. Thanks! 83.70.250.202 (talk) 17:33, 20 March 2011 (UTC)
- Yes, use the api. Go to http://www.flickr.com/services/api/explore/ , from the url you've given - the photoid is 1791811909 and the secret is acc666f314. The XML output generated gives all sorts of information about the image and the uploader. Nanonic (talk) 17:57, 20 March 2011 (UTC)
Evolutionary algorithm for music
editSome years ago I heard on the radio about a computer program that would play a simple tune or rhythmn. If you told it that you like that tune, it would offer variations of it to you. Thus the tune evolved into something you liked.
The interesting thing was that just telling it you liked something - yes or no - was sufficient, even though there must have been many variables involved to specify the music.
I would like to play with or program something like this myself. Does anyone have any more details about it, of how it worked etc? Thanks 92.28.241.202 (talk) 17:54, 20 March 2011 (UTC)
- Evolutionary music seems to be relevant. 92.28.241.202 (talk) 18:06, 20 March 2011 (UTC)
Maybe this? It's the first answer over at Wikipedia:Reference desk/Miscellaneous#Internet_song_generators and seems to be exactly what you're looking for 82.43.92.41 (talk) 18:35, 20 March 2011 (UTC)
- I worked on a paper from the early stages of evolutionary music. As with much artwork, what humans find pleasing is Zipf distributed frequency of items and events. You can read the whole paper here. Notice that this paper is about writing a program that decides if music is pleasing or not. The next step (which I had no part in) was writing music that would be pleasing. -- kainaw™ 21:33, 20 March 2011 (UTC)
- Awesome paper, Kainaw. I wonder if you could tune the aesthetic evaluation algorithm with a supervised machine-learning algorithm, and then run the music-synthesis step to numerically optimize for "most pleasant" music. Nimur (talk) 22:08, 20 March 2011 (UTC)
- That was done later. That is an old paper - from undergrad school. I also tried it with artwork. The problem is that measuring how pleasing certain metrics are is the easy part. Defining the metrics is hard. If you miss an important metric, you'll never get anything that sounds (or looks) pleasing. -- kainaw™ 22:29, 20 March 2011 (UTC)
I think 82.43.92.41 meant WP:RDM, the miscellaneous-topic reference desk. David Cope has done some nice work in automated composition that might be thought of as a fancier version of what you're asking for. He can configure his software to generate music in the style of classical composers like Bach, Brahms, etc. He had it generate 5000 fake Bach cantatas that you can download, and I've listened to a few of them. They are pretty convincing imitations if you listen to 15 or 30 seconds at a time: they use the rhythms, harmonies, etc. that a Bach listener would expect. The problem is that they are like a computer-generated novel with good sentences but no plot. They don't really go anywhere, and after a minute or two they stop being interesting. 75.57.242.120 (talk) 21:50, 20 March 2011 (UTC)
- I fixed the link 82.43.92.41 (talk) 22:16, 20 March 2011 (UTC)
Computer self restores
editI'm running an older computer (Pentium D) with Windows 7 and it's not on a UPS. Earlier today I had a Wikipedia page open for editing along with a text editor and AWB both running and none of them had been saved. I got called away and after 15 minutes the computer went to sleep. I was going to go back to it when the power went out. I decided not to restart for a couple of hours and when I did I was again called away just as I pressed the power button. This time when I got back I found that the computer was in exactly the same state as before the power outage. Firefox was still open as was the text editor and AWB all with the stuff I had been working on. Why would it being in sleep mode cause that to happen? I wouldn't have thought that the CMOS battery (see below for a question about that) would have been able to have kept the information for a couple of hours. Of course I have now taken the panels off the computer to see if there is anything else that would supply power like that and there appears to be nothing. CambridgeBayWeather (talk) 23:02, 20 March 2011 (UTC)
- The CMOS battery is a tiny, weak, watch battery. It barely keeps alive a little quartz clock and a few tens of bytes of BIOS settings. It doesn't (and doesn't try) to keep the main system DRAM alive. What happened is that your computer first went into sleep mode, which means it was really still "on", but with some peripheral parts of the system powered down (but with the DRAM fully powered). At some later point it switched to hibernate, which means it wrote the state of the DRAM to a file on the disk and then fully powered the machine off. Some point after that your power died. Later you returned, fixed the power outage, and pushed the "power" button. The system booted, noticed that it had shutdown as hibernate, and so restored the hibernate file into DRAM. -- Finlay McWalter ☻ Talk 23:28, 20 March 2011 (UTC)
- A slight correction: for a machine as new as yours, the BIOS status is probably written to Flash memory rather than NVRAM, so the battery only has to keep the clock ticking. -- Finlay McWalter ☻ Talk 23:30, 20 March 2011 (UTC)
- Firefox saves its entire state (as best it can) when it shuts down irregularly, and automatically tries to restore that state when it restarts. Looie496 (talk) 23:41, 20 March 2011 (UTC)
- It's worth bearing in mind that the "power button" isn't really what it used to be. In times of yore it was a real power button: pushing it toggled whether electricity flowed to the transformer, and pushing it when the machine was on was much the same as just yanking the power cord out of the wall. But now it's just a (albeit special) button; it sends a signal that's received by the OS, which decides how to handle it. Typically the OS will push the PC to a different ACPI state, depending on how it's configured. So it means "please shut down nicely, or something", rather than being a hard kill. Incidentally the OS isn't the only recipient of this message; the ACPI microcontroller (which is in the southbridge) is also listening; that's what turns a PC "on", and generally if you hold the power button in for 5 seconds the ACPI controller figures that the OS must be misbehaving and does a hard poweroff for you. You shouldn't do that unless things have really broken, as that's to a full, unannounced, poweroff with no work saved. -- Finlay McWalter ☻ Talk 23:46, 20 March 2011 (UTC)
- Looie496's explanation is the correct one, I'm pretty sure: Firefox cached what you were doing on disk while you were working, and restored the stuff from cache when you restarted the browser after the power outage. 75.57.242.120 (talk) 23:50, 20 March 2011 (UTC)
Windows 7 has a feature called Hybrid Sleep, which means when it goes to sleep it also hibernates (saves session to disk) in case of a power cut. The situation you describe is exactly what Hybrid Sleep was intended for. 82.43.92.41 (talk) 00:37, 21 March 2011 (UTC)
- Thanks all for the excellent answers. CambridgeBayWeather (talk) 02:12, 21 March 2011 (UTC)
CMOS battery time of day
editWhen I was looking for an answer to the above I looked at Nonvolatile BIOS memory#CMOS battery which says that, "These cells last two to ten years, depending on the type of motherboard, ambient temperature and the time that the system is powered off". So why does the time of day have anything to do with battery life? CambridgeBayWeather (talk) 23:02, 20 March 2011 (UTC)
- Maybe it means "the amount of time the system is powered off". Meaning the total amount of time over the lifetime of the battery that the system is powered off, as opposed to the time of day (each day) that the system is powered off. Maybe it is just not written clearly. Bus stop (talk) 23:17, 20 March 2011 (UTC)
- Yes. The assumption is that when the PC is switched on, the clock is powered from the PSU rather than the battery, so the drain on the battery is proportional to the proportion of the day that the PC is off. Building a circuit that's powered by two unrelated and dissimilar power sources is slightly tricky, so in practice the battery is probably still partially loaded even when the PC is powered. -- Finlay McWalter ☻ Talk 23:35, 20 March 2011 (UTC)
- Thanks folks. I'm going to clarify that sentence. CambridgeBayWeather (talk) 02:13, 21 March 2011 (UTC)
Microsoft Excel 2007 linking automatic updating
editI have several finance worksheets (sheet2, sheet3, etc.) in the same workbook (e.g., banking sheet2, saving sheet3, credit card sheet4, etc.). In the first worksheet, sheet1, I am trying to create a balance sheet that links to the other worksheets. In sheet1 cell B5, I have the linking string ='Sheet3'!A4
. When I move Sheet3, cell A4 around within the same worksheet, I would like the contents of sheet1 B5 to automatically change to follow the move. ='Sheet3'!A4
doesn't change when I move Sheet3, cell A4 around within the same worksheet (Sheet3). Is there a linking code that I could use in sheet1 cell B5 that will automatically change when I move Sheet3, cell A4 around within the same worksheet? Thanks. -- Uzma Gamal (talk) 23:19, 20 March 2011 (UTC)
- Hum, I think adding dollar signs --
='Sheet3'!$A$4
-- might do the trick. -- Uzma Gamal (talk) 23:35, 20 March 2011 (UTC)- Actually, I meant "move" Sheet3, cell A4 by sorting Sheet3 by dates. Sometimes, the cells in Sheet3 do not return to where they were as new information is added and Sheet3 is sorted. Using
='Sheet3'!$A$4
for sheet1 cell B5 and inserting a line above Sheet3, cell A4 moved Sheet3, cell A4 to Sheet3, cell A5 and automatically revised='Sheet3'!$A$4
to read='Sheet3'!$A$5
. When I sorted Sheet3 that resulted in Sheet3, cell A4 being located in Sheet3, cell A23,='Sheet3'!$A$4
did not change. Is there a linking code that I could use in sheet1 cell B5 that will automatically change when I sort Sheet3, cell A4 within the same worksheet? -- Uzma Gamal (talk) 23:44, 20 March 2011 (UTC)- You could try using the INDEX and MATCH functions. You need some kind of indentifier in each row. Let's insert a column at the beginning of sheet3, so the identifiers are in column A and the data you want is now in column B. Let's say the identifier for the row you want is "foo". Then in Sheet1!B5 you put a formula like "=INDEX(Sheet3!B2:B100, MATCH("foo", Sheet3!A2:A100,0))". Look the functions up in the Excel help files for more details of how to use them. --Tango (talk) 23:52, 20 March 2011 (UTC)
- Wow! Thanks. That helped out. -- Uzma Gamal (talk) 02:58, 21 March 2011 (UTC)
- You could try using the INDEX and MATCH functions. You need some kind of indentifier in each row. Let's insert a column at the beginning of sheet3, so the identifiers are in column A and the data you want is now in column B. Let's say the identifier for the row you want is "foo". Then in Sheet1!B5 you put a formula like "=INDEX(Sheet3!B2:B100, MATCH("foo", Sheet3!A2:A100,0))". Look the functions up in the Excel help files for more details of how to use them. --Tango (talk) 23:52, 20 March 2011 (UTC)
- Actually, I meant "move" Sheet3, cell A4 by sorting Sheet3 by dates. Sometimes, the cells in Sheet3 do not return to where they were as new information is added and Sheet3 is sorted. Using