Talk:Google Books Library Project

Latest comment: 9 years ago by SD0001 in topic Removed text

University of mysore

edit

They are not part of the project. They are not listed on the partners page. See Partners Bwwm (talk) 01:40, 6 February 2008 (UTC) Bavarian State LibraryReply

edit

Does this seem like a plausible strategy? suggesting an implied invitation? --Ooperhoofd (talk) 02:55, 9 February 2008 (UTC)Reply

a very good article

edit

I found this article cohesive. The paragraphs are tied together and the article itself is no-nonsense. It is an impressive work. Maybe someday it will be a "Good article". Kushal 02:44, 18 February 2008 (UTC)Reply

In contrast to the above user, I found this article to be quite lacking. It doesn't discuss any of the controversy surrounding the Library Project, or any of the relevant legal cases. --76.182.85.249 (talk) 15:49, 12 May 2008 (UTC)Reply

I agree totally with the above user who found the article "quite lacking" for similar reasons.

Neutrality

edit

The current version does not satisfy the requirements of WP:NPOV. It has already been remarked that criticism of the project is not described. What is more, statements as the following are too opinionated, and not in the factual writing style expected from an encyclopedia (cf. WP:PEACOCK, WP:WEASEL):

adding a unique contribution not only from the wealth of books in its world-famous collection, but also ensuring that a significant voice will be heard in the discussions about the project's unfolding future.

Finally, the article consists mostly of quotes from press releases, whose advertising language adds little information, examples:

  • Through this landmark partnership with Google, Wisconsin is taking a leading role in preserving public domain works for future generations ...
  • At the University of Texas at Austin, we hold a deep commitment to each of these objectives ...
  • The opportunities for education are phenomenal and we are delighted to be working with Google on this project.

Regards, HaeB (talk) 01:30, 5 July 2008 (UTC)Reply

I agree, this is far too much press releases and sales talk. Also, it doesn't make clear how much of the collections "presented" would actually be thrown open full-text. Most of the current Google Books is nowhere near full-text access, what you get (unlike e.g. Gutenberg) is a random dozen pages of the book or just the title page and table of contents. Many people who are enthusing about Google Library plainly think all these hundreds of millions of books will be available in full text for free. Are Harvard, U Texas, the Bavarian Library etc really planning to do this or is it more about showing that they *have* the books? The article provides no real clarity about this.
The University of Texas at Austin, on their homepage, say this: "In the course of the multi-year project, Google will digitize at least one million volumes from the University of Texas Libraries' collections, working from selection lists prepared by the Libraries. (---) The digitized books will all be fully searchable through Google Book Search. Google pays particular attention to copyright law and has specifically designed Book Search to comply with it. Anyone will be able to freely view, browse and read the university's public domain books, including a number of unique treasures in the Libraries' historic collections.
For books protected by copyright, users will only be provided the basic background information (such as the book's title and the author's name), at most a few lines of text related to their search and information about where they can borrow or buy the book. Publishers or authors who wish not to have their books digitized can be omitted from inclusion in the project"
Notice that they are not talking about indexing the full extent of their libraries for Google - one million volumes can't be even a tenth of the total number of books of the U-Texas libraries. And that for books protected by copyright - which means the lion's share of 20th century books - only bibliographical data will be put online. So the number of books they actually scan full-text could well be below fifty thousand and those will mostly be a hundred years old or more. That's nowhere near the hype generated by the overall project /Strausszek (talk) 02:30, 15 November 2009 (UTC)Reply

I find that all of the quotes from the individual libraries add little to this article. Instead I would rather see some sort of summary showing each institution's participation with the project. — Preceding unsigned comment added by 68.197.180.177 (talk) 00:44, 23 April 2013 (UTC)Reply

Agreed, so I have removed them. WP:BOLD. Lawsonstu (talk) 12:46, 26 August 2013 (UTC)Reply

removing POV tag with no active discussion per Template:POV

edit

I've removed an old neutrality tag from this page that appears to have no active discussion per the instructions at Template:POV:

This template is not meant to be a permanent resident on any article. Remove this template whenever:
  1. There is consensus on the talkpage or the NPOV Noticeboard that the issue has been resolved
  2. It is not clear what the neutrality issue is, and no satisfactory explanation has been given
  3. In the absence of any discussion, or if the discussion has become dormant.

Since there's no evidence of ongoing discussion, I'm removing the tag for now. If discussion is continuing and I've failed to see it, however, please feel free to restore the template and continue to address the issues. Thanks to everybody working on this one! -- Khazar2 (talk) 13:21, 14 June 2013 (UTC)Reply

This is a very misleading article without any deeper knowledge!!!!

edit

Lacking in Detail and Critical POV

edit

In addition to the legal controversies, there certainly should be some mention of the academic criticisms of this project. For example: the Google Books project has made hundreds of scientific papers available for free download on eBook readers; these include original publications by people like James Watt, Heinrich Hertz, Carl Friedrich Gauss, Albert Einstein, and many others. On the face of it this is a noble effort to preserve the original publications by these great names in science. However, the editions Google has made available, while alleged by Google to be 'exact reproductions' of the original books, are, in fact, raw, unedited OCR scans. They are riddled with OCR artifacts to the extent that some of them are essentially unreadable.

In one case I checked the Google download of "The principle of relativity; original papers" of Albert Einstein against copies of the same papers taken from issues of the journals in which they were originally published, and found that virtually none of the equations had been reproduced correctly. This effectively renders Einstein's work gibberish.

I am not the only one to have noticed this: many academicians have questioned the usefulness of "preserving" imperfectly reproduced copies of historic documents, which contain reams of newly-introduced errors. In many of these texts, not even the "Digitized by Google" notice is consistently correctly reproduced, frequently appearing as odd combinations such as "Digitized by vj0O!^l".

http://www.historians.org/publications-and-directories/perspectives-on-history/september-2007/google-books-is-it-good-for-history

It's as if one attempted to "preserve" a forest for posterity by stripping the leaves from all the trees, scattering random shrubery around, and coating the whole scene with varnish.

BTW: I would consider this article of high importance in software, website, and library categories, as the eventual resolution (or lack thereof) of these issues of the Google project stand to significantly impact virtually all future research:

http://www.guild2910.org/Pelopponesian%20War%20June%2013%202007.pdf — Preceding unsigned comment added by 74.95.43.249 (talk) 00:53, 10 January 2014 (UTC)Reply

Actually, Google calls those works exact reproductions of the original books because users can view the original scanned pages. Google never edits the OCR versions and they can contain errors. It is quite understandable that none of Einstein's equations would have rendered correctly through the OCR process. But the scanned pages cannot contain any errors at all and I think they should be quite readable. SD0001 (talk) 07:04, 31 January 2015 (UTC)Reply
There is more than one book where pages have been missed or double-scanned. There are examples where parts of the original page are off the page, or obscured by a clip or other object. I think one of the problems was that Google wanted the data to mine for ngrams and so forth. Another that such a large project needs a good QA system, which was clearly lacking. The type of people who can check every page of a book against a scan, then do another one, then another one.... and still pay close attention, are few and far between. All the best: Rich Farmbrough22:24, 31 January 2015 (UTC).

Removed text

edit

There are concerns that libraries participating in the project may be destroying old texts once they have been scanned. Many of these texts are old and scarce, and converting them from physical library books to unreadable scans may effectively be destroying some of these works forever, rather than preserving them.

The following text has been removed from the article because this is backed up with this as the source.

  1. The source is a discussion forum and does not qualify as a reliable source.
  2. The source seems to make no mention of the given concern.

Besides, this does not seem to have any factual basis. Google's scanning process does not harm the books. (See this and this.) Besides, Google itself states on a help page that Google has developed innovative technology to scan the contents of books without harming them. In addition, we won't scan any book that we or our library partners deem too fragile, and once we've scanned a book, it is promptly returned to the library collection. SD0001 (talk) 07:30, 5 February 2015 (UTC)Reply