Wikipedia:Size comparisons
From Free net encyclopedia
This article compares the size of Wikipedia with other encyclopedias and information collections.
Source material from which Wikipedia statistics in this article are derived is available here; the Footnote on WikiStatistics section at the end of this page provides technical discussion of this article.
As of March 2006, the English Wikipedia alone had over 1,000,000 articles of any length, and the combined Wikipedias for all other languages greatly exceeded the English Wikipedia in size, giving a combined total of more than 856 million words in 3.1 million articles in over 200 languages. The English Wikipedia alone has over 340 million words, more than six times as many as the next largest English-language encyclopedia, Encyclopædia Britannica, and more than the enormous 119-volume Spanish-language Enciclopedia universal ilustrada europeo-americana, and about 70% of that work's volumes haven't been updated since 1933.
In 2005 the English-language Wikipedia more than doubled in size, and many smaller wikipedias have grown by a higher multiple.
There have been 89,921 contributors to all Wikipedia language editions (43,531 to the English language edition), with 3.7 million edits in October (1.7 million of which were in the English version).
Not bad for an encyclopedia that is only five years old. But we must do better! Many of the articles are still of poor quality, and the average article length is only a little over half the size of that in Encyclopædia Britannica. As Wikipedia grows more comprehensive, efforts are expected to move more towards increasing the quality, scope, classification and interlinkage of existing articles, rather than the creation of new articles - though see Wikipedia:Words per article for further discussion.
See Wikipedia:Modelling Wikipedia's growth for more educated guesses about the potential growth of Wikipedia.
Contents |
Comparison of encyclopedias
Numbers regarding total characters are based on an estimated average word length of five, plus a space, or six characters per word.
On November 1, 2005, the English Wikipedia had 748,000 articles1 and 292 million words, giving a mean article length of 382 words and about 1.75 billion total characters. It also had 390,000 photographs and illustrations, 696,000 redirect pages, over a million links to other websites and a staggering 16.8 million cross reference links between articles.
Encyclopedia | Edition | Articles (thousands) | Words (millions) | Est. characters (millions) | Average words per article |
---|---|---|---|---|---|
Wikipedia | English | >1,000 | 320 | >1,752 | 382 |
Enciclopedia universal ilustrada europeo-americana | — | — | 200 | 1,000 | — |
Nationalencyklopedin | — | *183 | — | — | — |
Encyclopædia Britannica | 2002 | 85 | 44 | 300 | 650 |
Online | 120 | 55 | 300 | 370 | |
Encyclopédie | — | 75 | — | — | — |
Microsoft Encarta | Encarta Deluxe 2002 | **70 | 40 | 200 | 600 |
Encarta Deluxe 2005*** | 63 | 40 | 200 | 200 | |
2002 Encarta Encyclopedia | 40 | 26 | 200 | 200 | |
Encyclopedia Americana | 2004 | 45 | 25 | — | 556 |
Grolier Multimedia Encyclopedia Online | — | 39 | 11 | 70 | 280 |
Columbia Encyclopedia | Sixth | 51 | 6.5 | 40 | 130 |
* Number of encyclopedic articles. The Nationalencyklopedin contains a total of 356,000 entries.
** Includes 10,000 historical archives.
*** Advertised as containing "over 63,000 articles...with 36,000-plus map locations, and over 29,000 editor-approved Web site links."
Sizes of information collections which are not encyclopedias
Note that Wikipedia is neither a dictionary nor a web index: these figures are just for order-of-magnitude comparison.
Template:Col-begin Template:Col-break
Astronomy
- The Guide Star Catalog II has entries on 998,402,801 distinct astronomical objects searchable online.
- 5.5 TB of astronomical images (covering the whole night sky in several colours) are available online from Aladin.
Biology
- The World Resources Institute claims that approximately 1.4 million species have been named, out of an unknown number of total species (estimates range between 2 and 100 million species).
Chemistry
- As of 2005, over 23 million CAS registry numbers have been allocated for chemical compounds.
- The Beilstein database claims entries on "8 million organic and 1.4 million inorganic and organometallic compounds".
- The Merck Index Subscription Edition has over 10,000 monographs on chemical compounds.
Film and television
- As of 2004, the Internet Movie Database claims to have records on "380,000 titles and 1.6 million names", for a total of "over 6.3 million individual film/TV credits"
Genetics
- Each human being is estimated to have 20,000 to 25,000 genes, each of which probably deserves an article.
- Online Mendelian Inheritance in Man (external link) has 15,255 entries, each describing a known gene, as of April 8, 2004. Reference: site statistics
- Genbank, an online database of DNA sequences from over 165,000 species ([1]), has (as of August 2005) over 46 million entries covering over 100 gigabases.
Geography
- The NIMA (http://www.nima.mil/) GEOnet Names Server contains approximately 3.88 million named geographical features outside the United States, with 5.34 million names.
- As of March 2004, the USGS Geographic Names Information System claims to have almost 2 million physical and cultural geographic features within the United States.
Internet
- 8,168,684,336 web pages were known to Google on August 6, 2005, including all other web sites mentioned and Wikipedia itself.
- Netcraft logged roughly 46 million distinct websites in January 2004.
- As of January 2006, the Open Directory Project web index claims to have over 590,000 categories for 5.2 million websites.
Language
- The New Oxford Dictionary of English claims 350,000 definitions, and four million words.
- The Oxford English Dictionary, Second Edition claims 615,164 definitions, and 59 million words.[2]
Law
- American Jurisprudence, Second Edition, is a 231 volume collection of American common law.
- Black's Law Dictionary, Seventh Edition, has 24,500 common law legal terms.
Libraries
- The British Library claims that it holds over 150 million items.
- The Library of Congress claims that it holds approximately 119 million items, 12 million of which are electronically searchable.
- Copac is a searchable electronic catalogue of over 31 million books held in libraries in the United Kingdom and Ireland (includes all electronic records from the British Library)
Music
- The freeDB database holds information for around 1,579,205 compact discs. Many of the disks are duplicates, however, so the number of unique CDs is unclear.
- The All Music Guide database contains entries for 834,069 unique albums, and 14,642,322 credits (as of June 2005).
- The New Grove Dictionary of Music and Musicians, Second Edition, claims "25 million words with over 29,000 articles" about the subject of music alone.
People
- The Oxford Dictionary of National Biography has over 50,000 articles on famous Britons, in 50 million words (implying an average article size of 1000 words).
- The old British Dictionary of National Biography had 36,500 articles in 33 million words.
Larger numbers
- As of 2005, there are about six and a half billion human beings, each with his or her own life story. Between 25 and 100 billion more have lived and died in the past, although most of their lives are lost to history. As Arthur C. Clarke put this, in his preface to 2001: A Space Odyssey (in 1968, when the world population was only about 3.5 billion [3]):
- Behind every man now alive stand thirty ghosts, for that is the ratio by which the dead outnumber the living. Since the dawn of time, roughly a hundred billion human beings have walked the planet Earth. — Now this is an interesting number, for by a curious coincidence there are approximately a hundred billion stars in our local universe, the Milky Way. So for every man who has ever lived, in this universe, there shines a star.
- There are, as indicated above, around 100,000,000,000 (100 billion) stars in the Milky Way galaxy. [4]
- There are approximately 70,000,000,000,000,000,000,000 (<math>7\times 10^{22}</math>) stars in the observable universe.
- Estimates by astrophysicists of the number of particles in the observable universe are currently on the order of <math>10^{85}</math>.
Footnote on Wikipedia statistics
Very detailed statistics for almost all aspects of Wikipedia are available from http://www.wikipedia.org/wikistats/EN/Sitemap.htm.
Statistics for this page are taken from the Article count (alternate) table and from the Words table.
Excluding redirect pages, there are roughly (using figures from November 1, 2005):
- 816,000 articles that have at least a single link.
- 748,000 articles that have at least a single link and 200 readable characters (roughly equivalent to at least 33 words).
Taking the difference of these two figures, there are about:
- 68,000 articles that have at least a single link but fewer than 200 characters.
There is also an uncounted number of articles which have no links. The current statistics provide no indication of the size of this last category. The upshot is that the 292 million words in fact span the 748,000 bona fide articles, the remaining 68,000 linked articles, and the unknown number of articles without links. A rough estimate of the word count in the latter two categories is six million words. Dividing the remaining 286 million words by 748,000 gives a mean article length of about 382 words.
Further, of the articles on the English Wikipedia, perhaps 36,000 are "data dumped" gazetteer entries about towns and cities in the United States. It is controversial whether gazetteer entries should count towards the number of "real" encyclopedia articles; however, their statistical significance is very much less now than in October 2002 when they were added. Very many have been colonised by Wikipedians who have transformed them to varying extents, including to an unimpeachably encyclopedic status.