File:Zipf-semi-1 Arabic, Geez, Hebraic.svg

File
File history
File usage
Metadata

Size of this PNG preview of this SVG file: 512 × 504 pixels. Other resolutions: 244 × 240 pixels | 488 × 480 pixels | 610 × 600 pixels | 780 × 768 pixels | 1,040 × 1,024 pixels.

Original file ‎(SVG file, nominally 512 × 504 pixels, file size: 3.93 MB)

This is a file from the Wikimedia Commons. The description on its description page there is shown below. Commons is a freely licensed media file repository. You can help.

Summary

DescriptionZipf-semi-1 Arabic, Geez, Hebraic.svg	English: Zipf law plot (frequency as function of frequency rank) for the words in three Afro-Asiatic ("Semitic") languages: Ge'ez, Hebrew, and Arabic. The languages, texts and the frequency files are: Ge'ez (Classical Ethiopian). Text of the Glory of the Kings (Kebra Nagast), a 14th century chronicle of Ethiopian kings, part of the Coptic Bible. Published by Michal Jerabek. In the SERA encoding, with numerals excluded. Sample: be'akWetEtu le'Igzi'AbHEr 'ab 'a`hazE kWulu webeweldu 'iyesus krstos [...] Syon baHr seged Hzbe 'ar`ad qdme seged Zan seged wdm 'ar`ad `amde Syon. File geez/gok/tot.1/gud.wfr (34291 words, N = 12272 distinct). Hebrew. The first five books (Torah, Pentateuch) of the Hebrew Bible (Tanak). From the 10th century version (the Masoretic text) of the original, probably composed mainly around ~500 BCE from earlier texts. Obtained from the Sacred Texts site, maintained by John B. Hare In an ad-hoc single-byte encoding designed to look vaguely phonetic under an ISO-Latin-1 font: '¡' = alef, 'b' = bet, 'g' = gimel, '°' = sehva, 'ï'= hiriq, '¤' = dagesh/mapiq, etc.. With vowel points but without cantillation marks. Sample: b¤°rë¡s¹ïy± b¤ârâ¡ ¡°êlöhïym ¡ë± häs¤¹âmäyïm w°¡ë± hâ¡ârêþ w°hâ¡ârêþ [...] k¤âlhäy¤âmïym. File hebr/tav/tot.1/gud.wfr (original 66311 words, truncated/filtered to 35027 words, N = 12487 distinct). Arabic. The Quran (~650 CE). Based on the Unicode Quran document from the Sacred Texts site, maintained by John B. Hare, with several corrections. Arabic Unicode characters were mapped into ISO latin-1 characters in a vaguely phonetic way. With short vowel marks, hamza, madda but without sukuns. Sample: bîsmî alllâhî alrrâµmânî alrrâµîymî alµâmdû lîllâhî râbbî al¿âlâmîynâ [...] tâttâ©î£ûwnâ mînhû sâkâräa wârîzqäa µâsânäa a¡înnâ fîy. File arab/quv/tot.1/gud.wfr (original 77411 words, truncated/filtered to 35027 words, N = 10762 distinct). The word frequency files '///gud.wfr' are available at the UNICAMP website. The original annotated full texts, before truncation/filtering, are in the companion files //org/main.src. The truncated/filtered texts -- one word per line, without punctuation -- are in ///gud.tlw.
Date	9 May 2023
Source	Own work
Author	Jorge Stolfi

Licensing

I, the copyright holder of this work, hereby publish it under the following license:

This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.

You are free:

to share – to copy, distribute and transmit the work
to remix – to adapt the work

Under the following conditions:

attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.

File history

Click on a date/time to view the file as it appeared at that time.

	Date/Time	Dimensions	User	Comment
current	13:37, 15 May 2023	512 × 504 (3.93 MB)	Jorge Stolfi	Rebuilt the file with small changes in dataset, colors

File usage

The following page uses this file:

Zipf's law

Metadata

This file contains additional information, probably added from the digital camera or scanner used to create or digitize it.

If the file has been modified from its original state, some details may not fully reflect the modified file.

Short title	Gnuplot
Image title	Produced by GNUPLOT 5.4 patchlevel 2
Width	100%
Height	100%

File:Zipf-semi-1 Arabic, Geez, Hebraic.svg

Summary

Licensing

Captions

Items portrayed in this file

depicts

creator

some value

copyright status

copyrighted

copyright license

Creative Commons Attribution-ShareAlike 4.0 International

source of file

original creation by uploader

inception

9 May 2023

MIME type

image/svg+xml

File history

File usage

Metadata