[95524] in tlhIngan-Hol
Re: [Tlhingan-hol] Fwd: RE: Klingon Scrabble
daemon@ATHENA.MIT.EDU (Rohan Fenwick - QeS 'utlh)
Fri Jan 11 09:51:44 2013
From: Rohan Fenwick - QeS 'utlh <qeslagh@hotmail.com>
To: <tlhingan-hol@kli.org>
Date: Sat, 12 Jan 2013 00:51:06 +1000
In-Reply-To: <6.2.5.6.2.20130110215607.064ed278@flyingstart.ca>
Errors-To: tlhingan-hol-bounces@stodi.digitalkingdom.org
--===============6472653827267460958==
Content-Type: multipart/alternative;
boundary="_e8a4e6ca-61bc-46a2-920b-9185108f3840_"
--_e8a4e6ca-61bc-46a2-920b-9185108f3840_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
ghItlhpu' Qov=2C jatlh:
> Now does anyone want to suggest which letter frequencies we diminish in
order to
> bring the qaghwI' frequency up close to the 'at frequency?
Especially considering that building upon words already on the board is a b=
ig part of Scrabble=2C and a couple of extra qaghwI'mey opens up opportunit=
ies for that considering how many affixes contain it.
I know that Noah has put together a frequency distribution=2C which is pret=
ty good=2C but I was halfway through my own and so I'll post it as well. It=
's based on four main texts in substantially different genres: ghIlghameS=
=2C the existing portion of mIl'oD veDDIr SuvwI'=2C nuq bop bom=2C and the =
Tao Te Ching. nuq bop bom is by far the largest (350431 characters=2C vs. 1=
22250 for mIl'oD veDDIr SuvwI'=2C 34901 for ghIlghameS=2C and 22589 for the=
Tao)=2C so in order to take this into account I multiplied the other three=
texts' results up so that the populations from each text matched in size. =
I'm happy to send on the Excel file with the stats in it so that the number=
s can be checked.
0 points: chIm (2)
1 point: 'at (10)=2C qaghwI' (10)=2C 'It (8)=2C 'et (8)=2C 'ot (6)=2C 'ut (=
6)=2C Hay (5)
2 points: jay (5)=2C may (5)=2C Day (4)=2C vay (4)
3 points: lay (3)=2C ghay (2)=2C bay (2)=2C chay (2)=2C Say (2)=2C qay (2)=
=2C nay (2)
4 points: tay (2)=2C pay (2)
5 points: yay (2)=2C way (2)
6 points: Qay (1)=2C ray (1)
8 points: tlhay (1)
10 points: ngay (1)
As in English=2C the total is 100 tiles=2C and a sum point total of 200 poi=
nts.
I've reduced the point value on tlhay=2C because as Qov points out=2C it's =
somewhat overvalued at ten points: in the all-texts percentage=2C ngay is b=
y far the rarest=2C having a frequency of 0.86% (compared to tlhay=2C which=
has 1.49%=2C only just behind ray). The current standard distribution has =
two raymey and for the potential for playing -rgh codas it'd be nice to kee=
p two=2C but the frequency really doesn't justify it: it's ranked 24th of 2=
6 in the all-texts percentage. The duplication of yay and way is because of=
the existence of -y/-y' and -w/-w' codas. Noah=2C in your distribution I w=
ould argue that of the two=2C it should be yay=2C not way=2C that has two t=
iles: way can't appear in the syllable coda for 40% of potential syllable s=
hapes.
Unfortunately there's also the need for a relatively high proportion of vow=
els so that playing verb prefixes won't deplete the vowel-consonant ratio t=
oo much=2C so the total of vowels is 38 of the 98 letter tiles. (The curren=
t distribution has 42 vowels.)
QeS
=
--_e8a4e6ca-61bc-46a2-920b-9185108f3840_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<html>
<head>
</head>
<body class=3D'hmmessage'><div dir=3D'ltr'>
<style><!--
.hmmessage P
{
margin:0px=3B
padding:0px
}
body.hmmessage
{
font-size: 10pt=3B
font-family:Tahoma
}
--></style>
<div dir=3D"ltr">ghItlhpu' Qov=2C jatlh:<br>>=3B Now does anyone want to =
suggest which letter frequencies we diminish in
order to<br>>=3B bring the qaghwI' frequency up close to the 'at frequenc=
y?<br><br>Especially considering that building upon words already on the bo=
ard is a big part of Scrabble=2C and a couple of extra qaghwI'mey opens up =
opportunities for that considering how many affixes contain it.<br><div><br=
>I know that Noah has put together a frequency distribution=2C which is pre=
tty good=2C but I was halfway through my own and so I'll post it as well. I=
t's based on four main texts in substantially different genres: ghIlghameS=
=2C the existing portion of mIl'oD veDDIr SuvwI'=2C nuq bop bom=2C and the =
Tao Te Ching. nuq bop bom is by far the largest (350431 characters=2C vs. 1=
22250 for mIl'oD veDDIr SuvwI'=2C 34901 for ghIlghameS=2C and 22589 for the=
Tao)=2C so in order to take this into account I multiplied the other three=
texts' results up so that the populations from each text matched in size. =
I'm happy to send on the Excel file with the stats in it so that the number=
s can be checked.<br><br>0 points: chIm (2)<br>1 point: 'at (10)=2C qaghwI'=
(10)=2C 'It (8)=2C 'et (8)=2C 'ot (6)=2C 'ut (6)=2C Hay (5)<br>2 points: j=
ay (5)=2C may (5)=2C Day (4)=2C vay (4)<br>3 points: lay (3)=2C ghay (2)=2C=
bay (2)=2C chay (2)=2C Say (2)=2C qay (2)=2C nay (2)<br>4 points: tay (2)=
=2C pay (2)<br>5 points: yay (2)=2C way (2)<br>6 points: Qay (1)=2C ray (1)=
<br>8 points: tlhay (1)<br>10 points: ngay (1)<br><br>As in English=2C the =
total is 100 tiles=2C and a sum point total of 200 points.<br><br>I've redu=
ced the point value on tlhay=2C because as Qov points out=2C it's somewhat =
overvalued at ten points: in the all-texts percentage=2C ngay is by far the=
rarest=2C having a frequency of 0.86% (compared to tlhay=2C which has 1.49=
%=2C only just behind ray). The current standard distribution has two rayme=
y and for the potential for playing -rgh codas it'd be nice to keep two=2C =
but the frequency really doesn't justify it: it's ranked 24th of 26 in the =
all-texts percentage. The duplication of yay and way is because of the exis=
tence of -y/-y' and -w/-w' codas. Noah=2C in your distribution I would argu=
e that of the two=2C it should be yay=2C not way=2C that has two tiles: way=
can't appear in the syllable coda for 40% of potential syllable shapes.<br=
><br>Unfortunately there's also the need for a relatively high proportion o=
f vowels so that playing verb prefixes won't deplete the vowel-consonant ra=
tio too much=2C so the total of vowels is 38 of the 98 letter tiles. (The c=
urrent distribution has 42 vowels.)<br><br>QeS<br></div></div>
</div></body>
</html>=
--_e8a4e6ca-61bc-46a2-920b-9185108f3840_--
--===============6472653827267460958==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
Tlhingan-hol mailing list
Tlhingan-hol@stodi.digitalkingdom.org
http://stodi.digitalkingdom.org/mailman/listinfo/tlhingan-hol
--===============6472653827267460958==--