[86912] in tlhIngan-Hol

home help back first fref pref prev next nref lref last post

Re: The topic marker -'e'

daemon@ATHENA.MIT.EDU (Tracy Canfield)
Sun Nov 22 12:02:32 2009

In-Reply-To: <249d5b950911220828x176ad975s8f0decb2caedcc2c@mail.gmail.com>
Date: Sun, 22 Nov 2009 12:00:04 -0500
From: Tracy Canfield <toastrix@gmail.com>
To: tlhingan-hol@kli.org
Errors-to: tlhingan-hol-bounce@kli.org
Reply-to: tlhingan-hol@kli.org

2009/11/22 Steven Lytle <lytlesw@gmail.com>:
> It seems that your (or any) MT program should at least attempt to translate
> even ungrammatical utterances.

I actually do take a pass at them after marking them as ungrammatical.

It's still important to distinguish the two - first, because you can
be much more confident about the intended overall meaning of the
grammatical ones, and second, because the grammatical ones are a lot
less unambiguous - you don't have to consider the possibility that a
noun ending in -vaD or -Daq could be the subject.

On the current build, if you take a sentence like

mapum Sor

which I think we can all agree is awful, you get

* fall tree

The * marks it as ungrammatical, but the program makes a try at the
individual words without trying to establish any relationship between
them.

In contrast

ngemDaq pum Sor

returns

The tree falls in the forest

with re-ordering, insertion of appropriate articles and prepositions,
etc.  (Plus a gentle reminder on a different line that there are other
legitimate parses because "ngem" and "Sor" could be plural.)

While it might well be worth doing more re-ordering of the
ungrammatical sentences, it's a lower priority than trying to ensure
that if a sentence *is* grammatical, the program can handle it.




home help back first fref pref prev next nref lref last post