| home | help | back | first | fref | pref | prev | next | nref | lref | last | post |
Date: Fri, 14 Nov 2025 12:34:11 -0800
From: Lauren Weinstein <lauren@vortex.com>
To: privacy-dist@vortex.com
Message-ID: <20251114203411.GA29032@vortex.com>
Content-Disposition: inline
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"
Errors-To: privacy-bounces+privacy-forum=mit.edu@vortex.com
Coding with Gemini: Cheerful, cooperative, and usually, wrong.
https://lauren.vortex.com/2025/11/14/coding-with-gemini
Another experiment in #AI coding with #Google Gemini. I try to be
fair. When I call generative AI mostly slop, I don't do so blindly; I
attempt to conduct reasonable tests in various contexts.
Yesterday I needed a couple of routines -- one in Bash, the other in
Python. I tried the Python one first. This required code to
asynchronously access a remote site API, authenticate, send and
receive various data and process what was returned, relying on a well
documented Python library on Github written specifically to deal with
that site's API.
After almost two hours, I gave up. Gemini was consistently cheerful
and cooperative -- almost to a creepy extent. It generated code that
looked reasonable, was very well commented, and even provided helpful
examples of how to configure, install, and run the code.
Unfortunately, none of it actually worked.
When I noted the problems, Gemini got oddly enthusiastic, with
comments like "Wow, that's a great explanation of the problems, and a
very useful error message! Let's figure out what's wrong! Here is
another version with more diagnostics that accesses the library more
directly!"
Sort of made me feel like I was dealing with an earnest but
incompetent TA at an undergraduate CS course at UCLA long ago. Which
was not something I enjoyed back then!
After a bunch of iterations, I gave up. Even starting over didn't
help. Gemini never seemed to produce the same code twice, no matter
how I worded the prompts. The code would use completely different
models each time, sometimes embedded configuration values, sometimes
external files, sometimes command line args. And the way it tried to
use the Python library in question also varied enormously. It almost
seemed random. Or at least pseudorandom.
I spent half an hour and wrote + tested the code I needed from
scratch. It worked on the second try, and was about half the number of
lines of any of the code Gemini generated, and much simpler, for
whatever that's worth. By comparison, Gemini's code was bloated and
definitely unnecessarily complex (as well as wrong).
I did give Gemini another chance. I also needed a simple Bash script
to do some date conversions. I offered that task to Gemini since I
didn't want to bother digging through the various date format
parameters required. Gemini came up with something reasonable for this
in about four tries. Whether it's completely bug free I dunno for
sure, I haven't dug into the code deeply since its not a critical
application. But it seems to be working for now.
So really, I haven't seen a significant improvement in this area.
There are probably some reasonable sets of problems where AI-coding
can reduce some of the grunt work, but once you get into anything more
complex the opportunities for errors, especially in larger chunks of
code where detecting those errors might not be straightforward, seem
to rise dramatically.
L
- - -
--Lauren--
Lauren Weinstein
lauren@vortex.com (https://www.vortex.com/lauren)
Lauren's Blog: https://lauren.vortex.com
Mastodon: https://mastodon.laurenweinstein.org/@lauren
Signal: By request on need to know basis
Founder: Network Neutrality Squad: https://www.nnsquad.org
PRIVACY Forum: https://www.vortex.com/privacy-info
Co-Founder: People For Internet Responsibility
_______________________________________________
privacy mailing list
https://lists.vortex.com/mailman/listinfo/privacy
| home | help | back | first | fref | pref | prev | next | nref | lref | last | post |