Gemini AI responds with nonsense answers? Here's maybe why

Prompt engineers were forced to rate its answers.

Reading time icon 2 min. read


Readers help support MSpoweruser. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more

Key notes

  • Google reportedly forces Gemini AI’s prompt engineers to rate responses on topics outside their expertise due to new guidelines.
  • Contractors at GlobalLogic can no longer skip prompts outside their knowledge and must evaluate them.
  • “Hallucinations,” where AI makes up answers, occur because models rely on training data patterns rather than true understanding.
Google Gemini

Google reportedly forces Gemini AI’s prompt engineers to rate its answers even on topics they have no knowledge about.

An exclusive report by TechCrunch reveals that Google’s new guidelines for contractors working on Gemini AI prohibit skipping prompts outside their expertise.

Contractors working at GlobalLogic, a Hitachi-owned outsourcing firm, have seen recent changes in the guidelines. Previously, it was acceptable to skip prompts outside their expertise.

“If you do not have critical expertise (e.g. coding, math) to rate this prompt, please skip this task,” the initial guideline reads, which has been amended to, “You should not skip prompts that require specialized domain knowledge.”

And then, according to these new guidelines, the contractors were also told to just “rate the parts of the prompt you understand” and leave a note. It’s now only possible to skip if the prompt is missing critical information or contains harmful content requiring special consent forms.

A prompt engineer holds an important position in determining whether a generative AI, powered by large language models (LLMs) responds correctly or not. After you ask an AI chatbot for something, some of the answers that are sent to the cloud and not processed locally are looked at by human reviewers to evaluate their accuracy.

It’s the backbone of the operation, especially to reduce and eliminate hallucinations in AI—where a model generates incorrect or nonsensical information while presenting it as fact. That mostly happens because the AI relies on patterns in its training data rather than true understanding, and it’s a lot of work to get it right.

For example, if you ask Gemini AI or ChatGPT about topics that are outside of their cutoff dates, the AI chatbot will still generate answers to make up for it, instead of simply saying that the topic is out of their reach.

And that happens a lot. There were cases like when Gemini AI responded to users’ queries with ridiculous and almost meme-able answers. OpenAI also launched GPT-4-powered CriticGPT to assist its human reviewers.

Mikhail Parakhin, a former Microsoft head of Bing Search, described, “If you prevent Bing from searching and yet want the answer, it is forced to make things up sometimes. That’s why we have Search on by default. This is the property of LLMs called “hallucination.”

User forum

0 messages