This is a Question Answering demo app using a Large Language Model hosted on Google Cloud Run. It showcases a retrieval system designed by me. It will retrieve the most fitting answer from a premade FAQ about me.
The FAQ contains information about who I am, what I do for work, my areas of expertise, and the frameworks I use. Try asking for instance "What kind of work do you do?".
With this method you can completely forget about hallucinations; a LLM by default will often provide answers that are plainly wrong. It WILL make up facts, even when shown the correct data. This method takes a step back, takes some control away from the LLM and forces it to provide a pre-written answer.
It also completely solves prompt injection hacking. In a normal chatbot a user could inject a prompt that would make the LLM say anything, or even target other modules connected to the chatbot. By constraining the response of the chatbot to a list of pre-written answers, this attack vector is completely eliminated.
The model used for this demo is a 1.5 billion parameter models, 300x smaller than the largest models available. As such, it is not as accurate as the larger models. According to my experiments a 9b parameter model would be ideal (but it's too slow for this demo). If you have an openai api key you can try it with gpt4o-mini by toggling the option on the right.
This can be integrated in a chatbot to accurately answer specific questions the user may have in the middle of the conversation. For example, this could be used in a chatbot on a business website, to make sure questions about the business are not answered incorrecly.
Question Answering Demo
Question:
Answer:
Processing...
0%
Success! The answer is ready.
Error:
Advanced Options
Your API key will not be stored or saved in any way. It will only be used to answer this question.