After weeks of problem-solving and rigorous testing, Fingoweb interns have successfully launched the first version of a system based on the Large Language Model. Wayfinder AI is now operational in a test environment and is being evaluated by our staff. What insights do the interns share about their experience?
In this article:
- Implementation and testing of the LLM model.
- Functionality and limitations of the model.
- The art of prompt engineering.
- Interns’ perspectives on their experience.
In the initial publication about internships at Fingoweb, we outlined the AI project that students from Kraków universities are developing from scratch. During the early stages of building a model designed to answer shopping mall customers’ queries, the most significant challenges were:
- Providing sufficient data for training the model (the interns utilized synthetic data generated by GPT).
- Overcoming hardware limitations, which were resolved by employing a lightweight open-source language model (flan-t5) and powerful GPU-equipped computers.
Testing and Implementation
Following successful data preparation and model training, our interns were finally able to showcase their work’s results – initially among Fingoweb employees. The objective was to assess how the model would handle unexpected questions (similar to the popular GPT’s training).
Even a small-sized flan-t5 model requires suitable infrastructure. Supported by our specialists, the intern team decided to establish a server and construct a user-friendly UI for conducting tests. Thus, Wayfinder AI was born:
Model Functionality and Limitations
The first version of the solution can answer questions like:
– Where can I buy a cap with a visor?
– Adidas, Cropp, Reserved.
Where can I find shoes? ['House', ' New Yorker', ' Diverse', ' Bershka', ' Big Star', ' 4'] Where can I buy hat? ['Lee Wrangler', ' New Yorker', ' Diverse', ' Bershka', ' Big'] Hi, my name is Kuba. I am looking for something to eat ['KFC']
It works great at classifying products and assigning them to specific stores, which was the primary goal of Wayfinder AI’s first iteration. It can also extract product information from variously formulated questions. Currently, the system can handle three types of questions:
- Normal – a question returning a list of stores.
- IDK – a question urelated to the model’s task.
- Missing – missing items.
The Greatest Challenge and Future Goal
Ultimately, the system aims to search for specific items, a feature not present in the current version. For instance:
– Where can I buy the latest Harry Potter book?
Such a question would not be classified. While it wasn’t crucial at this stage, the interns were already testing solutions that would classify specific products and provide directions to the store.
An interesting aspect of system building involves preparing appropriate prompts, not just choosing a model, providing data, tuning, and infrastructure. Here’s an example of a prompt used in the system developed by the students:
'<s>[INST] <<SYS>>\nYou will be asked questions about Galeria Krakowska. If you are asked where you can buy something. return a list of stores where you are sure the product can be purchased. If there are no shops selling given product, return "The shopping mall does not have shop with such products". If question is not related to Galeria Krakowska, return \'I do not know the answer to this question\'\nYou can return only shops from this list: Ania Kruk, Cropp. Medicine, Bershka, New Yorker, House, Pull&Bear, Stradivarius, Reserved, Marc O'Polo, Greenpoint, Tatuum, Taranko. SWISS. Empik, Rossmann, Douglas, HEBE, Hugo Boss. Apart. Big Star, C&A, SEPHORA, Bytom, Vistula, Lavard, Ochnik, TOUS\n<</SYS>>\n\nGive me the correct answer to the question about the shopping mall, Galeria Krakowska. Q: Where can I buy gimbal? If there are no shops with this products, say that there are no shops with this item. Return it as comma separated list. [/INST]"
Interns’ Perspectives on Their Experience
We asked Kuba Mieszczak, one of the interns in the Wayfinder AI creation team, about his project experience.
What are the biggest advantages and challenges of the current solution?
Wayfinder AI is undoubtedly a robust alternative to conventional search engines in shopping malls. This assistant can find specific stores and eventually will be able to locate products, and provide guidance on locating them.
The most significant challenge is that the language models offer users considerable freedom. If we could trust users to use the model as expected, the work would be much simpler. In reality, users ask one or two task-related questions and then proceed to “test” its limits, which can be problematic.
Examples of testing include asking complex questions or logical puzzles, entering random words or characters, or asking about sensitive topics. The model must be prepared for such situations.
Despite these challenges, we’re delighted that the small-sized open-source model could handle our initial requirements, even with the large number of products in the dataset.
Did you encounter any other unusual problems?
While it’s hard to label it an “unusual” issue, the topic is still relatively new when it comes to LLM models used in commercial projects like this. Articles, blog posts, white papers appear almost weekly, and descriptions of techniques to prepare your own AI-based solution don’t solve all problems. Sometimes, checking how the model behaves in a given situation requires asking questions and using trial-and-error methods, leading to much time spent on fine-tuning and testing.
On the other hand, that’s precisely what internships are about. We learn a lot from this process, and the solution already yields initial results at this stage.
Speaking of the next stage, what are you currently working on?
The next stages involve searching for new solutions and improving current ones. Our progress might be limited by the model itself. Therefore, we’re ensuring it can perform basic functions while also testing more advanced models. So now, we’re focusing on testing, improvements, and expanding functionality.
What have you learned, and what was the most interesting aspect?
As an intern, a student, and someone who knew little about LLM models before starting the project, practical work with the model provides a wealth of knowledge. We worked with text2text models (flan-t5) and text generation models (LLAMA 2), and I believe we as a team understand them quite well now. Additionally, we used models for translation and speech processing, worked with LangChain, created a synthetic dataset, and managed and processed it using Python libraries. The whole project offers practical knowledge and is incredibly fascinating.