AI Chatbot with Voice Sensor for Smart Customer Service Kiosk Using Raspberry Pi

ABSTRACT:
The AI Voice Banking System is an advanced, intelligent banking application that leverages Artificial Intelligence (AI), Natural Language Processing (NLP), and Speech Technologies to facilitate seamless and intuitive financial interactions through voice and text interfaces. The primary objective of this system is to enhance accessibility and usability of digital banking services, particularly for users with limited technical literacy, by enabling natural language-based communication in multiple Indian languages.
The system integrates a robust speech processing pipeline using the SpeechRecognition library in conjunction with the Google Speech Recognition API to capture and transcribe real-time voice input. To ensure reliability and improved accuracy under varying acoustic conditions, an offline fallback mechanism is implemented using the OpenAI Whisper model. Additionally, the system employs langdetect for automatic language identification and Googletrans for real-time translation of user input into English, enabling a unified processing pipeline for multilingual commands including Tamil, Telugu, Kannada, Hindi, Malayalam, Bengali, Marathi, Gujarati, and English.
The Natural Language Understanding (NLU) module is designed using a hybrid approach that combines rule-based intent detection, JSON-driven pattern matching, and a supervised Machine Learning model. The ML model is trained using TF-IDF (Term Frequency–Inverse Document Frequency) vectorization on both word-level and character-level features, and implemented using scikit-learn classifiers. Preprocessing techniques such as tokenization and lemmatization are performed using the NLTK (Natural Language Toolkit) to improve prediction accuracy. This multi-layered approach ensures robust intent classification and minimizes failure cases.
A specialized numeric processing module is incorporated to accurately interpret the Indian numbering system, allowing the system to understand and process spoken financial values such as “five thousand,” “two lakh,” and “ten crore.” This module utilizes custom algorithms for word-to-number conversion and pattern extraction from natural language input.
For output generation, the system uses gTTS (Google Text-to-Speech) as the primary speech synthesis engine, delivering high-quality, natural-sounding voice responses with support for Indian accents and multiple languages. In scenarios where internet connectivity is limited, an offline fallback is provided using pyttsx3, ensuring uninterrupted user interaction. Additionally, espeak is used as a lightweight backup option for constrained environments such as Raspberry Pi.
The application incorporates a secure authentication mechanism based on email-driven One-Time Password (OTP) verification using the SMTP protocol, ensuring protection against unauthorized access. The user interface is developed using CustomTkinter, offering a modern, responsive, and user-friendly graphical interface. The backend is powered by an SQLite database that efficiently manages user credentials, account information, transaction records, and session data.
The system is deployed on a Raspberry Pi 3B+, demonstrating the feasibility of implementing AI-powered voice banking solutions on low-cost, resource-constrained edge devices. This makes the solution highly suitable for rural and underserved regions where access to advanced computing infrastructure is limited.
In conclusion, the AI Voice Banking System presents a comprehensive and scalable solution that combines voice intelligence, multilingual processing, secure authentication, and real-time transaction handling to redefine the user experience in digital banking. The system highlights the potential of integrating AI and edge computing to create inclusive, accessible, and efficient financial services.

OBJECTIVES:
The primary objectives of the AI Voice Banking System are as follows:
1. To develop an AI-based voice-enabled banking system that allows users to perform financial transactions using natural language voice and text commands.
2. To implement multilingual support by integrating language detection and translation techniques, enabling interaction in multiple Indian languages such as Tamil, Telugu, Kannada, Hindi, Malayalam, Bengali, Marathi, Gujarati, and English.
3. To design an efficient speech recognition pipeline using the SpeechRecognition library and Google Speech API, with Whisper as a fallback mechanism to ensure accuracy and reliability in diverse environments.
4. To build a robust Natural Language Processing (NLP) framework that combines rule-based intent detection, JSON-based pattern matching, and Machine Learning models using TF-IDF vectorization and scikit-learn classifiers.
5. To incorporate intelligent number processing capable of understanding the Indian numbering system, including terms such as thousand, lakh, and crore for accurate financial transactions.
6. To implement secure user authentication using email-based One-Time Password (OTP) verification through the SMTP protocol to ensure data privacy and system security.
7. To develop an interactive and user-friendly graphical interface using CustomTkinter, enabling easy navigation and accessibility for users with varying levels of technical expertise.
8. To integrate Text-to-Speech (TTS) functionality using gTTS for high-quality voice output, with pyttsx3 as an offline fallback to maintain system usability without internet connectivity.
9. To design and manage a reliable database system using SQLite for storing user credentials, account details, and transaction history efficiently.
10. To deploy the system on Raspberry Pi 3B+ as an edge computing solution, demonstrating the feasibility of implementing AI-driven applications on low-cost, resource-constrained hardware.
11. To enhance accessibility and inclusivity in digital banking, particularly for elderly users, visually impaired individuals, and people in rural or underserved areas.
12. To create a scalable and modular system architecture that can be extended in the future with features such as biometric authentication, mobile integration, and real-time banking APIs.

Block Diagram

• Demo Video
• Complete project
• Full project report
• Source code
• Complete project support by online
• Life time access
• Execution Guidelines
• Immediate (Download)

HARDWARE REQUIRMENTS:

1. Raspberry Pi 3B+
The Raspberry Pi 3B+ acts as the main processing unit of the system. It runs the entire application including the GUI, voice processing, AI modules, and database operations. It supports Python and is suitable for running lightweight AI applications. It also provides built-in Wi-Fi and Bluetooth for connectivity.

Fig: Raspberry Pi
2. USB Microphone
A USB microphone is used to capture the user’s voice input. It allows users to give voice commands to the system. The microphone plays an important role in speech recognition, and better audio quality improves the accuracy of voice processing.

Fig:Mic

3. Speaker / Audio Output
The speaker is used to provide voice output from the system. The system converts text responses into speech using Text-to-Speech (gTTS or pyttsx3). This helps users hear responses clearly, especially useful for visually impaired users.

4. Power Supply (5V, 2.5A Adapter)
A stable power supply is required to run the Raspberry Pi and connected devices. A 5V, 2.5A (or higher) adapter ensures smooth operation. If the power supply is weak, the system may restart or not function properly.

Fig: Power Supply

SOFTWARE REQUIRMENTS:
The AI Voice Banking System is developed using a combination of software tools, libraries, and frameworks that enable speech processing, natural language understanding, machine learning, and user interface design. The following are the detailed software requirements:

1. Operating System (Raspberry Pi OS / Linux)
The system is deployed on Raspberry Pi OS, a Linux-based operating system optimized for Raspberry Pi hardware. It provides a stable environment for running Python applications and supports all required libraries for AI, speech processing, and GUI development. Linux-based systems are preferred due to their lightweight nature, open-source support, and efficient resource management.

2. Python Programming Language
Python is used as the primary programming language for developing the entire system. It is widely used in AI and machine learning applications due to its simplicity, readability, and extensive library support. Python enables seamless integration of different modules such as GUI, database, NLP, and voice processing into a single system.

3. CustomTkinter (Graphical User Interface)
CustomTkinter is used to design the graphical user interface of the application. It enhances the traditional Tkinter library by providing modern UI components, dark/light themes, and better styling options. It is used to create all screens such as login, registration, OTP verification, dashboard, and transaction panels, ensuring a user-friendly and interactive experience.

4. SpeechRecognition Library
The SpeechRecognition library is used for capturing audio input from the microphone and converting it into text. It integrates with the Google Speech Recognition API to provide real-time transcription with high accuracy. It includes features such as noise adjustment, energy threshold tuning, and timeout handling to improve recognition performance.
5. Whisper Model (Speech Recognition Fallback)
The Whisper model is an advanced deep learning-based speech recognition system used as a fallback when the primary speech recognition fails. It is capable of handling noisy audio, different accents, and multilingual speech. Whisper ensures robustness and reliability of the system, especially in real-world environments where audio quality may vary.

6. Natural Language Processing (NLTK)
The Natural Language Toolkit (NLTK) is used for preprocessing user input text. It performs operations such as tokenization (splitting text into words), lemmatization (reducing words to their base form), and filtering irrelevant tokens. These preprocessing steps improve the accuracy of intent classification and enable better understanding of user commands.
7. Machine Learning (Scikit-learn)
Scikit-learn is used to build the intent classification model. The system uses TF-IDF (Term Frequency–Inverse Document Frequency) vectorization to convert text into numerical features. These features are then used by classification algorithms to predict user intent. The trained model helps the system understand commands like “check balance,” “deposit money,” and “withdraw amount.”

8. Googletrans (Translation Library)
Googletrans is used for translating user input from multiple languages into English. This allows the system to process commands in a unified format regardless of the input language. It plays a key role in enabling multilingual support for Indian languages.

9. Langdetect (Language Detection)
Langdetect is used to automatically identify the language of the user’s input. It detects whether the input is in English, Hindi, Tamil, Telugu, Kannada, or other supported languages. This information is used to select appropriate processing and response mechanisms.

10. gTTS (Google Text-to-Speech)
gTTS is used as the primary Text-to-Speech engine to convert system responses into natural-sounding audio. It supports multiple languages and provides high-quality voice output using Google’s speech synthesis service. It enhances user interaction by providing clear and understandable responses.

11. pyttsx3 (Offline Text-to-Speech Engine)
pyttsx3 is used as an offline fallback for speech synthesis. Unlike gTTS, it does not require internet connectivity and works locally on the system. This ensures that the application continues to provide voice feedback even in offline conditions.
12. SQLite Database
SQLite is used as the backend database to store user information, account details, and transaction history. It is lightweight, serverless, and efficient, making it ideal for embedded systems like Raspberry Pi. It ensures fast data access and reliable storage without requiring complex database setup.

13. SMTP Protocol (Email Service for OTP)
The Simple Mail Transfer Protocol (SMTP) is used to send OTP (One-Time Password) to the user’s registered email during login. It ensures secure authentication by verifying the identity of the user before granting access to the system.

Immediate Download:
1. Synopsis
2. Rough Report
3. Software code
4. Technical support

Hardware Kit Delivery:
1. Hardware kit will deliver 4-10 working days (based on state and city)
2. Packing and shipping changes applicable (based on kit size, state, city)