Chat Data Cookbook

Learn how to build your own chatbots with high accuracy and deploy the chatbots in multi-platforms.

Chatbot Creation

Comprehensive guides for building basic chatbots using three distinct frameworks.

Chatbot Training Guide

Strategies to enhance your chatbots' accuracy through optimal configuration.

Advanced Features

Enhance your embedded chatbots with advanced features to bolster your website's functionality.

Multi-platform Integrations

Guidelines for integrating chatbots across multiple platforms seamlessly and without any coding requirements.

Welcome to Chat Data Guide.

Chat Data is a platform that allows you to create AI chatbots using your chosen data or data provided by the service. It offers the following features:

Multi-platform Integration: Seamlessly integrate your chatbots with your website using scripts, iframes, or third-party platforms including Discord, Slack, WhatsApp, WordPress, and Shopify.
Great Flexibility: Chat Data offers extensive flexibility in both backend and frontend configurations. Backend options include uploading your data, utilizing your own API endpoint, or employing our pre-built models. On the frontend, personalize the chatbot’s style, and rebrand it with your company’s name and embedding domain URLs.
Webhooks Available: Chat Data provides webhooks to enable real-time retrieval of chat conversations from your chatbot.
APIs Available: Utilize the APIs offered by Chat Data to facilitate interactions with your chatbots.
High Accuracy: Chat Data supports both structured and unstructured data for training, enhancing the chatbot's ability to focus on your specific data and deliver precise answers.
Language Compatibility: Designed for multilingual use, Chat Data allows you to interact with sources and pose questions in any language.
User Analysis: Gain insights into user interactions through comprehensive analytics, including geographic distribution and activity tracking.
Customer Information Gathering: Capture valuable customer data through interactions with the chatbot, streamlining data collection for marketing and customer relationship management.
Daily Notifications: Receive daily email updates with summaries of all chatbot conversations and collected leads.

This doc serves as a comprehensive usage guide to the Chat Data website. We greatly value your feedback. If you notice any missing or inaccurate information, please don't hesitate to email us at: [email protected].

What is Chat Data?

Watch this 3-minute video to quickly understand what Chat Data is and how it works.

Workflow

The chatbot created by our platform can function as an AI agent on your behalf. However, its primary use case is to serve as a customer support representative on your website. Here is an overview of the user experience you can achieve with our AI chatbot:

Chat Data overflow

Here is how the work flow works:

The customer interacts with the AI chatbot to receive responses on the website.
The AI chatbot can instantly answer most(suppose 80%) of queries. However, some corner cases ( suppose 20%) may require human intervention.
In such cases, the customer can click the "live chat escalation" button to request human support.
The human agent receives an email notification with a unique handling URL. This URL is whitelabeled, allowing the agent to handle the query without logging into Chat Data.
If the customer is still on the chat widget, the agent and the customer can communicate in real-time, similar to a chat room.
If the customer has left the chat widget interface, an email with a unique handling URL will be sent to remind the customer of the new message from the human agent. This email can be sent from a whitelabeled email address.
When the customer responds to the chat message, the human agent receives an email notification about the new messages.
All emails and handling URLs can be whitelabeled and do not require a login to manage the messages.

This workflow enables your AI chatbot to answer the majority of queries 24/7. For tasks that the AI chatbot cannot handle, such as queries requiring the most up-to-date information or handling refunds that require special permissions, you can respond directly through your chatbot admin panel or chat with the customer in real time. If the customer has left the chat widget, we will send an email notification if they have provided their contact information in the lead submission form.

Quick Start

This is the quickest way to create your basic chatbots without config it with high accuracy.

Create Your Account

Commencing your chatbot creation journey entails the initial step of registering for a Chat Data account. The registration process is straightforward, requiring just a valid email address and a password, or you can alternatively sign up using your Google account for a password-free experience.

Create Your Chatbot

Log into your Chat Data account, and navigate to the Create Chatbot section found under the Product tab or click the Build Your Chatbot button on the homepage. This will redirect you to the My Chatbots page. Here, start by clicking the New Chatbot button to begin the chatbot creation process. Initiate New Chatbot

You have four main backend options for creating your chatbot:

custom-data-upload: Train your chatbot using your own data.
custom-model: Utilize your own LLM backend endpoint to power the chatbot.
medical-chat-human: Employ our human medical model for the chatbot.
medical-chat-vet: Use our veterinary medical model for the chatbot.

For more information on selecting the appropriate backend for your needs, refer to Chatbot Creation.

Configuring the Chatbot

To meet your expectations, a chatbot must be well-configured. The three critical parameters to set for your newly created chatbot are the base prompt, the OpenAI model, and the temperature.

Base Prompt
If you have the option to configure only one aspect of your chatbot, prioritize setting the base prompt. It sets the overarching tone and communicates your expectations to the chatbot. Without a properly configured base prompt, your chatbot will revert to the default setting, potentially leading to responses that are unexpected or off-target. In the base prompt, includeCONTEXT INFORMATION to represent the uploaded data if you are using the custom-data-upload backend, as this is what we employ in our backend.
OpenAI Model
You have the option to select either the GPT-4o mini or the GPT-4.0 model for processing the uploaded data if you opt for the custom-data-upload, medical-chat-human or medical-chat-vet backends. The GPT-4.0 is much better at following the base prompt and not hallucinating, but it's slower and more expensive than the GPT-4o mini. If accuracy is critical, especially in reflecting the uploaded data in responses, upgrading to the Standard plan and selecting the GPT-4.0 model is advisable.
Temperature
The temperature setting affects the chatbot’s response style, determining whether it is more deterministic or creative. For high accuracy, particularly when the chatbot must adhere closely to the uploaded data, it is recommended to keep the temperature at 0. This setting prevents the chatbot from deviating too far from the provided information with creative interpolations.

Tips for a Good Base Prompt!

A good base prompt should convey the logic clearly and concisely. DO NOT use verbose and complex logic in your base prompts as this can confuse the chatbot, especially when using GPT-4o mini.

When to Use Base Prompt vs. Training Data

When to Use Base Prompt

The base prompt provides context or sets up overarching instructions that apply to the entire conversation or chatbot behavior.

Use Cases

General Instructions: Information that should influence every conversation regardless of user queries.
- Example: Asking about medical history at the end of every conversation. The chatbot should always remember to ask this, irrespective of the user's specific query.
Consistent Logic Across Conversations: When a particular logic or behavior should apply universally in the conversation flow.
- Example: General instructions for how the chatbot should handle sensitive information or maintain a polite tone.

Example

Base Prompt Information: At the end of every conversation, ask the user about their medical history to ensure it's captured for future reference.

When to Use Training Data

Training data contains specific examples and responses that help the chatbot understand how to handle particular queries or topics.

Use Cases

Query-Specific Information: Data that is used to generate responses based on specific user queries.
- Example: Information needed to respond to queries like "What is my medical history?" should be included in the training data so the chatbot can generate accurate responses.
Topics and Responses: When dealing with specific topics or questions that users might ask about, and the responses need to be tailored to those queries.
- Example: If a user asks, "Can I order a blood test?", include details about blood tests in the training data to provide an informed response.

Example

Training Data Information:
- Question: "Can I order a blood test?"
- Response: "Yes, you can order a blood test through our online portal. Here’s how you can do it..."
Training Data for Specific Queries:
- Question: "Can I retrieve my medical history?"
- Response: "To retrieve your medical history, please follow these steps..."

Testing the Chatbot and Optimizing for Best Performance

If you have uploaded your own data for training with the custom-data-upload backend, there's a possibility that your data might not cover all possible scenarios. Consequently, ongoing enhancements are essential to improve your chatbot's response quality:

Identify Issues: Should you encounter any unsatisfactory responses in the chatbot’s conversation logs, simply replicate the issue by copying and pasting the problematic query into the chatbot on your site.
Debugging Issues: Once you have replicated an inadequate response, you can examine the sources that informed the response by clicking to view them.
Fix Issues: If the current training data inadequately covers the query, or if relavant information has been lost during the segmentation process in the RAG workflow, update the dataset with the correct Q&A to correct the response. For detailed instructions on debugging, please refer to the Debug And Optimization section.

What You Should Know!

Please wait for the training process to finish before testing your chatbot, which usually takes about 5 minutes. If you test too soon, the chatbot won’t have learned all the uploaded data, and the response might be inadequate.

Important Insight

Structured data is prioritized due to its curated nature. We assign greater weight to it when calculating cosine similarities. Thus, if there is a discrepancy between answers from structured and unstructured data, the chatbot will favor the structured data to generate the response.

Integrating the Chatbot

Customize your chatbot integration across various platforms based on your specific requirements. We currently support integration with the following platforms, with more to come:

Websites: Embed the chatbot into your website either as a floating widget using our script or directly within a page using an iframe.
Discord: Use the chatbot as the backend for your Discord bot. Follow the Discord Integration Guide for details.
Slack: The chatbot can also function as the backend for your Slack App. Refer to the Slack App Guide for setup instructions.
WhatsApp: Integrate the chatbot as the backend for a WhatsApp application. Please consult the Whatsapp Guide for more information.
Shopify: Embed the chatbot in your Shopify store to respond to Shopify webhooks, synchronizing product updates in real-time and providing current product information to customers. See the Shopify Guide for implementation details.
WooCommerce: Incorporate the chatbot in your WooCommerce site to handle WooCommerce webhooks, ensuring real-time product updates and accurate information is available to your customers. Follow the WooCommerce Guide for setup.
WordPress: Integrate your chatbot seamlessly into your WordPress site without any coding. Refer to the WordPress Guide for how to do this.

Common Reasons for Poor Performance

Typically, chatbots exhibit poor or unexpected performance for the following four reasons. Please check if your chatbot exhibits any of these issues:

Bad Base Prompt: Some poorly performing chatbots do not use CONTEXT INFORMATION in their base prompts, opting instead for phrases like given info to reference uploaded data. If CONTEXT INFORMATION is not utilized in the base prompt, our chatbot will disregard any CONTEXT INFORMATION provided in the backend.
Lack of Structured Data: Ineffective chatbots may possess excessive unstructured data filled with irrelevant and verbose text. This data can provide conflicting information, which confuses the chatbot when it tries to respond to queries. Ensure you upload structured data such as Q&A if the responses are unsatisfactory. Refer to How to Structured Data V.S. Unstructured Data for details.
Inadequate Wait Time Before Testing: Always allow the training process to complete before testing your chatbot, which typically takes about 5 minutes. Testing too soon can result in responses that do not fully incorporate all uploaded data, as the system has not yet saved the training information to the database.
Hallucinations: Although the GPT-4o mini model is more cost-effective and faster, it is prone to 'hallucinations' or errors that occur as it interprets the base prompt. For instance, it might produce a nearly correct answer but overlook some conditions of the prompt, or it may neglect subtle context details, such as not responding precisely to your Q&As but instead summarizing the main points from them. We suggest upgrading to the GPT-4.0 model, especially if your base prompt involves complex logic or if accuracy is critically important to you.

Limits Of Our Chatbot

Due to the use of Retrieval-Augmented Generation (RAG) training process for processing your uploaded data, our chatbot has several limitations:

Incomplete view of the whole picture: The uploaded data is segmented into text chunks, and only the most relevant chunks are provided to the chatbot as the context. This approach means the chatbot lacks a comprehensive understanding of the entire document, and will respond based on the context of the most relevant text chunks labeled as CONTEXT INFORMATION.
Inability to count: The LLM (Large Language Model) is not capable of counting words in your uploaded document because it prioritizes the semantic interpretation rather than quantitative analyses.
Original document not displayable: Since the original document is divided into text chunks, displaying the full original document is not possible. Instead, we can only present the most relevant text chunks that the chatbot uses as CONTEXT INFORMATION, along with the confidence score derived from cosine similarities.

Getting Help

Should you encounter any issues while setting up your chatbot, do not hesitate to schedule a meeting with the Chat Data team. We are eager to assist you in optimizing your chatbot for maximum accuracy tailored to your needs.

VIP Support

Customers on Professional plan may request our WhatsApp number to receive priority support directly through WhatsApp. Please email us at: [email protected] to get our WhatsApp phone number.

Enterprise Plan

For inquiries about our Enterprise plan, please contact us at [email protected]. Enterprise solutions include building larger, more powerful chatbots, providing isolated servers and database instances, and customizing your chatbot management website page with your company logo, among other features.