How to Create a Conversational Chatbot in Python

A chatbot is an AI-based software designed to interact with humans in their natural languages. These chatbots usually converse via auditory or textual methods, and they can effortlessly mimic human languages to communicate with human beings in a human-like manner. In the last few years, chatbots in Python have gained a huge following in both the business and tech sectors. These smart bots are so proficient at mimicking the human language and conversing with humans.

Bots are responsible for the majority of internet traffic. For e-commerce sites, traffic can be significantly higher, accounting for up to 90% of total traffic. They can communicate with people and on social media accounts, as well as on websites.

We will be demonstrating on how you can create your chatbot below.

Import libraries
We will import the NLKT library which we would use in this tutorial.

import nltk
import io
import random
import string # to process standard python strings
import warnings
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import warnings
warnings.filterwarnings('ignore')
from nltk.stem import WordNetLemmatizer
nltk.download('popular')

Tokenization

Tokenization is the process by which a large quantity of text is divided into smaller parts called tokens. These tokens are very useful for finding patterns and are considered as a base step for stemming and lemmatization. Stemming and lemmatization are techniques used by search engines and chatbots to analyze the meaning behind a word.

Tokenization also helps to substitute sensitive data elements with non-sensitive data elements. We will pass our data through both tokenization and lemmatization and remove punctuation marks from the data.

f=open('data.txt','r',errors = 'ignore')
raw=f.read()
raw = raw.lower()
sent_tokens = nltk.sent_tokenize(raw)# converts to list of sentences 
word_tokens = nltk.word_tokenize(raw)#
lemmer = nltk.stem.WordNetLemmatizer()
#WordNet is a semantically-oriented dictionary of English included in NLTK.
def LemTokens(tokens):
    return [lemmer.lemmatize(token) for token in tokens]
remove_punct_dict = dict((ord(punct), None) for punct in string.punctuation)

def LemNormalize(text):
    return LemTokens(nltk.word_tokenize(text.lower().translate(remove_punct_dict)))

Keywords Matching

We created a list of data we refer to as greeting input and responses, for which we are going to pick our results at random if any of the keywords is found to match.

#Keyword matching

GREETING_INPUTS = ("hello", "hi", "greetings", "sup", "what's up","hey",)
GREETING_RESPONSES = ["hi", "hey", "*nods*", "hi there", "hello", "I am glad! You are talking to me"]
def greeting(sentence):
 
    for word in sentence.split():
        if word.lower() in GREETING_INPUTS:
            return random.choice(GREETING_RESPONSES)

Chatbot Implementation

This is the point where we write a function which uses the transformed data and pass it through the TFidfvectorizer and as well use the cosine similarity on it to train our bot.

def response(user_response):
    robo_response=''
    sent_tokens.append(user_response)
    TfidfVec = TfidfVectorizer(tokenizer=LemNormalize, stop_words='english')
    tfidf = TfidfVec.fit_transform(sent_tokens)
    vals = cosine_similarity(tfidf[-1], tfidf)
    idx=vals.argsort()[0][-2]
    flat = vals.flatten()
    flat.sort()
    req_tfidf = flat[-2]
    if(req_tfidf==0):
        robo_response=robo_response+"Oops, I can't understand you!"
        return robo_response
    else:
        robo_response = robo_response+sent_tokens[idx]
        return robo_response

Testing

This is where we call our function and also obtain the user’s input while we run our code.

flag=True
print(" John: Hi!! My name is John. I'll answer all your queries about Chatbots. If you want to exit, just type - bye!")
while(flag==True):
    user_response = input()
    user_response=user_response.lower()
    if(greeting(user_response)!=None):
        print("John: "+greeting(user_response))
    else:
        print("John: ",end="")
        print(response(user_response))
        sent_tokens.remove(user_response)

Output

After we ran our code . These were the results from the conversation we had with the bot which shows that the bot performed well and was able to capture most of the questions asked. We can as well improve on this.

Once you have a good understanding of the structure of the chatbot built using Python, you can then play around with it using various techniques or commands that will make the bot more efficient.