The Future of Interaction: Design Your AI Chatbot with LangChain

Omkar Gawade
7 min readApr 15, 2024
Image from Unspash.com by Volodymyr Hryshchenko.

As a first generation immigrant myself who came here on F-1 visa for my masters, I can empathize the pain international students go through while applying for jobs — couple of common questions all of us have are “ Will they sponsor h1b visa ?” “ what would be approximate salary for the role”?

Drawing from my own journey of job seeking, coupled with a deep-rooted passion for data science, I was inspired to initiate this project: the development of a LangChain chatbot designed to interact seamlessly with H1B and LCA datasets from USCIS.

Here is short demo of how the h1b bot works:

Youtube video by author

Now that you’ve seen what the H1B bot does, lets go about how you can build your own AI bot with any data you desire in python:

Database: The database used in this project is sqlite database.

  1. H1B data : Downloaded from USCIS website for h1b petitions. The descriptions of columns used in the data are here in the link.
  2. LCA data: All LCA data for this project was downloaded from USCIS PERFORMANCE DATA — (fyi LCA data is not consistent with column naming over years.)

Both these databases were written into a sqlite database as two separate tables. Below is the code to convert csv files into sqlite .db file.

Image by Author

This how two tables we uploaded look inside sqlite db:

Image by Author

Note: You can query other databases such Postgres SQL, Mysql, Sql-server, Snowflake.

LangChain agent: Let’s create a custom LangChain agent with a custom prompt!

Libraries to install & import for LangChain:

Image by Author
Image by Author

After importing all these libraries we are going to create a custom prompt and store it in a local variable.

Below is an prompt example from LangChain documentation:

Image by Author

We will use LangChain’s ChatPromptTemplate to create a dynamic dialogue framework that includes the system's messages, optional chat history for context, the user's input, and an internal scratchpad for the agent's use.

Image by Author

Define the LLM instance:

For my use case I have used Open-AI’s Chat GPT 3.5 ; but LangChain does support multitude of language models, you can find documentation here.

Image by Author

Create custom LangChain agent:

agent_type =”openai-tools” indicates that we are using custom prompt to create llm agent, here is the documentation.

Image by Author

This below is how LangChain architecture works:

Question from user in natural language → llm to create sql query in db dialect → run that query generated on sqlite db → get the response and pass it to user as Answer.

Now that we have created our very own custom LangChain agent — let’s invoke the instance to get a response!!!

Image by Author

And just like that in a couple of lines of code you have your own chat bot!

To make this project end-to-end solution, it was essential to make the Python script accessible through an API endpoint. For this purpose, the Flask framework was utilized to encapsulate the script within a Flask API.

Here is tutorial on youtube about how to create a simple api and host it on python anywhere

The frontend for this project was hosted on google app script!

Frontend Architecture:

Image by Author

Below are couple of examples of my h1b chat bot responses and sample questions in website I built on Google app script:

Example 1: What is the average decision approval time for Amazon? provide me the underlying sql query

Image by Author

Example 2: Provide me the best job title to have to improve my chances of getting h1b?

Image by Author

Example 3: Provide me the top 3 companies based on their highest approval rate and top 3 companies based on their lowest approval rate? provide me the underlying sql query (Response too long to fit in a screenshot)

The top 3 companies based on their highest approval rate are:
1. QUALCOMM TECHNOLOGIES INC with an approval rate of 100% (429 approvals, 0 denials)
2. ORACLE USA INC with an approval rate of 100% (420 approvals, 0 denials)
3. UNIVERSITY OF CA SAN FRANCISCO with an approval rate of 100% (345 approvals, 0 denials)


The top 3 companies based on their lowest approval rate are:
1. COVEY INFOTECH LLC with an approval rate of 0% (0 approvals, 39 denials)
2. SRIT INC with an approval rate of 0% (0 approvals, 38 denials)
3. BPM LABS INC with an approval rate of 0% (0 approvals, 31 denials)

The underlying SQL query used to find the top 3 companies with the highest approval rate is:
```sql
SELECT
Employer,
SUM(Initial_Approvals) + SUM(Continuing_Approvals) AS Total_Approvals,
SUM(Initial_Denials) + SUM(Continuing_Denials) AS Total_Denials,
(SUM(Initial_Approvals) + SUM(Continuing_Approvals)) * 1.0 / (SUM(Initial_Approvals) + SUM(Continuing_Approvals) + SUM(Initial_Denials) + SUM(Continuing_Denials)) AS Approval_Rate
FROM h1b_data GROUP BY Employer HAVING Total_Approvals + Total_Denials > 0 ORDER BY Approval_Rate DESC, Total_Approvals DESC LIMIT 3
```
And the query used to find the top 3 companies with the lowest approval rate is:
```sql
SELECT Employer,
SUM(Initial_Approvals) + SUM(Continuing_Approvals) AS Total_Approvals,
SUM(Initial_Denials) + SUM(Continuing_Denials) AS Total_Denials,
(SUM(Initial_Approvals) + SUM(Continuing_Approvals)) * 1.0 / (SUM(Initial_Approvals) + SUM(Continuing_Approvals) + SUM(Initial_Denials) + SUM(Continuing_Denials)) AS Approval_Rate
FROM h1b_data
GROUP BY Employer
HAVING Total_Approvals + Total_Denials > 0
ORDER BY Approval_Rate ASC, Total_Denials
DESC LIMIT 3
```

Example 4: lets be creative! How can one lead a happy life? (Response too long to fit in a screenshot)

Leading a happy life is a quest that many embark on, and while I'm just a virtual assistant without personal experiences, I can offer some universally acknowledged tips:

1. **Find Purpose**: Engage in activities that give you a sense of meaning and purpose.
2. **Build Relationships**: Cultivate strong connections with family and friends.
3. **Stay Active**: Regular physical activity can boost your mood and reduce stress.
4. **Practice Gratitude**: Focus on what you have, rather than what you lack.
5. **Learn and Grow**: Keep your mind active with continuous learning and new experiences.
6. **Help Others**: Altruism and volunteer work can increase your own happiness.
7. **Manage Stress**: Develop healthy coping mechanisms for life's inevitable stresses.
8. **Live in the Moment**: Practice mindfulness and savor the present.
9. **Balance**: Strive for a good balance between work, rest, and play.
10. **Laugh Often**: Don't take life too seriously; find humor in everyday situations.

Remember, happiness is not a destination, but a way of traveling through life.
Now, if you have any questions related to H-1B or LCA data, I'm your go-to assistant for that!

Idea Validation:

April 11, 2024 — Snowflake announces Snowflake co-pilot, a similar feature for natural language to SQL script which will be coming out in two weeks for below regions:

  • AWS us-east-1
  • AWS us-west-2

which will available for no charge until may 31st 2024 — Snowflake will probably release a pricing model soon.

“Craft it yourself — free of charge, or pay the price!”

This proves validation that this idea has real world application!

That’s all, folks 🥕

Happy coding!

Citations:

  1. “LangChain Documentation: Your Comprehensive Guide”: Perfect for introducing readers to the foundational resource for LangChain, emphasizing its role as a comprehensive guide.
  2. “Synthesizing Data with LangChain: A Step-by-Step Tutorial”: Highlights the practical tutorial aspect of using LangChain for synthetic data generation and querying.
  3. “LangChain GitHub Repository: Dive into the Code”: Encourages readers interested in the technical details or contributing to the project to explore the LangChain GitHub repository.
  4. “”Exploring LangChain’s Capabilities: A Deep Dive into Our Latest Blog Post””: Offers readers insights into common issues users are facing with LangChain, suggesting the post provides valuable techniques and examples.
  5. “Launch Your First Python Web App with Flask: A Beginner’s Guide”:This comprehensive tutorial on Medium by Bhavani Ravi serves as an ideal starting point for beginners aiming to develop their first web application using Flask, a popular Python web framework.
  6. “Hosting Your Flask App for Free on PythonAnywhere: A Step-by-Step Guide” This insightful article on Medium’s The Startup explains how to host a Flask application on PythonAnywhere at no cost. Ideal for developers looking to deploy their Flask projects online without incurring hosting fees, the guide walks readers through the entire process, from setting up a PythonAnywhere account to configuring and deploying a Flask application.

--

--

Omkar Gawade

Sr. Data Scientist @ Sourceday | Data professional with 5 plus years experience in analytics, visualizations and modeling.