Data Science Interview Question Series: Let's Discuss ChatGPT and OpenAI Intelligently!

Gradient Valley
Keith Bourne

May 1, 2023

Data Science Interview Question Series: Let's Discuss ChatGPT and OpenAI Intelligently!

Welcome to a new series on our blog where we explore various data science interview questions, delving into not just the questions and answers themselves, but all related topics you would want to be familiar with if asked this question in a real data science interview. So, without further ado...

Imagine you're in an interview for your dream data science position, and the interviewer asks you:

OpenAI has received a lot of attention recently with the launch of ChatGPT. Explain the technology behind ChatGPT and its relevance to the field of Natural Language Processing (NLP).

This was an actual question in a recent interview shared with me. As I understand it, there were also some probability questions dealing with cards and/or dice mixed in (the "classics!"), which would impress me if I was a candidate, as it shows they were putting extra thought into this and keeping things current.

So, could you discuss this intelligently in an interview setting? Gradient Valley has added this question to our long list of interview questions in our "DS Interview" app, with full coverage of all the topics you would need to know to talk in-depth about this subject. You can try it out yourself; it's even one of the free interview questions provided in the oversized Unit 1 of the app. Here are the main topics we felt were helpful to be familiar with to have an in-depth discussion about this topic with a potential employer:

  • A brief background of OpenAI as a company and why they are suddenly so famous.
  • What does the acronym GPT stand for, and what does Generative Pre-trained Transformer really mean (i.e., break down each word and discuss what each one represents)?
  • What does LLM stand for, and what is the meaning behind those words as well?
  • What is a transformer-based model? How do they compare to other models?
  • What are the nuanced differences between the three big terms you hear thrown around in the press, and when should you use each term: ChatGPT, LLMs, and Transformers?
  • Alternatives to OpenAI that are also transformer-based and, oddly enough, references to Sesame Street characters.
  • What is meant by a self-attention mechanism? Bonus if you know who Ashish Vaswani is; more about him below!
  • Details about how the self-attention mechanism works, including how it assigns scores/weights to words, what those weights mean, and how exactly they are used in NLP tasks.
  • What are input embeddings in a natural language model, and how does the self-attention mechanism use them?
  • What is a softmax function, and how is it used to normalize the self-attention scores?
  • Why does the transformer architecture use multiple self-attention "heads" in parallel, and why does that play such a critical role in ChatGPT's success as a generative model?

We are also tempted to dive deeper and add some more basic, but fundamental, concept coverage for even more build-up to the final data science interview question. For example, the softmax function converts a vector of real numbers into a probability distribution. There are two more basic but fundamental concepts that help build up to the higher-level concepts of transformers and self-attention mechanisms. The self-attention mechanism also uses input embeddings that are multiplied by learnable weight matrices to calculate the self-attention scores. So, yes, there are a lot of linear algebra concepts forming key aspects of the foundation of these algorithms that we could cover as well! It is tempting to include these more fundamental concepts because it's interesting to see how you can tie some of the most fundamental elements of a branch of mathematics that has been around since the Babylonians in 300 BCE to one of the most advanced technologies of our modern era.

But that is what we decided to cover. Did we get it right? Did we miss something? Please, let us know your thoughts!

In the meantime, did you know, it was just in 2017 when Ashish Vaswani published a paper called “Attention is All You Need” that mapped out the transformer-based architecture with its revolutionary self-attention mechanism that has been the foundation for the significant advances we have seen lately in this field. That is not that long ago! The field is moving very fast!

What's New with Gradient Valley?

We just updated our app, ‘DS Interviews’ with what we feel is our most exciting feature yet, a full integration of our AI Tutor with all of the sections. You have to try it out, it is extremely engaging and ultimately we believe it will help you learn better and faster than more traditional flashcard-like approaches.

Try out the new ‘DS Interviews’ app and let us know if you have any other questions to add! We are adding more all the time!

DS Interviews - AppStore (iOS) - Focused on data science interview questions
DS Interviews - Google Play (Android) -Focused on data science interview questions

Gradient Valley ( is on a mission to expand access to education through the application of Artificial Intelligence (AI) and mobile/web apps. We believe the right combination of AI and mobile will allow us to bring real improvements to the lives of mobile learners everywhere. Gradient Valley is developing a suite of education-based apps utilizing AI to its fullest extent, starting with the 'DS Interviews' mobile apps, and aiming to provide an unmatched comprehensive and always-on tutoring and learning experience.


Ready to do something?