What’s next after ChatGPT?

A Large Language Model as a Van Gogh Artwork (Dall-E2)

The hype on Generative AI is still there. Everyone is looking for applications of GenAI and investing to get a competitive advantage for businesses with AI-based interventions. These are a couple of questions I came across, as well as my view on LLMs and their future.

Can we achieve everything with LLMs now?

The straightforward answer is NO! LLMs can’t handle all tasks and are best suited as tools for most natural language processing tasks, particularly in information retrieval and conversational applications. Despite their strengths, there are still plenty of simple approaches that find practical use in real-world scenarios. In simpler terms, LLMs have their unique use cases, but they can’t do everything like a wizard!

Are LLMs taking over the tech world?

Is this the end of traditional ML? Not at all. As mentioned earlier, LLMs don’t cover all machine learning tasks. Most data analytical and machine learning use cases involve numerical data often organized in a relational structure, where traditional machine learning algorithms excel. Traditional machine learning techniques are expected to remain relevant for the foreseeable future.

Artificial General Intelligence (AGI)?

Have we reached it? General Artificial Intelligence (General AI) envisions AI systems with human-like abilities across various tasks. However, we are not there yet. While there’s a possibility of achieving some level of AGI, current LLMs, including applications like ChatGPT, should not be confused with AGIs. LLMs are proficient in predicting text frequencies using transformers but struggle with complex analytical tasks where human expertise is crucial.

Are enterprises ready for the AI hype?

Having worked with numerous enterprises, I’ve observed a willingness to invest in AI projects for streamlining business processes. However, many struggle to identify suitable use cases with a considerable Return on Investment (RoI). Some organizations, even if prepared for advanced analytics, face extensive groundwork in their IT and data infrastructure. Despite these challenges, the AI hype has prompted businesses to recognize the potential of leveraging organizational data resources effectively. In the coming months, we anticipate a significant boost not only in LLM-based applications but also in traditional machine learning and deep learning applications across industries.

Ethical AI? What’s happening there?

With the public’s increasing adoption of ChatGPT and large language models, conversations about responsible AI use have gained traction. The European Union has passed pioneering AI legislation, and Australia is actively working on regulating AI systems and establishing ethical AI guidelines. Countries like Australia have introduced AI ethics frameworks and established National AI Centres to promote responsible AI practices and innovation.

Leading companies like Microsoft are contributing to responsible AI by introducing guidelines and toolboxes for transparent machine learning application development. Governments and corporations are moving towards regulating and controlling AI applications, a positive development in ensuring responsible AI use.

There’s no turning back now. We must all adapt to the next wave of AI and prepare to harness its full potential.

Empowering Conversations with ChatGPT: The Art of Prompt Engineering

The release of ChatGPT in November 2022 gave a mega boost for the world of AI. With that, everyone started playing around with Large Language Models (LLMs) and explore the possibilities of them.

While opening up interesting possibilities, new approaches and methodologies for interacting with LLMs came into the play. Prompt engineering is one of the major domains even the general public is interested in right now.

To get myself familiar with the context and not to feel left behind, I followed several online resources including the Prompt Engineering for developers from Deeplearning.ai. Here are some interesting points caught my attention. This brief would be helpful for anyone who’s using ChatGPT or any LLM not only in development scenarios, even for general daily chaos.

What’s actually Prompt Engineering?

Without any hesitation, I asked this question from ChatGPT 😀

Prompt engineering refers to the process of designing and refining prompts to elicit desired responses from a language model like ChatGPT. It involves crafting the initial instruction or query in a way that effectively communicates the desired task or information to the model. Prompt engineering can significantly influence the quality and relevance of the model’s responses.

ChatGPT

In simple terms,

Prompt engineering is the methodology of giving clear instructions to LLMs. It helps the language model to understand the instructions clearly and provide a better precise output.

LLMs in the GPT family (GPT-3, GPT3.5, GPT4 etc.) are trained to predict the next word occurrence of a given output. given that, the instructions we provide to the model should be specific and understandable. There are no hard bound rules for prompting the instructions, but the most precise would be better.  

There are two key principles we should keep in mind when prompting.

  1. Write clear and specific instructions.
  2. Give the model time to “think”.

Be clear! Be Precise!

There are many tactics which we can follow in order to make out instructions clear and easy to understand for an LLM.

Use delimiters to indicate distinct parts of the input

Let’s get the example of using a GPT model to summarize a particular text. It’s always better to clearly indicate which text parts of the prompt is the instruction and which is the actual text to be summarized. You can use any delimiter you feel comfortable with. Here I’m using double quotes to determine the text to summarised.

Ask for structured outputs

When it comes to LLM aided application development, we use OpenAI APIs to perform several natural language processing tasks. For an example we can use the GPT models to extract key entities, and the sentiment from a set of product reviews. When using the output of the model in a software system, it’s always easy to get the output from a structured format like JSON object or in HTML.

Here’s a prompt which gets feedback for a hotel as the input and gives a structured JSON output. This comes handy in many analytics scenarios and integrating OpenAI APIs in production environments.  

It’s always good to go step by step

Even for humans, it’s better to provide instructions in steps to complete a particular task. It works well with LLMs too. Here’s an example of performing a summarization and two translations for a customer review through a single prompt. Observe the structure of the prompt and how it’s guiding the LLM to the desired output. Make sure you construct your prompt as procedural instructions.  

Here’s the output in JSON.

{
  "review": "I recently stayed at Sunny Sands Hotel in Galle, Sri Lanka, and it was amazing! The hotel is right by the beautiful beach, and the rooms were clean and comfy. The staff was friendly, and the food was delicious. I also loved the pool and the convenient location near tourist attractions. Highly recommend it for a memorable stay in Sri Lanka!",
  "English_summary": "A delightful stay at Sunny Sands Hotel in Galle, Sri Lanka, with its beautiful beachfront location, clean and comfortable rooms, friendly staff, delicious food, lovely pool, and convenient proximity to tourist attractions.",
  "French_summary": "Un séjour enchanté à l'hôtel Sunny Sands à Galle, Sri Lanka, avec son emplacement magnifique en bord de plage, ses chambres propres et confortables, son personnel amical, sa délicieuse cuisine, sa belle piscine et sa proximité pratique des attractions touristiques.",
  "Spanish_summary": "Una estancia encantadora en el hotel Sunny Sands en Galle, Sri Lanka, con su hermosa ubicación frente a la playa, habitaciones limpias y cómodas, personal amable, comida deliciosa, hermosa piscina y ubicación conveniente cerca de atracciones turísticas."
}

Hallucinations are one of the major limitations LLMs are having.  Hallucination occurs when the model produces coherent-sounding, verbose but inaccurate information due to a lack of understanding of cause and effect. Following proper prompt methodologies and tactics can prevent hallucinations to some extend but not 100%.

It’s always the developer’s responsibility to develop AI systems follow the responsible AI principles and accountable for the process.

Feel free to share your experiences with prompt engineering and how you are using LLMs in your development scenarios.

Happy coding!

AI Summer is Here!

Undoubtedly, we are going through another AI summer where we see a rapid growth in Artificial Intelligence based interventions. Everyone is so keen on using applications like ChatGPT and want to know what’s the next big thing. On my last post, I was discussing the underlying technicalities of large language models (LLMs) which is the baseline of ChatGPT.

Through this post, I’m trying to give my opinion on some of the questions raised by my colleagues  and peers recently. These are my own opinions and definitely there can be different interpretations.

What’s happening with AI right now?

Yes. AI is having a good time! 😀 Recently almost all the mainstream media started talking about AI and it’s good and bad. Even people who are not very interested in technology started their research on AI. In my point of view, it all happened since major tech companies and research institutes came out with AI base products that general public can use not only for business operations, but also for entertainment purposes. For example, AI base image generation tools (mostly based on artificial neural models as Dall-E) started getting popular in social media. ANN based image filters and chatting applications conquered all around social media.

All these fun stuffs didn’t come overnight. Those products have been under research and development for years. There have been more interesting stuff happening in research world and the problem was giving an easy accessibility of their abilities for general public.

Right now, we can see a clear motivation from enterprises to make AI more accessible and user-friendly. The barrier of having extensive understanding on mathematics and related theories to use AI and machine learning is getting reduced day by day.      

Yes. I see that as a good trend.

ChatGPT is the only big thing happened right?

No, it’s not! ChatGPT is an application build upon the capabilities of GPT-3 which is a LLM. OpenAI has announced they are coming up with GPT-4 soon, which is going to be much more powerful in understanding human language and images.

With the large computation capabilities opening up with technological advancements like cloud, we can observe research is going towards training massive neural networks to understand the non-linear data such as natural language, image and videos. The recent advancements of architectures like Transformers (which was introduced in 2017), has making very big changes in computer vision too.

Applications such as GitHub Copilot, an AI pair programming application gives software developers the ability to code faster and produce more reliable outcome efficiently.

Microsoft announced, Copilot is going to be a part of their office365 suite, which is widely used in many industries. It’s going to be a definite big change for the way that people work with computers today.

I’m a tech folk. How AI is going to affect my career?

I work in the AI domain. The advancement of AI directly affects my career and I always keep myself busy reading and grabbing new trends as much as I could (which is impossible with this rate!)

AI developers or the machine learning engineers are not the only groups getting affected directly with these advancements. Remember how cloud technologies changed the way we operate in enterprises, completely eliminating the excessive usage of cumbersome networking devices and in-house servers. I feel something similar is going to happen with AI too. IT folks may have to adapt for using AI As a Service in application development life cycle. You can be a dinosaur. Let’s get updated!

I’m a non-tech person. How AI is going to affect me?

It’s not only for tech savvy folks. Since, the trend is all about making AI more accessible, the AI based interventions are going to be there in almost all the industries. Starting from retail business, the knowledge focused industries such as legal and education going too going to get affected with the ways of working.

If you are a non-tech person definitely you should research on the ways of using AI and machine learning in your career.  

Can AI displace people from job?

Yes and no! According to a McKinsey study, Artificial intelligence could displace roughly 15% of workers, or 400 million people, worldwide between 2016 and 2030! Most the monotonous jobs will get replaced with robots and AI applications. For example, the manual labour needed for heavy lifting in factories will be replaced with computer vision assisted robots. Some monotonous tasks in domains like accounting will be replaced by automated applications powered with AI based applications.  

In the meanwhile, new job will be generated to assist AI powered operations. The term data scientist was not widely used even a decade ago. Now it has become one of the most demanding professions. It’s just an example.  

Isn’t excessive use of AI is harmful?

Excessive use of anything is harmful! It’s the same with AI too. There are many discussions going on related to regulating human-competitive machine intelligence. Enterprises are working towards concepts such as responsible AI to make sure AI systems are not harmful for the betterment of mankind and the nature.

To be frank, one can argue that AI base systems are reducing the critical thinking ability and the creative power of humans. In a way that can be true in some contexts. We should find ways to keep it as only a tool, but not totally rely on it.

Recently, an open letter titled “Pause Giant AI Experiments” was published enforcing the need of proper rule sets and legislations around AI systems. It was signed by profound scientists including Yoshua Bengio and known tech folks as Elon Musk.

Personally, I strongly believe a framework on regulating the usage of AI interventions should be there and the knowhow on such applications should be made widely available for general public.

That’s not something can be done overnight. Global organisations and governments should start working together soon as possible to make it happen.  

What’s the future would look like with AI?

It’s something we actually can’t predict. There will be more AI applications which may have human-like abilities and even more than that in some domains. The way people work will change drastically with the wide accessibility of AI. As we use electricity today to make our lives easy, the well-managed AI systems will be assisting us with our daily chaos. Definitely we may have to keep our eyes open!  

Unlocking the Power of Language with GPT-3

Yes. This is all about the hype of ChatGPT. It’s obvious that most of us are too obsessed with it and spending a lot of time with that amazing tool even as the regular search engine! (Is that a bye-bye google? 😀 )

I thought of discussing the usage of underlying mechanics of ChatGPT: Large Language Models (LLMs) and the applicability of these giants in intelligent application development.

What actually ChatGPT is?

ChatGPT is a conversational AI model developed by OpenAI. It uses the GPT-3 architecture, which is based on Transformer neural networks. GPT-3 is large language model with about 175 billion parameters. The model has been trained on a huge corpus of text data (about 45TB) to generate human-like responses to text inputs. Most of the data used in training is harvested from public internet. ChatGPT can perform a variety of language tasks such as answering questions, generating text, translating languages, and more.

ChatGPT is only a single use case of a massive research. The underlying power is the ANN architecture GPT-3. Let’s dig down step by step while discussing following pinpoints.

What are Large Language Models (LLMs)?

LLMs are deep learning algorithms that can recognize, summarize, translate, predict and generate text and other content based on knowledge gained from massive datasets. As the name suggests, these language models are trained with massive amounts of textual data using unsupervised learning. (Yes, there’s no data labelling involved with this). BLOOM from Hugging Face, ESMFold from Meta AI, Gato by DeepMind, BERT from Google, MT-NLG from Nvidia & Microsoft, GPT-3 from OpenAI are some of the LLMs in the AI space.

Large language models are among the most successful applications of transformer models. They aren’t just for teaching machines human languages, but for understanding proteins, writing software code and much more.

What are Transformers?

Encoder decoder architecture
The  encoder-decoder structure of the Transformer architecture
Taken from “Attention Is All You Need

Transformers? Are we going to talk about bumblebee here? Actually not!

Transformers are a type of neural network architecture (similar as Convolutional Neural Networks, Recurrent Neural Networks etc.) designed for processing sequential data such as text, speech, or time-series data. They were introduced in the 2017 research paper “Attention is All You Need“. Transformers use self-attention mechanisms to process the input sequence and compute a weighted sum of the features at each position, allowing the model to efficiently process sequences of varying length and capture long-range dependencies. They have been successful in many natural language processing tasks such as machine translation and have become a popular choice in recent years.

For a deep learning enthusiast this may sound familiar with the RNN architecture which are mostly used for learning sequential tasks. Unless the RNNs, transformers are capable of capturing long term dependencies which make them so capable of complex natural language processing tasks.

GPT stands for “Generative Pre-trained Transformer”. As the name implies it’s built with the blessing of transformers.

Alright… now GPT-3 is the hero here! So, what’s cool about GPT-3?

  • GPT-3 is one successful innovation in the LLMs (It’s not the only LLM in the world)
  • GPT-3 model itself has no knowledge; its strength lies in its ability to predict the subsequent word(s) in a sequence. It is not intended to store or recall factual information.
  • As such the model itself has no knowledge, it is just good at predicting the next word(s) in the sequence. It is not designed to store or retrieve facts.
  • It’s a pretrained machine learning model. You cannot download or retrain the model since it’s massive! (fine-tuning with our own data is possible).
  • GPT-3 is having a closed-API access which you need an API key to access.
  • GPT-3 is good mostly for English language tasks.
  • A bit of downside: the outputs can be biased and abusive – since it’s learning from the data fetched from public internet.

If you are really interested in learning the science behind GPT-3 I would recommend to take a look on the paper : Language Models are Few-Shot Learners

What’s OpenAI?

The 2015 founded research organisation OpenAI is the creators of GPT-3 architecture. GPT-3 is not the only interesting innovation from OpenAI. If you have seen AI generated arts which are created from a natural language phrases as the input, it’s most probably from DALL-E 2 neural network which is also from OpenAI.

OpenAI is having there set of APIs, which can be easily adapted for developers in their intelligent application development tasks.

Check the OpenAI APIs here: https://beta.openai.com/overview    

What can be the use cases of GPT-3?

We all know ChatGPT is ground-breaking. Our focus should be exploring the approaches which we can use its underlying architecture (GPT-3) in application development.

Since the beginning of the deep neural networks, there have been lot of research and innovation in the computer vision space. The networks like ResNet were ground-breaking and even surpass the human accuracy level in tasks like image classification with ImageNet dataset. We were getting the advantage of having pre-trained state-of-the-art networks for computer vision tasks without bothering on large training datasets.

The LLMs like GPT-3 is addressing the gap of the lack of such networks in natural language analysis tasks. Simply it’s a massive pre-trained knowledge base that can understand language.

There are many interesting use cases of GPT-3 as a language model in use cases including but not limited to:

  • Dynamic chatbots for customer service use cases which provide more human-like interaction with users.
  • Intelligent document management by generating smart tagging/ paraphrasing, summarizing textual documents.
  • Content generation for websites, new articles, educational materials etc.
  • Advance textual classification tasks
  • Sentiment analysis
  • Semantic search capabilities which provide natural language query capability.
  • Text translation, keyword identification etc.
  • Programming code generation and code optimisation

Since the GPT-3 can be fine-tuned with a given set of training data, the possibilities are limitless with the natural language understanding capability it is having. You can be creative and come up with the next big idea which improves the productivity of your business.

What is Azure OpenAI?

Azure OpenAI is a collaboration between Microsoft’s Azure cloud platform and OpenAI, aimed at providing cloud-based access to OpenAI’s cutting-edge AI models and tools. The partnership provides a seamless platform for developers and organizations to build, deploy, and scale AI applications and services, leveraging the computing resources and technology of the Azure cloud.

Users can access the service through REST APIs, Python SDK or through the Azure OpenAI Service Studio, which is the web-based interface dedicated for OpenAI services.

In enterprise application development scenarios, using OpenAI services through Azure makes it much easier for integration.

Azure OpenAI opened for general availability very recently and I’m pretty sure there’ll be vast improvements in the coming days with the product.

Let’s keep our eyes open and start innovating on ways which we can use this super-power wisely.