Recent advancements in Artificial intelligence (AI) have shown how the technology has the ability to significantly impact industries globally in the near to medium term. With rapid advancements in the ability to process and generate complex data, most recently around language and vision, organisations will be able to unlock new levels of efficiency and productivity in their business operations.
This has been driven by a combination of improvements in model architectures, developments in supporting tools and services, increase in compute processing capacity and increase in data available to process.
The technology has also moved away from being a field purely accessible to specialists, to one accessible to people with varying degrees of technical capability, thanks to the abundance of products, libraries and services now available.
However, due to the broad range of methods, models and approaches available, many organisations are struggling to match a technology solution to a real-world use case for improvement.
This guide aims to demystify AI and machine learning and equip organisations with the knowledge needed to navigate this evolving landscape. This understanding will empower business leaders to make informed decisions and capitalise on the potential of artificial intelligence.
AI can be broadly understood as any system that exhibits behaviour or performs tasks that typically require human intelligence. It encompasses various approaches, including machine learning, expert systems, rule-based systems and symbolic reasoning. Machine learning, a subset of AI, uses trained models to interpret and analyse complex data sets.
Trained models are the result of a learning process where the model is exposed to data and adjusts its internal parameters to capture patterns and relationships within the data. This learning process can be categorised into three types: supervised learning, unsupervised learning and reinforcement learning.
Supervised learning involves training models using input and output data. Learning from these examples, the model is then able to adapt to changing situations and make predictions on unseen data.
Unsupervised learning, alternatively, defines from unlabelled data. This enables algorithms to learn autonomously and uncover patterns and structures in data without predefined labels or explicit guidance.
Reinforcement learning involves an intelligent agent being trained to make decisions based on feedback. The agent receives feedback through either rewards or penalties based on its actions. Feedback can then be used to improve the models decision-making ability.
Early advancements in Artificial Intelligence were based on logic-based reasoning. This includes expert systems and heuristic models which rely heavily on statistical methods to solve complex problems in specific domains. Where machine learning is focused more on extracting information from data sets, these rule engines rely on the rules that are input.
Over time, these approaches have been complemented, and replaced by, more advanced techniques. Machine learning algorithms have proven impressive in their capacity to learn from data and make predictions by identifying patterns. What makes systems powered by machine learning so powerful is their ability to learn without being as dependent on human intervention.
As machine learning has advanced so too has this ability to learn independently. Artificial neural networks mimic the structure of the human brain to process and transmit information. Consisting of interconnected nodes, these networks use activation functions to determine the output of each neuron. By propagating information forward and backward through the network, they learn to recognise patterns, classify data and make sophisticated predictions. This process replicates the multifaceted cognitive processes of the human brain.
Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have played a significant role in how systems can process data related to image and speech, respectively. CNNs are mainly used for processing grid-like data, such as the pixels in an image. RNNs, on the other hand, are ideal for processing sequential data, where how elements are ordered is important.
These multi layered neural networks are encompassed by deep learning, an advanced form of machine learning that enables systems to learn increasingly complex representations of data. This subset of machine learning has led to breakthroughs in the way that models can process image, speech and text.
Today AI can perform a wide range of complex tasks that were once considered exclusive to human intelligence, with proficiency in natural language processing, image and speech recognition. At the peak of these advancements are transformers, which were initially proposed in Google's seminal research paper “Attention is All You Need”. This research introduced a novel architecture that is distinguished by its ability to process input sequences in parallel.
Unlike previous approaches, transformers do not rely on sequential processing. These models, instead, utilise self-attention mechanisms. Self-attention allows the model to capture relationships between different elements within a sequence by assigning importance weights to each element based on its relevance to other elements. This mechanism enables transformers to process the entire sequence in parallel, which makes them more efficient and effective in capturing long-range dependencies and contextual information.
Transformers have been particularly successful in tasks like machine translation, understanding human language and text generation. They have enabled the development of large-scale language models like OpenAI's Chat GPT and Google Bard, natural language processing tools that demonstrate impressive capabilities in generating coherent and contextually relevant text.
With different ways to leverage these algorithms and technologies, it can be difficult to know which is the best option and how you can get started. In the following sections we look at some of the key considerations for getting started with your AI projects.
Organisations have various factors to consider when beginning AI and machine learning projects, from defining the processes, people and data that fall within the scope to choosing the methods and technology to implement. In this section we will outline three key considerations in detail.
In this section:
Are you working with financial data, user activity, volumes of text, images or something else? Is your data structured or unstructured? For example, your organisation may want to analyse online customer behaviour to inform marketing strategies. The data involved would consist of structured data such as user demographics, browsing preferences and purchase records. In this scenario a model could be used to capture preferences in future behaviour.
Alternatively, if you want to visually identify stock, then your data will be images. Many image classifiers have been pre-trained, where a model that has already been trained on a dataset. Using pre-trained models can allow organisations to begin quickly leveraging AI technology without having to invest in training data and models from scratch. Pre-trained models like those offered in Azure Custom Vision and AWS Rekognition provide a strong foundation for these scenarios, with pre-trained models for image classification and object detection, specifically.
Also consider the data that you would receive from your solution; how will you evaluate the output? If you decide to use a language model to process and generate text (e.g. a chatbot), then it is important to consider the challenges that come with evaluating its responses. Large language models can be difficult to test because their outputs are subjective; how would you define an ideal response?
There are different strategies for evaluating generative language models and each one will likely be suited to a different use case. You may want to evaluate the truthfulness of the model's responses (i.e. how accurate are its responses by real-world factual comparisons) or how grammatically correct its responses are. For translation solutions, you are more likely to measure metrics such as the Translation Edit Rate (TER), that is, how many edits must be made to get the generated output in line with the reference translation.
Language libraries like LangChain provide features for evaluating the responses according to relevance, accuracy, fluency and specificity, as well as giving you the flexibility to define your own criteria for evaluation via the LangChain API.
Clarify whether your intended solution would process and analyse existing data or generate new content. For cases where you want to identify patterns or predict future behaviour, a model that processes data will be well-suited. Examples could include a solution to analyse existing customer data, from which trends can be identified and form predictions.
Data generation solutions, on the other hand, are used to create data that did not previously exist. This new data could take the form of synthetic data that can then be used to train and test machine learning models, or even new creative content, such as text or images.
There is also the option of using a solution that is capable of both processing and generating data. This type of solution can be advantageous in cases where you want your model to learn from its experiences and the data that it is processing. An e-commerce organisation may train a model on a large data set of user behaviour to learn about customers interests. Once this training is completed, the model could then be used to generate new recommendations for users.
The core component at the centre of a machine learning project is a trained model, which in the simplest terms is a software program that, once given sufficient training data, can identify patterns and make predictions. Your final consideration, therefore, should be how you will access a model for your AI/ML project. In the following sections we will look at two popular approaches for accessing a machine learning model.
With a better understanding of the key considerations for getting started with AI projects, your organisation will be able to evaluate these approaches in line with your intended data area and output.
AI cloud services enable organisations to rapidly adopt and leverage AI technology by providing pre-built models, APIs and infrastructure. Because of the wide range of pre-built models that cloud services offer, it can be useful for organisations to first think if they can achieve their objectives using a cloud service that already exists.
Azure, Google Cloud and AWS provide pre-built, pre-trained models for use cases such as sentiment analysis, image detection and anomaly detection, plus many others. These offerings allow organisations to accelerate their time to market and validate prototypes without an expensive business case.
Where previously machine learning projects have required specialised expertise and substantial resources, AI cloud services enable organisations to quickly develop AI solutions for a range of applications.
In this section:
In cases where you want more control over the development and training of your own model, it can be useful to leverage a machine learning framework like TensorFlow or PyTorch. These frameworks offer libraries and tools to help develop machine learning models.
Building a machine learning model generally refers to the entire process of creating a model from scratch, including selecting an appropriate algorithm or architecture, defining the model's structure and implementation.
Defining a model, alternatively, will more likely involve working with a model from a library or using a framework that provides predefined architectures. Which approach you take will be determined by your organisation's use case, resources and the granularity with which you want to create a model. Building from scratch affords even greater customisation and control over your model but will come with higher financial and computational costs.
In this section:
Decide between using an existing model or developing your own: Consider whether an existing model already addresses your problem. PyTorch, TensorFlow and Scikit-learn offer functionality that can be leveraged for everything from data pre-processing and feature engineering to model training and evaluation. The versatility and power of these frameworks makes them a very viable option if you are choosing to configure or develop your own model.
Select a framework: Scikit-learn is a powerful framework for accessing pre-built models or developing custom models across a range of algorithms including classification, regression and clustering algorithms.
Working specifically within the area of neural networks, it is possible to develop custom deep learning algorithms using frameworks such PyTorch or Tensorflow, developed and used by Meta and Google respectively. These deep learning models can then be used to power solutions such as virtual assistants and speech recognition systems. Both these frameworks are built upon the concept of tensors, which can simply be thought of as multi-dimensional arrays.
Both are mature and stable frameworks, each with their strengths and weaknesses. For example, being heavily used in research areas, PyTorch can provide more access to state-of-the-art models, where as TensorFlow in certain scenarios can provide increased performance due to it's ability to take advantage of GPUs and other specialised processors.
For organisations approaching this with experience in the Microsoft technology stack, ML.NET is also an option with seamless integration capabilities. However, compared to other development frameworks, ML.NET has a more limited set of pre-built models and algorithms available.
Aggregating, cleansing and preparing data: This involves collecting all the data that you will use for training your model. Once this data is collected, it will need to be prepared for training, with processes like cleansing being important in getting the data into a format which the data can understand and learn from.
Defining your model: Developing and tuning your model is a crucial step in this process and goes beyond simply defining the structure and design of the model. This will require choosing the appropriate algorithms and layers to make your model as effective as possible.
When it comes to selecting parameters, be sure to carefully consider their impact on the model’s performance and ability to generalise. Experimentation and iteration are key in finding the optimal configuration for your specific problem.
In this aspect, PyTorch and TensorFlow are very useful frameworks in that they give you access to a variety of libraries and tools that make it easier to, for example, define neural networks and apply optimisation techniques. Frameworks like Scikit-learn also offer a diverse set of algorithms for traditional machine learning tasks.
Training your model: This is where you provide your model with the data it needs to learn. Keep in mind that how long this will take vary greatly, ranging from minutes to months, depending on the complexity of your model and the size of your dataset. This step can be the most time and resource intensive, so it is a good idea to capture the usage metrics of this stage before deploying your model to any production environments.
Evaluate your model: Your model will need to be evaluated on a held-out dataset after it has been trained. By doing this, you can determine how well the model generalises to unseen data.
Deploy your model: When you are satisfied with the model's performance, you can deploy it in production, which could be anything from hosting the model with API access to embedding the model within a cloud-based web application.
At this stage it is important to consider the type of inference required for your specific use case: real-time or batch inference. Batch inference processes large batches of data periodically. This approach can support complex models and producing results with latency.
These factors make it suitable for situations where you need to produce large batches without requiring immediate results. If you are working in scenarios such as data analysis or generating reports where the focus is on comprehensive analysis, rather than real-time decision-making, then batch inference can be a useful solution.
Real-time inference, alternatively, delivers a small number of inferences instantly. Fraud detection and recommendation systems are two well-suited use cases for real-time inference because they require instant predictions to respond to dynamic situations. One caveat when taking this approach is latency constraint. Real-time inference is less suitable for deploying complex models that require extensive computational resources or have longer processing times.
Once your machine learning model has been built and trained, it can be deployed to an environment. Here we will outline a few of the different options available for hosting your model. Which option is best for your organisation will depend on specific budget, needs and overall requirements.
Cloud hosting is a popular choice for hosting machine learning models because of the scalability and security that this provides. Here resources are accessed online which allows you to allocate and adjust computational resources based on the demands of your model.
If you’ve developed a model using an AWS or Azure AI service, then your model will be seamlessly integrated with the cloud infrastructure. These providers offer specialised machine learning services that handle the underlying infrastructure and provide built-in scalability.
This scalability makes it easier to host both real-time and batch inference models in the cloud. With cloud hosting, you can allocate and adjust computational resources based on the demands of your model, whether it requires immediate responses or periodic processing of large data batches.
Scalability is accommodated by pricing which operates as a pay-as-you-go service. This means that you only pay for the resources that you need and will effectively have access to unlimited storage, as your budget allows.
One thing to be mindful of is that the creating and running of a machine learning model can be CPU intensive. For this reason, it is advised that you separate out the infrastructure such that you have a dedicated resource running your model. This will prevent the model from competing with other services, like your website or database.
This option involves hosting your model on-premises in physical servers. A big drawback here is the maintenance required for these physical servers, which incur large costs and can lead to diminished returns in the long run.
Also consider the infrastructure requirements and maintenance challenges when hosting a real-time inference model on-premises. Unlike other hosting options, real-time models demand continuous availability and low-latency processing. This means you'll need robust hardware, reliable network connectivity and dedicated resources to handle the high volume of incoming data and to be able to provide real-time responses.
Hosting your machine learning model on-premises comes with upfront costs for hardware infrastructure, but it does provide a major advantage if your model is meant for internal use. If you keep the model within your own infrastructure, you will have complete control and ownership over your data. This is crucial when dealing with sensitive information that should remain on-site. This approach will also enable faster data access and reduced latency, in turn, leading to a more responsive system where teams can quickly retrieve data.
Using containers allows you to package your model and its dependencies into a single unit that could be run on any compatible infrastructure. This could be based within a certain App Service or deployed on a Kubernetes cluster, depending on your specific requirements.
Your model can be packaged along with its associated software stack and deployed seamlessly across various platforms. This portability means that containers make it easy to deploy models to the cloud or on-premises. Containerisation offers some of the similar benefits of hosting in the cloud, such as scalability and flexibility. These benefits also make it suitable for hosting both batch inference and real-time inference models.
However, if organisation has limited hardware infrastructure or a limited budget for hosting containers, the cloud may offer a more cost-effective solution.
With your model deployed, it is important to consider how you can maintain and potentially improve its performance through retraining.
Data changes over time, and what was valid or representative a few years ago may no longer hold true today. If you have a model that predicts user behaviour, six months of user behaviour data from three years ago may no longer accurately reflect current patterns.
It is, therefore, important that you effectively maintain and retrain your model to ensure accuracy. Here we outline 6 key areas to consider during these processes:
1. Decide a plan to feed in new data
Determine the schedule and approach for feeding in new data and retraining your model. This could be on a time basis (weekly, monthly, etc.), per-deployment or event-driven triggers. Setting this plan early ensures that your model stays up-to-date and can adapt with evolving patterns.
2. Investigate failures
Investigating very bad failures or inaccurate results may identify parameters that you had not previously considered. For example, in a database looking at vehicles, these results may identify attributes like engine size or maintenance history, that had not previously been factored into the model. You can then add this previously unconsidered factor as a parameter in your model and retrain it to see their impact.
3. Leverage tools to help improve your algorithm
There are various tools that you can use to improve your algorithm by fine tuning parameters and optimising performance. One example is Ray Tune, a Python library that provides capabilities for tuning hyperparameters. This allows you to automate the process of exploring different hyperparameter configurations and finding the optimal settings for your model.
Running tools like these periodically gives organisations insights into how they can improve data collection and overall business processes, in turn, leading to a better model. The objective, here, is to seek out opportunities for getting more accurate results from your machine learning solution, so that it can respond to the latest market and customer data.
4. Retraining does not guarantee better performance
Without proper evaluation, retraining might give you a worse model. Be sure not to save a model without first ensuring that it is performing better than older models. It is recommended that you retain your own criteria for what constitutes a good model and archive previous models to maintain access to them.
For example, an outlying piece of data might cause your retrained model to perform badly. In this case, it is important that you can still access your last model for comparison and fallback purposes. Archiving older models will ensure that you always have a reference point to determine how effective your retraining process is and avoid a regression in performance. This way you won’t be replacing an older model that is performing better than your retrained model.
5. You might get better data over time
Over time, training data can become less relevant or redundant. The likelihood is that there will always be ways that you can get better data for your model. This does not necessarily mean that your existing data is ‘bad’, but rather that there may be opportunities to enhance the quality, diversity or fullness of the data.
An important step here is to establish a learning curve. This is a graphical representation of how your model is performing related to the amount of training data that it receives. Analysing the learning curve can help you gain insight into how the model’s accuracy or other performance metrics change as you increase volume or variety of training data.
6. Perform actions periodically
Measuring the performance of your machine learning model periodically ensures that you are consistently monitoring its effectiveness and scoping out any potential areas for improvement. Utilise your learning curve perhaps every quarter or at regular intervals depending on how quickly your data changes, to assess the model’s performance over time and identify trends that may require your attention. You may discover that your model would benefit from additional training data to enhance its performance.
Cloud service providers including Google Cloud, AWS and Azure provide a range of services that enable organisations to get started developing AI solutions quickly. These services include pre-built and pre-trained models, APIs and other important tools for solving real business problems.
Focusing on Azure, here we breakdown the three main categories of Azure AI cloud services:
Azure Machine Learning is fully managed cloud service for building, training and deploying machine learning models. It provides a variety of tools to help you with every step of the machine learning process, from data preparation to model training and deployment. With its robust set of tools, this service can be leveraged by organisations to solve a wide variety of problems.
Azure Cognitive Services are a set of pre-built APIs and SDKs that enable you to add features like natural language processing, speech recognition and computer vision to their applications. These services provide the foundation for more advanced Azure AI Services, such as Azure Applied AI Services.
Azure Applied AI Services is a specialised set of services that can be used for practical applications of AI. Leveraging the core components of Azure Cognitive Services, including speech, language and image capabilities, this service enables organisations to enhance their applications with advanced AI functionalities in a seamless and user-friendly way.
View our interactive breakdown of all Azure AI cloud services below, with descriptions and use cases.
Jump to our industry case studies on organisations leveraging Azure AI cloud services for everything from image classification, to natural language processing.
With an understanding of the machine learning cycle established, this section will look at how leading organisations have created powerful machine learning solutions by leveraging cloud services or defining their own model using a machine learning framework.
Approach: Configuring a model
Technology: ML.NET
The client for this project is a global provider of sterilisation of medical products. The main objective of the project was to create an application that could accurately forecast the optimal efficiency of the sterilisation process.
An application was created using ML.NET to accurately predict the dose range for products undergoing sterilisation. The prototype, trained on the provided data, leveraged machine learning algorithms within ML.NET to predict the level of sterilisation required for products prior to product loading.
This machine learning framework was chosen because it aligned with the organisation’s technology stack, meaning that it could be seamlessly integrated with their other .NET applications.
The first step in building the model was to define the scenario that we wanted to solve. A key feature that helped with this process is the ML.NET Model Builder, which selects the algorithm that will perform best on a given data set. This feature helps developers get started on building their model without the need for extensive algorithm selection and evaluation.
Historical data that could be used to train the model was provided and imported into the model. The range of file types supported by ML.NET, including CSV files and SQL Server databases, made this a seamless and efficient process. The historical data could then be used to build a customised linear regression model in ML.NET.
An LGM algorithm was selected for training and testing the model. This is a type of linear regression algorithm that is useful for predicting a single value based on a set of input parameters. The parameters for the model were density, totes, surrounding totes' density and processing speeds. This model was trained locally, although ML.NET also offers the ability to train models on Azure as well. Trained using approximately 6,000 runs, the platform quickly learned and adapted to the data.
Functions like Test and Evaluate helped ensure that the model was accurate and performing as expected. These functions enabled the model to be tested on unseen data and helped evaluate its performance by providing metrics related to accuracy and precision.
This tool can calculate the probability of achieving the desired sterilisation range for a given set of processing speeds. This flexibility helps optimise scheduling and dosage processes while ensuring compliance with contractual obligations.
All results provided by the predictor are made available to scheduling administrators who can then make informed decisions based on the predicted range. The tool empowers users to assess the probability of failure, for instance, by indicating that processing the solution at a certain speed had a 90% chance of failure. Users have the final say in processing decisions and can infer the likelihood of failure by processing the product under different conditions.
The predictor not only enhanced existing processes but also provided scheduling administrators with data-driven insights to optimise their decision-making. The project highlighted the power of ML.NET to create custom machine learning solutions quickly and with great accuracy. Further successful outcomes included:
Approach: Defining a model
Technology: Scikit-learn and Panda
The client for this project is a nationwide energy provider who specialises in providing gas to organisations. The main objective for this project was to be able to better predict incorrect or overinflated estimates for energy bills.
By accurately predicting these bills, the organisation could improve billing transparency, in turn, ensuring that customers could avoid unnecessary expenses. A machine learning model would provide a data-driven approach to the billing process and help increase customer service and trust in the long term.
The scikit-learn library and panda open source package in Python was used for this project as it provided the necessary tools and resources to preprocess and analyse the data.
A linear support vector machine (SVM) model was specifically chosen for its ability to handle complex patterns and relationships in data effectively. SVMs are particularly powerful for identifying outliers and classifying data into different categories, which made them well-suited for distinguishing potentially inaccurate bills in the data.
Historical data was provided by the organisation relating to customer data, billing details and energy consumption metrics. Most useful was the data revolving around what an accurate bill should look like. This subset would serve as a reference point for distinguishing between correct and incorrect or overinflated estimates.
This data then underwent thorough preprocessing, including cleansing and transforming the dataset, to ensure that inputs were meaningful and could be effectively used for training the model.
Scikit-learn provided a comprehensive implementation of linear SVMs which helped ensure a seamless process for training the model.
The model was thoroughly trained using supervised learning methods and labelled data. Each data point had input features and a corresponding label indicating whether the estimate was incorrect or overinflated.
This allowed the model to learn the underlying patterns and relationships between the input features and the billing errors. The model’s parameters were fine-tuned throughout this process, with a focus on optimising its performance to ensure the highest possible accuracy.
Two days of training resulted in strong levels of accuracy for the model. Key to this success was defining what an ideal bill looked like in the data. This meant establishing the characteristics of what was an accurate bill, so that the model could gain a deep understanding of what constituted an incorrect or overinflated estimate.
The model was retrained periodically to adapt to evolving data patterns and changes in energy billing practices. Using updated data for this retraining helped to improve the accuracy of the model and ensure its effectiveness in predicting incorrect bills.
The solution developed predicts incorrect or overinflated estimates for energy bills to a high level of accuracy by analysing input features and identifying patterns indicative of such errors. With these predictions, the organisation can take corrective measures and provide more accurate billing information to customers.
Scikit-learn provided ideal functionality for this use case, with functionality that enabled seamless machine learning development. Main benefits of using this library included:
Approach: AI Cloud service
Technology: Azure Custom Vision
This organisation faced a challenge of monitoring the placement of their products in supermarkets to ensure optimal visibility for their brand. An ideal solution to this situation would give a more streamlined and automated solution to capture product images and compare their shelf presence with competitor products.
Azure Custom Vision, a part of Azure Cognitive Services, was used for this project because it provides built-in functionality for identifying food on shelves. Azure Custom Vision provides granular functionality for choosing what machine learning you want to create, categorised into:
The project began by collecting photographs of the client’s products on supermarket shelves. While there was the option to use pre-trained models within Custom Vision, in this case the model was manually trained with a wide selection of images taken from different angles. This decision was made to ensure that the model could recognise specific characteristics and variations of the product. The training data also served as test and validation data and provided a starting point for the model to learn and improve.
With this data collected, each image was then tagged with relevant labels and classifications that could differentiate the products. Custom Vision ensured an efficient labelling process by automatically detecting potential products within the image that could then be labelled with our created tags.
Custom Vision provides granular control over how you want to train your model. This includes training type — whether you want to carry out quick training or advanced training on your model — and for how long you wanted to train your model. Azure provides indicators to show how certain the duration of training time corresponds to budget.
Custom Vision provided the flexibility required for refining this solution. With a Quick Test, you upload an image an image that the model can process. Results from this test are split into three metrics: Precision, Recall and mean Average Precision (mAP).
The service also allows you to improve your model by conducting a quick test and querying the detections made by the model, e.g. correcting the model if it wrongly identifies a tub of greek yoghurt as a pint of milk. This evaluation allowed for continuous improvement by identifying misclassifications and providing feedback to the model, gradually enhancing its accuracy.
Custom Vision enabled the creation of a proof of concept by creating, training and refining an image detection model. The implementation of Azure Custom Vision provided the client with a clear proof of concept for how an AI solution for product image detection would behave. Other benefits include:
Approach: AI cloud services
Technology: Azure OpenAI Service
Low- and no-code platforms allow non-technical users to build powerful projects quickly and efficiently. However, the traditional process of building a project within these platforms often involves manually adding data to tables and their associated rows. This manual approach not only consumes significant time and effort and for more complex projects requires some knowledge of data modelling. The client for this project is a no-code platform who was looking to streamline this traditional process for building projects.
The solution streamlines the onboarding process for the client by giving users a way to quickly generate projects based on text inputs. This eliminates the need for manual data entry and reduces the time and effort required to get started with a new project.
Microsoft’s Azure OpenAI Service was chosen for this project because it provides access to OpenAI’s pre-trained large language models, including GPT-3 and Codex, via its REST API. This API can then be leveraged to create generative language processing tools. Azure OpenAI Service is also compatible with open-source framework LangChain to allow users more granular control over the training of these large language models.
Prompt engineering is a key part of how Azure OpenAI Service functions. This process requires users to input queries to the machine learning model to elicit desired responses. Prompts should be detailed enough to guide the model towards generating an accurate and contextually appropriate response. The completions endpoint is the key area of the API to submit prompts. Here users can provide an input command and the model will generate a text completion. Prompts can range from a short piece of text that provides context for the completion, to a maximum number of tokens, which defines how big the completion should be.
Azure OpenAI Service provides a playground to experiment with these capabilities. Here users can interact with the API and adjust various configuration settings, such as the temperature and length of the generated text. To familiarise the API with the no-code platform, detailed information about the platform, its capabilities and its use cases were provided to the completions endpoint. This information gives the model an understanding of the platform and the project creation process. Key information included context about what features the platform offers and data relationships that can be created on the platform.
Azure OpenAI Service is particularly powerful because of its ability to quickly gain an understanding of the context that is provided. Leveraging OpenAI’s generative language model, ChatGPT, the completions endpoint responded to text inputs with relevant data types and relationships.
The API was also able to return an accurate JSON array based on the project database, name and description. This code contained all the data types each table, as well as the necessary data relationships that have been suggested by the model. This code can then be parsed and used to dynamically create the tables and fields required for the CRM platform.
The API also made it easy to integrate the developed solution with the client’s platform, ensuring a seamless end-to-end user experience. Once the prompt is executed, the API provides a JSON array that can be linked through as part of an interactive UI.
Azure OpenAI service given users the ability to build custom software solutions by interpreting and processing natural language text inputs. For example, by interreacting with the AI and providing a request like “I want a CRM platform” users will receive a custom CRM platform to fit their needs. This approach revolutionises the traditional process of building these platforms, as it eliminates the need for extensive business and process analysis. Other benefits of using the platform included:
Term | Definition |
---|---|
Artificial intelligence | The development of computer systems that can perform tasks that would usually require human intelligence. |
Clustering | A method used in unsupervised learning to group data points together based on their similarities and characteristics. |
Deep learning | A branch of machine learning that uses artificial neural networks with multiple layers to extract meaningful patterns. |
Error functions | Metrics that measure the difference between predicted and actual values, used to guide model optimisation. |
Gradient descent | An optimisation algorithm used to minimise the error of a model by adjusting its parameters iteratively. |
Human feedback | Inputs provided by humans to evaluate, correct or guide machine learning models. |
Lemmatisation | The process of grouping together different inflected forms of a word for analysis as a single item. |
Linear regression | A statistical approach for modelling the relationship between dependent and independent variables using a linear equation. |
Logistic regression | A statistical technique used to model the probability of a binary outcome based on independent variables. |
Machine learning | An application of AI that enables machines to learn from data and improve their performance without explicit programming. |
Model | A representation of a real-world system or process created by a machine learning algorithm to make predictions or decisions. |
Natural language processing | The field of AI focused on enabling computers to understand, interpret and generate human language. |
Neural networks | Computational models that consist of interconnected nodes or neurons that can learn and make decisions. |
Overfitting | A phenomenon in machine learning where a model becomes too specialised to the training data and performs poorly on new data. |
Pre-trained model | A model that has already been trained on a dataset. These models are built to perform specific tasks, such as image recognition. |
Recurrent neural networks | Neural networks that can process sequential data by including loops within their architecture to retain and utilise past information. |
Reward functions | Functions that define the measure of success or desirability in reinforcement learning, guiding the learning process. |
Sequence transduction | The process of transforming one sequence of data into another sequence (e.g., translating one language into another). |
Stemming | A text processing technique that reduces words to their base or root form to simplify analysis and improve efficiency. |
Supervised learning | A category of machine learning where models are trained using labelled data to make predictions or classifications. |
Unsupervised learning | A category of machine learning that trains models on unlabelled data to identify patterns or structures. |