Skip to main content
Skip table of contents

Mimir & OpenAI Implementation Tutorial

Introduction

This tutorial will walk you through the process of creating a custom action in Mimir and integrating it with a corresponding flow in qibb. The objective is to extract a transcript and use ChatGPT to generate short Rap rhymes from it, leveraging the latest and most advanced OpenAI model, gpt-4-1106-preview, available at the time of writing this tutorial.

Tutorial Flow Source Code

To explore the nodes featured in this tutorial, simply download the attached JSON file below and import it into your flow app.

rapGPT.json

You can do this by either pressing Ctrl + I simultaneously or by navigating to the main settings and selecting the Import option.

Tutorial

This guide and the rapGPT.json flow expect you to have access to:

  1. qibb flow app with an HTTP Node accepting POST requests and defined path, e.g. /webhooks/rap-gpt)

  2. Mimir admin rights

  3. qibb Space admin rights (needed for the Secrets management)

  4. active OpenAI API Key

Mimir Configuration

Custom Action Creation

To get started, follow these steps to configure the custom action in Mimir by navigating to System Settings → Integrations → Custom Actions.

Next, you'll need to configure the Custom Action and click on the Add new button. A dialog box will appear; fill out the necessary details about the custom action:

Adding a Custom Action in Mimir

Make sure you give the action a Label, the Tooltip and Icon are optional, but nice to have, as they would improve the user experience. For URL, use the base URL of your designated qibb flow followed by the path you have assigned to your HTTP In node in the qibb flow app.

After creating the Custom Action, you can navigate to the Video assets and try to select some assets, and if press the right mouse button, you should see in the contextual menu, your Custom Action.

RapGPT Custom Action

Metadata Field Creation

After creating a custom action, we need to create a custom metadata field. This can be done in Mimir, by navigating to System Settings → Metadata → Model, pressing the plus (plus) sign (Add a new field to the schema) This action would automatically add a new field to the very bottom of the list, where you can define the field and its key. Select Field Type → Text. Optionally, you can set the field as Read only, which would ensure that you can change the value of this field only via an API call.

rapSummary Metadata Field’s Model Creation

When done, press on Item Page and set a label to the field, select Visible and Multiline and eventually select an icon for it from the drop-down list.

Metadata Field Creation - Item Page

Multiline Metadata Field

By default, string fields are single-line. However, if you enable the multiline checkbox, the field will automatically expand to accommodate multiple lines of text.

qibb Flow App Configuration

Install Mimir, OpenAI, and Secret Manager Nodes

The flow expects you to have the following three qibb nodes installed:

  1. Secret Manager (for populating the global context of your flow)

  2. Mimir (to be able to communicate with Mimir’s backend API)

  3. OpenAI (to send API requests to OpenAI)

Installing OpenAI qibb Node From the Flow Catalog

More information can be found here.

Set Secrets in qibb's Space

You can refer to this tutorial on how to add Mimir and OpenAI credentials to your Space.

The tutorial qibb flow (rapGPT.json) expects to configure the following credentials in your Space:

  1. MIMIR_USERNAME - Mimir’s username

  2. MIMIR_PASSWORD -Mimir’s password

  3. MIMIR_BASE_URL - Mimir’s base URL (usually it is https://mimir.mjoll.no/)

  4. OPEN_AI_API_KEY - API Key for OpenAI (reference: https://platform.openai.com/docs/quickstart?context=python)

Global Variables Prefix

Please be aware that the Secret Manager is adding a SECRETS. prefix in front of every key, so if you set a key pair MIMIR_USERNAME in the Space, the global variable in your flow would be SECRETS.MIMIR_USERNAME.

qibb Flow Explanation

Mimir is sending a POST request webhook to the configured URL (/webhooks/rap-gpt) with the following payload:

JSON
{
   "items":[
      {
         "id":"c0b36642-4698-4669-ab2c-8a29a88964cf",
         "itemType":"video",
         "metadata":{
            "formId":"03b447e3-7876-46c8-b882-294378fdbc8b",
            "formData":{
               "default_title":"sample_video.mp4",
               "default_description":"Test description",
               "default_createdOn":"2023-10-30T12:34:10.822Z"
            }
         }
      }
   ]
}

where:

  • id - unique item ID assigned by Mimir of each one of their items

  • itemType - type of the item

  • formId - Mimir assigns to each item a formId, and each formId can have different metadata fields, but one item cannot have mor

    e than one formId

  • default_title - the title of the asset that gets automatically generated by Mimir

  • default_description - the description of the item

  • default_creaedOn - default date pointing to when the item was first created in Mimir

From this list, we only care about the itemId which is why, we are using the Fetch ItemId & Set Auth change node to save its value into msg.mimir.itemId, and we use this node, to set the Mimir’s username and password that is needed to retrieve a JWT (JSON Web Token) needed for the subsequent calls to Mimir using the Mimir Auth qibb node.

qibb nodes Error Handling

In almost every qibb node, you get to decide how errors are managed. You can select Standard mode when both successful calls and error messages are consolidated into a single output or Separate in that case successful calls would be routed through the first output, but error message through the second one.

Within the qibb node configuration, specifically in the General section, you can explicitly define the desired behavior, choosing between Standard or Separate error-handling mechanisms.

Error Handling

Upon successful authentication, the generated JWT token will be set to msg.headers["x-mimir-cognito-id-token"]. This token is needed every time we want to fetch or send some data to Mimir, and that is why we will save it under msg.mimir.token in the Save Auth Token node. By storing this token, we enable seamless reuse in subsequent operations, considering its expiration timeframe of one hour.

The next node is the Fetch Item Data and here we are using it to fetch all data associated with the Mimir’s itemId that we previously saved under msg.mimir.itemId. This would return a long JSON object containing all information about the item.

The Check Transcript switch node is configured to assess the presence of a timed transcript for a specific item. Upon fetching the item's data, you'll typically encounter a pre-signed URL leading to the timedTranscriptUrl.

If the item doesn’t have a transcript, we set the msg.payload to the following JSON object:

JSON
{
    "message": "Transcript not found!",
    "status": "error"
}

Upon receiving this payload, Mimir would display for a short period of time an error message: “Transcript not found!” in the upper right corner.

Mimir Error Message

If the timedTranscriptUrl property exists for the item, we will set the payload to:

JSON
{
    "message": "Rap this Video",
    "status": "success"
}

Similarly, Mimir will display a success message in the UI:

Mimir Success Message

This URL serves as a link to a JSON file containing all transcript data for the item, including start/end time and transcribed words, and has the following example format:

JSON
[
    {
        "content": "Alright,",
        "startTime": 220,
        "endTime": 690
    },
    {
        "content": "let's",
        "startTime": 740,
        "endTime": 1370
    },
    ...
]

The Fetch Transcript node is an HTTP request node, that fetches this transcript and writes its data to msg.payload.

Once the transcript is retrieved, we need to concatenate the values of all content keys in the Concatenate the Transcript node:

JS
msg.mimir.transcript = msg.payload.map(item => item.content).join(' ');

This will create a msg.mimir.transcript property, that would contain all the transcript words with an empty space in between.

The Set Prompts node sets:

  • model - The OpenAI model, at the time of this article gpt-4-1106-preview is the latest model

  • openai.userPrompt - The user prompt is the input provided by the end-user, typically in the form of a message or instruction.

  • openai.systemPrompt - The system prompt is an instruction or message that sets the behavior or context for the assistant's response

  • openai.inputTokenPrice - The price per input (prompt) token (reference: https://openai.com/pricing)

  • openai.outputTokenPrice - The price per output token (reference: https://openai.com/pricing)

Tokens

In order for machines and NLP (Natural Language Processing) LLM (Large Language Model) models, such as BERT (Bidirectional Encoder Representations from Transformers) or GPT (Generative Pre-trained Transformer), etc., to comprehend human language, a crucial initial step involves the conversion of written words into numerical representations, as computers operate with binary represented data.

This initial step, known as tokenization, forms the foundation of (NLP) endeavors. Tokenization entails the segmentation of a given text into discrete units referred to as tokens. These tokens can encompass both words and punctuation marks. Subsequently, these tokens are further transformed into numerical vectors, serving as mathematical representations of the words they represent.

To make sense of these numerical values, we use a special type of computer program called a deep learning model, often a transformer. This model is trained using the numerical vectors obtained through tokenization, enabling it to understand the complexities of word meanings and their contextual relationships.

The ultimate objective is to allow NLP models with the capability to comprehend the semantics and connotations of various words and their contextual placement within sentences or texts. This, in turn, enhances the NLP model's proficiency in understanding and processing human language.

That’s why a lot of LLM (Large Language Models) like ChatGPT have a strict limit of the number of tokens consumed both by the prompt and by the output of the model and also have a fixed pricing per token.

The next node is the “Rap the Rhymes” node which is the OpenAI node that sends the defined body to OpenAI, to request the generation of rap rhymes. The request body is of the type:

JSON
{
   "model": openai.model,
   "messages": [
       {
           "role": "system",
           "content": openai.systemPrompt
    },
       {
           "role": "user",
           "content": openai.userPrompt
    }
  ]
}

In the Advanced section of the OpenAI node, we set the OpenAI API Key, as it is required. As you can notice there, the value is set to a global variable called SECRETS.OPENAI_API_KEY as suggested in the Set Secrets in qibb's Space section.

In the Advanced section of the OpenAI node, configuration of the OpenAI API Key is a necessary step. Notably, the value is assigned to a global variable SECRETS.OPENAI_API_KEY, as suggested in the Set Secrets section in qibb's Space section.

Chat Completions API
We are using the /chat/completions endpoint to query OpenAI. We are using the qibb’s OpenAI node, but if you are interested in the Chat Completion API, you can check this article.

OpenAI Node Body

Please note that the body of the POST /chat/completions call supports JSONata expression, that is why for example openai.model there would be replaced with the value of msg.openai.model, etc.

The next node is a change node, called Save the Rap Rhymes & Set Auth which is basically assigning the OpenAI output to the msg.mimir.rapSummary and sets the X-Mimir-Cognito-ID-Token needed for the authentication into Mimir.

In the same node, we are also calculating the price using the following JSONata expression:

JS
"$" & (openai.inputTokenPrice * openai.output.payload.usage.prompt_tokens + openai.output.payload.usage.completion_tokens * openai.outputTokenPrice)

and it saves the usage information reported by OpenAI to msg.openai.usage.

Finally, we need to update the designated metadata field in Mimir with the rhymes that have been generated by OpenAI. This is done using the Update Item node. Again, if you open the node, you can see that the body is taking advantage of JSONata to populate dynamically the value of msg.mimir.rapSummary:

JSON
{
   "metadataDelta": {
       "formData": {
           "rapSummary": mimir.rapSummary
    }
  }
}

Finally, we display the complete message in the debug node and use JSONata expression to format the node status (emoji and the price in USD).

JS
"👍 " & price

OpenAI Price

The final result in Mimir looks like this:

OpenAI Generated Rap Rhymes

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.