Mimir & OpenAI Implementation Tutorial

Introduction

This tutorial will walk you through the process of creating a custom action in Mimir and integrating it with a corresponding flow in qibb. The objective is to extract a transcript and use ChatGPT to generate short Rap rhymes from it, leveraging the latest and most advanced OpenAI model, gpt-4-1106-preview, available at the time of writing this tutorial.

Tutorial Flow Source Code

To explore the nodes featured in this tutorial, simply download the attached JSON file below and import it into your flow app.

rapGPT.json

You can do this by either pressing Ctrl + I simultaneously or by navigating to the main settings and selecting the Import option.

Tutorial

This guide and the rapGPT.json flow expect you to have access to:

qibb flow app with an HTTP Node accepting POST requests and defined path, e.g. /webhooks/rap-gpt)
Mimir admin rights
qibb Space admin rights (needed for the Secrets management)
active OpenAI API Key

Mimir Configuration

Custom Action Creation

To get started, follow these steps to configure the custom action in Mimir by navigating to System Settings → Integrations → Custom Actions.

Next, you'll need to configure the Custom Action and click on the Add new button. A dialog box will appear; fill out the necessary details about the custom action:

Make sure you give the action a Label, the Tooltip and Icon are optional, but nice to have, as they would improve the user experience. For URL, use the base URL of your designated qibb flow followed by the path you have assigned to your HTTP In node in the qibb flow app.

After creating the Custom Action, you can navigate to the Video assets and try to select some assets, and if press the right mouse button, you should see in the contextual menu, your Custom Action.

Metadata Field Creation

After creating a custom action, we need to create a custom metadata field. This can be done in Mimir, by navigating to System Settings → Metadata → Model, pressing the plus sign (Add a new field to the schema) This action would automatically add a new field to the very bottom of the list, where you can define the field and its key. Select Field Type → Text. Optionally, you can set the field as Read only, which would ensure that you can change the value of this field only via an API call.

rapSummary Metadata Field’s Model Creation

When done, press on Item Page and set a label to the field, select Visible and Multiline and eventually select an icon for it from the drop-down list.

Multiline Metadata Field

By default, string fields are single-line. However, if you enable the multiline checkbox, the field will automatically expand to accommodate multiple lines of text.

Generate Mimir API Key

API Key Permission

By default, Mimir users don't have permission to generate API keys. To grant access, follow these steps:

Click on your profile picture in the top right corner of the screen
Navigate to My Organization
Click on your username from the list
Select the API Key Group permission

Mimir Authentication Deprecation Notice

There are currently two methods to authenticate against Mimir:

Legacy Method: Involves using the Mimir Authentication node and your username and password to generate a temporary bearer token. This method will soon be deprecated. We strongly recommend migrating to the API key authentication method.
API Key Method: This is the recommended approach, using a statically generated API key to authenticate against Mimir.

To generate your API key, follow these steps:

Click on your profile picture in the top right corner.
Navigate to User Settings > API Keys.
Click the small plus button as shown in the screenshot below:

In the API key creation window, you can define:

Maximum time the key can be unused before it is deemed invalid: How long the key can remain unused before it becomes invalid.
Expiry Date: The date and time after which the key will be invalidated.
Add: After defining these parameters, click the Add button

A new window will appear where you can copy your API key.

Important

Save this key immediately, as it will only be shown once. Ensure you record it securely and adding it to the Secrets tab in your space under the name MIMIR_API_KEY.

qibb Flow App Configuration

Install Mimir, OpenAI, and Secret Manager Nodes

The flow expects you to have the following three qibb nodes installed:

Secret Manager (for populating the global context of your flow)
Mimir (to be able to communicate with Mimir’s backend API)
OpenAI (to send API requests to OpenAI)

Installing OpenAI qibb Node From the Flow Catalog

More information can be found here.

Set Secrets in qibb's Space

You can refer to this tutorial on how to add Mimir and OpenAI credentials to your Space.

The tutorial qibb flow (rapGPT.json) expects to configure the following credentials in your Space:

MIMIR_API_KEY - Mimir API Key
MIMIR_BASE_URL - Mimir’s base URL (usually it is https://mimir.mjoll.no/)
OPEN_AI_API_KEY - API Key for OpenAI (reference: https://platform.openai.com/docs/quickstart?context=python)

Global Variables Prefix

Please be aware that the Secret Manager is adding a SECRETS. prefix in front of every key, so if you set a key pair MIMIR_API_KEY in the Space, the global variable in your flow would be SECRETS.MIMIR_API_KEY.

qibb Flow Explanation

Mimir is sending a POST request webhook to the configured URL (/webhooks/rap-gpt) with the following payload:

JSON

{
   "items":[
      {
         "id":"c0b36642-4698-4669-ab2c-8a29a88964cf",
         "itemType":"video",
         "metadata":{
            "formId":"03b447e3-7876-46c8-b882-294378fdbc8b",
            "formData":{
               "default_title":"sample_video.mp4",
               "default_description":"Test description",
               "default_createdOn":"2023-10-30T12:34:10.822Z"
            }
         }
      }
   ]
}

where:

id - unique item ID assigned by Mimir of each one of their items
itemType - type of the item
formId - Mimir assigns to each item a formId, and each formId can have different metadata fields, but one item cannot have mor
e than one formId
default_title - the title of the asset that gets automatically generated by Mimir
default_description - the description of the item
default_creaedOn - default date pointing to when the item was first created in Mimir

From this list, we only care about the itemId which is why, we are using the Fetch ItemId change node to save its value into msg.mimir.itemId.

qibb nodes Error Handling

In almost every qibb node, you get to decide how errors are managed. You can select Standard mode when both successful calls and error messages are consolidated into a single output or Separate in that case successful calls would be routed through the first output, but error message through the second one.

Within the qibb node configuration, specifically in the General section, you can explicitly define the desired behavior, choosing between Standard or Separate error-handling mechanisms.

The next node is the Fetch Item Data and here we are using it to fetch all data associated with the Mimir’s itemId that we previously saved under msg.mimir.itemId. This would return a long JSON object containing all information about the item.

You also need to define the HOST, ItemId, ReadableMetadataFields, and apiKey in the Advanced Security settings and set the Error Handling to Separate.

For the apiKey, use a JSONata expression to concatenate the "Bearer " string with the value of the MIMIR_API_KEY defined in the qibb Space. The expression should look like this: "Bearer " & $globalContext("SECRETS.MIMIR_API_KEY"). This will set up the value of the authentication header in the qibb's Mimir node.

The Check Transcript switch node is configured to assess the presence of a timed transcript for a specific item. Upon fetching the item's data, you'll typically encounter a pre-signed URL leading to the timedTranscriptUrl.

If the item doesn’t have a transcript, we set the msg.payload to the following JSON object:

JSON

{
    "message": "Transcript not found!",
    "status": "error"
}

Upon receiving this payload, Mimir would display for a short period of time an error message: “Transcript not found!” in the upper right corner.

If the timedTranscriptUrl property exists for the item, we will set the payload to:

JSON

{
    "message": "Rap this Video",
    "status": "success"
}

Similarly, Mimir will display a success message in the UI:

This URL serves as a link to a JSON file containing all transcript data for the item, including start/end time and transcribed words, and has the following example format:

JSON

[
    {
        "content": "Alright,",
        "startTime": 220,
        "endTime": 690
    },
    {
        "content": "let's",
        "startTime": 740,
        "endTime": 1370
    },
    ...
]

The Fetch Transcript node is an HTTP request node, that fetches this transcript and writes its data to msg.payload.

Once the transcript is retrieved, we need to concatenate the values of all content keys in the Concatenate the Transcript node:

JS

msg.mimir.transcript = msg.payload.map(item => item.content).join(' ');

This will create a msg.mimir.transcript property, that would contain all the transcript words with an empty space in between.

The Set Prompts node sets:

model - The OpenAI model, at the time of this article gpt-4o is the latest OpenAI model.
openai.userPrompt - The user prompt is the input provided by the end-user, typically in the form of a message or instruction.
openai.systemPrompt - The system prompt is an instruction or message that sets the behavior or context for the assistant's response
openai.inputTokenPrice - The price per input (prompt) token (reference: https://openai.com/pricing)
openai.outputTokenPrice - The price per output token (reference: https://openai.com/pricing)

Tokens

In order for machines and NLP (Natural Language Processing) LLM (Large Language Model) models, such as BERT (Bidirectional Encoder Representations from Transformers) or GPT (Generative Pre-trained Transformer), etc., to comprehend human language, a crucial initial step involves the conversion of written words into numerical representations, as computers operate with binary represented data.

This initial step, known as tokenization, forms the foundation of (NLP) endeavors. Tokenization entails the segmentation of a given text into discrete units referred to as tokens. These tokens can encompass both words and punctuation marks. Subsequently, these tokens are further transformed into numerical vectors, serving as mathematical representations of the words they represent.

To make sense of these numerical values, we use a special type of computer program called a deep learning model, often a transformer. This model is trained using the numerical vectors obtained through tokenization, enabling it to understand the complexities of word meanings and their contextual relationships.

The ultimate objective is to allow NLP models with the capability to comprehend the semantics and connotations of various words and their contextual placement within sentences or texts. This, in turn, enhances the NLP model's proficiency in understanding and processing human language.

That’s why a lot of LLM (Large Language Models) like ChatGPT have a strict limit of the number of tokens consumed both by the prompt and by the output of the model and also have a fixed pricing per token.

The next node is the “Rap the Rhymes” node which is the OpenAI node that sends the defined body to OpenAI, to request the generation of rap rhymes. The request body is of the type:

JSON

{
   "model": openai.model,
   "messages": [
       {
           "role": "system",
           "content": openai.systemPrompt
    },
       {
           "role": "user",
           "content": openai.userPrompt
    }
  ]
}

In the Advanced section of the OpenAI node, we set the OpenAI API Key, as it is required. As you can notice there, the value is set to a global variable called SECRETS.OPENAI_API_KEY as suggested in the Set Secrets in qibb's Space section.

In the Advanced section of the OpenAI node, configuration of the OpenAI API Key is a necessary step. Notably, the value is assigned to a global variable SECRETS.OPENAI_API_KEY, as suggested in the Set Secrets section in qibb's Space section.

Chat Completions API
We are using the /chat/completions endpoint to query OpenAI. We are using the qibb’s OpenAI node, but if you are interested in the Chat Completion API, you can check this article.

OpenAI Node Body

Please note that the body of the POST /chat/completions call supports JSONata expression, that is why for example openai.model there would be replaced with the value of msg.openai.model, etc.

The next node is a change node, called Save the Rap Rhymes & Set Auth which is basically assigning the OpenAI output to the msg.mimir.rapSummary and sets the X-Mimir-Cognito-ID-Token needed for the authentication into Mimir.

In the same node, we are also calculating the price using the following JSONata expression:

JS

"$" & (openai.inputTokenPrice * openai.output.payload.usage.prompt_tokens + openai.output.payload.usage.completion_tokens * openai.outputTokenPrice)

and it saves the usage information reported by OpenAI to msg.openai.usage.

Finally, we need to update the designated metadata field in Mimir with the rhymes that have been generated by OpenAI. This is done using the Update Item node. Again, if you open the node, you can see that the body is taking advantage of JSONata to populate dynamically the value of msg.mimir.rapSummary:

JSON

{
   "metadataDelta": {
       "formData": {
           "rapSummary": mimir.rapSummary
    }
  }
}

Finally, we display the complete message in the debug node and use JSONata expression to format the node status (emoji and the price in USD).

JS

"👍 " & price

The final result in Mimir looks like this: