Deploying Deep Learning models using Knowru

Trying to deploy Deep Learning models but found no good tool to do so? Find how machine learning engineers used Knowru to deploy their chatbot and video analysis Deep Learning models in our most recent blog post with sample code as well.

Thanks to its interface getting easier, Deep Learning has become prevalent when analyzing irregular and complex data formats such as text, audio, images and videos. However, when it comes to deploying Deep Learning models, machine learning (ML) engineers found themselves left with not many choices because Deep Learning models require a high hardware specification such as GPU due to their complexity. Creating and maintaining a powerful environment where Deep Learning models can respond real-time is fairly a daunting project. One would hope that a service similar to AWS Lambda exists so that she could quickly deploy her models to APIs but unfortunately would later find that most services have many restrictions and limitations (e.g. no GPU supported,very limited memory, restriction on package size, processing time limit etc.). Precisely for this kind of situation, ML engineers can choose Knowru to deploy their models. With Knowru, ML engineers can choose memory size and will get to use a special environment where high-performance GPUs and relevant packages are built-in. Furthermore, there is no restriction on long-running tasks. Here, we would like to share two use cases where ML experts used Knowru to deploy their ML models.

Deploying Chatbots With Knowru

Visa Yang is a ML engineer at a Chinese chatbot company, providing Chatbot solutions to international corporates in China, such as Adidas and Hilton. Using Knowru, they were able to deploy complex Seq2Seq (a Deep Learning algorithm) in minutes:

In developing our chatbots, we used Seq2Seq algorithms based on RNN. We turned our models into APIs using Knowru. The APIs receive users’ chats and then return appropriate answers.

- Jaemin Kim, Data Scientist of Knowru

Chatbot whose Deep Learning models are deployed in Knowru

Visa said there are three reasons they chose Knowru for their deployment tool:

first of all, the deployment process is very straight-forward and quick. In turning our models to APIs, we spend less than 5 minutes. Second of all, Knowru comes with full suites of management tools from searching specific requests and responses to automated alarms, dashboards and tests. Using Knowru, we did not have to develop any tool to find our bugs or create separate graphs to understand our current situations. Lastly, despite of our models’ complexity, their response times in Knowru are amazingly fast, usually less than 200 milliseconds for a request. This is very critical to us because our customers won’t wait anything longer than a second.

- Jaemin Kim, Data Scientist of Knowru

Deploying Emotion Analysis Models With Knowru

The team of Jaemin Kim, Knowru’s own data scientist, also used Knowru to deploy TensorFlow, a famous Deep Learning package. Jaemin’s team has been developing an emotion analysis tool for Knowru’s AI Interview product (, which allows companies to save cost of on-boarding interviews by conducting remote, video-recorded interviews with candidates beforehand and also more precisely evaluate candidates’ technical skills especially in the data science field. The emotion analysis tool his team has been responsible helps recruiters quickly understand many candidates’ attitudes and stress levels at one glance.

Detect emotion

I originally thought using AWS Lambda to deploy our models. However, I later found out that the service has a restriction on the size of packages at 250MB. Our ML packages (TensorFlow and Keras) when zipped were still more than 360MB. We had to look for other methods and realized that fortunately our existing product could satisfy our needs.

- Jaemin Kim, Data Scientist of Knowru

Error message from AWS Lambda on package size

TensorFlow package size exceeds the size AWS Lambda allows

The final architecture his team has adopted is interesting that it uses both AWS Lambda and Knowru.

AWS Lambda could not deploy our models but it could perceive that a new video file is saved in AWS S3. Knowru could deploy our models but it could not perceive a new file storage. We in the end deployed our models in Knowru and triggered the APIs inside Knowru when a new file is stored in S3 using Lambda.

- Jaemin Kim, Data Scientist of Knowru

Initial Architecture

Final Architecture

For future readers’ references, Jaemin has shared some of his steps in Python 3.7 as below:

Step 1. Deploying to Knowru

Here we assume that you already have created a ML model. To deploy your model to Knowru, you need to write a file named and prepare a list of packages in a file named requirements.txt which can be easily prepared using package management software like pip.

import boto3
import json
import os
s3_client = boto3.client(

	# This function run will be executed when a request comes in. The requests’ data will be provided as an input argument to the function.
def run(data):
    bucket_name = data.get('bucket_name')
    key_name = data.get('key_name')
	video_filename = key_name.split(/)[-1]
    video_path = f'/tmp/{video_filename}'
    with open(’video_path, 'wb') as f:
        s3_client.download_fileobj(bucket_name, key_name, f)
    # This function analysis is hidden on purpose. This is where your logic goes in.
    output = analysis(video_path)
    output_path = '/tmp/output.json'
    with open(output_path, 'w') as f:

    s3_client.upload_file(output_path, bucket_name, f'output/{video_filename}.json')
    return {'output': output}

Now let us deploy our model, which is equivalent of saying “creating a runnable” in Knowru.

Creating a runnable in Knowru can be done in a single form

After creating the runnable, make sure to check its API endpoint URL and also your own token in the Account page, which can be accessed by clicking the “Account” button on the top right.

Step 2. Preparing a Lambda function

This Labmda function will be triggered when a new file is uploaded to a target S3 and call the Knowru’s API.

import json
import boto3
import requests
s3_client = boto3.client('s3')

# Enter your runnable URL as an environment variable when creating this Lambda function
runnable_url = os.environ.get(‘MY_RUNNABLE_URL’)
run_url = f'{runnable_url}run/'
# Enter your Knowru token as an environment variable when creating this Lambda function
token = 'Token {}'.format(os.environ.get(‘MY_KNOWRU_TOKEN’))
headers = {
    "Authorization": token,
    "Content-Type": "application/json",
    "Accept": "application/json"
def lambda_handler(event, context):
    bucket_name = event['Records'][0]['s3']['bucket']['name']
    key_name = event['Records'][0]['s3']['object']['key']
    request_input = {
        	"bucket_name": bucket_name,
        	"key_name": key_name
    r =
    print('Request sent')

Step 3. Check AWS CloudWatch, Knowru Run List and AWS S3

Now, let us upload a file to the S3 that will trigger this Lambda function. You can see that “Request sent” message is printed in CloudWatch as a log message and that expected output file is saved in S3.

Also, Knowru will keep its records of request and response which you can search at a key level as well.

Model request and response record in Knowru

So far, we have looked at how ML engineers use Knowru to deploy their models. They enjoy Knowru for its strong support for Deep Learning models, intuitive process and quick response time. Finding ways to easily and quickly deploy your Deep Learning models? Contact us to see how we can help you.