Markitdown-as-a-Service: from AI to production on Clever Cloud

Markitdown-as-a-service banner
Every day, new tools are released, AI brings new perspectives, you have new ideas. It's one of Clever Cloud's missions to help you to develop and test them in real-life conditions, effortlessly, before making them available to everyone.

In this post, we’ll explore how our services can be used alongside AI-assisted code creation tools to kick-start an idea and bring it to life online. As an example, we use Markitdown, an MIT-licensed Python tool that converts a wide range of document formats into Markdown.

Our ambition is to build a conversion service available on Clever Cloud, with a website and a simple API. To get a first prototype up and running quickly, we’ll be using an AI-assisted editor called Cursor, but it could equally be Jetbrains, VS Code, one of its other derivatives (Pear, Void, Windsurf) or event Zed.

How does Markitdown work?

Markitdown is a Python tool, so we’ll be using it in a Python application with the uv package manager. Built in Rust, it is highly efficient and easy to use. In addition, it is natively available within the Clever Cloud Python image. To follow this example, you’ll need :

Let’s start by initiating a project with Markitdown as its first dependency. A git repository will be created automatically by uv, as well as certain additional files such as .gitignore (files and folders not to be tracked in the repository) and pyproject.toml for managing the Python project:

cd markitdown-converter
uv init markitdown-converter --no-readme
mv hello.py app.py # uv creates a hello.py by default, we rename it
uv add markitdown

The Markitdown documentation states that it’s easy to use:

from markitdown import MarkItDown

markitdown = MarkItDown()
result = markitdown.convert("test.xlsx")
print(result.text_content)

A mode also allows you to use an OpenAI model to convert an image into a Markdown description:

from markitdown import MarkItDown
from openai import OpenAI

client = OpenAI()
md = MarkItDown(mlm_client=client, mlm_model="gpt-4o")
result = md.convert("example.jpg")
print(result.text_content)

We add this information and the list of supported formats to a usage_instructions.md file, which we place at the root of our project. This will be used to give context to the AI, which won’t be familiar with Markitdown, as it has just been made available in open source. It’s now time to launch our editor in the current folder and run some initial tests.

AI helps you get started…

Our primary objective is to have a website that we can host on Clever Cloud. As our service respects most standards and requires very little customization, we don’t have any rules to provide to AI in this first stage. We use the Cursor Composer feature, which makes it easy to pass instructions as context through a file, with a prompt to initiate a first version of the site:

Help me develop a website using Markitdown.
It converts documents into Markdown format. 
Its usage instructions are in the file 'usage_instructions.md'.
 
I need to be able to pass the file in the form of:
- A URL
- A file selected or dragged and dropped into the interface

Then get the result in a block of raw code.

Depending on your model (claude-3.5-sonnet in our tests), you’ll get a more or less convincing first result. In our case, it was a little too basic and purple, but successful overall:

… but must be guided

However, it soon becomes clear that the AI has made a number of small errors. In Markdown format, a single line break is not interpreted, and the text continues as if nothing had happened. Markitdown tends to break long lines after a certain number of characters. A rule that our assistant didn’t anticipate and that we had to take into account in the result display block.

Once the result is convincing enough, we can start improving this “PoC” (Proof of Concept). To do so, we first added a button to copy the result to the clipboard, and then managed the OPENAI_API_KEY environment variable, which is required to use the OpenAI API. If it is present in the server system, it is used. Otherwise, the user sees a field for entering it.

As with the site’s initial design, we asked our assistant to revise it, using dark blue tones. Then to add META tags for the page’s SEO. Here, he added links to icons and images… without providing them. So we turned to a third-party AI for that.

After all these improvements and fixes, some small bugs appeared. We therefore detected and corrected them one by one. At this stage, generate and add unit tests, which will help you check that a change wasn’t destructive. Don’t forget to make a commit in your git repository after each major modification, so that you can return to a functional state in the event of a problem following unsuccessful modifications.

On the way to our “final” site, we asked the AI to generate a few other functionalities that may or may not have been included in the initial specifications: a loading message during conversion, validating the URL by pressing the Enter key, API capabilities with instructions added to the website. Think of it as a kind of iterative process.

Then it’s time to get ready for “production”. Of course, in the context of a professional service, we would push error management and testing further, as well as the reliability of AI-generated code. For this demonstration, we simply asked it to use the ASGI uvicorn server, which led us to replace Flask with Quart, a web framework geared towards asynchronous use.

Here again, the AI’s work can be improved, as it occasionally needs to be asked to clean up its code, to organize it better, and it would sometimes be quicker to carry out certain steps directly. On the other hand, it is very useful for commenting massively on existing code, detecting and explaining a bug or making changes across different files, etc.

As usual, such tools complement developers’ work and enable a greater number of people to produce a functional result, but they remain no more than an aid. Our last test reflected this observation: when we asked it how to deploy on Clever Cloud.

Clever deploy

Here, the model used provided pretty decent basic instructions. But Clever Tools have evolved considerably this year, with numerous improvements and simplifications. What’s more, we’ve natively integrated uv into the platform, which simplifies its use.

We’ve therefore reworked the instructions in the README.md generated by the AI manually, which are mainly used to configure the application:

clever create -t python
clever env set QUART_ENV production
clever env set CC_PRE_BUILD_HOOK "uv sync"
# Python apps on Clever Cloud should listen to 9000 port, not 8080
clever env set CC_RUN_COMMAND "uvicorn app:app --host 0.0.0.0 --port 9000 --workers 4"
clever env set OPENAI_API_KEY  # If you want to provide your own OpenAI API key

Once everything is ready and your final code has been committed, simply push it onto Clever Cloud. It will then be deployed within our infrastructure, and automatically made available within one of our virtual machines. It will be accessible via a dedicated domain assigned by default with a certificate for HTTPS access, but you can also configure the domain of your choice.

clever deploy
clever open

How to test Markitdown-converter on your Clever Cloud account?

If you just want to test Markitdown-converter on your Clever Cloud account, you can fork it in your GitHub account and deploy it from the Console (remember to set environment variables), or in a terminal with a few commands via Clever Tools:

git clone https://github.com/CleverCloud/markitdown-converter
cd markitdown-converter

clever create -t python
clever env set QUART_ENV production
clever env set CC_PRE_BUILD_HOOK "uv sync"
clever env set CC_RUN_COMMAND "uvicorn app:app --host 0.0.0.0 --port 9000 --workers 4"

clever deploy
clever open

Blog

À lire également

Clever Cloud obtains HDS (Health Data Hosting) certification

Clever Cloud achieves HDS Certification, enabling it to host health data in France. Clever Cloud, Europe's leading provider of Platform as a Service cloud solutions, today announced that it has been awarded the Hébergeur de Données de Santé (HDS) certification, in its updated version effective May 16, 2024, for all 6 activities in the standard. This certification reinforces Clever Cloud's position as a trusted partner for companies and organizations in the healthcare sector.
Press

Clever Tools: a year of enhancements for your deployments, on the road to v4

A command line interface (CLI) is at the core of developer experience. At Clever Cloud, we have been providing Clever Tools for almost 10 years.
Engineering Features

Otoroshi with LLM: simplify your API and AI service management on Clever Cloud

Your applications and services are evolving in an increasingly complex environment, requiring effective management of APIs and interactions with artificial intelligence models such as the very popular LLMs (Large Language Models).
Features