Google T5 Translation as a Service with Just 7 lines of Codes¶

What is T5? Text-To-Text Transfer Transformer (T5) from Google gives the power of translation.

translate

In the article, we will deploy Google T5 model as a REST API service. Difficult? What about I’ll tell you: you just need to write 7 lines of codes?

translate

Install Dependencies¶

HuggingFace¶

pip install "transformers[pytorch]"

If it doesn’t work, please visit Installation and check their official documentations.

Pinferencia¶

pip install "pinferencia[uvicorn]"

Define the Service¶

First let’s create the app.py to define the service:

app.py
from transformers import pipeline

from pinferencia import Server

t5 = pipeline(model="t5-base", tokenizer="t5-base")

def translate(text):
    return t5(text)

service = Server()
service.register(model_name="t5", model=translate)

Start the Service¶

$ uvicorn app:service --reload
INFO:     Started server process [xxxxx]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

Test the Service¶

CurlPython requests

curl -X 'POST' \
    'http://localhost:8000/v1/models/t5/predict' \
    -H 'accept: application/json' \
    -H 'Content-Type: application/json' \
    -d '{
    "parameters": {},
    "data": "translate English to German: Good morning, my love."
}'

Result:

{
    "model_name": "t5",
    "data": [
        {
        "translation_text": "Guten Morgen, liebe Liebe."
        }
    ]
}

test.py
import requests

response = requests.post(
    url="http://localhost:8000/v1/models/gpt2/predict",
    json={
        "data": "translate English to German: Good morning, my love."
    },
)
print("Prediction:", response.json()["data"])

Run python test.py and print the result:

Prediction: {
    "translation_text": "Guten Morgen, liebe Liebe."
}

Even cooler, go to http://127.0.0.1:8000, and you will have an interactive ui.

You can send predict requests just there!