Create a Telegram Bot to get info from PyPi Server: Part I

Iuliia Volkova
8 min readJul 25, 2020

In article:

BigQuery, pyTelegramBotAPI(telebot), Python 3.7, Poetry, Telegram(https://telegram.org/)

Link to the source code at the end of the article.

intro

Hi, I love investigating new tools and taking a look at different libraries that are created everyday in the Python community.

I thought that it would be cool to create some Telegram Bot that would take care of new information for me on an everyday basis.

Also last days pypistats.org did not feel well (stats was not updated several days) so I also want a tool that will quickly provide me info about package stats — downloads for last day, week, month and etc.

That I want to get as the end as the result of first task - I want to have a bot, that will take from me my interests, like themes (for example, data science, tests, web development, etc) and will send to me every day 5 new packages, that he did not sent to me previously. And as information I want to see: Description, Authors, Homepage of Package. For second task — I want to send to bot command ‘/stats’ and package name and it must answer with number of downloads for package.

In this Part I we will simple Telegram Bot sync service to work with API and obtain PyPi stats and in next Parts I will improve it.

Let’s start from a simple version that will just publish short info about 5 random packages from PyPi.

To obtain information about packages we will use Public PyPI BigQuery dataset https://packaging.python.org/guides/analyzing-pypi-package-downloads/#getting-set-up

Create new project

Let’s init new project with Poetry (https://python-poetry.org/). PS: $ mean run command in console:

$ poetry new pypi_observer_bot

Enter our new project:

$ cd pypi_observer_bot

Next add dependency, at the start I will use sync https://github.com/eternnoir/pyTelegramBotAPI to create Bot Server on Python:

$ poetry add pyTelegramBotAPI

Create Telegram bot & get authorization token

When before start develop the code, you need to create bot in Telegram with BotFather https://core.telegram.org/bots#6-botfather and when you get your authorization token, we can continue.

I got the token:

create new Telegram Bot & got the auth token

And we can go ahead.

Create bot.py

Let’s first create simple example as described in https://github.com/eternnoir/pyTelegramBotAPI#a-simple-echo-bot and test that all works well.

I created ‘bot.py’ file with content:

import telebot

bot = telebot.TeleBot("mytoken", parse_mode=None)


@bot.message_handler(commands=['start', 'help'])
def send_welcome(message):
bot.reply_to(message, "Howdy, how are you doing?")


@bot.message_handler(func=lambda m: True)
def echo_all(message):
bot.reply_to(message, message.text)


bot.polling()

Now run bot.py:

$ python bot.py

And open chat with your bot (by username that you choose when you create bot with Bot Father). Check that all works well:

check Bot works & answered well

Nice. Move on.

Send message to Telegram Chat without User Actions

Now I want Bot to send me messages without any actions from my side. For this, I need to use method

bot.send_message(chat_id=5421727806, text=”hi, I’m a message from update”)

How to get get Telegram chat id

But to use it, I need to get chat_id — unique id of chat between me (or another user) and Bot.

To do this we need just get chat id from chat, in a method that catches the first action ‘start’ (by the way this action always used to start talking with Bot, so this is like the main entrypoint to start dialog with your Telegram Bot).

Let modify our send_welcome(message) to print chat.id, also, let’s print message.chat.__dict__ and message.__dict__ to see that else information we can obtain from the Message Object:

@bot.message_handler(commands=['start', 'help'])
def send_welcome(message):
print(message.chat.id)
print(message.chat.__dict__)
print(message.__dict__)
bot.reply_to(message, "Howdy, how are you doing?")

And restart bot.py:

$ python bot.py

Now in the console you can see the chat id and use it to send messages directly to your dialog with bot. Send ‘/start‘ message to bot and in console you will see something like this:

5421727806{'id': 5421727806, 'type': 'private', 'title': None, 'username': 'xnuinside', 'first_name': 'Iuliia', 'last_name': 'Volkova', 'all_members_are_administrators': None, 'photo': None, 'description': None, 'invite_link': None, 'pinned_message': None, 'permissions': None, 'slow_mode_delay': None, 'sticker_set_name': None, 'can_set_sticker_set': None}{'content_type': 'text', 'message_id': 24, 'from_user': <telebot.types.User object at 0x10d327b90>, 'date': 1595680726, 'chat': <telebot.types.Chat object at 0x10d327f50>, 'forward_from': None, 'forward_from_chat': None, ... , 'json': {'message_id': 24, 'from':{'id': 5421727806, 'is_bot': False, 'first_name': 'Iuliia', 'last_name': 'Volkova', 'username': 'xnuinside', 'language_code': 'en'}, 'chat': {'id': 5421727806, 'first_name': 'Iuliia', 'last_name': 'Volkova', 'username': 'xnuinside', 'type': 'private'}, 'date': 1595680726, 'text': '/start', 'entities': [{'offset': 0, 'length': 6, 'type': 'bot_command'}]}}

A lot of information, you can investigate it later to check, that else can be useful for you.

Okay, now we want to send messages to chat without any actions from the user, but we use bot.polling() in our bot.py and this means that this script is used for waiting actions from users.

Let’s rename ‘bot.py’ to ‘listner.py’ and we will use it for all logic that must be react to user requests.

Create the informer.py

And let’s create ‘informer.py’ that will contain logic with sending messages to users with our PyPi packages everyday.

In informer.py let’s add a test action with sending some messages to our chat. Use the chat id that you extracted from your message in the previous step.

import telebotbot = telebot.TeleBot("api_token", parse_mode=None)bot.send_message(chat_id=547123227806, text="hi, I'm a message from informer")

Now run script:

$ python informer.py

And check your chat with bot:

message from bot without user’s input

Great, all works.

Now we need to implement logic that will query PyPi BigQuery Dataset and:

1) will send 5 random packages from it

2) Information about how many packages was downloaded from PyPi on the last day (distinct packages that have at least one download).

Let’s start from the simplest task — number 2.

PyPi Dataset is partitioned by date and query is looks like:

SELECT count(distinct(file.project)) as packages_number FROM `the-psf.pypi.downloads20200724`;

Where 20200724 is a date.

You can enter BigQuery https://console.cloud.google.com/bigquery?p=the-psf&d=pypi&page=dataset and test it.

To query BigQuery from Python we will use google-cloud-bigquery, let’s add it to the project:

$ poetry add google-cloud-bigquery

To start work with BigQuery we also need to set path to credentials that must be used for authentication, let set GOOGLE_APPLICATION_CREDENTIALS variable with path, for me it will be

import telebot
import os
from google.cloud import bigquery

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "../pypi_observer_gcp_key.json"

If you don’t have a service account file check — https://cloud.google.com/docs/authentication/getting-started to get an access file.

Now let’s add function to call our query:

client = bigquery.Client()

def bq_get_unique_packages_downloaded_for_yesterday():
query_job = client.query(
f"SELECT count(distinct(file.project)) as packages_number FROM "
f"`the-psf.pypi.downloads{(datetime.now().date() - timedelta(days=1)).isoformat().replace('-', '')}`;")

results = query_job.result()
results = [row for row in results]
return results[-1].packages_number

(datetime.now().date() — timedelta(days=1)).isoformat().replace(‘-’, ‘’) — returns yesterday date in the format 20200724

And change bot.send_message to send information in chat:

bot.send_message(chat_id=547123227806,
text=f"Total unique packages from PyPi, that was downloaded yesterday: "
f"{bq_get_unique_packages_downloaded_for_yesterday()}")

Run informer:

$ python informer.py

And check chat with bot:

bot prints count of packages downloaded yesterday

Cool.

Now, let’s create a Query that gets a random package from the downloaded package of the day.

This query will return random names of 7–20 packages:

SELECT distinct(file.project) as package_name FROM `the-psf.pypi.downloads20200724` WHERE RAND() < 10/164656895;

We will use this query only to get package id and after we will call PyPi API endpoint:

https://pypi.org/pypi/{package_id}/json

To obtain full information about package:

Use, for example, GET https://pypi.org/pypi/pyyaml/json to get sample of PyPi API answer:

https://pypi.org/pypi/pyyaml/json answer example

Great, next create logic:

  1. call query with list of random packages
  2. Chose on package from the list and send it to chat with some time delta (we don’t want to get all 5 packages one-by-one, but I want to send when during the day — at morning, on the lunch time, at the evening)
  3. Call PyPi API to get more information about package

Call a query with a list of random packages

Lets add to imports:

from random import choice

And create function to call query and return random package name:

def bq_get_random_packages_downloaded_for_yesterday():query_job = client.query(f"SELECT distinct(file.project) as package_name FROM "
f"`the-psf.pypi.downloads{(datetime.now().date()-timedelta(days=1)
).isoformat().replace('-', '')}` "
f"WHERE RAND() < 10/164656895;")
results = query_job.result()
results = [row.package_name for row in results]
return results

And let’s test it with adding one more send message with random package name to Bot:

bot.send_message(chat_id=547123227806,
text=f"Random Package from PyPi: \n"
f"{choice(bq_get_random_packages_downloaded_for_yesterday())}")

Run script again and check it:

random package name from pypi

Great. Let’s modify little bit message to attach also url to package page on pypi.org: https://pypi.org/project/{package_id}/

bot.send_message(chat_id=54727806,text=f"Random Package from PyPi: \n"
f"https://pypi.org/project/{choice(bq_get_random_packages_downloaded_for_yesterday())}/")

And now I got:

send package page to Telegram Bot

Chose on package from the list and send it to chat with some timedelta

In primitive way we can do something like this:

from time import sleep # add this line to importsdate_ = datetime.now().date()while True:
if date_ <= datetime.now().date():
for i in range(5):
bot.send_message(chat_id=54727806,
text=f"Random Package from PyPi: \n"
f"https://pypi.org/project/{choice(bq_get_random_packages_downloaded_for_yesterday())}/")
sleep(3)
if i == 4:
date_ = datetime.now().date() + timedelta(days=1)

What am I doing here? I set variable date_ at the start equal to current date and with timeout in 3 second I send a new message with a new package to telegram.

And if I send already 5 times — when we change date_ for tomorrow and new Packages will be sent only tomorrow.

while True” means infinity until our script will be killed from outside or stopped manually.

We will leave this part of code this way and will add normal implementation tasks with schedule later.

Populate data about package from PyPI

We will use requests to make API call

$ poetry add requests

Doc about text formatting in Telegram: https://core.telegram.org/api/entities

And let change our bot send message, we need also to add parse_mode=’html’ to make our text formatting works in message:

Rerun informer.py and check the result:

formatted bot message

Not ideal, but I hope you got the idea.

Add ‘/stats’ command to listener.py

Now let’s return to our ‘listener.py’ file and add a handler to ‘/stats’ command, we will extract package name from text message and request stats for 3 last days from PyPi.

1st create method that will query info from BigQuery (in listner.py):

os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "../path/to/your/key"client = bigquery.Client()def bq_get_downloads_stats_for_package(package_name, date_):
query_job = client.query(f"SELECT count(timestamp) as downloads FROM `the-psf.pypi.downloads{date_}` "
f"WHERE file.project=\'{package_name}\'")

results = query_job.result()
result = [row for row in results][-1]

return result.downloads

2nd add a handler for bot ‘/stats’ command:

Re-run listener:

$ python listener.py

Check results:

pypi stats for package downloads

Cool, now you can get info about downloads of your python package from PyPi stats.

Source code you can find here: https://github.com/xnuinside/pypi_observer_bot/tree/v0.0.1

In the next Part: move ‘informer’ and “listener” to asynchronous rails, adding saving information about Users (to avoid hardcoding chat_id) in DB and other improvements.

--

--