Getting Started with Cloud Computing – A COVID-19 Data Map

1. Abstract

Are you searching for country-specific, up-to-date numbers and rates for the global pandemic caused by COVID-19? Well, then I got some bad news for you. You won’t find any in this blog post… not directly anyway. If you are looking for in-depth information about public APIs, location-based data visualization or cloud-based Node.js web applications on the other hand, I might just be the right guy to help you out.

After reading this post you will not only have detailed information about the previously mentioned topics, but you will also learn about the challenges and problems I had to face working on this project and what to look out for, when starting a web application from scratch all by yourself.

2. Introduction

Final product

This project is the result of the examination that is part of the lecture “Software Development for Cloud Computing”. The focus of this lecture is to learn about modern cloud computing technologies and cloud services like AWS, Microsoft Azure and IBM Cloud, how to create and develop software using these technologies as well as creating a cloud-based project from scratch.

At first, I wasn’t quite sure what I was going to work on, but I really wanted to work on a web application that uses location-based data in the form of a map to visualize something else. This “something” was yet to be determined when I started brainstorming for my project, so I started to do some research. I then bumped into this collection of free and public APIs which was perfect for my undertaking and I was almost set on creating a weather app, when I found a free API that would provide me with global data all around the Coronavirus.

Now that I knew what I was going to visualize I came up with a personal scope for this project. I decided to create a web application that would deliver COVID-19data for a specific country by either hovering or clicking on this country, as well as a search function, so that the user could jump to a country of choice by entering the name of a city or a country. Since I had only very limited knowledge about web applications and cloud computing as such (I have worked bits and pieces with Microsoft Azure during my 6-months internship at Daimler before, but never really worked with Node.js or a map library) I did some research first, but I was very confident that I could reach this goal.

3. More Research

Now that I determined what I was planning on doing, I had to figure out which tools and cloud technologies I was going to use. Since I already had a little experience with Microsoft Azure it seemed obvious to settle with Azure and the Azure Maps Service for my project. But there were a couple problems with that:

Problem 1: In order to create a private Azure Account, even an education account, one has to provide a credit card, which I do not own.

Problem 2: There is no map material in Azure Maps for the regions China and South Korea. Now that isn’t technically a k.o. criteria, but I would prefer to use a service that supports all regions to avoid limitations.

Problem 3: Again, this isn’t a huge problem, but I would rather learn something new and not go with something I already had prior experience in.

So I decided to go with AWS, Amazon’s Cloud Service instead. Even though in retrospect the documentation for AWS is not as good as for Microsoft Azure (at least in my personal opinion), AWS offers a wide range of services and on top of that you can create a free education account with 100$ worth of credits. Unfortunately AWS does not have a location data service from what I could figure out, so I had to decide on an external service.

For map data services I decided to go with Mapbox. Mapbox GL JS is an open-source JavaScript library that uses WebGL to render interactive maps for websites and mobile apps. The advantage Mapbox has over Azure Maps is that it offers all the services I require for my project for free and also covers every region without restriction. Upon creating a free account, a user gets a subscription-key that grants access to all Mapbox services, including Mapbox Studio and the Mapbox Geocoder API which I will get into more detail later on.

4. But how do I get access to data from the internet?

https://www.wrike.com/de/blog/programmierschnittstelle-api-erklaert/

As I mentioned earlier, I stumbled upon a public Web API called covid19api, which offers all sorts of corona-related, up-to-date data for free. In the abstract I promised in-depth information about public APIs, so I might as well lose a couple words about the functionality of Application Programming Interfaces while we’re at it. An API is a software-to-software interface, not a user-to-software interface.

HTTP-Request for COVID-19 data for Germany between the 20.09 and 21.09
Response to the HTTP-Request above

A good metaphor to understand the concept of APIs would be to think of it as a waiter in a restaurant. The waiter(API) takes an order (HTTP-Request) from a guest(user) and delivers it to the kitchen(backend-system) where the order is acknowledged and prepared. When everything is ready the waiter(API) serves the food(HTTP-Response) to the guest(user). Some companies publish their APIs to the public, so that programmers can use their service to create different products. While some companies provide their APIs for free, others do so against a subscription fee. In the case of the COVID-19-API there is a free tier as well as a 10$, 30$ and 100$ subscription option. By subscribing, the user has access to additional data routes and no request-rate limit, the latter led me to subscribing, because I require several requests per second with my application.

5. Architecture

Basic architecture of my web application hosted in AWS

Let’s take a step back and focus more on which solution I came up with for my project. The architecture of my web application is pretty straight forward. Clients can access a frontend via their browser. If a client hovers over, clicks on or searches for a country, a HTTP-Request is sent to the backend server which then evaluates that request and sends another HTTP-Request to either the COVID-19-API or the Mapbox-Search-API depending on what the client requested. Upon receiving a HTTP-response from either one of the APIs backend systems, my backend server evaluates the data for the respective user request and sends it back to the frontend where it is then visualized. I will go a little more in-depth about these topics later on, but first I want to explain why having a separate frontend and backend makes sense.

Pros for having a separate front and backend:

  1. It’s far easier to distinguish between a frontend or backend issue, in case of a bug
  2. Possibility to upgrade either one without touching the other as they run on different instances (modularity)
  3. Allows use of different languages for front- and backend without problem
  4. Two developers could work on each end individually without causing deployment conflicts, etc.
  5. Adds security, because the server is not tightly bound to the client
  6. Adds level of abstraction to the application

Cons for having a seperate front and backend:

  1. Have to pay two cloud instances instead of just one
  2. Independent testing, building and deployment strategies required
  3. Can’t use templating languages anymore, instead the backend is basically an API for the frontend

6. Frontend

More detailed architecture for the frontend of my application (Note: the Node Server is not part of the frontend, it just receives requests)
How to implement mapbox to your HTML-website

The frontend of my application consists of a static HTML-website that is hosted on an AWS EC2 Linux instance. The EC2 instance gets its data from an S3 bucket that is also hosted in AWS and contains up-to-date code for the website. The implementation of Mapbox is very straight forward. All you have to do is implement the Mapbox CDN(Content Delivery Network) into the head and include the above shown code with a valid access token into the body of your HTML. The “style” tag allows the user to select from different map styles, such as streets, satellite, etc. Users can create custom map styles, tilesets and datasets using Mapbox Studio. The big benefit of this is that the user does not have to store and load the data manually from the server. Instead a user can simply upload a style/tileset/dataset to Mapbox Studio and access it from the HTML by creating a new data source with the respective url for the style/tileset/dataset.

Tileset made from GeoJSON in Mapbox Studio

In my case I created a custom tileset from a GeoJSON file of every country in the world. You can find geographical GeoJSON data for free online, I personally found this handy web tool that lets the user create a fairly accurate GeoJSON from countries of choice. But I encountered a problem by doing so. Even though I had fairly accurate geographical data for each country, the COVID-19-API does not support every single country. By sending a request  to the COVID-19-API I got a list of all supported countries with their respective country-slug and ISO2 country code. Since those country codes are unique I wrote a basic algorithm that would craft a custom GeoJSON from all matching country codes of both the GeoJSON and the country JSON response.

How to get list of supported countries from COVID-19-API

Unfortunately not everything was that easy, because for some reason not every Object in the GeoJSON had a valid ISO2 code. So I had to manually go through all countries of both files and figure out which ones are missing, which was a real pain in the backside. Eventually I had a simple GeoJSON with a FeatureCollection containing a name, a unique slug, a ISO2 code and a geometry for each country, which I then uploaded to Mapbox Studio as a custom tileset.

How to implement and visualize Mapbox Studio tileset in frontend JavaScript

Now that my tileset was uploaded to Mapbox studio, I was able to create a data source and a style layer from it. This allowed me to customize the appearance of the tileset’s polygons to my liking. By using Mapbox’s map.on() function, I could add hover and click events for when the client hovers or clicks over a country and retrieve information from the tileset for this specific country(feature). In my case I get the slug for the country the user has clicked or is currently hovering on and start a HTTP-Request to the backend server with this information and the current and previous date. Hovering will return a basic COVID-19data for a country, while clicking will return premium data.

6.1 COVID-19 Data Request (Frontend)

The request is sent using the fetch method, which is a JavaScript interface. The body of the POST-request contains the country slug for the country you want to get COVID-19data for, the current date and the date of the day before. This information is needed for the backend request from the COVID-19-API in order to get the latest corona-related data.

After receiving a response from my backend in the form of a JSON object, the data is added to an empty <ul> object in the HTML where it is then visible to the client.

Client searched for Berlin and Mapbox flew to the exact location

6.2 Search Request (Frontend)

The search function works very similar to the previous description on how the COVID-19 data is requested, but instead of sending dates and a country slug from the tileset, we send a query. This query is the text that the client enters into the search bar. Upon starting a search, a fetch POST-request is sent to the backend containing the query in its body. After receiving a response from the backend which contains information about the first point of interest the Mapbox geocoder could find, we jump to the location of the POI, as long as it was a valid query. This “jump” is handled from the Mapbox fitBounds() function which tries to fit a POIs bounding box perfectly on the user’s screen.

7. Backend

More detailed architecture for the backend of my application (Note: the Amazon EC2 instance is not part of the backend, it just sends requests)

The backend consists of a single Node.js express server that is hosted in an Elastic Beanstalk instance on AWS. I also added a CI/CD Code Pipeline from AWS that connects the instance to a GitHub repository so I have continuous updates. Since I decided on separating my frontend and backend, the backend server behaves much more like an API system for my frontend.

7.1 COVID-19 Data Request (Backend)

Express route for basic COVID-19 data

Whenever a HTTP-Request for one of the corona-related server routes happens, the server passes the request body to a function and executes this function. Upon execution, the backend sends another HTTP-Request to the COVID-19-API with the country slug, the current and previous date as parameters and the API access token as header. This request is being sent using the request-promise npm dependency.

The COVID-19-API’s response contains specific, corona-related data for the requested country. This data has to be evaluated and adapted, to make sure the backend only responds with data that is needed and correctly formatted. This is necessary, because otherwise larger Integer numbers are difficult to read without a dot every 3 numbers. After evaluation the data will then be sent back to the frontend where it is then displayed.

Backend function that sends a request to the COVID-19-API with respective parameters. (Note: the use of async and await make sure the response is not empty)

A problem that I stumbled upon while working on the backend was that the requested data was only usable within the scope of the callback function. In order to fix that issue and prevent an empty string from being sent to the frontend as a response, I had to learn about promises (async and await). Let’s go back to the restaurant example, shall we? If you create a function in JavaScript it is synchronous by default. That means a waiter would take an order from a table(client) and gives it to the kitchen. If the system was synchronous, the waiter would wait in the kitchen(server) for the chef to be done preparing the order and not serve any other tables in the meantime. He will not serve another table until he brings the finished food to the table which has ordered. As you can see, this would be very inefficient, which is why asynchronous exists. The exact same scenario would work as followed if it was asynchronous: The waiter would take an order and give it to the kitchen, but instead of waiting in the kitchen he would start serving other tables and bring the finished food as it is ready to be served. In the case of my application it is important that I handle requests asynchronously, because there are multiple requests per second when a client hovers over many countries in a short period of time. And that is where the JavaScript keywords async and wait come into play. Async defines that a function is asynchronous and await can be used in the scope of an async function to make sure to wait until a HTTP-request is finished and the response has arrived. This makes sure that the COVID-19-API’s response and not an empty body will be sent to the frontend.

7.2 Search Request (Backend)

If there is a HTTP-Request for a query search, the server simply starts a request to the Mapbox Geocoding API with the request body’s query and the Mapbox access token as parameter. The result will be a list of POIs that fit the query, but for the sake of simplicity the server always sends the very first result back to the frontend.

8. Other Challenges

Another challenge that occured during my work on the project was that I sometimes struggled finding a solution for a problem, because documentation for an API or a service wasn’t clear or simply not existing. Sometimes it would take multiple hours reading up on documentation and community contribution, just to figure out that a single line of code would fix the problem. The biggest issues I probably had with the AWS and COVID-19-API documentation. While I could fix the issues I had with AWS by following YouTube and StackOverflow tutorials, there wasn’t really such a thing for the COVID-19-API. I then joined the official slack server for the API and reached out to the creator and developer who was very supportive and helpful.

9. Conclusion

Cloud computing is versatile and complex. During my time working on the project I got a far better understanding about web applications, APIs and cloud computing as such. I got more confident in working with JavaScript as a frontend and backend language and made my first steps into the world of web and cloud development. I learned a lot about location-based data and server architecture as well as how to do research on these topics. When I look back on what I achieved with this project, I am very happy with the result. I managed to reach all the goals I set for yourself. I’m also happy that I decided to go with AWS over Azure for this project, because I got to work with a new cloud environment. For my next cloud-based web application I probably will go back to Azure though or try a new cloud service, as I am not a big fan of the AWS documentation and management console.

But now it is up to you what you do with this information. Are about to close your browser in disappointment after not learning about the latest Coronavirus numbers or are you going to work on your own cloud-based web application tomorrow? No matter how you decide, I hope you learned something from reading this blog post that will help you on your journey to become a cloud developer.

Thanks for reading!

Generating audio from an article with Amazon Polly

Author: Silas Krause (sk295)

Project

Reading multiple and detailed articles can become a little bit tiring. Listening to the same content, on the other hand, is more comfortable, can be done while driving, and is less straining for the eyes.
Therefore I decided to use this lecture to create a service that converts an article to an audio file using a Text-to-Speech service.

Technical Architecture

The input for the application is quite simple. The user only needs to provide a URL to the article. Then the main application fetches the contents of that URL and cuts out the unwanted markup. Then an audio file needs to be created. I chose the Amazon Polly TTS API and S3 as a file storage solution to try out Amazon Web Services.
To reduce multiple creations of the same article and load time, I intended to add a database that checks if there is already an audio file.
To interact with this application, I also needed a frontend that has an input field and dynamically renders the elements once the API endpoints send a response.

I built the app using NodeJS with the express because even though I do not have a lot of experience building backend applications, I know JavaScript well, and therefore I am familiar with node.
I decided to create three routes for my application. The index should serve the frontend. Additionally, I need two API endpoints, the first one to scrape the content from the URL, and the second one to generate the audio file.


Getting the content

Initially, I thought I could simply fetch the HTML from the source. I quickly discovered that some pages render the content on the client-side or have some kind of confirmation screen. That is why I needed a way to prerender the page. The best solution I found was Puppeteer. Puppeteer is a Headless Chrome Node.js API that runs Chromium headless and enables access to the rendered DOM. To reduce the load time, I blocked all third-party JavaScript.
Pruning the response to exclude everything but the content turned out to be a tedious task because every website structures their content differently. I ended up using unfluff, which is fine for most cases.


TTS

After the extraction, the text can be sent to the Polly API. At first, I was using the synthesizeSpeech method from the SDK. Aside from the parameters, this method accepts a callback function that can handle the response audio stream. That buffer can be stored in a file on the disk. While looking for a way to upload the audio file to S3 I found that there is a much simpler solution, which also eliminates the 3000 character limit of the synthesizeSpeech method. The Polly SDK also has an option to start a task using the method startSpeechSynthesisTask. This method excepts an additional parameter called ‘OutputS3BucketName’. After the task is completed. The output file is placed into the mentioned S3 bucket.
I really enjoyed seeing how this integration of different platform services simplifies the development.

In hindsight, a real consumer application might want to synthesize small snippets and stream them subsequently. That would almost eliminate the wait time, since generating an audio file and loading it can take up a lot of time for impatient users. However, I did not choose this path because I intended to create a cache with my database.

The Response object from the startSpeechSynthesisTask method contains a link to the file, but there are two issues.
The first problem is that S3 files are not public by default. You need to complete three different steps to make them publicly available.
At first, you need to unblock all public access in the permissions. Then you need to enable public access for ‘list objects’ for everyone. After that, a pocket policy needs to be created. The policy generator luckily makes that quite easy.

Even when public access is enabled, the asset cannot be loaded immediately because the generation takes a couple of seconds. I needed to notify and update the frontend. Eventually, I solved this by starting an interval once the audio is requested. The interval checks if the task has been completed and renders an audio element after it is completed.

The authentification for AWS had to be done using the Cognito service by creating an identity pool.

Deployment

After the application was running successfully on my local machine, I had to deploy it. I chose the Platform-as-a-Service Platform on the IBM Cloud because I wanted to try out Cloud Foundry and I thought my simple express application was a good use case for this abstraction layer. I could have solved some parts of the app with a cloud function, but I do not need the control level of a virtual machine. Because Cloud Foundry requires a lot less configuration than a VM, it should be easy to deploy.
That is what I though.
I quickly ran into restrictions anyway. Except for the things I had to figure out due to my lack of knowledge of this platform, I had to spend a lot of time troubleshooting.
The biggest issue I faced was because of Puppeteer. At install time, the puppeteer package includes three versions of Chromium for Mac OS, Linux and Windows, which are all 150-250 MB large. The size exceeds the free tier limit and I had to upgrade. After that, I could not get Puppeteer running on the server, because the Ubuntu instance does not include all the debian packages that are necessary for running Chromium.
This really set me back. There is no way to install packages via sudo apt-get on PaaS and doing anything manually would eliminate the benefits of the simple deployment. I really thought I had reached the limits of Platform-as-a-Service until I discovered that you can use multiple buildpacks with cloud foundry. Even if they are not included on the IBM Cloud, by adding the Github repo.

buildpacks: 
    - https://github.com/cloudfoundry/apt-buildpack
    - nodejs_buildpack

This allows you to add an apt.yml file to specify the packages you want to install.
Afterward, I was able to run my application.


Tests

For tests, I chose to use mocha and chai. Except for a few modifications for the experimental modules I am using, this integration was straightforward. It uncovered a few error cases I was not considering before.


Conclusion

To sum up I can say that I learned a lot during this project, especially because a lot of things were completely new to me. But now I feel more confident to work with those tools and I want to continue to work on this project.
I can also recommend using cloud foundry. If you know how to deal with the restrictions and know your true environment conditions, it is pretty flexible and enjoyable to use.

Repo: https://github.com/krsilas/article2audio

A beginners approach at a cloud backed browser game

Foreword:

This article reflects my experiences while developing a real time browser-based game. The game of choice was Tic-Tac-Toe as it is straight forward to implement and does not have complex game mechanics. The following paragraphs explain my experiences I got while developing this game with a cloud-based infrastructure in mind. The article is not much of a manual on how to create a game in the cloud, it is more of a diary showcasing all the pitfalls and impressions I collected. This is more focused on beginner developers and first timers in projects as I share common pitfalls about my first bigger project which you should totally void.

To try out the prototype that I have created, check out the GitHub repository. There is a complete manual on how to start the application as well.

A simple game of Tic Tac Toe.

Initial project goal:

The initial goal of this cloud project was to create a simple browser game which automatically scales with increasing concurrent players. The key part for any game are game servers which players need to use to play against each other. Having no available game server means that no additional player can join in and have fun playing your game. The seemingless integration of additional game servers is a key point, no one wants to shut down the whole backend and bring it back up to just increase the server size. So, one goal was to achieve the seaming less integration of game servers and when they are not needed, the game servers should be removed without any hassle

The whole structure of the app is thought to fight against load in every possible part. For example, the frontend part, which consists of ReactJS should be relatively easy to scale. A load balancer would just redirect the request to the frontend to one of the available servers. The next server which then gets requested would be the matchmaking server. Here, several matchmaking servers should be free to choose from. However, it’s important to keep the connection to the same server every time, as these connections consist of socket connections which make it possible to transfer changes form the servers, which the frontend can’t access by default, in real time.

Technology stack

The technology used in this project is simple and easy to use. It mostly consists of technology I used in the past and I am quite familiar with. It saves a good amount of time not needing to be actively learning a new technology and using technology you are familiar with and which meets the requirements.

Frontend technology stack

For the frontend part I sided with ReactJS. It is more a personal preference to use ReactJS instead of Vue.js or plain HTML with JavaScript. ReactJS makes it easy to transform changes in data to the rendered HTML without ever writing a function to actively change your DOM by yourself. Changes to the DOM are easy and lightweight making it a great performance deal when doing frequent changes in the DOM. In my use case, a browser game, it was the perfect solution. Just get the data from the game server, push it into the fitting variable in the frontend and ReactJS magically adjusts according to the given data. ReactJS profits form huge community support as well. There are several packages that you can integrate in your project. In this project I integrated two rather famous packages, React-Router and React-Redux. React-Router makes routing between different pages easier without reloading the whole page. In my use case, the page consists of several components. Traditionally there is a header, a navigation bar and then all your information about the page you are on. If you are on the home page, it displays the home page, when you are on the about page, it displays the about page. With React-Router, it just loads the components that are changing. So, when going from the home page to the about page, only the component holding the about page re-renders. The header and navigation bar stays the same, as nothing changed there. It would be a huge waste of resources, re-rendering components which have not changed and are still used by the page. React-Redux is used to achieve a global state. Each React component has a state in which you store information. For example, the value of the input field in your form. But the problem that occurs when having multiple components is that you cannot pass this state to you siblings. Most likely you can pass the state to you children components, but that is it. React-Redux introduces global state that you can freely declare and use wherever you want. In this project it is used to save the information about the game you want to enter. From the lobby component you’d get the room name and the server name, then get redirected to the play component and the play component reads the information about the game you want to join from the global state. Talking about the play component, sockets are used to achieve real time communication between the client and the server. Socket.IO is used to establish a connection between the client and the game server. The game server holds a connection to both players. Each player’s interaction gets send to the game server, validated if needed, and then both players get the resulting game state form the game server back. Socket.IO is a proven framework with good community support and has great features such as rooms, which make it easy to use with a game project. Socket.IO’s rooms are used to create the different game rooms each server has. When a player joins a server, the game servers Socket.IO socket puts it into the matching room. All communication between the players in this room can now be easily emitted to just the room, and not all connected sockets.

The applications home page

Backend technology stack

The backend uses NodeJS servers with Express to provide an easy way to handle requests. Each server has its own different API-interfaces which are used by either other servers or for debugging purposes or general information. Additionally, the game servers and matchmaking servers have Socket.IO socket connections to communicate to the game server, the matchmaking server, or the frontend. With Socket.IO it is easy to listen to connects, disconnects and user defined room events, making managing the sockets not a total nightmare. Listening to disconnects is important for the matchmaking server to remove a game server from its list of available game servers and sends a request to the master server to check for the game server’s health. In case that the game server does not respond, the game server is removed from the master server’s server list as well, because the game server is not reachable anymore and therefore cannot be used to play matches on.

Two npm-packages have shown to be a great gift setting up these servers and making requests to other APIs simple. The first package Is node-fetch which, just like in plain JavaScript, has the fetch() method to asynchronously fetch information from an API. Unlike the standard JavaScript you use on your frontend, the fetch() method is not natively included in NodeJS. The other package is called minimist. It is a great convenience in reading the parameters the servers gets started with. To locally use multiple servers, each server needs adjustable ports. So, most servers created have a fitting parameter to set the port number.

Testing wise, Mocha and Chai are widely used in testing NodeJS applications. Mocha is a very common JavaScript test framework and Chai is a fitting assertion lobby extending Mocha’s asserting capabilities. Chai’s syntax is fairly easy to learn and easy to read as well.

Due to poor structural choices in development, most of the servers I created can not be tested without the others actually running. For example, the test case for the game server requires the master server to run, as a game servers first step is to register itself with a master server. The testing is set up, so that all required servers for testing are running before the test started.

Current state of the project

As of writing this article, the project is in a prototype state. All the servers work like they are meant to, and game servers can be seamlessly integrated into the running application. The whole application was deployed to Azure Virtual Machines and proved to work.

When trying out a different Azure service, like App Service, the application did not deploy as intended and would not work out of the box. When actually deciding on which Azure service to use, you need to check your different services for “compatibility”. For example, the game server uses two ports for sockets, one for the socket to connect with the player and the other one for a socket to connect to the matchmaking server. The Azure app service however only allows your application to use port 8080, so you either change your application to use that port, or completely switch to a different Azure service, virtual machine for example.

The biggest problem I encountered so far is to find a reliable way, to deploy my application to Azure Virtual Machines. Originally, I wanted to use Azure DevOps Pipelines which, after a successful build, then deploy the whole application to different virtual machines, but that did not work out right of the box as I thought. More on that in the ‘Cloud Integration’ chapter.

Application structure

The optional and aimed at structure looks like this:

A draft about the aimed structure of the application

Frontend, matchmaking and game server can be turned on and off depending on the current amount of players and the current load. Unfortunately, in the current state, there is no way implemented and tested, that one matchmaking server gets chosen when a player connects for the first time. It might work, but the frontend needs a couple of changes to dynamically change the address of the matchmaking server. At the moment its hard coded. The current structure looks more like this:

Current state of the structure.

Cloud Integration

Out of the several known cloud service providers, I sided with Microsoft Azure to get to know this service. During the cloud development lectures I have already tinkered AWS, IBM Cloud and Google Cloud but to further expand my basic knowledge about cloud services, I went with Azure. Adding to that, creating an Azure account gets you 170€ (200$) of free credit for the first 30 days, but you must verify yourself with a credit card. Payments only start if you switch your account from the free tier to a subscription-based tier.

Cloud Structure

Azure offers a variety of cloud services like virtual machines, load balancing clustered databases and Azure DevOps. Azure DevOps is basically your cloud enabled Jenkins instance allowing you to connect to your GitHub repository and automatically run pipelines depending the actions you take in your repository. For instance, when you push to the master branch of your repository, your DevOps pipeline automatically builds your projects, runs unit tests, and then can deploy your application to the Azure service of your choice. It is highly customizable and offers a variety of template applications to get started understanding how these pipelines work and are set up.

Cloud Pipelines

The development process should seemingless migrate from local development to deployment. Meaning, that every server can be set up locally, used for development and testing, and when finished, the changes can be pushed into the repository and a current build with all features gets set up automatically. The “dream pipeline” would look like this:

A deployed and running application is just a push away from being ready to use without ever setting up something by hand afterwards. Having such a powerful pipeline has several improvements while developing:

  • Automatic project building and running tests
  • Deployment happens automatically
  • All deployments are handled the same way and are consistently
  • Decreases time fiddling with deployments done by hand

Choosing the fitting cloud services is a key requirement before you actually start developing your application. I already mentioned the problem that I got myself into because I did not research the fitting cloud technologies beforehand. I’m not saying that the azure services I chose were the right and only fitting choice, the problem was, that I did not spent enough research on actually working out the different approaches I could take with Azure’s cloud services and what requirements the Azure services have. After a good amount of fiddling around, which got me to know the Azure App Service better, I understood that my current structure of the application simply could not use this service. The benefits from using Azure App Services would have been huge, as it would automatically scale depending on the load. It does however limit your abilities to directly debug and manage your application. It is not really possible to just login to your service via SSH, look at the logs or start/stop the application. A fully detailed comparison between the different services shows the azure documentation here: Azure Technology Choices

Project challenges

This chapter splits up into two different parts. Challenges in developing the application itself and the other part is about the challenges working on this project.

The biggest problem I encountered while developing the application was socket management in the frontend. This problem encountered, because two different components needed information from the incoming game event data of the active game. The ideal solution would be to share the socket across the application in a global state manner so that each component would set its listeners on the needed information. But that did not work out as a global state with React-Redux. The solution then was to actually get all the information in the game board component and then push it into a global state. The other component, the game status, would then retrieve it from the global state and update its values according to the data. This worked in the end and is sufficient for the prototype, but in a real-world production ready application, some sort of “socket-manager” or “socket-controller” would be needed to be implemented.

Another problem I encountered with the current prototype was testing. Especially the socket connections sometimes make it hard to create reliable tests as each test would need its own socket set up and ready to emit and retrieve data. The straight forward solution is to create “before” and “after” functions that ran before and after each test to setup sockets and afterwards closing them. In the test itself, only the listeners would be set, and data could be emitted through the set-up sockets. The really tricky part about this is to determine when to stop the test. A normal test calling a REST-API would be finished when the call was received and the data got evaluated. With sockets, especially when testing two player operations such as joining and emitting a player move, you have to carefully watch when to stop the test. Stopping the test is done by calling “done()”. In Mocha it’s a simple parameter that gets setup in the test. When “done()” is called, the tests stop. Sockets however can continue to receive information about events they are subscribed to. If two sockets have to receive the same event, on socket gets the information first and the second one last. The order of the sockets receiving information could be mixed up when networking does not deliver packages due packet loss for example. Meaning that the first socket receives the package after the second socket received the package. The test would end after the socket received the data, but the second socket still has to receive data and evaluate it. When running these tests locally, nothing like this occurred, but it is still a viable problem that can cause failure on the tests.

Most of the problems I encountered were on the more formal side of this project. A huge problem that I just realized when there were two to three weeks left until the presentation of the project was my time management concerning the development and deployment of the application. The development was going slower than I expected because of a slow month of August and a packed month of September in which my practical semester started meaning after sitting over 8 hours in front of a computer doing some sort of developing tasks, I had to spent my whole free time after to work on my project. I’d never expected it to be that hard to get things done after work, but after 8 hours, doesn’t matter what I’ve worked on, I simply wasn’t as concentrated, focused and quick while developing and driving the project forward, I was rather exhausted and that caused the project’s progress to slack.

As this is my first bigger project which I decided to do on my own, I got to know the difficulties planning and managing a project on my own, which led to quite some problems during the whole project. Time management got already mentioned, additionally the architectural side would need some great refactoring if the application ever would go into a productive environment. This happened due to poor knowledge about handling all these servers and components and just “coding away”.

The whole idea of this project was to develop something for the cloud. Unfortunately, I set my expectations quite too high for a single person, especially a beginner, to achieve something that big. I did however manage to create some kind of overview of my expectations. I already mentioned the pipeline that would get triggered on an action in the GitHub repository. This pipeline was made to capture everything I would need to research in order to create this kind of pipeline.

Without proper architectural knowledge it is quite hard to keep clean code and a reasonable structure inside each server application. For prototyping this is somewhat sufficient, but to actively develop and maintain a project, a clean structure and clean code is a must.

Learning for future projects

This being my first bigger project that results in actual software that has a real use-case, many different things have approached, whether they were good or bad. In the end the whole project thought me very valuable things about how I should approach the very next project during my studies. There are several key points that are worth pointing out.

The first one being a clear project scope that once defined, it should not suffer from huge changes. The project scope, especially for a timed project, needs to be adjusted just right to match the available man power and the available knowledge. Using new and not yet used technology is great, no arguing there, but getting started with new technology takes a lot of time, especially when going beyond the “tutorial” stuff. In my next project, I will make sure to account enough time for learning new stuff. This kind of goes hand in hand with proper architectural planning. Having no structure and plan to go along, makes it very hard to maintain and expand code. Other people may have a very hard time understanding the project at all.

Cloud architecture and cloud services come with the benefit of having huge resources on demand. It is definitely a topic that is going to be present for quite some time, so I’ll continue using them. Especially the benefits of cloud computing versus traditional computing, like load balancing and creating resources with one call or click, are very promising and easy to take care of resources and managing them. In combination with DevOps, an automatic deployment can save a huge amount of time over the time of developing the application.

 

Realization after finishing the project

During this project, I learned a lot about developing an application that makes, or should make, use of the cloud as a distributed platform enabling my application to scale and run however and wherever I want.

The key realization about project management is, that such a rather complex and feature rich application needs more time and more developers to get done in time with a releasable build. It is surly doable, but you really need to know your stuff. There would only be little time to get to know additional technologies so that you have enough time focusing on releasing a finished build that meets the requirements. It’s more a matter of knowing things and how they work, instead of being a high tier developer. A lot of time got spent on researching and trying things out than actually working with them.

Azure’s cloud services have shown me several possibilities to publish my application with totally different needs and benefits. Understand what you need and how you implement it, is something I have to dig deeper in my own research time. There is huge potential, that can be discovered, but you actually need time to integrate and get comfortable with cloud as your infrastructure provider.

The whole project made a lot of fun even though I just got around to make a working prototype and just got to touch the glimpse of cloud computing, I realized the huge potential for further projects and the necessity to get to work on cloud backed projects.


podcrawler

1. Introduction

If you listen frequently to podcasts you were probably at least once in the situation where you could not really remember where you heard that one line or sentence. Then you really want to be able to just google it only to find out that this exact podcast didn’t offer a transcription, because most of time this is time consuming and expensive and most podcast producers decide to skip it. There are already some online services that offer automatic transcription of audio files with the option of a manual overview, done by a live person at extra cost. The only problem is that these services are pretty expensive most of the time and rarely offer a search functionality for everyone.

Thats why I thought to myself: How hard can it be to build such an app, that transcribes and analyses podcasts automatically and offers a search endpoint to everyone?

2. Goals and Scope

So here’s the thing. I was really exited about the opportunity to dive into the cloud development. The hype around it was really big and every one was talking about how fast development of prototypes and even production ready applications has become, how you don’t need to think about server infrastructure, stacks etc. anymore. And this was really exciting for me personally, because I get at least 1 ~million dollar~ idea for an application per day and typically the process looks something like this:
I get really excited and motivated and start developing right away. Of course I start with defining the stack for the project and server infrastructure requirements. By the time I am done with the stack configuration and the server infrastructure is ready for the actual development I have lost interest and sometimes even forgot the core of the idea I had…

Thats why I set couple of goals that I wanted to achieve during the development of podcrawler and also set a certain scope, so that I can finish the project in time:

  1. Most importantly I wanted to see how fast I can develop a working prototype in comparison to the “usual” app development. I didn’t aim to build a production ready application.
  2. I wanted to get an overview of the 3 main cloud platforms – Google Cloud, AWS and IBM Cloud and see if there are any big differences or drawbacks. (I left Microsofts Azure out on purpose.. had terrible experience with it during a past project and knew I didn’t want to go back there)
  3. I also wanted to decouple the different services of the application as much as possible and try to make it as dynamic as I can.
  4. See what’s the hype around Docker and Kubernetes all about!

3. Research

After I had a scope for the project ready I started with the research and planing.At that time I had very little idea how exactly everything would work together. I was only aware of the basic services I would need for the application to work, but had no idea how these services work, are paid for or which one is the best solution to my problems.
With that in mind I destructured the application in 6 main categories:

  1. Base – Programming Language, Stack etc.
  2. Speech-to-Text Service
  3. NLU or Natural Language Understanding Service
  4. Database and Search Service
  5. File Storage
  6. Authentication

and started to search for the “best” solution for each category.


* Spoiler alert : I basically scraped the whole project 3 weeks before the presentation and deleted everything. Thats why some of the planing had to be made twice..

The Base

As with every project I had to decide which programming language and framework would allow me to develop as fast and as easy as possible. Since one of my goals was to try out Docker and K8s I didn’t think much about that part, I just assumed that it would work good, because everyone is transitioning to a docker+k8s stack (which later proved to be a big problem for me). Most of the time I spent deciding whether to go with Node.js, because at the end it will be browser based app or to go with Python, because it is just easier to do NLU and Data Processing with. Initially I decided to go with Python, because I had good experience with it in Data Mining and structuring Data. Therefore as framework I took Django , because on paper it had everything I needed and even more in the form of Django packages. It seemed really good documented and at a point where you can consider it for a project and not think if you are going to hit the limits of it soon.

Speech-to-Text Service


You would think there is one obvious choice here, considering that Google is essentially the nervous system of the Internet at this point and normally I would agree. But the accuracy of the recognition and API options weren’t the only factors I had to consider. The main drawback, as with Amazon Transcribe, was the cost. Both Google and AWS allowed for only 60 free minutes per month of transcription. This may sound like enough for prototyping, but only as I was testing the accuracy on both of the platforms I used up 14 minutes respectively and I hadn’t even began with the development of the application… 
On the other hand IBM Watson was offering 500 minutes per month at no cost which is a significant difference! And the difference in the accuracy in comparison to Google wasn’t that big! Sure, at times there were some wrongly recognised words, but the correct word was always in the alternatives array, which is part of the return object for each transcribed word. More on that in the section with the description of the service itself!\
Both Googles and IBMs Services offered great documentation and SDK with support for most of the popular programming languages. The Amazons documentation was a bit confusing at times and offered only python support.
Thats why I went with IBM Watson on the Speech-to-Text Service!

NLU Service

The situation with the natural language understanding service was similar. Google offers probably the best models but it is just too expensive and the free tier consists of around 5K Units per month for free. 1 unit here is basically a document consisting of less than 1,000 Unicode characters. This was really confusing for me first, but just consider that one test audio file of 3 minutes can contain up to 500 words, which means that you can reach your monthly limit with just a 30-40 min. long podcast. On the other hand IBM calculates the costs a bit differently, which makes it a lot more suitable for development. There extracting Entities and Sentiment from 15,000 characters of text is (2 Data Units * 2 Enrichment Features) = 4 Items and you have 30,000 Items free each month. And again, the price to quality ratio is probably even better. 
Thats why I went with IBM again for this service.

Database and Search Service

Here is where things got a bit complicated. I had to consider that the average podcast length is about 40-50 min long, which means that the transcriptions would get pretty long. This was pretty important, because Django ships by default with SQLite and officially supports only relational databases. So I was worried that the performance of text search in a relational database wouldn’t be satisfying enough. And yes, I could’ve used different vendor like MongoDB for document based solution, but the problem here was that I had to basically decide that when setting up the project. Even the Django documentation warns about this :

When starting your first real project, however, you may want to use a more scalable database like PostgreSQL, to avoid database-switching headaches down the road.

At that point I also researched what cloud solutions were out there and besides Google Cloud Search I also found IBM Cloud Databases for Elasticsearch. Both sounded really promising but again, there were 2 big problems:

  1. Both seemed pretty hard, for me personally, to actually implement.
  2. Neither one of them had a free Tier plan and at this point I had blown almost all my credits from the student accounts, because I forgot to terminate different services in the process of testing them.

At the end the plan was to just worry about this when I get there.

File Storage

Storing files was another consideration I had to make. It wasn’t a big problem during the prototyping phase, but if the app was ever to be production testet, I needed a plan how to store big audio files and where. There were a couple of options, but the most considerable ones were Cloudinary, AWS S3 Bucket or some of the Google solutions, e.g. Google Drive or Google Cloud Storage.
At the end, when the application is ready to be testen in a production environment I will probably go with Cloudinary, because it will let me make file transformations with the REST Requests. This can prove handy if I wanted to reduce the quality of the uploaded podcast to save some space, or if I needed to change audio formats on the fly etc. 
For right now I will be storing the raw uploaded files on the cloud, where the application is uploaded. Hopefully it will be enough for now.


Authentication

Probably the easiest decision I had to make in the planing phase. As I already mentioned Django is a pretty mature framework and has a really big community behind it. Therefore it also has the most of the common services e.g. registering users and authentication already programmed either in the framework itself or as a separate Django package ready to just be installed and used. Thats why I decided to go with django-allauth – an integrated set of Django applications addressing authentication, registration, account management.
Because I was interested in cloud authentication I also wanted to give the user the opportunity to login using their Google account with the help of the oAuth2.0 Service, provided by Google. And let’s face it: everyone has a google account and would probably prefer to just login with it, instead of taking the time to create a new one.

4. Development

With my motivation going through the roof I started development.
The plan was to separate the core services (Speech to Text, NLU, Frontend) in Docker containers and manage them with Kubernetes. The reason for that was that I wanted to make some extra data manipulation in python after I got the transcribed text from Watson and make the whole app architecture as dynamic as possible, in case I wanted to exchange the Speech-to-Text service down the line. At that point I already had a bit of experience with Docker from work, but I never had to build containers or do any initial configurations. Thats why I felt like I was starting from scratch and had to go through the documentations of both docker and kubernetes. Even though they are pretty well written I felt like the abstraction both system try to achieve is a bit too much, thus making simple task like for example adjusting the timeouts or memory of a simple php container a bit tedious.. (I know this because at one point I was considering to switch Django for Laravel – a php based web framework, because I found a pretty good configured container for it.) And of course it wasn’t long before I started facing the first big problems

First I had issues with the allocation of resources, because of the free tier plan. As I mentioned I didn’t have any credits left so the only option was to create a free tier cluster in IBM cloud which gave me 2 vCPUs and 4GB RAM. For some reason I was frequently getting missing resources errors.

AddContainer error: not enough cpus available to satisfy request

Another issue I faced during development had to do with the way kubernetes handles the state of the application. Every time a new deploy occurs, the old containers are shut down and new ones are generated, which then replace the old ones. This led to the problem that all of the previously saved audio files were gone on every deploy, because I had a configuration error with the shared volumes in the docker containers and persistent volumes in kubernetes. At some point I found a configuration that worked for some reason and then just like that didn’t work anymore… This was especially frustrating because it happened at a point where I thought I finally began to understand kubernetes and docker.

Then I struggled with networking. I couldn’t make the cluster publicly available and took me a really long time to even get the different docker containers to be able to talk between each other.

Then I also had problems with Docker.. Every time I tried to customise a standard container to my needs I ran into problems with the compiling of the container. I can remember I needed some extra linux package for file manipulation on the Speech-To-Text container and it took me almost 2 weeks to get the configuration right and to be able to build the container without some dependencies errors.

Don’t get me wrong, I know that most of the problems I had were result of the lack of experience with docker or kubernetes, but I also found it really frustrating to get to the point where it all makes sense. And I just wanted to be able to speed up the developing of prototypes by skipping the server side work.

I also get that there are a lot of benefits to Docker and Kubernetes. I kind of saw that in all of the trouble I had to go through just to have some half working setup. I mean you can really make you application platform independent and test it everywhere, have the automatic scaling and management of your docker containers done by kubernetes, have a load balancer etc. But in my opinion the pain you have to go through if you are not an expert is only worth if you are building a big application that needs to scale automatically and be deployed everywhere AND have a really specific plan how you can achieve that. It certainly didn’t make any sense for my prototype, because in the end I felt like I was back to provisioning and maintaining a server the whole time. And this was exactly what I was trying to avoid in the first place… 
On the other hand the only real work I had done with Django was to create the necessary boilerplate code for the framework to display a render a simple HTML page.. and it was in my opinion a lot of boilerplate code. This is also good, because it lets you tweak every little detail, but was not really something you would do, if you just want to build a prototype.

So after around 2 month of work I wasn’t getting anywhere with the core functions of the application and the fun and motivation ware gone.. in my rage I deleted everything (that’s why there are no screenshots or code from the initial plan), took a day off to cool down and basically started from scratch 3 weeks before the final presentation.
After I revisited my goals again and started to search for a different solutions for the base of the project I stumbled on Cloud Foundry. Cloud Foundry is basically a PaaS and its container-based architecture runs apps in any programming language over a variety of cloud service providers. At first it all sounded magical, because you just needed to select a runtime, e.g. Node, Python etc. and Cloud Foundry automatically scans your applications, takes care of all dependencies and deploys your application to a certain publicly available domain. After all the hustle I went through with kubernetes and docker, this sounded like a dream.. just look at the configuration file necessary to deploy an application:

---
applications:
 - name: podcrawler
   instances: 1
   host: podcrawler
   memory: 256M

and compare it to a docker compose configuration file. So I decided to first test the service before making the same mistakes as with K8s. I replaced the K8s cluster service with the Cloud Foundry one and deployed just a simple Express.js app. Therefore I needed to first install the ibmcloud CLI and make some basic configurations from the terminal like choosing a region and an API target endpoint. At the end it all worked fine and the deployment was really as easy as described. This is where my motivation came back and I sat down and restructured podcrawler around Cloud Foundry. The architecture right now looks like this:

podcrawler architecture

As mentioned above I ditched Django and replaced it with Express, because I found out I didn’t really need a full fledged framework. I just needed a basic routing and REST interface to talk to my cloud services. And so every core component of the application became a cloud service.

5. How everything works


So after 3 weeks of developing I had a working prototype and here is how everything works right now:

 

Database


In the end I decided to create a separate MongoDB cluster on AWS because of the following reasons:

1. Document based databases are typically more performant than the traditional relational databases, especially when it comes to whole text search.
2. Because MongoDB uses JSON syntax for the Schema it was really easy to populate the database with the JSON responses I got from Watson
3. MongoDB provides text indexes to support text search queries on string content. This was optimal for me, because I didn’t have to implement some sophisticated search service.

The integration of MongoDB with Express was also really easy thanks to mongoose:

const mongodb = async () => {
  try {
    const conn = await mongoose.connect(process.env.MONGO_URI, {
      useNewUrlParser: true,
      useUnifiedTopology: true,
      useFindAndModify: false,
    });

    console.log(`MongoDB Database connected: ${conn.connection.host}`);
  } catch (e) {
    console.log(e);
    process.exit(1);
  }
};

Then I only had to require and call the mongoldb() function in the app.js :

const mongodb = require('./config/database')
mongodb()

and I had a scalable database that I didn’t need to manage or alocate resources for.

Authentication


I decided to leave out the sign-up option for now and only offer authentication with a google account.
When you visit the application you are first greeted with a login screen

Here the user can authenticate directly with Google:

The authentication is done with Google’s oAuth2.0 Service. I created an instance on Google Cloud and used a middleware called passport to bind it with the Express framework. The passport middleware has the so called strategies for all kind of different authentication methods. Here is a link if someone is interested in the details. Basically every time someone calls the application express checks if the user making the request is already authenticated with the ensureAuth function, which implements passport’s req.isAuthenticatd() method. If the user is already signed in, the request is passed on. Otherwise the user is redirected to `/` where the login screen is rendered :

ensureAuth: function (req, res, next) {
        if (req.isAuthenticated()) {
            return next()
        } else {
            res.redirect('/')
        }
    }
...
router.get('/', ensureAuth, (req, res) => {
    res.render('login', {
        layout: 'auth',
    })
})

The ensureAuth is then set as second parameter to every get request to the application. This way if the user tries to call `/podcasts/upload/` for example, we always make sure it he is authenticated.

router.get('/upload', ensureAuth, (req, res) => {
  res.render("podcasts/upload", {
    name: req.user.firstName,
    lastName: req.user.lastName,
    image: req.user.image,
  })
})

The authentication with oAuth2.0 is done entirely by passport. I just call the passport.use method and set the strategy to GoogleStrategy :

passport.use(new GoogleStrategy({
        clientID: process.env.GOOGLE_CLIENT_ID,
        clientSecret: process.env.GOOGLE_CLIENT_SECRET,
        callbackURL: '/auth/google/callback'
    },

I also decided to save the session to the mongo database, because sometimes I accidentally closed the tab and had to login again every time this happened.

File Storage


For now I opted to just store the files on the server, but am in the process of binding Cloudinary. Maybe I will update the post in the future when this is done and go over the process.

Load balancing


After ditching k8s and docker I didn’t really implement a load balancer just because it is not necessary right now. If the application ever goes to production and grows big enough user base, I will probably have to consider docker and k8s again.

Application Flow


After a successful authentication the user is greeted with overview page, where he can see an overview of his uploaded and analysed podcasts and search through their transcriptions (the search fictionality is described in more detail in the sections below):

The core of the application is the automated transcription and categories and concepts extraction form the uploaded podcast. When uploading a new podcast I gave the user the option to add a custom title and description for the podcast, because everyone has their own way of sorting out information and I didn’t want to force some automatically generated descriptions.

After the user fills out the filed and clicks on the upload and analyse button, he sends a post request to `/podcasts/upload`, which triggers the following process:
First I make sure that the user has uploaded a file

if (!req.files) {
      res.send({
        status: false,
        message: "No file uploaded",
      })

Then I get the file from the post request and save it on the server

let fileUpload = req.files.fileUpload
fileUpload.mv(filePath, (err) => {..}

In the process of saving I send the audio file to Watson for transcription. Here I define the `alternative` words threshold and tell Watson to also include the timestamps for every word in the response. The high threshold for the alternatives words is important, because I plan to implement a feature, where the application automatically saves an array of word alternatives for any transcribed word with a score less than 0.8. This way even if Watson doesn’t recognise a word correctly and the user doesn’t notice it, the correct word could be in the saved array and the search results should still be accurate.

 const recognizeParams = {
          audio: fs.createReadStream(filePath),
          contentType: "audio/mp3",
          wordAlternativesThreshold: 0.9,
          timestamps: true,
        }

speechToText
          .recognize(recognizeParams)

Once the transcription is ready and the results are available I send the transcribed text for analysis to the Watson NLU service

.then((speechRecognitionResults) => {
            const transcript =
              speechRecognitionResults.result.results[0].alternatives[0]
                .transcript;

            const analyzeParams = {
              features: {
                categories: {
                  limit: 3,
                },
                concepts: {
                  limit: 3,
                },
              },
              text: transcript,
            };

Here I request the top 3 Categories and Concepts for the transcribed text. I plan to use the Categories as taxonomies and automatically create filter options for the frontend. With the Concepts on the other hand I plan to generate connections on a higher level between the podcasts themselves. This way I can pretty easily create a recommendation system, when the user is searching for a specific phrase. The best way to understand the difference between Categories and Concepts is with an example – For example, a research paper about deep learning might return the concept, “Artificial Intelligence” although the term is not mentioned.
After the transcription and analysing is done, I forward the user to a review page, where he has the ability to manually check the generated data and edit it if necessary

If everything is alright the user can then save the podcast in the database! And the best part here is that the saving is done with just one line of code:

const Podcast = require("../models/Podcast")
await Podcast.create(req.body)

The most important part here is to make sure that the input fields names correspond to the field names in the Schema of the model! For example the Schema for the Podcast model looks like this

const PodcastSchema = new mongoose.Schema({
  title: {
    type: String,
    required: true,
    trim: true,
  },
  description: {
    type: String,
    required: true,
  },
  transcript: {
    type: String,
    required: true,
  },
...})

And the input fields of the review form have the same name tags:

<input id="title" type="text" class="validate" value="{{title}}" name="title">
<label for="title">Title</label>

<textarea id="description" class="materialize-textarea" name="description">{{description}}</textarea>
<label for="description">Description</label>

<textarea id="transcript" class="materialize-textarea" name="transcript">{{transcript}}</textarea>
<label for="transcript">Transcript</label>

Search


Another important function of the application is the ability to search the transcribed text. As I already mentioned above, MongoDB provides text indexes to support text search queries on string content. In order to perform text search queries I had to set a text index on my collection

PodcastSchema.index({ transcript: 'text'})

With the index set I could user the `$text` query operator to perform text searches. The $text operator will tokenize the search string using whitespace and most punctuation as delimiters, and perform a logical OR of all such tokens in the search string. So when the user posts a search request, I just call the find method on the Podcast model and search for the phrase they entered. I also sort the results by the relevance score that MongoDB provides for each search query

const podcasts = await Podcast.find(
      { $text: { $search: `${searchString}`, $language: "none" } },
      { score: { $meta: "textScore" } }
    ).sort({ score: { $meta: "textScore" } }).lean()

CI/CD


Because this is only a prototype (and also I didn’t have any time left) I didn’t really bother with CI/CD. For now I just set up some basic rules in Gitlab to make sure that no changes are merged into the master without an approved Pull-Request and deploy the application to Cloud Foundry manually. As I already said, after the initial configuration of the ibmcloud CLI I just need to use ibmcloud cf push . If in the future I decide to scale the project I will definitely implement a CI/CD solution. (Maybe with Jenkins)

Testing


Again, because my main goal was to just build a working prototype I didn’t really spend much time on testing either. After the decision to go with Node.js I installed Mocha and Chai for assertion testing.


6. TODOs

Of course I couldn’t achieve everything I planned, partly because of the problems with k8s and docker, but also because of my poor time management.

I still have to implement a file storage solution that can scale with project. This is something that can take a lot of time again, because I have no prior experience in it. 
The lack of integration and functional tests is also something I will have to fix in the feature.
And if I still have the motivation, maybe I can upgrade the search functionality as I initially planned.


7. Conclusion


Despite the troublesome start and the months I lost, because of my frustration with docker and k8s I was still able to achieve the goals I set for myself. This speaks volumes about the speed of cloud development (if you have the experience, of course :D). At the end I think I achieved all the main goals I set at the beginning : 
I was able to build a working prototype in couple of weeks (you should also keep in mind that I didn’t work on it all of the time). 
I got familiar with the 3 main Cloud Platforms and got a deeper understanding how the cloud operates.

Admin Panel (Web App) in der AWS Cloud

1. Einleitung

Im Rahmen der Vorlesung „Software Development for Cloud Computing“ haben wir uns als Gruppe dazu entschieden aufbauend auf teilweise bereits vorhandener Codebasis an einem Startup-Projekt weiterzuarbeiten. Der Hauptfokus lag bei uns auf dem Ausbau von DevOps-Aspekten und auf dem eines stabilen und sicheren Systems, welches auch in der Production-Environment eingesetzt werden kann. Bei einem umfangreichen Projekt wie diesem spielen natürlich auch Überlegungen zu Skalierbarkeit und Kosten eine recht große Rolle. Punkte und Ziele wie diese werden wir später im Beitrag noch genauer betrachten.

2. Über das Projekt

Im Rahmen einer Startup-Idee war es das Ziel, eine Art Admin-Panel zu erstellen, auf dem Restaurants u. a. ihre wöchentlich wechselnden Gerichte eintragen und managen können. Dabei werden dann automatisch Nährwertangaben, rechtliche Kennzeichnungen wie Allergene und weitere Informationen über Nahrungsmittel hinzugefügt und verwaltet. Später sollen Kunden automatisch ein Menü, das speziell auf ihren Ernährungsplan angepasst ist, erstellt bekommen. Das gesunde und auf sie abgestimmte Essen bekommen die Kunden dann frisch geliefert.

Das Admin-Panel wurde als SPA mit Vue.js im Frontend und mit einer GraphQL-API auf Basis von Go realisiert. Für die Speicherung von Daten nutzen wir PostgreSQL. Darüber hinaus verwenden wir Services von AWS zur Authentifizierung von Benutzern und für die Ablage von User-Uploads. Die folgende Grafik zeigt eine vereinfachte Darstellung der einzelnen Services und Beziehungen untereinander.

3. Cloud-Architektur

In diesem Kapitel erklären wir, mit welchen AWS Services wir die grundlegende Architektur der Applikation in die Cloud gebracht haben. Wir haben uns für AWS als Cloud-Anbieter entschieden, da es den mit Abstand größten Marktanteil genießt und einer der ganz ersten Anbieter von Cloud-Computing-Diensten war. Diese Attribute bringen mit sich, dass für AWS die meisten Ressourcen online oder als Literatur zur Verfügung stehen. Dadurch versprachen wir uns einen leichteren Einstieg in das Themengebiet. Zudem hat AWS die größte und global verteilteste Infrastruktur. Auch das Angebot an Cloud-Diensten ist bei AWS mit Abstand mit größten und darüber hinaus noch sehr diversifiziert, wodurch wir in dieser Ökosphäre alles finden würden, was wir jemals brauchten. Vergleichbare Dienste anderer Cloud-Anbieter sind oft auch nicht so ausgereift und entwickelt wie bei AWS.

Die gesamte Cloud-Architektur haben wir dabei mit Terraform bereitgestellt. Terraform ermöglicht das Provisionieren von Ressourcen in der Cloud in einer Templating-Syntax. Dies wird gemeinhin als Infrastructure as Code (IaC) bezeichnet. Die Vorteile sind vielseitig:

  • Die Übersicht der Ressourcen bleibt erhalten und geht nicht im Wirrwarr der AWS Konsole verloren.
  • Die Infrastruktur kann wie Programmcode versioniert werden und bringt damit die üblichen Vorteile eines Versionskontrollsystems mit sich.
  • Ressourcen können über das Terraform CLI hochgefahren und wieder heruntergefahren werden und machen die Übertragung auf andere Cloud-Accounts damit sehr einfach.
  • Die Abstrahierung von Terraform Code in Modulen und verschiedene syntaktische Hilfsmittel erlaubten es uns eine Unterteilung zwischen der Development-, Staging- und Production-Umgebung komfortabel zu pflegen.

Der folgende Code zeigt die Datei main.tf für unsere Staging-Umgebung. Von hier aus steuern wir über selbst erstellte Terraform-Module die Provisionierung unserer benötigten Services an. Die Abstraktion in Module macht es uns möglich, diese Konfigurationen für jede Umgebung separat vorzunehmen und dabei nicht unnötig Code zu wiederholen.

Kurzer Überblick zum Cloud-Ansatz

3.1 Frontend (Client)

Beim Frontend handelt es sich um ein kompiliertes Vue.js-Projekt, also um nicht mehr als statische Dateien. Das Vorgehen bei der Bereitstellung in der AWS Cloud ist daher sehr einfach und direkt: Statische Dateien werden in ein öffentliches S3-Bucket hochgeladen. Das allein reicht sogar schon um eine Webseite zu betreiben. Es ist aber sehr empfehlenswert einen CDN-Dienst davor zu schalten, im Fall von AWS: CloudFront.

Quelle: https://aws.amazon.com/de/blogs/networking-and-content-delivery/amazon-s3-amazon-cloudfront-a-match-made-in-the-cloud/

CloudFront dient zum Erreichen gecachter statischer Dateien und verfügt über ein global verteiltes Netzwerk an Servern. Das bedeutet, dass sich die Latenz verringert, da das Frontend vom nächstgelegensten CloudFront-Standort geholt wird, anstatt vom Standort des S3-Buckets. Außerdem kümmert sich CloudFront für uns um SSL, DDoS Protection und mehr.

3.2 Backend (API & Datenbank)

Die Wahl des Cloud-Stacks für das Backend einer Webanwendung ist sehr viel komplizierter als für das Frontend. Die Auswahl geht von EC2-Instanzen über Kubernetes, Elastic Beanstalk bis hin zu Serverless-Funktionen.

3.2.1 Der richtige Service für das API

Die Frage hin zum Serverless-Stack sollte man sich am Anfang der Programmierung stellen. Lambda-Funktionen bieten viele Vorteile, vor allem in Hinsicht auf Kosten, zudem große Sicherheit im Hinblick auf Skalierung. Wir entschieden uns dagegen. Ein Grund dafür war, dass wir für eine Admin-API keine unerwartet hohen, sondern eher konstant bleibende Anfragen erwarteten. Der Hauptgrund war aber wahrscheinlich die unzureichend entwickelte Ökosphäre rund um Entwicklertools für ein zunehmend komplexes Software-Projekt.

EC2-Instanzen, welche im Grunde nur virtuelle Maschinen sind, schienen uns nicht spezialisiert genug für einen Service, der einfach nur unsere Docker-Images ausführen soll und außerdem noch zu mächtig. Diese Option schied also direkt aus.

AWS selber bewirbt Elastic Beanstalk (EB) sehr stark als Wahl zur Bereitstellung von Webanwendungen. Wir können technisch nicht genau argumentieren, warum wir uns gegen diese Lösung entschieden haben. Allerdings sind wir bei unserer Recherche auf sehr viel Kritik gegenüber diesen Dienst gestoßen. Professionelle DevOps-Engineers rateten immer gegen EB und stattdessen zu einem Container-Service.

Hierbei bieten sich zwei Optionen: Der Industrie-Standard Kubernetes und der kleine Bruder, speziell von AWS entwickelte Elastic Container Service (ECS). Keiner von uns hatte viel Erfahrung mit Kubernetes, wir hörten nur, es sei mächtiger und flexibler in der Konfiguration, dafür aber auch teurer. Nichts davon könnten wir gebrauchen. Der ECS macht dafür genau das, was wir wollen: Er betreibt Instanzen unseres API-Images, kann dies verbunden mit einem Load Balancer horizontal skalieren und ist sogar mit Auto-Scaling-Policies verknüpfbar.

3.2.2 Datenbank

Zum Betreiben unserer Postgres-Datenbank nutzen wir einen eigenen spezialisierten Service, was auch sehr zu empfehlen ist. Im Falle von AWS ist das der Relational Database Service (RDS), zumindest für relationale Datenbanken. Die Hardware ist speziell für Datenbankzugriffe abgerichtet, System-Updates und Backup-Automatisierungen sind mit inbegriffen. Auch Aspekte wie Skalierung und Ausfallsicherheit (hohe Verfügbarkeit durch Replikas) können mit diesem Dienst einfach realisiert werden.

RDS bietet mit Aurora auch einen speziellen Typ Engine an, der die Autoskalierung von Datenbanken ermöglicht. Das Angebot ist recht neu und definitiv sehr interessant. Es soll außerdem eine höhere Performance (3x schneller bei Postgres-Datenbanken) bieten. Wir haben uns dagegen entschieden bzw. noch nicht dafür entschieden, aus zwei Gründen:

  1. RDS Aurora ist schon in der Basis-Skalierung deutlich teurer als eine einfache RDS-Instanz.
  2. Bislang unterstützt Aurora Postgres nur in der Version 10 und wir würden die Features verlieren, die sich uns mit Postgres 12 bieten.

3.2.3 VPC

Aus Sicherheitsgründen wollen wir die Datenbank nicht ans Internet anschließen. Sie soll nur von unseren API-Diensten erreichbar sein. Das bedeutet, wir müssen uns eine VPC einrichten, eine Virtual Private Cloud.

Die VPC besteht zum einen aus einem Private Subnet, hierin lebt unser ECS und unsere Datenbank, und zum anderen aus einem Public Subnet, hier ermöglicht ein NAT Gateway die Verbindung nach außen. Eine harte Bedingung des ganzen ist die Distribution auf mindestens zwei Availability Zones, darum sind es, um genau zu sein, jeweils zwei Private Subnets, Public Subnets und NAT Gateways.

Quelle: https://user-images.githubusercontent.com/884507/34551896-a3a838a6-f0d2-11e7-8858-c4de887fb225.png

3.4 User Uploads CDN

Eine Sub-Architektur in unserer Infrastruktur will noch getrennt betrachtet werden. Die Rede ist vom sogenannten “Serverless-Image-Handler”, hierüber wickeln wir ab, wie von Benutzern hochgeladene Bilddateien aufgerufen werden.

Wenn Benutzer eigene Bilder hochladen, geht die Anfrage über unsere API und wird dann in einen S3-Bucket abgelegt. Diese Bilder werden an verschiedenen Stellen aufgerufen, z.B. in der Detail- und Listenansicht von Rezepten. Da die Bilder von Benutzern hochgeladen werden, ist die Bild- und Dateigröße variabel und auch nicht optimiert. Bei der Listenansicht ist uns dann aufgefallen, das wir ja 15 Bilder gleichzeitig laden und sie eigentlich nur in eine Größe von 80×60 Pixeln für das Thumbnail benötigen. Also potenziell mehrere Megabyte an Datentraffic nur zum Laden der Listenansicht.

Bei der Suche nach einer Lösung sind wir dann relativ schnell auf den “Serverless-Image-Handler” gestoßen, eine Komplettlösung von AWS in Form eines CloudFormation-Stacks.

Diese Lösung involviert einen CloudFront-Endpunkt, einen API Gateway, eine Lambda-Funktion und natürlich das S3-Bucket, in dem die Bilddateien abgelegt sind. Es funktioniert dabei wie folgt: Das Bild wird von dem CloudFront-Endpunkt angefragt. Die Anfrage beinhaltet den Namen des S3-Objekts und die gewünschte Skalierung. CloudFront leitet die Anfrage an das API Gateway, welches eine Lambda triggert. In dieser Lambda wird das Objekt aus dem S3-Bucket geholt, skaliert und komprimiert. Schließlich wird die optimierte Bilddatei an den Client zurückübermittelt. Von nun an befindet sie sich außerdem im CloudFront-Cache und kann bei der nächsten Anfrage sogar deutlich schneller abgerufen werden.

Quelle: https://aws.amazon.com/de/solutions/implementations/serverless-image-handler/

4. CI/CD

Der Aufbau einer CI/CD-Pipeline ist meist recht komplex und herausfordernd. Hier kann man nur raten, wie man es aufbauen soll, damit es von Anfang an solide und “langlebig” ist. Man muss dabei auch im Hinterkopf behalten und sich genau überlegen, wie alles zusammenspielen soll, wenn später mit mehreren Leuten zusammengearbeitet werden soll. Prinzipiell gibt es nicht die eine richtige Lösung für eine CI/CD, denn die Anforderungen daran sind auch vom jeweiligen Projekt abhängig. Selbstverständlich bietet ein Versionskontrollsystem wie bspw. Git oder SVN in erster Linie die Basis für eine CI/CD-Pipeline. Bei unserer Pipeline war es uns zum einen wichtig, vorprogrammierte Tests ausführen zu können (CI) und zum anderen, das Deployment auf unsere Staging-Environment, als auch auf unsere Production-Environment zu automatisieren (CD).

Bevor man sich mit dem Thema auseinandersetzt, sollte man überlegen, was man für das Projekt braucht, um beispielsweise Tests ausführen zu können. In unserem Fall nutzen unsere Integration-Tests eine eigene Datenbank.

Zum automatisierten Testen sollte man dabei möglichst Production-nahe Bedingungen herstellen, da sich sonst anstrengende Bugs einschleichen können, die man erst in Production bemerkt, obwohl die eigene Test-Suite diesen Fall schon abgedeckt hätte. Diese Fehler zu finden ist dann sehr mühsam, da man normalerweise davon ausgeht, dass etwas, was erfolgreich getestet wurde, auch funktioniert. Man möchte also die Umgebung (Betriebssystem, Abhängigkeiten, Konfiguration, …) identisch halten.

Für das Testing in unserer CI-Pipeline nutzen wir daher Docker und Docker-Compose. Dies ermöglicht uns, das Testing auf dem Image des Projekts auszuführen, anstatt in der Laufzeitumgebung des CI-Runners. Die Vorteile der Konsistenz tauscht man hier gegen höhere Kosten und Wartezeit beim Ausführen der CI.

Der CD-Teil unserer CI, also das Continuous Deployment, wird bei uns im Anschluss auf das Testing für Commits im develop- und master-Branch ausgeführt. Das Docker Image wird hier entsprechend getaggt und in die Registry von AWS hochgeladen. Abschließend wird mithilfe der AWS CLI der ECS Cluster neu gestartet, wodurch das neue Image Anwendung findet. Bei unserem Vue.js-Projekt werden die kompilierten Dateien wiederum nur mit dem S3-Bucket synchronisiert und der CloudFront Cache invalidiert, um CD zu realisieren.

Alles ist so darauf eingerichtet, dass unsere Staging-Umgebung dem Stand des Branches develop entspricht und unsere Production-Umgebung dem von master.

Im Folgenden zum Zwecke der Veranschaulichung, seht ihr die CircleCI-Konfiguration für unser API-Repository. Wir haben auch jeweils weitere Pipelines konfiguriert für unser Cloud-Repository (bestehend aus TerraForm Dateien) und dem Repository, welches unser Vue.js-Frontend enthält.

Wir haben uns u.a. aufgrund positiver persönlicher Erfahrungen für CircleCI entschieden. Dieser Dienst verfügt über eine hervorragende Dokumentation und Support. Des Weiteren ist er sehr mächtig in seiner Konfiguration. Bevor wir CircleCI nutzten, versuchten wir eine CI mit GitHub Actions aufzubauen. Der Dienst war zu dem Zeitpunkt noch sehr neu bzw. ist es immer noch. Wir hatten große Probleme bei der Einrichtung einer CI mit Docker-Compose und eine generell schlechte Erfahrung gemacht. Das einzige Problem bei CircleCI, welches uns erst im späteren Verlauf bewusst wurde, ist dass es sehr kostspielig ist, und das auch im Vergleich mit ähnlich starken CI-Diensten. In der Zukunft wollen wir eventuell auf GitLab CI wechseln und würden uns dabei auch überlegen, den CI-Runner selbst auf der AWS-Plattform zu hosten.

5. Weitere Services

Die AWS Cloud bietet noch eine Vielzahl weiterer Services, u.a. auf der Ebene von Software-as-a-Service (SaaS), aus der wir uns bedienen.

Für die Authentifizierung von Admins bzw. Mitarbeitern und Kunden verwenden wir Cognito User-Pools. Cognito ist die AWS Komplettlösung für Authentifizierung. Hierüber können auch Verifikationsmails, 2FA-Verfahren, Passwort-vergessen-Dialoge und mehr konfiguriert werden. Den E-Mailversand von Cognito haben wir bspw. mit einem weiteren Service von AWS verbunden: Simple Email Service (SES). 

Ein weiterer wichtiger AWS Service, den wir nutzen, ist Route 53. Hierüber setzen wir unsere DNS-Regeln und richten uns Subdomains für diverse CloudFront-Endpunkte und das API Backend ein.

Wir verwenden außerdem noch die Amazon Translate API zum Internationalisieren von Texten auf der Webseite und Lambda-Funktionen als Verbindungsstück zwischen verschiedenen AWS Services.

6. Fazit

Wir haben in diesem Projekt natürlich sehr viel gelernt. Cloud-Computing, die AWS Cloud oder der Ansatz von Infrastructure as Code waren für uns alle vor diesem Projekt Neuland. Dementsprechend mussten wir uns in sehr viele Themen einarbeiten und das Projekt hat sich über die Zeit auch immer wieder gewandelt. Heute sind wir aber sehr zufrieden mit unseren Entscheidungen über Infrastruktur und vor allem über unsere DevOps-Prozesse und die Integrationen zu Team Software, wie Jira oder Slack, die die Entwicklung sehr viel angenehmer gestalten.

Was die eigentliche Infrastruktur angeht, denken wir, werden wir nochmal besser darüber reflektieren können, wenn die Anwendung tatsächlich in Production geht. Dann sehen wir uns bestimmt nochmal verschärft mit anderen Themen, wie Logging und Monitoring, konfrontiert.

Letztendlich möchten wir euch dazu motivieren, euch auch intensiver mit dem Thema Cloud & DevOps zu beschäftigen. Es hat uns sehr viel Zeit und Anstrengung gekostet, die Konzepte von Cloud-Computing zu verstehen und uns bei den vielen Services zurechtzufinden. Mittlerweile fühlen wir uns in dieser Welt aber sehr wohl. Mit dem angehäuften Wissen sehen wir uns in der Lage skalierbare Anwendungen auf Enterprise-Niveau zu veröffentlichen und zu verwalten. Auch die viele Arbeit, die wir in unsere DevOps-Strategie gesteckt haben, macht sich in der Entwicklung bezahlt und hilft uns, nachhaltig gute und zuverlässige Software zu schreiben.

Von Marcel Gregoriadis, Julian Fritzmann, Sandro Schaier & Jan Groenhoff.

TS3 Voice Channel Manager – Create and push a Bot to its Limits

by Jan Kaupe (jk206)

Figure 1: Web Configuration Panel for the Bot

Introduction

TeamSpeak³ is a Voice-over-IP application allowing users to connect to a server where they can join Voice Channels to communicate with each other.

Anyone can download and host own TS³ servers. Huge community servers have been established. However, these servers usually have way more Voice Channels than users, resulting in a decreased user experience.

To resolve this issue, I created a proof-of-concept Bot which is able to create and delete Voice Channels on demand. This Bot can be managed via a Web Configuration Panel.

Figure 2: TS3 Voice Channel Manager Showcase
Continue reading

GeoDarts

by Jannik Igney [ji016] & Timothy Geiger [tg079]

1. Introduction

For our course “Software development for cloud computing” we developed a little multiplayer browser game named “GeoDarts”. Goal of the game: Guess where cities are located in Germany on a map and be closer than your opponents. Goal of our project: We wanted to learn what it takes to bring an application like that to the cloud and test different solutions for this task. Also we wanted to get some first hands-on experience with different technologies including socket.io, Vue.js and Mapbox. There were many highs and lows and many lessons-learned in our process. In the following we will describe some of the problems we encountered in different areas, along with the solutions we found for them – and the new problems arising from these solutions…

To get a rough impression of what our game looks like:

The blue markers are the ones set by the players, the red one is the solution. The player that is the furthest away gets one point, the next one two points and so on. If two players have the same number of points in the end, their total distance will decide.

2. Technologies

For the application itself we decided to work with Node.js, Socket.io and Vue.js.

2.1 Why NodeJS

We decided to use NodeJS for our backend because we already worked with it in the lecture Web Development 2. Therefore, we were able to start working on the game right away without having to deal with a new programming language and framework. Popular alternatives would have been Django or Flask. However, both frameworks require advanced knowledge in python that neither of us has. Another alternative that is becoming more and more famous is Deno. Deno was programmed by the same developer who also developed NodeJS. He wanted to improve some things in Deno that he didn’t like in NodeJS. You can write code for Deno with Javascript or Typescript. Even though we have no experience in Typescript, we have heard almost only good things about the programming language. Unfortunately, it has two major downsides. It has a relatively small community and you cannot use npm. Whereas NodeJS has a huge community. You will hardly find any question about NodeJS that has not been asked and answered before. This saves you a lot of time and headaches when debugging.

2.2 Why socket.io

In the beginning, we didn’t know how to implement a real-time game. After some research we came across a technology called websockets. Websockets allow communication in both directions. That means not only can the client send data to the server, but the server is able to send data to the client as well. Websockets work like this: you open a TCP connection and leave it open until you don’t need it anymore. In NodeJS you can implement websockets in different ways. We chose one of the most prominent solutions: Socket.IO. It is very easy to start with. With only a few lines of code, you already have a functional application running. It is also widely used and you can find many learning materials. And these were not even all features. Unfortunately. By the time we were at a very advanced stage of development, we had learnt about features that we didn’t know before. However, these features were not very advantageous for us. We will see later why.

2.3 Why Vue.js

Similar to Node.js we had some first basic experience with Vue.js but never did a major project with it. That’s why we wanted to use this opportunity to deepen our knowledge of Vue.js. Many concepts of Vue.js are quicker to learn than with competitors like React. Also there’s great support from a large community – perfect conditions for some experimenting.

3. Architecture and program flow

In the beginning our application just consisted of a single Node.js express server, communicating with clients via socket.io. At some point we decided that we would like to be able to horizontally scale the app and have multiple instances available. The reason for that decision was of course not that we feared our server might collapse under the traffic generated by hundreds of thousands of excited users of our game, we simply wanted to gain some first experience in the topic of scaling and find out what solutions there are and what problems they come with.
So the below graphic shows the architecture we ended up with. Redis and Node.js are deployed in seperated Kubernetes Services. In order to make use of multiple Node.js instances we first had to overcome quite a few obstacles and still did not achieve a perfect solution, as described further below. However, we learned a lot in this fight and in the end that was our most important goal.

Once the client has loaded the page all further communication with the server happens via the websocket. The following chart illustrates the flow of events from creating a new game to transmitting the final results:

4. How to get a map?

One of the core requirements for a geography game like ours, is of course embedding an interactive map. Since most developers will probably come across the big topic of maps sooner or later, so we were glad to use this as an opportunity to take a first look into some of the related technologies. First thing we learned was that maps that are integrated in websites are usually not one big piece of data but a raster of single tiles, put together like pieces of a puzzle. That way only the required areas of a map need to be loaded, e.g. when zooming in. Traditionally these tiles were served as images (“raster tiles”), but considering that you need new tiles for every zoom level and that every image is a huge bunch of data you might end up with a very slow map that is not much fun to use. Also with raster tiles, though you can add custom features to the map later, you can not style single layers of the map, because there is just that one layer. That’s why most providers for map services support another technique by now: vector tiles. With this approach, the tiles are not images made of pixels, but vector data, an exact geometric description of every element in the map which can then be rendered in the browser. The main advantages are shorter loading time (smaller data size), smoother zooming (no need for multiple tilesets for different zooming levels) and easy customization of the specific layers. Though this technique involves higher requirements on the client’s browser and hardware it is considered the superior approach for many use cases nowadays. That’s why we decided to go with vector tiles, though probably if we had to start over, we would try to do it differently.

When it comes to the question of what technologies to use, there are many alternatives for both client side libraries (e.g. leaflet, mapbox gl) and map tile providers (mapbox, Google, OpenStreetMaps). In our case it was important to be able to style the map, so that we could remove all labels and features except country borders and give single countries their own colour. Mapbox is a big platform for all kinds of map services including an extensive map style editor called Mapbox studio. So we decided to try this, since Mapbox also offers a large community and good tutorials. Mapbox hosts vector tiles for free as long as you stay under 50.000 requests per month. No need to worry here.

We were happy with Mapbox since it is fairly easy to use and fulfilled all our demands. However, with the knowledge we have now about how displaying maps works, we would probably try to solve the problem completely differently: The most important advantage of vector tiles is good performance even with heavy zooming and jumping around on a multilayer map. However, the map that we needed for our game neither has many layers nor is there any zooming or changing position. That’s why we probably could have rendered the map from a simple geoJSON file, e.g. with the library d3.js. A geoJSON describes all features of a map in JSON format and can be used as a starting point for vector tiles. Due to our low requirements on the map, we probably wouldn’t have needed vector tiles and could have done without the dependency to mapbox. 

5. Frontend development – SPA = Single Page Application or Socket Problems Ahead

Writing the frontend of our application with Vue.js was a new experience for us in different respects. Even though it is really easy to create a first outline of a page with Vue.js, it still takes quite some time to get behind concepts like the lifecycle hooks and communication between components, but also usage of the development server or UI Libraries. 

All these things go beyond the scope of this post. A more general aspect of frontend frameworks like Vue.js, is that what you get is an SPA, a Single Page Application, which opens up new possibilities and is better in terms of performance, but it also changes some very basic assumptions that you might be used to if you come from developing old school static web pages like us. The most important one is probably that your components are being reused on the client side without reloading them from the server again. That sounds like no big deal but at many points it can cause some quite confusing errors if you forget about it. The fact that your entire frontend is now stateful of course has some big advantages like being able to store data and objects and keeping open a permanent socket connection, you just shouldn’t forget that unlike in a static frontend, jumping between different URIs won’t reset that state, because the page is not reloaded from the server. For instance if you use a timer in your component and you don’t kill it when leaving the component it might not start anew but just keep running when revisiting that page. At one point it took us a long time to figure out why certain actions happened twice. Every chat message we sent, appeared twice in the chat. Eventually we found out that our listener functions of the websocket were mounted again every time a component was reloaded because we didn’t destroy them. We handled this by writing a wrapper function for socket.addListener(), which first removes all listeners for that specified event and then adds the new one. Another approach would be to use the vue-router’s beforeRouteLeave hook to remove listeners when leaving the current route.

Another point where we needed to adapt our stateful application to the stateless platform web is the game flow. We needed a purely frontend mechanism that makes sure that a user can only access those subpages that represent the phase of the game that he is currently in. For example when the results are displayed we don’t want players to be able to navigate back into the game view. We don’t want them to see the game view of a game they have not joined. For that purpose we created a room token that is stored in a vuex store when the player successfully joins a game and deleted when the game is over. By this token the components can verify that a user is actually allowed to enter the component he navigated to.

6. Backend Introduction

This chapter is about roughly explaining how our backend works. We will take a closer look at the backend later when it comes to scaling because many decisions for certain technologies and techniques only came up through scaling. Therefore, it would not make much sense if we were to look at it right now.

Before we started the development, we thought about how to make it possible for players to play the game at the same time in different rooms. With techniques that we knew up to that point, we couldn’t find a solution. So we did some research. After a while we came up with Socket.IO. Besides being able to develop real time applications, socket.io also offers the possibility to create such rooms that we need for our game. So how does it work? If a player wants to create a new game, we just create a new room in socket.io with a unique Id in order to distinguish the rooms. Now a second player wants to join the game. To join the room from before, all he needs is the unique id. That’s it. Now we can create rooms and join them. But how can we address only the players in a specific room? With socket.io this is done via events. We can emit events from the server side to the client and the other way around. An example: Player ‚A‘ wants to send a message to all other players in the same room. So he emits an event called ‚sendMsg’ with the message as parameter. The server receives this event and tries to determine the room the socket is currently in. Afterwards, the server itself emits an event named ‚receiveMsg‘ with the same message as before as a parameter with the determined room as destination. Every client connected to the server listens to this event. But since the event is only sent to the sockets in the same room, only these sockets receive the event. The received message can then be displayed in the chat. Once you understand the logic behind it, it is actually quite simple. Our entire client-server communication then works with these events. Socket.io also saves us some work. It automatically deletes the rooms that are no longer needed. This is the case when no player is left in a room.

However, one problem remains. How do we store data? For example, what about the points you win during the game? If we were to store the scores on the client side, you could manipulate your score. What about the cities that are randomly selected by the game at the beginning and then queried during the course of the game. If we would save the cities on the client side, one could simply read the coordinates. That’s not what we want. So this is not a solution that comes into question for our game. This means we have to store the data on the server side. But how? In the end we decided to use Redis. This was not always the case. But more details about this will come later when we talk about scaling.

What is redis? Redis is an in-memory data structure store. We can easily store data structures like strings, lists and sets. It stores data with an associated key. So if we want to access a data set, we simply do this through the matching key. However, there is a problem with Redis. Unlike socket.io, the data here gets not deleted when all sockets in a room are disconnected. So if we did nothing, our database would be filled with unnecessary data. When we first noticed the error, there were about 1,000 entries in the database without a game running. The number of entries in Redis is easy to determine. You go to the Redis console and enter ‘keys *’. This will give you all keys stored in Redis. But how did we solve the problem? Our solution is to make the keys dependent on the room or the player. Each socket in socket.io has a unique socket id and each room has a unique room id as mentioned before. We use these IDs to store the data. We simply rename the keys. A key that stores the players name of a player with the socketId=12345 would then be: ‘12345:playername’. Now you just have to get all keys starting with ‘12345’ at the disconnect event and delete them. In Javascript it looks like this:

This way we delete all player data, so no unnecessary data remains in Redis. Furthermore, we check in the disconnect event whether the player was the last player in the room. If so, we also delete all room data with the same method. But now we use the room id instead of the socket id. This way we can avoid memory leaks.

What had also cost us a lot of time was generally programming with Redis. The code became very complex very quickly. You had to read the code again and again to understand what individual parts of the code did. Of course that’s something nobody wants to have. The fact that the readability of the code worsened very quickly was because Redis works with callbacks. If you want to get one entry, it looks like this:

This does not look very complicated yet. But if you have a lot of queries that depend on each other, as we do, the whole thing looks a lot more complicated:

In order to understand what’s going on here, you really have to pay attention. After a while we found out that such a thing is called ‘Callback Hell’. It is caused by coding with complex nested callbacks. Each Callback takes the previous result as an argument. Fortunately, you can easily solve this in Javascript with promises respectively with async and await. To have a little less work, we used the npm package “async-redis”. You just have to be careful that you can only use await in async functions. Now we can easily rewrite the above example to make the code more readable:

7. Infrastructure vs. Platform as a service

When our application had reached a certain level, we started thinking about ways to deploy it to the cloud. Using AWS EC2 Instances seemed to be a quite straightforward approach to us since we could operate on a common Linux VMs and wouldn’t have to give too much control to some black box. So we created a Linux Ubuntu instance, added some rules to its security group (AWS implementation of a virtual firewall) and installed the software we needed, like our Node.js runtime. By that time we already knew that we would need a separate redis server that allows us to have multiple instances of the app available. Therefore we created another EC2 instance running redis and connected our app instance with it via the internal IP address. This worked fine but we saw some problems. First of all, hardcoding the IP of our redis server for the connection did not seem like a great idea in respect to flexibility and exchangeability. Also we realized that having to connect to our instances via ssh in order to monitor and operate the application is quite annoying. We came to the conclusion that maybe we should try some solutions that are more in the field of orchestration and platform as a service. That’s why we took a look into Cloud Foundry and kubernetes in the ibm cloud environment. 

Cloud Foundry seemed to be a good choice at first, since deploying our app itself was done very quickly after choosing our version of Node.js as a runtime and then pushing our source code to the platform with the help of a simple manifest.yml file. However we couldn’t figure out how to set up a redis-server as a separate service which our app could connect to. The longer we tried to make any sense of the CF documentation the more confusing and frustrating it became, so eventually we decided to focus on kubernetes, which was a good decision. Though Docker and Kubernetes turned out to be very complex for beginners as well and they required a lot of research, we found the documentation and tutorials to be really helpful and got a clearer understanding of how things actually work over time. So our summary for kubernetes is “powerful and complex but well explained and consistent in itself” 

Apart from the different technical purposes of AWS and kubernetes and apart from all their pros and cons, we noticed huge differences in the way things are explained and presented to first-time users. With AWS we constantly felt that it’s mostly about advertising and selling a product. Kubernetes on the other hand as an open source platform really seems to want its users to understand the underlying technical concepts. As developers we liked that spirit way better than the commercial one and that’s also why we chose not to work with Amazon’s PaaS solution Elastic Beanstalk for a start.

8. Scaling with Kubernetes

So what is this chapter about? As already mentioned in the chapter ‘Backend Introduction’, many decisions we made in the backend are based on problems we encountered during scaling. These difficulties occurred due to the fact that we weren’t sure at the beginning whether we wanted to scale the game. We then decided to scale the game after all. But the backend version at that time was not designed for scaling. That’s how all these problems came up. For example Redis. In the beginning we had a completely different way of storing data. We only decided to use redis later when we experienced difficulties with the old method. And this is exactly what this chapter is about. We will discuss the problems that arose while scaling a version that was not designed for scaling. So let’s start with the Kubernetes deployment, because without a deployment there are no bugs to talk about.

8.1 Docker

In order to deploy our game in Kubernetes, we first have to create an image of our game. This is done via Docker. In order to write our own dockerfile, we first had to take a closer look on how docker works. Each time we build our image, docker steps through the instructions in the dockerfile and executes them in the specified order. Every instruction in that file then creates a new image layer. This mechanism allows image layers to be cached. Therefore when Docker steps through the instructions one after the other, it checks if a layer has changed. If nothing has changed, docker uses the cached image layer. Otherwise the instruction gets executed and all subsequent layers are not cashed anymore, because something could have changed. Best practice would be to order your image layers from the less frequently changed to the more frequently changed.

That’s why our first instruction is the node image. It is rarely changed and rather big. We use the alpine image from Node because it is faster, smaller and more secure compared to the other versions. Furthermore, we don’t need the advantages of the other images, such as ‘apt’. Here is a little size comparison from the latest version (14.11):

stretch  345,2 MB
buster 322,65 MB
stretch-slim 57,8 MB  
buster-slim 51,54 MB
slim 47,33 MB  
alpine 28,33 MB

However the image is only available locally on our computer. So we still have to publish it somehow. This can be done with the help of a registry. There are many different registries. One of the best known is the one from docker itself, which we then decided to use: Dockerhub.  Alternatively, cloud providers like AWS and IBM also offer such registries.

8.2 Kubernetes

Since we now have an image of our game, we can now deploy our application into Kubernetes. Kubernetes is a popular container orchestrator. It was first released by Google, but is now part of an open source community.

In the beginning it was very difficult to get into Kubernetes, because there were many new concepts. But once we understood the basics, we were able to apply what we had learned very quickly.

A Pod in Kubernetes is the smallest unit of deployment. It can run one or more containers. Technically, a Pod can be deployed directly into kubernetes. However, we mostly use controllers to deploy a Pod. There are different types of controllers: Deployments, ReplicaSet, StatefulSet, … In our case we chose Deployment because it can manage several identical pods. Here we can specify how many instances we want to have and which image we want to use.

Next we need a Service. A service is an endpoint to a set of pods. It is a persistent endpoint in the cluster to connect to the Pods.

Finally, Ingress. Ingress manages external access to the services in a cluster. Here you can enter additional routes that lead to other backends. But in our case there is only one backend.

In summary, we now have the following structure:

https://kubernetes.io/docs/concepts/services-networking/

8.3 Problem Solving

Now comes the really interesting part, which helped us the most in understanding scaling. Determining why our app is not doing what we wanted it to do. In this blog post we want to focus on 3 problems. The first problem is about data storage and the other two are about websockets and socket.io.

This brings us to the first problem: data storage. We have already mentioned Redis as our final solution. But how have we stored the data before that? In the simplest way possible. We stored our data as JSON objects in an array. Here is a simplified version of a room JSON object:

As you can see, we store the RoomID. Before a player can join a room via Socket.IO, we first step through the array in which all rooms are stored. Only if the room with the specified RoomID exists, the socket is allowed to join the Socket.IO room. However, when we scaled the game, we noticed some strange logs. Apparently, a room couldn’t be found. As a result the socket wasn’t able to join a game. Yet we were sure that the room must exist. Therefore, we did a little research. After some time, we realized the mistake: Let us assume we have 2 instances. Alice visits our website. She gets forwarded to instance 1. Now Alice creates a new game, which results in a room object being stored on instance 1. Afterwards Bob wants to join the game. He gets forwarded to instance 2. The server tries to find the room Alice has just created. Since the room was saved on instance 1, the server can’t find it. As a result Bob can’t join the game. 

As you can see, we have made the mistake to store our data per instance. Such an application is called a stateful application. Every instance has a different state. One possible solution would have been to edit our Ingress Controller. In the Ingress Controller you can add paths that point to different backends. In our example from above, the paths would have been ‘/instance1’ and ‘/instance2’, each pointing to a different backend. For example a game is created on instance 1 with the roomID=12345. Now the invitation link would look like this: “/instance1/invitation/12345”. Every player who now joins room 12345 gets automatically forwarded to instance 1. However, this solution has a few drawbacks. Firstly, instance 1 must know its name: ‘instance1’. This has proved to be quite tricky. With a little more time, however, we would have figured that out. Yet there is still another problem. Let us suppose we add 5 additional instances. But how does the ingress Controller know? Somehow the new paths have to be added. One would have to add them manually. We could have possibly replaced the ingress controller with Traefik. Traefik, from our understanding, has the feature to automatically detect such new paths. But we didn’t bother with that any longer, because Traefik became very quickly very complex. Finally there is also a third drawback to the solution. We don’t really want the players to know which instance they are on. We just want it to be some kind of a black box for the players. Therefore, the solution was out of question.

Our solution to the problem was actually quite simple. We have turned our stateful game into a stateless game. We simply connected a database to ensure that every GeoDarts instance has access to the same data. So it does not matter which instance you are forwarded to. As mentioned above, we chose Redis. MongoDB would have been an alternative. We wouldn’t have had to reprogram as much, because we already saved the data as JSON objects. We just had to change the “location” where we saved the data. However, deleting the data after a disconnect with MongoDB would have been much harder. In addition, Redis also has a significant advantage when it comes to websockets, which we will address in a moment. But first we have to create a Redis Service in Kubernetes. We do this through the following YML file:

Now to the second big problem, websockets. Again let’s assume we have 2 instances and 2 players: Alice and Bob. Alice gets forwarded to instance 1 and bob to instance 2. Both players join the same socket.io room with the same roomID. Everything seems fine. At least on the first sight. Although we have not received any error messages, it’s still not working properly. For example, if you tried to start the game, the game only started for some players. And if any of the remaining players then clicks on „start game”, the game will be started for all other players where the game has not yet started. 

After several frustrating debugging sessions, we finally figured it out. Websocket connections are stateful. This makes them not so easy to scale. But lets first look at what went wrong. As Alice wanted to join the room on instance 1, the join event was only emitted on instance 1. The same applies to Bob. When he wanted to join the room, his join event was also only emitted on instance 2. So Alice didn’t receive the event because she is on instance 1.

This problem can be addressed by using an Adapter. This Socket.io technique allows us to pass messages between processes and to broadcast events to all clients. We use the socket.io-redis adapter, which takes advantage of the pub/sub function of Redis. When Bob  now tries to join a game, the other instances now also get informed about this event with the help of Redis. Ergo the event gets emitted on all instances.

Now to the last problem. From time to time we got this error message on the client side:

Error during WebSocket handshake: Unexpected response code: 400

To understand why this error occurs, we need to look at how socket.io establishes a connection. Since socket.io 1.x.x the fallback algorithm has changed to an upgrade approach. By default, a long-polling connection is established first, then upgraded to “better” transports like websockets. Long polling almost works everywhere. That’s why the connection gets established this way.

Though this feature can be quite useful, in our case it was the root of our problem. As the socket.io documentation says: 

“If you plan to distribute the load of connections among different processes or machines, you have to make sure that requests associated with a particular session id connect to the process that originated them.”

However, since we use a load balancer, this is not always the case. A brief reminder: With websocket connections, players remain on the prozess/instance they were first redirected to. With Long Polling, though, it’s different. Every time you make a new request, you will be randomly redirected. The player must be lucky that the long polling request gets always forwarded to the same process until Long Polling gets rejected and the websocket connection is used.

There are two solutions. The first solution would be to use Sticky Connections/Sessions. Sticky Sessions is a feature that allows a loadbalancer to route requests to the same process they were first routed to. Though this solution is proposed by socket.io, we decided not to use it. We have tried to stick to the 12 factor app during the development and it does not allow sticky connections. In the end, we decided to use the method socket.io doesn’t recommend: disabling Long Polling. Socket.io does not suggest this, because long polling is one of the biggest advantages compared to other web sock implementations. In that case, the socket.io documentation proposes to maybe consider using raw Websockets. And we have to agree with it. In retrospect, it would have made more sense for us not to use socket.io.

8.4 Target achieved?

Now that we have successfully scaled our game, we have realized something. The load doesn’t get distributed as much as we hoped. Back when we only had one instance, everything was forwarded to this one instance. It was responsible for storing data, emitting events and calculating game stuff. Now, after scaling, this hasn’t changed. There is still only one process responsible for everything. Now it just doesn’t happen on the GeoDarts instances anymore, but rather on the Redis Service. Now Redis stores data and emits events. Only the calculations remain within the geodarts instances. The problem has simply moved to the back. This is somewhat sobering.To solve the problem we could have used Redis replication / Redis cluster. It allows replica instances to be exact copies of a master instance. However, we stopped here. We have noticed that when we’ve just solved one problem, the next one follows. The difficult thing is always to say when you’re done, because there are always things you can improve. But we felt that our project had come to a point where we were able to tell so.

Peer2Peer Multiplayer Real-time Strategy Game “Admiral: WW2”

Admiral: WW2

1 Intro

Gaming is fun. Strategy games are fun. Multiplayer is fun. That’s the idea behind this project.

In the past I developed some games with the Unity engine – mainly 2D strategy games – and so I thought it is now time for an awesome 3D multiplayer game; or more like a prototype.

The focus of this blog post is not on the game development though, but rather on the multiplayer part with the help of Cloud Computing.

However where to start? There are many ways one can implement a multiplayer game. I chose a quite simple, yet most of the time very effective and cheap approach: Peer-to-Peer (P2P).

But first, let us dive in the gameplay of Admiral: WW2 (working title).

2 Game Demo

2.1 Gameplay

Admiral: WW2 is basically like the classic board game “Battleships”. You’ve got a fleet and the enemy player has got a fleet. Destroy the enemy’s fleet before your own fleet is sunk. The big difference is that Admiral: WW2 is a real-time strategy game. So the gameplay is more like a real-life simulation where you as the admiral can command your ships via direct orders:

  • Set speed of a ship (stop, slow ahead, full ahead, …)
  • Set course of a ship
  • Set the target of the ship (select a ship in the enemy fleet)

Currently there is only one ship class (the German cruiser Admiral Hipper), so the tactical options are limited. Other classes like battleships, destroyers or even aircraft carriers would greatly improve replayability; on the other hand they would need many other game mechanics to be implemented first.

Ships have multiple damage zones:

  • Hull (decreases the ship’s hitpoints or triggers a water ingress [water level of the ship increases and reduces the hitpoints based on the amount of water in the hull])
  • Turrets (disables the gun turrets)
  • Rudder (rudder cannot change direction anymore)
  • Engine/Propeller (ship cannot accelerate anymore)

If a ship loses all hitpoints the ship will sink and is not controllable.

2.2 The Lobby Menu

Before entering the gameplay action the player needs to connect to another player to play against. This is done via the lobby menu.

Here is the place where games are hosted and all available matches are listed.

On the right hand side is the host panel. To create a game the host must enter a unique name and a port. If the IP & Port combination of the host already exists, hosting is blocked.

After entering valid infos the public IP of the host is obtained via an external service (e.g. icanhazip.com). Then the match is registered on a server and the host waits for incoming connections from other players.

On the left hand side there is the join panel. The player must enter a port before viewing the match list. After clicking “Join”, a Peer-to-Peer connection to the host is established. Currently the game only supports two players, so after both peers (host and player) are connected the game will launch.

More on the connection process later.

3 Multiplayer Communication with Peer2Peer

3.1 Peer-to-Peer

P2P allows a direct connection between the peers with UDP packets – in this case the game host and player.

So in between no dedicated server handling all the game traffic data is needed, thus reducing hosting costs immensely.

Because most peers are behind a NAT and therefore connection requests between peers are blocked, one can make use of the NAT-Traversal method Hole-Punching.

3.1.1 P2P Connection with Hole-Punching

Given peer A and peer B. A direct connection between A and B is possible if:

  • A knows the public IP of B
  • A knows the UDP port B will open
  • B knows the public IP of A
  • B knows the UDP port A will open
  • A and B initiate the connection simultaneously

This works without port-forwarding, because each peer keeps the port open as if they would contact a simple web server and wait for the response.

To exchange the public IPs and ports of each peer a Rendezvous-Server behind no NAT is required.

3.1.2 Rendezvous-Server

The Rendezvous-Server needs to be hosted in the public web, so behind no NAT. Both peers now can send simple web requests as if the users would browse the internet.

If peer A tells the server he wants to host a game, the server saves the public IP and port of A.

If B now decides to join A’s game the server informs B of the IP and port of A.

A is informed of B’s public IP and port as well.

After this process A and B can now hole-punch through their NATs and establish a P2P connection to each other.

A Rendezvous-Server can be very cheap, because the workload is quite small.

But there are some cases where Hole-Punching does not succeed (“…we find that about 82% of the NATs tested support hole punching for UDP…”, https://bford.info/pub/net/p2pnat/).

In those cases a Relay-Server is needed.

3.1.3 Relay-Server

The Relay-Server is only used as a backup in case P2P fails. It has to be hosted in the public internet, so behind no NAT.

Its only task is the transfer of all game data from one origin peer to all other peers. So the game data just takes a little detour to the Relay-Server before continuing it’s usual way to the peers.

This comes at a price though. Since all of the game traffic is now travelling through this server the workload can be quite tough depending on the amount of information the game needs to exchange. Naturally the ping or RTT (Round Trip Time: the time it takes to send a packet from peer to peer) of a packet is increased resulting in lags. And finally multiple Relay-Servers would be required in each region (Europe, America, Asia, …). Otherwise players far away from the Relay-Server suffer heavy lags. All of these lead to high demands on the server hardware. To be clear: a proper Relay-Server architecture can be expensive in time and money.

Because of that in this project I ignored the worst-case and focused on the default Peer-to-Peer mechanism.

3.1.4 Peer2Peer Conclusion

The big advantage of this method: it’s mainly serverless, so the operation costs of the multiplayer is very low. Because of that, P2P is a very viable multiplayer solution for small projects and indie games. The only thing that is needed is a cheap Rendezvous-Server (of course only if no Relay-Server is used). P2P also does not require to port-forward, which can be a difficult and/or time consuming task depending on the player’s knowledge.

But there are disadvantages:

  • A home network bandwidth may not be enough to host larger games with much traffic; a server hosted at a server farm has much more bandwidth
  • The game stops if a P2P host leaves the game
  • No server authority
    • every player has a slightly different game state that needs to be synchronized often; a dedicated server has only one state and distributes it to the players; players only send inputs to the server
    • anti-cheat has to be performed by every peer and not just the server
    • random is handled better if only the server generates random values, otherwise seeds have to be used
    • game states may need to be interpolated between peers, which is not the case if only the server owns the game state

A dedicated server would solve these disadvantages but in return the hardware requirements are much higher making this approach more expensive. Also multiple servers would be needed in all regions of the world to reduce ping/RTT.

3.2 Game Connection Process

After starting the game the player sees the multiplayer games lobby. As described previously the player can host or join a game from the list.

3.2.1 Hosting a game

The host needs to input a unique game name and the port he will open for the connection. When the host button is clicked the following procedure is triggered:

  1. Obtain public IP address
    • Originally this should be handled by the Rendezvous-Server, because it is hosted behind no NAT and can see the public IP of requests, but limitations of the chosen hosting service prevented this approach (more on that later)
    • Instead I used a web request to free services like icanhazip.com or bot.whatismyipaddress.com as a second backup in case the first service is down; these websites respond with a plain text containing the ipv6 or ipv4 of client/request
  2. The Rendezvous-Server is notified of the new multiplayer game entry and saves the game with public IP and port, both sent to the server by the host
    • Host sends GET-Request to the server (web server) containing all the information needed /registermpgame?name=GameOne&hostIP=1.1.1.1&hostPort=4141
    • On success the game is registered and a token is returned to the host; the token is needed for further actions affecting the created multiplayer game
  3. The host now waits for incoming connections from other players/peers
    • The host sends another GET-Request to the Rendezvous-Server /listenforjoin?token=XYZ123
      • This is a long-polling request (websocket alternative): the connection is held open by the server until a player joined the multiplayer game
      • If that is the case the GET-Request is resolved with the public IP and port of the joined player, so that hole-punching is possible
      • If no player joins until the timeout is reached (I’ve set the timeout to 15 seconds), the request is resolved with http status code 204 No content and no body
      • In that case the GET-Request has to be sent again and again until a player joins
  4. On player join both peers init a connection and punch through NAT
  5. If successful the game starts
  6. (Otherwise a Relay-Server is needed; explained previously)
  7. The host closes the game with another GET /startorremovempgame?token=XYZ123

3.2.2 Joining a game

The player first needs to input a valid port. After that he is presented with a list of multiplayer games by retrieving the information from the Rendezvous-Server with a GET-Request to the endpoint /mpgameslist. This returns a JSON list with game data objects containing the following infos:

  • name: multiplayer game name
  • hostIP: public IP of the host
  • hostPort: port the host will open for the connection

If the player clicks “Join” on a specific game list item the following process handles the connection with the host:

  1. Obtain public IP address
    • Originally this should be handled by the Rendezvous-Server, because it is hosted behind no NAT and can see the public IP of requests, but limitations of the chosen hosting service prevented this approach (more on that later)
    • Instead I used a web request to free services like icanhazip.com or bot.whatismyipaddress.com as a second backup in case the first service is down; these websites respond with a plain text containing the ipv6 or ipv4 of the client/request
  2. Inform the Rendezvous-Server of the join
    • Send a GET-Request with all the information needed /joinmpgame?name=GameOne&ownIP=2.2.2.2&hostPort=2222
    • Now the host is informed by the server if the host was listening
    • The server resolves the request with the public IP and port of the host
    • Now the player and the host try to establish a P2P connection with hole-punching
    • If successful the game starts
    • (Otherwise a Relay-Server is needed; explained previously)

3.3 Game Synchronization

Real-time synchronization of game states is a big challenge. Unlike turn-based games the game does not wait until all infos are received from the other players. The game always goes on with a desirably minimal amount of lag.

Of course the whole game state could be serialized and sent to all players, but this would have to happen very frequently and the package size would be very large. Thus resulting in very high bandwidth demand.

Another approach is to only send user inputs/orders, which yields far less network traffic. I used this lightweight idea, so when the player issues an order the order is immediately transmitted to the other player. There the order is executed as well.

The following game events are synchronized:

  • GameStart: After the game scene is loaded the game is paused and the peer sends this message to the other player periodically until he receives the same message from the other peer; then the game is started
  • RandomSeed: Per game a “random seed master” (the host) periodically generates a random seed and distributes that seed to the other player; this seed is then used for all random calculations
  • All 3 ship orders:
    • ShipCourse
    • ShipSpeed
    • ShipTarget
  • GameSync: All of the previous messages still led to diverging game states, so a complete game serialization and synchronization is scheduled to happen every 30 seconds
    • Projectile positions, rotations, velocities are synched
    • The whole ship state is synched
    • Both game states (the received one and the own one) are interpolated, because I don’t use an authoritative server model and so both game states are “valid”

The following game events should have a positive impact on game sync, but are not implemented yet:

  • ProjectileFire: Syncs projectiles being fired
  • Waves: Because the waves have a small impact on the position where projectiles are fired and hit the ship the waves should be in-sync as well

3.3.1 IDs

In game development you mostly work with references. So for example a ship has a reference to another ship as the firing target. In code this has the benefit of easy access to the target ship’s properties, fields and methods.

The problem is with networking these references do not work. Every machine has different references although it may represent the same ship. So if we want to transfer the order “Ship1 course 180” we cannot use the local reference value to Ship1.

Ship1 needs an unique ID that is exactly the same on all machines. Now we can send “ShipWithID1234 course 180” and every machine knows which ship to address.

In code this is a bit more tedious, because the received ID has to be resolved to the appropriate ship reference.

The most difficult part is finding unique IDs for all gameobjects.

Ships can obtain an ID easily on game start by the host. Projectiles are a bit more tricky, because they are spawned later on. I solved this by counting the shots fired by a gun turret and combining the gun turret’s ID with the shot number to generate a guaranteed unique ID, provided the gun turret ID is unique. Gun turret IDs are combined as well: Ship ID + gun turret location (sternA, sternB, bowA, bowB, …).

Of course with an authoritative server this gets easier as only the server generates IDs and distributes them to all clients.

3.3.2 Lockstep

Additionally there is an interesting and promising approach to discretize the continuous game time called Lockstep. It is used in prominent real-time strategy games like Age of Empires (https://www.gamasutra.com/view/feature/131503/1500_archers_on_a_288_network_.php). The basic idea is to split up the time in small time chunks, for example 200ms intervals. In this time frame every player can do exactly one action that gets transferred to all the other players. Of course this action can also be “no action”. The action is then executed in the next interval almost simultaneously for all players. This way the real-time game is transformed into a turn-based game. It is important to adjust the interval based on the connection speeds between the players, so that no player lags behind. For the players the small order input delay is usually unnoticed, if the interval is small enough.

An important requirement is that the game is deterministic and orders issued by players have the same outcome on all machines. Sure there are ways to handle random game actions, but because AdmiralWW:2 uses random for many important calculations and my development time frame was limited I unfortunately did not implement this technique.

4 Rendezvous-Server Hosting

There are almost unlimited hosting options on the internet. Usually the selection shrinks after a specific programming language is picked. But because I used NodeJS with Typescript, which transpiles the code to default Javascript, there were still plenty of hosting options. If I decided to write the server in C# and therefore run a .NET Core application like the game is written with (Unity uses C# or some exotic programmers use Javascript) many hosting providers drop out.

4.1 Alternatives

Of course there is the option of renting an own dedicated server: very expensive for a simple Rendezvous-Server and maintenance heavy, but powerful and flexible (.NET ok).

There’s the option of a managed server: little maintenance but very, very expensive.

We have VPS (Virtual Private Servers): dedicated servers that are used by many customers and the hardware is distributed among them, cheaper.

Then there are the big players like AWS, Google Cloud Platform, IBM Cloud and Microsoft Azure: they can get very expensive, but in return they offer vast opportunities and flexibility; it is easy to scale and monitor your whole infrastructure and a load-balancer can increase availability and efficiency of your server(s); on the other hand the learning-curve is steeper and setting up a project needs more time.

4.2 Heroku

Heroku is a cloud based Platform-as-a-service (PaaS) offering hosting of many common programming languages like Javascript/NodeJS (which I used), Python and Ruby. It does not offer as many possibilities as AWS and co, but it is way simpler to learn and set up.

Also it does have a completely free plan, which grants over 500 hours uptime per month. This is not enough to run the whole month with 30 * 24 = 720 hours, but the application sleeps after 1 hour with no actions and automatically wakes up again if needed. This is perfectly fine for a Rendezvous-Server, because it is not used all the time. The wake up time is not that bad as well (around 4-8 seconds).

Of course Heroku offers scaling so that the performance is massively increased and the app will never sleep, but this comes with a price tag.

In a paid plan Heroku also has a solid monitoring page with events, up- and downtimes, traffic and so on.

Server logs are easily accessible as well.

For setup you just need to create a “Procfile” in your project folder that defines what to execute after the build is completed: web: npm run start will run the npm script called start as a web service. The application is then publicly reachable on your-app-name.herokuapp.com. The NodeJS web server can then listen on the port that is provided by Heroku in the environment variable process.env.PORT.

Deployment is automated: just push to your github master branch (or the branch you specified in Heroku); after that a github webhook triggers the build of your app in Heroku.

But during development I discovered a big disadvantage: Heroku does not support ipv6.

This is a problem, because I wanted to use the Rendezvous-Server as a STUN-Server as well, which can determine and save the public IPs of client requests. But if a client like me only has Dual-Stack lite (unique ipv6 but the ipv4 address is shared among multiple customers) Peer2Peer is not possible with the shared ipv4.

As a workaround the clients obtain their public ipv4 or ipv6 via GET-Request from icanhazip.com or as a backup from bot.whatismyipaddress.com. These websites return a plain text body containing the public IP. After that the peers send their public IP to the Rendezvous-Server as explained previously.

5 Architecture Overview

Typescript usually is a very good choice for larger projects, simply because of the type-safety and development-time error checking. This guarantees no more searching for bugs like typos as it is the case in plain Javascript.

To realize the web server I used the very popular ExpressJS, which does not need any introduction and should be well-known by this time.

6 Conclusion

Real-time multiplayer games are tricky. The game states quickly diverge and much effort has to be done to counteract this. Game time differences and lag drastically compound this. But methods such as Lockstep can help to synchronize the time across multiple players.

While developing, try to keep the game as deterministic as possible, so that player actions yield the same result on every machine. Random is usually problematic, but can be handled via a dedicated game server or seeds.

Peer-to-Peer is a simple and great solution for smaller indie multiplayer games, but comes with some disadvantages. For larger projects dedicated/authoritative servers are favourable.

Heroku offers a fast and simple setup for hosting cloud applications and the free plan is great for smaller projects. If demand increases scaling is no problem and the deployment is automated. But be aware of the missing ipv6 support of Heroku.

All in all: Gaming is fun. Strategy games are fun. Multiplayer is fun – for the player and an exciting challenge for developers.

Montagsmaler – Multiplayer online game running on Amazon Web Services

by Jannik Smidt (js343), Niklas Schildhauer (ns107) and Lucas Crämer (lc028)

Project idea

Montagsmaler is a multiplayer online game for web browsers. The idea is derived from the classic Pictionary game, where players have to guess what one person is painting. Basically, we have built the digital version of it, but with one big difference: Not the players are guessing, the image recognition service from AWS is guessing. The game’s aim is to draw as good as possible so that the computer (an AWS service) can recognize what it is. All players are painting at the same time the same thing and after three rounds they see the paintings and the score they have got for it. 

Goal

At the beginning of the course, neither of us had any experience in cloud development. For this lecture we developed Montagsmaler exclusively from scratch. During the project, we have learned and tested new concepts and deepened our skills in software engineering and cloud computing. This article should give you a brief overview of our app, its challenges and the corresponding solutions during development.

Technical architecture

Cloud-Components

Amazon Cognito

Amazon Cognito is an AWS Service for user identification in the cloud. Cognito offers an API and SDKs for simple implementation for popular tech stacks.
We use Cognito for saving personal user data and handling the registration and authentication of the user accounts in our app. Cognito offers EMail verification for user accounts and state of the art token-based stateless authentication techniques. 

Amazon S3

Amazon S3 is an object storage with a REST API. It offers high scalability, availability and fine granular access control. We use an S3 bucket to store the pictures which are saved during the games.

Amazon Rekognition

Amazon Rekognition is an AWS Service for computer vision tasks. Like Cognito Rekogniton offers an API and SDKs for simple implementation for popular tech stacks. We use Rekognition for labelling the pictures during a game after they were stored in the S3 bucket. These labels are then used to calculate a score for the picture which was submitted. 

Amazon ElastiCache (Redis)

Amazon Rekognition is an AWS Service for Redis. We wanted to use it in our architecture for a redis cluster, but since we do not have permission to start even a single ElastiCache Instance, we could not use it at the end.

Amazon Elastic Container Service

Amazon Elastic Container Service is a highly scalable, container management service that makes it easy to run, stop, and manage containers on a cluster. We have one cluster and this cluster has one Elastic Container Service, which contains the core of our application the Montagsmaler API. Our application is continuously deployed with a Task Definition (more on that in CI-Components). This Task Definition deploys two docker containers. One container contains the Montagsmaler API. The other container contains a redis-server since we could not use the Amazon ElastiCache due to permission restrictions.

Amazon Application Load Balancer

We use an Amazon Application Load Balancer which routes all the traffic to the Elastic Container Service. We currently only have one cluster with one service instance, so it fulfills the role of a reverse proxy as of right now. We can not use TLS encryption since we do not have permission to access the AWS Certificate Manager, which is kind of a bummer since it leads to popular browsers refusing to store the HTTP-only cookie containing the refresh token since we can not enable “SameSite: Secure”, which is required.

Amazon Amplify

Amplify offers two products and functions: the Amplify Framework to create serverless backends and static web hosting. For us the static web hosting was interesting. We used it to host our angular frontend. It’s a simple tool which is connected to our github repository and automatically builds the master branch, when a new commit was made (more in CI-Components).

CI-Components

Github Actions

We use GitHub Actions for “continuous integration” of our application. We have an action which automatically tests and deploys the backend to AWS. This action is triggered on every push or pull request to the master. 

Test

The action runs on a Ubuntu machine with a node installation. First it runs the unit tests and then it runs the e2e tests. If even one test fails the deployment stops and we get a notification via EMail. 

Deploy to AWS

The deployment depends on the successful test. It does also run on a ubuntu machine. It starts with configuring the AWS credentials which are stored in the GitHub Secrets of our repository. Then it logs into the AWS Elastic Container Registry and builds the docker image to push it. On successful build and push the AWS Elastic Container Service Task Definition with the new image is rendered. Here we need an extra step since we do not have proper access to AWS IAM: Usually the URI of the credentials is put into the task definition, but we do not have permission to access this URI and our credentials are only valid for about three hours. That is why we take the credentials here also from GitHub Secrets and then insert them manually into the Task Definition using the shell and inplace substitution with sed. When the Task Definition is ready it is deployed to our AWS Elastic Container Service.

We used amplify’s static web hosting to provide the frontend. It would also have been possible to provide the frontend with an AWS S3 bucket. We chose Amplify because of its simple continuous workflows. Once we connected Amplify to Github, all we had to do was select our project, choose the master branch and adjust the build settings. Now the settings were ready and the frontend will be deployed to the master with every new commit. So the latest version is always hosted on AWS.

Montagsmaler-API

NestJS

The HTTP and Websocket API which forms the core of the application is built with NestJS. NestJS is a framework for building efficient and scalable Node.js server-side applications. It is heavily inspired by the architecture of the popular frontend framework Angular, while also taking lots of ideas from Spring. Like Angular it comes with built-in TypeScript support and it combines elements from Object Oriented Programming, Functional Programming and Functional Reactive Programming. 

It makes heavy use of metaprogramming with TypeScript Decorators to provide an advanced modular architecture with dependency injection with the focus on separation of concerns and high testability. 

NestJS provides full compatibility to popular express middlewares and libraries, but can be configured to use different HTTP Server frameworks at your desire. But it also provides a very rich ecosystem with idiomatic solutions for standard problems regarding configuration, pipes e.g. validation pipes, exception filters, authguards, websocket gateways etc. 

The Game

Lobby

Before you can start the game you have to create a lobby. On creation each a UUID is assigned to each lobby, which players can use to invite their friends via an invitation link. Leaving/joining the lobby broadcasts an event to all lobby members. Initially joining the lobby returns the current state of the lobby. The lobby leader (the player who created the lobby or in the case he/she left the lobby the player who joined after and so on) has the permission to configure and start the game. You can configure a round duration between 30 to 300 seconds and up to 10 rounds. Starting the game broadcasts the LobbyConsumedEvent to all members, which contains data about the configured game so all players can join it. As a side effect it also deletes the lobby from the redis storage and it sets a timer to initialize the game loop. Lobbies get automatically cleaned up after two hours in case they are not started to prevent memory leaks.

Games

Games are driven by the game loop. The game loop emits static events based on the given configuration of the game. Consuming the lobby initializes a not started game. After a specific time is over the game starts by emitting the GameStartedEvent. The following RoundStartedEvent starts the game round. After the configured time the round is ended and the RoundOverEvent with the scores of all submitted images is emitted. Within that time frame one picture can be published by each player: An image of the picture is uploaded to an AWS S3-Bucket and then feeded into the AWS Rekognition API. Depending on the time the player needed to publish and the confidence of the expected label given by the Rekognition API a score for the image is determined. After the score is determined the ImageAddedEvent with the corresponding score and link to the uploaded image is emitted. This process repeats for the configured amount of rounds. After all rounds were played the GameOverEvent is emitted which contains the high score and links to all submitted images with the side effect of deleting the game and the saved events. 

Security

Authentification

Authentication is required for playing the game. The registration requires EMail authentication. 

All requests to the HTTP and Websocket API of the game are protected by validating the access token which is transmitted in form of a signed JWT. The refresh token which is used for refreshing the access token is an HTTP-Only Cookie. That makes it possible to store the access token only in memory on the client. These measures protect against common CSRF and XSS attacks.

The API has a middleware for request rate limiting and players are only able to start one game at the time so they have to wait until the previous game is finished before they can start a new one to protect against denial of service attacks.

The game itself also has more security mechanisms build-in regarding the game logic. Only lobby-leaders are allowed to start the game. The lobby-leader is the player who created the lobby, when he leaves the player who joined first becomes the next lobby-leader and so on. The players who are able to join the game are locked once the lobby is consumed, so players can not join a random game. Players can only submit their pictures within the start and the end of a round. This is ensured by a state machine. More on that in challenges.

Challenges

Distributed Gamestate

In single player games or gameservers which only run on one instance, the question where to store the gamestate does not arise. We however wanted that a player could connect to any server instance behind a load balancer and is still able to connect to any game properly. That is why the game state in the application had to be distributed. Since it is just a game and not a serious business application we do not require any specific delivery guarantees or consistency model for our distributed game state. If one in hundred games crashes due to an irrecoverable inconsistency in the game state we can live with that. What we are concerned about is performance. Latency can significantly impact the gaming experience. That is why one important aspect was low latency. The storage medium should also be able to horizontally scale out in the form of a cluster and it should support some kind of publish and subscribe mechanism which we can leverage to distribute events across the instances. With those requirements the choice fell on redis since it is an in-memory key-value store which focuses on performance and it offers a publish and subscribe mechanism. Redis also supports scaling out with Redis-Cluster. So we were settled on Redis. But there was another elephant in the room. In which way do we save the state on the redis? The game loop emits events which are distributed using build-in redis and publish and subscribe mechanism, which we extended to also save all events in order in a redis sorted set. So all the events of a game are saved in a sorted set per game. The maximum events per game can not get very large since there is a maximum amount of rounds which can be played. So having the game state itself saved in the redis and editing it with every event within a lock seems very expensive compared to just accumulating it from the events in the sorted set, which can be retrieved without any lock in a read only operation, whenever it is needed. That is why we settled on pure event sourcing for the game state. So for example whenever a player tries to submit a picture all the events of the game are retrieved from the sorted set and accumulated to the current game state using the state pattern. If the current state is RoundStarted and the player submits for this specific round and the player has not submitted for this round yet the submission is accepted and the picture is rated which leads to the following ImageAddedEvent. So the state is important for validating client events.

Race Conditions/Locking

Although event sourcing significantly reduced the amount of locks we need within the game logic there is still logic that can lead to race conditions and therefore the need of locks. For example while players are submitting a picture we need a mechanism that protects against a player submitting a picture twice which could be possible within the time frame of the AWS API calls which are used for giving a score to the picture since the ImageAddedEvent is emitted after this process was successful. This is a common race condition which can be prevented by putting a lock around the logic from retrieving the state to emitting the event. For locking we use the popular Redlock algorithm, which has its problems though which can lead to inconsistencies according to an article by Martin Kleppmann:

https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html

There is optimistic locking built into redis, but only on specific key value pairs using WATCH. This does not offer quite what we want and because as mentioned earlier unlikely inconsistencies are not the end of the world for our project we decided to stick with Redlock.

Serialization

Redis does only offer very primitive basic data types in the form of byte arrays and strings. The higher data types like the redis sorted set are just containers for those primitive data types. That means if you want to store objects in redis you need lots of serialization and deserialization which introduces new problems. One problem is that serialization, especially non-binary serialization, can be quite expensive. We ignored the expense of non-binary serialization in our app for the time being. One more problem, which was more important for us, is that the serialization for redis also introduces higher complexity for the programmer while dealing with the data since there is no (virtual) continuous chunk of memory accross the network and is unlike dealing with your data only within an operating system’s process. This results in no call-by-reference but only call-by-value, so if you introduce for example cyclic dependencies in your data objects you need some kind of special algorithm to deal with it. JavaScript has built- in JSON serialization. The problem with JSON serialization is, that it does not serialize to the actual class instance, but rather a plain JavaScript object which has the same data properties. That means it also deserializes to a plain JavaScript object and not to the class instance it once was and how should it, JavaScript is a prototype-based language and the object prototype is lost in the serialization process. We did not want to have constraints on the objects which are serialized nor annoying manual instantiation of an actual class instance from the object. That is why we created a small library which introduces a TypeScript Class Decorator @Serializable() for the classes you want to serialize and a function which deserializes and instantiates to the actual class instance. This helped to increase productivity while working with redis as an object store. Under the hood it makes use of TypeScript Decorators, ES6 Object functions and an algorithm for dealing with cyclic dependencies.

Websocket Interface

From the beginning it was clear that our application needed a websocket interface for bidirectional communication between client and server. Since NestJS provides Websocket support out of the box with socket.io server under the hood, it was the websocket server of choice. The biggest benefit of using socket.io is backwards compatibility with browsers which do not have native websocket support using a http ajax polling technique as a fallback, if no native websockets are available. In retrospect though I would choose a pure websocket API instead of socket.io since socket.io adds a bunch of overhead, the client on the frontend is pretty old fashioned and websocket support in browsers nowadays is really good. According to caniuse.com almost 98% of users use browsers which support native websockets (29.08.2020).

Nobody in our group had any experience with building a websocket API and even after building the application I am not really sure what is good and a bad practice. I tried implementing it with the events of the game in mind. I found it quite challenging and I do not think the API is particularly well designed, but it works. From doing research online it seemed like there are not a lot of guidelines yet. If anyone has good resources on that feel free to message me since I am genuinely interested. Authentication was another problem. The JWT containing the Access Token is sent as a query parameter of the websocket connection and is verified with every event, which was sent from the client. Authentication in websocket connections is still one of those big question marks in my head regarding Websocket APIs. 

E2E-Testing

End-to-end tests are on top of the testing pyramid (https://martinfowler.com/articles/practical-test-pyramid.html). They are actually meant to test your entire, completely integrated system. In our case the e2e-Test of the Websocket API sets up a full Nest Application and connects to it via the node.js socket.io client in the Jest Testrunner, which then initializes a lobby with two members and goes through a whole game. The problem was that the Jest Tests have a timeout of 10 seconds, which means that a whole normal game can not be played within that time frame. To work around this, the service, which initializes the game loop, gets the value of a second in milliseconds injected in the form of a provider into the constructor. On the standard application this provider returns the constant value of 1000 milliseconds. All the time constants within the class are then calculated based on this constant. In the e2e test this provider is overwritten and the value is set to only 50 milliseconds. Using this trick a whole game can be played out in the e2e test within the time frame of 10 seconds. There were also some other sacrifices made regarding the complete integration: The AWS and Redis Providers are mocked and overwritten in the Nest Application for the e2e test. 

Rekognition Service 

A machine learning algorithm for object recognition in pictures requires the picture to include a background. This led to a problem in our drawing component, because the standard was that the drawing of the user was saved as PNG. This led to the drawing having a transparent background. The AWS Rekognition service identified in these pictures only “black” as an object, because the algorithm only considered the inside of the black lines of the drawing, not viewing it as a whole. To ensure that every picture has a background, the solution was to change the format from PNG to JPEG, because JPEG doesn’t support transparency. The library we used to implement a drawing canvas made it easy to change the format, but the new JPEG pictures were now all black. After some research we realized that the problem was that the previously transparent pixels were now saved as “fully black but transparent” pixels by the canvas. Resulting in the transparent pixel becoming black when turning non-opaque by the JPEG format. The solution to this problem was to manually change the background pixel to white instead of black. This change made us face another problem regarding the canvas visible to the user. The change in pixels resulted in aliasing problems or crashes in the HTML canvas. To avoid this from happening, we copied the existing content of the canvas in a new, invisible canvas, in which we applied the pixel-shift. In that way we ensured that the picture visible to the user receives no change while adding a white background to the copied picture.

Word Similarity

To ensure that the users of our game don’t always receive 0 points for their drawings if the AWS Rekognition service doesn’t identify the correct word, we had the idea to calculate the similarity of the other object names recognized by the service. To calculate the context similarity of words, we used the continuous bag of words model. The idea behind this model, is to calculate a vector representation of each word, based on the previous and following words. We decided to implement the code in python, based on the already existing machine learning libraries gensim and tensorflow. The main problem of this algorithm was its dependency on a very big dataset of text, e.g. a Wikipedia dump. The time it takes the code to load the model of a 4 to 20 GB dataset was too long for the AWS instances. Additionally, we would need an instance with a huge amount of RAM, which we couldn’t afford with the AWS student account.

As insurance that the user receives most of the time points, we hard coded similar words for every word a picture has to be drawn for.

Presentation of the Game

Sketches

Demo

Perfekter Glühwein für Zuhause: Thermometer mit Raspberry Pi und AWS

Abstract

Kein anderes Getränk ist mit Weihnachtsmärkten so verbunden wie Glühwein. Und so trinkt sich der ausschweifende Weihnachtsmarktbesucher im Laufe der Adventszeit von Stand zu Stand bis er schließlich am Ende des Jahres seinen Lieblingsstand gefunden hat. Doch auch daheim kann der perfekte Glühwein gelingen. 

Wir zeigen, wie man sich ein Glühweinthermoter mit Cloudanbindung selber baut, und so perfekten Glühwein und Komfort miteinander kombiniert. Und das ganz ohne gedrängte Weihnachtsmärkte und Mundschutz.

Prost!

Einleitung

Unser dreiköpfiges Team hatte für die Vorlesung Software Development for Cloud Computing das Ziel, die Grundlagen der Entwicklung in einer Cloud Umgebung zu lernen und dabei ein Projekt auf die Beine zu stellen, welches diese Grundlagen in der Praxis umsetzt. Ein interessanter Aspekt der Cloud war für uns dabei die Bereitstellung einer überall erreichbaren Umgebung, über welche wir verschiedene Geräte miteinander kommunizieren lassen können.

Daher kam uns die Idee, ein Thermometer zu bauen, welches mit einem Raspberry Pi verbunden ist und wir die Daten über die Cloud verarbeiten und an ein Smartphone weiterleiten. Darüber soll es möglich sein, die aktuelle Temperatur abzulesen und eine Prognose für die Dauer bis zum Erreichen einer einstellbaren Temperatur zu stellen.

Unser Projekt besteht aus drei logischen Schichten. Unser Sensor stellt ein Raspberry Pi mit angeschlossenem Thermometer dar. Der Sensor dient der Feststellung der Flüssigkeitstemperatur, die wir anschließend in der zweiten Schicht verarbeiten. Unsere zweite Schicht stellt dabei eine EC2 Instanz bei AWS dar. Diese erledigt die Berechnung der Zielzeit und stellt einen Webserver für die dritte Schicht, der Datenanzeige bereit. Die Anzeige stellt Informationen und bietet Möglichkeiten der Steuerung des Systems. Es bestehen also bidirektionale Verbindungen, damit der Benutzer Konfigurationen am System unternehmen kann.

Ablauf

Der grundlegende Ablauf in unserem Projekt sollte also folgendermaßen aussehen:

Im ersten Schritt scannt das Smartphone einen QR-Code auf dem Raspberry Pi, damit die richtige Zuordnung von Raspi und Smartphone in der Cloud später gewährleistet werden kann. Als nächstes beginnt der Raspberry Pi, die Temperatur über das Thermometer auszulesen und schickt diese an die Cloud weiter. Sobald der Nutzer nun seine Zieltemperatur eingegeben hat und die Abfrage gestartet hat, wird dies im vierten Schritt mit der ID des Raspberry Pis an die Cloud übermittelt. Nun kann diese die Daten des Raspis mit der passenden ID verarbeiten, die Zeit errechnen und das Ergebnis an die App weiterleiten.

Backend

In unseren ersten Schritten wollten wir uns mit der Cloud vertraut machen und erste Instanzen darauf laufen lassen. Dabei entschieden wir uns für die Cloud von Amazon Web Services (AWS), da es zu dieser eine gute Dokumentation gibt und sie alle für uns notwendigen Komponenten bereitstellt. Zwar kostet die AWS Cloud im Gegensatz zur IBM Cloud auch für Studenten etwas, aber dies stellte für uns kein Problem dar, da wir von der HdM genug Credits zur Verfügung gestellt bekommen haben.

Bei unserem ersten Versuch, eine EC2 Instanz zu starten, stießen wir aber bereits auf einige Probleme. Der Grund dafür war, dass die AWS Cloud relativ komplex ist und sehr viele Möglichkeiten bietet, die Instanzen zu individualisieren und zu optimieren. Dies ist besonders für Einsteiger zu Beginn relativ überfordernd. Am meisten Probleme hatten wir mit dem Einstellen der Security Groups. Diese sind notwendig, damit der Zugriff auf den Server von außerhalb möglich ist. Erst nachdem wir den Zugriff auch über die verschiedenen Protokolle wie TCP und UDP geöffnet haben, konnten wir auf den Server zugreifen.

Als nächstes mussten wir unseren Raspberry Pi so erweitern, dass er die Temperatur messen kann. Dafür haben wir ein Thermometer gekauft, welches wir mit dem Raspberry Pi verkabeln.

Um nun die Temperatur auch zu verarbeiten, benötigten wir ein Skript auf dem Raspberry Pi. Wir entschieden uns hierbei für Python, stellten aber im Nachhinein fest, dass eine Sprache, welche nativ auf dem Gerät läuft, sich hier besser geeignet hätte. Dies hat damit zu tun, dass der Raspberry Pi in unserem Fall ja nur als Testobjekt fungiert, auf welchem Linux installiert ist. Eigentlich sollte es auch möglich sein, die Aufgabe des Raspberry Pis auf ein embedded System zu übertragen, welches nicht die Möglichkeit hat, Python zu nutzen. Hätten wir dies im Vorhinein beachtet, wäre der Übergang vom Raspberry Pi zu embedded Systems einfacher.

Das Thermometer schreibt die ganze Zeit die aktuelle Temperatur in eine Datei auf dem Raspi. Diese lesen wir mit dem Skript jede Sekunde aus und schicken sie dann gemeinsam mit der ID des Raspis an den Server.

Im weiteren Verlauf des Projekts beschäftigten wir uns mehr mit der Serverseite in der Cloud. Wir entschieden uns für eine Node.js-Lösung, welche in der EC2 Instanz läuft, da Node.js mit get und post requests alle von uns benötigten Kommunikationsmittel zwischen dem Raspberry Pi und dem Smartphone bereitstellt. Auch im Nachhinein erwies sich Node.js als eine gute Wahl, da das Aufsetzen des Webservers keinerlei Probleme bereitet hat und die Kommunikation auch mit dem Python Skript auf dem Raspi einwandfrei geklappt hat.

Unser technischer Ablauf und die Kommunikation unter den Geräten sah nun folgendermaßen aus:

Frontend

Zu Beginn des Projekts bestand unser Frontend nur aus einer Webseite, welche die Temperatur des Thermometers anzeigen sollte. Später erweiterten wir diese mit einem Zeit-Temperatur Graph und einem Thermometer zur Darstellung der Temperatur. Dies ließ sich mit HTML, CSS und etwas JavaScript relativ simpel realisieren. Später wurde diese Ansicht durch eine Android App erweitert. Diese stellt eine mobile Möglichkeit dar, sich über den aktuellen Stand zu informieren. Technisch gesehen handelt es sich hierbei um eine WebView, die die Webseite mobil anzeigt.

Dieser Weg, erst eine leicht wartbare Webseite zu erstellen und diese anschließend per WebView auf dem Smartphone aufzurufen erwies sich als gute Idee. So konnten wir uns erst um die Funktionalität der Geschäftslogik konzentrieren und diese anschließend ohne viel Code auf dem Handy nutzen. Jedoch muss man die Optimierung für verschiedene Geräte dann nicht in der App selbst, sondern in der Website vornehmen, was etwas mühsamer ist als in Java für Android.

Zeitberechnung

Ein zentraler Wunsch war es, eine zeitliche Abschätzung zu erhalten, wann nach aktuellem Temperaturtrend die Zieltemperatur erreicht wird. 

Dafür haben wir zuerst eine Beispielmessung eines Temperaturverlaufs durchgeführt. Eine Analyse verschiedener Trendlinien hat ergeben, dass sich eine quadratische Regression am Besten eignet. Bei der Auswahl haben wir ein besonderes Augenmerk auf die Genauigkeit der Zeitabschätzung nach kurzer Zeit gelegt, sodass wir bereits relativ früh eine gute Abschätzung bekommen. 

Die eigentliche Berechnung erfolgte dann in 3 Schritten. Zuerst wurden die Mittelwerte der Messwerte ermittelt und mit diesen nach den Formeln der quadratischen Regression die Faktoren einer quadratischen Gleichung bestimmt. Anhand dieser konnten wir nun den Schnittpunkt mit der gewünschten Zieltemperatur berechnen. Vorteil dieser Variante ist es, dass wir auch negative Temperaturtrends, sowie andere Zieltemperaturen verarbeiten können. Bei der praktischen Anwendung stellte sich jedoch heraus, dass das System einige Schwachstellen aufweist. So können gleichbleibende Temperaturen, die vor allem in der Anfangsphase einer Erhitzung auftreten, die Berechnung sehr ins Schwanken bringen, sodass manchmal für längere Zeit keine Zielzeit berechnet werden kann. Auch kommt es zu teils starken Schwankungen im Verlauf einer Messung. Diese Probleme können jedoch durch eine Bereinigung der Daten im Voraus gelöst werden.

Fazit

Wir haben im Laufe des Projektes natürlich nicht nur viel Mathe gemacht, sondern auch sehr viel über Cloud Computing gelernt. Für einen Anfänger, der vorher noch nie mit AWS in Kontakt kam, ist der Einstieg ziemlich überfordernd. Es gibt deutlich einsteigerfreundliche IaaS-Anbieter wie z.B. die IBM-Cloud.

Was unseren Server angeht, sind wir auch recht zufrieden mit unserer Wahl von Node.js als Web-Backend. Node.js bietet den Vorteil, dass es sehr einfach ist, einen Webserver aufzusetzen, der auf Anfragen hört und gleichzeitig eine Webseite liefern kann. Braucht man mehr Performance und stellt viele parallele Anfragen an den Server, würde es sich lohnen einen Server in Go aufzusetzen. Dasselbe gilt für unseren Raspberry Pi. Das Python-Skript zu schreiben ging ziemlich schnell, aber auch hier könnte man auf eine performantere Lösung in C++ einsetzen. 

Durch eine erprobte Zielberechnung hat unser Thermometer deutlich an Funktionalität gewonnen und kann für nun für verschiedene Temperaturen eingesetzt werden.

Unser Projekt war ganz klar auf Anfänger ausgerichtet. So wurden bereits genutzte und bekannte Technologien mit neuen Technologien der Cloud kombiniert. Dabei konnte der Funktionsumfang von AWS natürlich nicht vollständig ausgenutzt werden. Jedoch haben wir uns Schritt für Schritt an der Cloud bedient und so einen ersten Einblick in die Welt von IaaS erhalten.

Last article: ArcolotBot
Last tool: BotScanner

Geschrieben von: Nikolai Thees, Michael Partes & Joshua Gertheiss