In this approach, the Alexa Skill is built in such a way that no dialogs are configured in the Skill backend. The skill only serves to capture the user's statement, convert it into text (speech-to-text) and forward it to the chatbot backend.
Since the Alexa Skills Kit (ASK) does not inherently provide a way to tap the user's statements, a little trick is necessary: To do this, a single intent is created that uses a custom slot type to grab the entire statement and provide it as an entity (called a slot in ASK). As a starting point for the Skill Mediator, the alexa-bridge Project was used and extended. Details on how to set up the makeshift intent and the custom slot type can also be found here.
The application now reads the utterance from the slot value and sends it as a message to the chatbot backend. Once this returns a message, it is converted in the Skill Mediator and sent back to the Alexa backend. From there, it is forwarded to the Amazon Echo end device for output.
The procedure is well suited for very simple dialogs as well as Alexa Skills operated primarily in English. However, if the dialogs are to be more complex or a skill in German is used, variant 2 is better suited. This is mainly due to the fact that the speech recognition (NLU) of German texts lags behind the English language. Depending on the type and form of the statements and entities, comprehension problems may occur here.
The big advantage of this variant is the limitation to only one intent, which is configured in the skill backend. If new dialogs are added, no adaptation is necessary.