The system allows a business to train its own chatbot that will be able to answer phone calls, greet user over the phone/sms, provide information about a client’s services and also help a user to schedule an appointment based on the available time slots. The application is HIPAA-compliant and has emergency calls scripts. There is also a dashboard where clients are able to create, configure and train very own conversational agent.

Technical Solution

Because of the nature of the chatbots and in spite of some specific application requirements the project is based on the following technologies:

  • Natural Language Processing (NLP) – as application should be able to understand human’s natural language and translate it into the signals can be understand by machine / program. In our case we are saying about Natural Language Classification (NLC) problem.
  • Speech to Text is a component of the system which is responsible for translating recorded/streamed voice into the text which then can be processed by NLC component.
  • Text to Speech – a component that is responsible for converting predefined text answer to voice, thus the answer can be transferred to user via phone.
  • Named-entity Recognition (NER) component is required because such things as products, pricing, location and mainly time and dates should be recognized and extracted from the user’s speech. Thus the system will be able to appropriately schedule the appointment of practitioner and user.

As the team was developing an MVP, the speed to market is one of the key factors. That is why instead of third-party services such as Google Tensorflow, we strongly suggested using IBM Watson services which can help with solving project tasks. Especially the following:

  • Watson Conversation: Quickly build and deploy chatbots and virtual agents across a variety of channels, including mobile devices, messaging platforms, and even robots.
  • Watson Speech to Text: Easily convert audio and voice into written text for the quick understanding of content.
  • Watson Text to Speech: Convert written text into natural-sounding audio in a variety of languages and voices.
  • Voice Gateway: Cognitive Self-service agent. IBM Voice Gateway connects to a telephone network and routes the calls through Watson Speech-to-Text, Conversation, and Text to Speech services.

Voice Gateway is a kind of orchestration tool which was built with the main idea of helping with building of Interactive Voice Response system. It tights / orchestrates together the following IBM services for that purpose: Watson Conversation, IBM Speech To Text, IBM Text to Speech etc. The very base scenario IBM Voice Gateway helps to solve is the following:

The following diagram describes the main system components and possible deployment scheme:

Service Orchestration engine’s main function can be the following:

  • To de-identify requests to remove personal information such as PHI, PII, and PCI before it is sent to the Conversation service
  • To personalize responses from the Conversation service, for example by using customer location information to provide a personal weather forecast
  • To enable telephony features, such as including caller ID or collecting DTMF digits for account numbers
  • To customize interactions with customers by using APIs
  • To use Voice Gateway state variables, for example to complete a long transaction
  • To integrate voice security by using DTMF or biometrics

Tech Stack

  • Python
  • PostgreSQL
  • IBM Watson Conversation (which utilizes IBM Watson NLC and IBM Watson Entity Extraction and/or Alchemy API)
  • IBM Watson Speech to Text
  • IBM Watson Text to Speech
  • IBM Voice Gateway as Watson cognitive services orchestration tool
  • iCalendar standard
  • Twilio


  • 1 Tech Lead
  • 1 Full-stack developer
  • 1 BA/PM
  • 1 QA engineer

Contact Form

Drop us a line and we’ll get back to you shortly.

For Quick Inquiries

Company name
Phone number