Project Overview

Repository

You can find our Project here https://github.com/leon2k6/independend_study.

Description

In this project we want to combine the Google VR, Unity and Android environment to make an easy scene with a character as an assistant. It should be possible to look to the animated character with an Android VR equipment and also talk to him in an easy way.

Related work

In the past there were many projects that are in a way related to our project. In following we discuss three of these projects.

Eviebot

Evie was made by Existor (https://www.eviebot.com/en/). She is an AI and an emotional chatbot. She learned several languages now for ten years from human being. The information, which Evie looks through every time she needs to say something, is stored in a database. Envie has had become very popular on youtube the last years.

Siri

Siri is an AI that is part Apple’s operating system. She is an Apple-based software designed to recognize and process natural-language speech to perform the functions of a personal assistant. The software analyses the users' individual language usages, searches, and preferences to return results that are individualized.

Alexa

Is an intelligent personal assistant developed by Amazon. It is possible to talk with the hardware device, which has to be online. It is also possible to control some smart home devices. The speech recognition works with a cloud based long short-term memory artificial neural network. It is also possible to use Alexa via mobile device like an Android phone. But it is not possible to give Alexa an Avatar like a secretary and talk to it, it will always be a white cube which can understand you.

VR

To create a VR application for Android, the Google VR SDK is used. We used the following tutorial to set up Google VR in Unity (https://developers.google.com/vr/unity/get-started). It is possible to decide between building a daydream or a cardboard application. The daydream environment requires a “daydream smartphone” (https://vr.google.com/daydream/smartphonevr/phones/) and the cardboard requires Android 4.4 'Kit Kat' (API level 19) or higher. For this project we used the cardboard environment to reach more people with the app and support more devices.

Now, by placing a main camera in the scene the VR works which is shown in the following graphic.

VR Sight

To make an input with VR glasses, it is possible to use a controller or the Google VR input called “GvrReticlePointer” which draws a circular reticle and dilates if an object is clickable. In our project, one door and the two buttons (reset/exit) are marked as clickable with the component “Event Trigger”. It is possible to use some event trigger like “Pointer Enter” which will be triggered, if you look to the gameobject with the pointer. This dilation will be shown in the next graphic

Controll Buttons

Android with speech recognition and TextToSpeech

Interface between Android and Unity

The possibility to build some Android APK´s is very easy. It is only necessary to bind the Android SDK to Unity like in this tutorial (https://unity3d.com/de/learn/tutorials/topics/mobile-touch/building-your-unity-game-android-device-testing). But for the interaction between Unity and Android and thus, between C# and Java exists some several types of Android plug-ins, which can be found in the Unity documentation.

  • AAR plug-ins and Android Libraries
  • JAR plug-ins
  • Extending the UnityPlayerActivity Code
  • Native (C++) plug-ins

We use the “aar” type to get more flexability. The aar file gives you the possibility to package all necessary android class. Unity also recommand this format (https://docs.unity3d.com/Manual/AndroidAARPlugins.html).

By calling the instance-method from the android class, you create an instance of your android class an send it to unity.

using (AndroidJavaClass pluginClass = new AndroidJavaClass ("com.example.independentstudy.TTS")) {
textToSpeech = pluginClass.CallStatic<AndroidJavaObject>("instance"); }

The instance is important to call non static methods from your android class. Once you get the instance it is necessary to set the application context in your android class by using the following method.

textToSpeech.Call("setContext", activityContext);

The application context is important to get full android functionality.

To send messages from android back to Unity it is possible to use the following method.

UnityPlayer.UnitySendMessage("Camera", "GetMessage", message);

The variable “message” will be submitted to the “GetMessage” method of the gameobject “camera”. But to use this method, it necessary to inlcude the UnityPlayer.jar from the Unity directory into your Android project.

Speech recognition

There are some possibilities to make speech recognition
  • since Unity 5, speech recognition with Unity
  • use/buy an asset which use the Google Now speech recognition (most of this assets are not for free)
  • Google Cloud Speech API, which use a neuronal network to detect the language and speech, but it is not for free
  • Android.speech, not so powerful Google service as the cloud speech API, but for free
  • Google Now assistant

The Unity speech recognition use the Microsoft Windows API (UnityEngine.Windows.Speech) and because of that, the internal Unity recognition is only usable with windows 7/8/10 and not for Android. The most Unity plug-ins from the asset store are not for free. The Google Cloud Speech API is also not for free, but you can get some free minutes of testing. For the Google Now assistant, we have not found any API for a 3rd party app like ours, so we have to implement our own interface between Android and Unity and use the Android.speech service.

Important step before calling a speechrecognizer function:
activityContext.Call ("runOnUiThread", new AndroidJavaRunnable ((

If you do not use this method before, you will get the error "SpeechRecognizer should be used only from the application's main thread" in Android.

Android text to speech

Instead of using audio files, text to speech from Android (https://developer.android.com/reference/android/speech/tts/TextToSpeech.html) is used, because it is much more flexible. Therefore the Java class called TTS was implemented in Android Studio. Text to speech is used to create the Voice of the character. The TTS class is included in the aar file, which is the interface between Unity and Android.

Performance optimizations

To keep the frames per second high enough for the Android app, it is necessary to make a scene without a terrain, because it needs much performance. A better choice is to use a small environment and make as much as possible gameobjects static to improve the performance. Another possibility to improve the performance is the light. It is better to “bake” the lighting instead of real time calculations. But this is only possible if you have not many dynamic objects which are moving. Some small (shadow) settings, which are recommanded here (https://medium.com/ironequal/android-optimization-with-unity-3504b34f00b0) were implemented but do not have such a high impact.

Character

The character is created with a tool from Adobe called Fuse. Fuse gives you many possibilities to design your character. It is possible to decide between different torsos, legs, arms and heads. Ones you choose these parts. Each of them could be customized. For example it is possible to make the nose bigger. The next step is to choose clothes for the character from a predesigned set. And in the last step the texture for each part can be changed.
To import the character from Fuse to Unity, you need to upload it to https://www.mixamo.com to create a custom skeleton for the character. After that the character is ready to use in Unity and it is possible to use animations from mixamo. The import from Fuse to Unity produces problems. All the materials from the imported characters are broken. The script and workflow from this site (https://forum.unity.com/threads/script-for-importing-adobe-fuse-character-model-into-unity-fixes-materials.482093/) deals with this problems.

Mixamo

Head Look Controller

For a natural movement of the character the Head Look Controller (https://assetstore.unity.com/packages/tools/animation/head-look-controller-4) is used. The Head Look Controller is a script that is provided by Unity Technologies. The script needs to be attached to a character. Inside the script it is possible to decide which parts of the character should be affected by the script. Then you choose another object in the scene and mark it as the point in which direction the character should move its upper body, head or whatever parts should be affected by the script. In the scene the head look controller affect the upper body, the head and the eyes of the character and it looks in the direction of the player.

Dialog:

Dialog

Future Work

Animation

The talking animation does not match the speaking perfectly and the mouth does not move. For the future work it could be very interesting to find new methods to make the character more realistic and may be to read some literature about natural movement processes.

Conversation

Another idea to improve this project is to extend the conversation. To make the character answering more flexible a artificial intelligence could be used. This could make the conversation much more interesting and the character gets more flexible.

Functionality

For now the character is only able to answer on limited specific questions with predefined sentences. It could be interesting to give the character the ability to use external API´s or Android functionality to execute commands from the user. An example is that the user asks for the weather in reykjavik and the programm uses android functionality to find out the answer like in Google Now. Finally the character gives the information to the user.

VR_Sight.PNG - VR Sight (36.5 KB) Pascal Bechtoldt, 2018-01-25 19:50

Controll_Buttons.PNG - Controll Buttons (312 KB) Pascal Bechtoldt, 2018-01-25 19:52

Mixamo.PNG - Mixamo (81.2 KB) Pascal Bechtoldt, 2018-01-25 19:54

Dialog.PNG - Dialog (40.5 KB) Pascal Bechtoldt, 2018-01-25 20:33