You are browsing the archive for Articles.

Post-Mortem

November 28, 2011 in Articles

Post-Mortem

Introduction

DynacTiV is a ETC project that is dedicated to explore interactive TV experience, which is sponsored by Microsoft. The high concept is to use Kinect as a bridge to build connections between the audience and TV broadcasters in real time. In traditional live broadcasts, it’s really just one-way communication. Call-in, twitter, or web comments brought some instant feedback, but not as involving and intriguing and also distracts the viewer. Now, with Microsoft’s Kinect, a unique input device, there are opportunities to take interactive television to where it has never gone before. Advanced body position and motion tracking, facial and voice recognition, and intuitive gesture control are just a few of the features Team DynacTiV explored to revolutionize interactive television.

 

Goal

The primitive goal on the project list description was seeking to “explore turning liver media (TV) into a more interactive experience”. And as important as it is, the exploration “should be unique and something not tried before”.

Some early directions we got from our client contact Arnold Blinn was:

l  Push Kinect. This is a Microsoft differentiating feature that competitors can’t replicate.

l  When using companions, feel free to use them all (iPhone, Android) but don’t ignore Windows Phone or simple Laptops running Windows.

l  Services should be built in C# and run on Azure, so that we have something we can take over and run with if interesting.

l  Prototypes/development of the “TV viewing” experience should be built on a PC with the connected Kinect. If Microsoft likes it and wants it on the console, they will then port it.

l   We want users interacting with the content, not simply controlling the pace of the content in a DVR way.

l  Consider what “interactive” means for interactive TV.

l  Regarding TV genres, LIVE broadcasts that is NOT SPORTS is an area of interest. Given the major U.S. elections for 2012, politics and speeches/debates/town hall meetings is one area with a lot of potential. If you had to work with this genre, what would you do?

 

The Team

We are a team of 5 people, 3 second-year students and 2 hired staff. Two programmers are Peter Kinney and Zach Cummings. Wai Kay Kong worked as designer and 3D artist. Jue Wang took the responsibilities for UI art. Kan Dong was the producer and participated in UI design work.

The Client

We are really excited to work in collaboration with Microsoft Corp., a worldwide software leader, with the goal of using Kinect technology to create enjoyable service in TV watching experience. Our client is interested in our idea to build a real-time relationship between broadcasters and TV viewers. For traditional broadcast shows, there is no connection with the audience they are broadcasting to, the only reaction they can capture is from the people in the venue, whereas DynacTiV tries to connect to a vastly broader scale audience, maybe millions of people watching TV at home. And with Kinect technology, it opens up the possibilities for more organic input, like gesture controls and voice commands.

“Case 1: the ideal vs. the reality “

We came up with tons of ideas in the first week’s brain storming, and coincidentally they fell into different level’s category. Our client wants us to focus on level 2 or 3 interactions.

The second week, we concluded our ideas into 4 concepts: “Augmented Politics”, “Interactive Kinect Story Teller”, “Ghost Town” and “Body Posture”. After proposing to our client, we got lots of applause on the concept of “Interactive Kinect Story Teller”, which has extremely valuable potential for a large crowd of virtual audience members to provide feedback to live performances. Arnold also liked the ideas of “Augmented Politics” and “Body Posture”, said it would be great to integrate these two to crowd source large amount of data to give the broadcast speaker valuable input about the success of their presentation. At this stage, we nailed down our high concept, and started to flesh it out with more design details in the next week.

Our primary focus shifted to the interactions directly between viewers and performers, with less focus on contact between users. By aggregating audiences’ body postures, polling opinions, and background noise levels, Kinect will determine likely levels of user interest in the program. This information can be displayed for the performer, allowing them to adjust to what gains the most positive response from the viewer.

Data to be detected:

l  Video Data

Number of people in room

Amount of movement/ fidgeting

Composite of all data

l  Audio Data

Background noise levels

Layer audio together to simulate “crowd”

l  Other

Voting

Sub-menus

Gimmicks(throwing tomatoes)

Avatars

What went well

The actual path we travelled stuck to the primitive goal of delivering a above-level-2 interactive viewing experience that has never been explored before. We are using Kinect as our exclusive input device, and created the whole experience in the context of political events, even though our system can be absolutely applied to any other live broadcast genres.

Taking interactions between viewers and performers as the priority, we put most of our effort on digging out the most meaningful, efficient and intuitive ways to communicate. We also took Kinect’s unique features into our considerations, dedicating to the experience that competitors can’t replicate. So we came up with passive data collection in which audience don’t have to anything explicitly, and Kinect do the collecting work in the background with its skeleton tracking and array of microphone features. In addition, combined the technology of skeleton tracking and depth image processing, audience can actively interact with the broadcasts.

What went wrong

During the process of development, there were too much back-and-forth resulted into waste of time. For example, we offered the gesture of thumb up/down and throwing flowers/tomatoes at first, but had to drop down the feature of throwing, because it was too distracted and lost the meaning for giving efficient feedback.

Another part that was in our original design, but dropped down at the early developing stage is virtual audience interaction, which we added later on with the stage theme idea after halves. The reason we wanted to abandon it was we were directed to focus exclusively on the interactions between audience and performers without worrying about virtual audience interactions. However, although this is not in the scope of our project, we want to deliver a complete experience with themed art background, and simulated watching experience. So in later development, as we put the viewing environment in a 3D podium, we resumed this feature as a highlight to make audience feels more involving. In future vision, this feature can be ported with Xbox avatar, turning this simulated interactions into the virtual viewing experience that you share with your Xbox friends online.

“Case 2: playful high technology”

What went well:

Working with the Kinect was straightforward and a relatively easy process. We had a very effective wrapper that gave us access to all of the depth, video, and skeletal tracking data streams that we needed. This allowed us to easily get a large portion of the program done in a relatively short period of time. Additionally, the server was relatively pain-free. Previous networking experience and a simple set of requirements allowed us to create and configure the code needed for the server quickly, thus allowing for more time to debug and make additional modifications. Overall the project was paced well, allowing for the coding to be done early enough to allow time for stress testing and debugging. Constant play-tests following halves gave us critical user feedback and helped reveal bugs as early in development as possible. Feedback also allowed us to remove extraneous features that allowed more care to be put into developing the core features that were more closely related to the primary goals of the project.

What went wrong:

Underestimating the difficulty of accessing the Kinect’s audio feed caused a substantial amount of time to be placed into various “dead-end” areas of research and testing. In the end we eventually discovered ways to get most of the audio features we wanted implemented. Some features such as voice-recognition weren’t able to be implemented due conflicts between Mono Develop and MSDN version support. Another difficulty that was only semi-problematic for an early portion of the project was version control. The Unity server was not set up until a few weeks into the project. For the first few weeks, major asset changes and parallel lines of development resulted in multiple versions of the project existing that eventually had to be merged together. After everyone had server access this problem ceased to exist. One of the problems that existed beyond the teams control was related to testing. While attempting to do the best we could, it was virtually impossible to simulate a real-world “at home” experience in our testing environment. Additionally, hardware limitations such as being able to track only two people at a time led to further limits on test-group sizes.

“Case 3: what makes a good interface”

What went well:

The good thing about our user interface development is that we had a very smooth and complete work flow. At the beginning, since none of us had experience with user interface design and development before, we had a relatively difficult start. For the first version of UI, we just did what we thought would work and did not put too much effort on research, as a result, the first version was not good, a lot of functions did not work in the way that we want them to work. However, we figured out a good way to develop UI soon, that is research/design, and then do art/code, and then play-test, and then use the feedback we get from play-test to iterate our design again. It turns out to be a very efficient and good way to development user interface. Usually, interface is a very ambiguous thing since different people have different feeling about the interface. After every play-test, we would get feedback about what made play-testers’ feel confusing and which ways play-testers’ prefer. Based on those feedback, we filtered out what need to be changed  or added. And then artists and programmers would do quick change to the interface and get ready for the next play-test. After several iterations, we got a clear user interface.

What went wrong:

Since we do not have experience with interface development before, we had a very slow start. If we could work efficiently from the beginning of the semester, we would have more time to polish.
And, another thing is about play-testers. Since we had all of our play-test at ETC, most play-testers are gamers or people have some knowledge about interactive experience. However, the audience for our product are more than these. Almost everyone watches TV shows, we have a very wide range of audience. And if we could get more naive play-testers, we would have a better user interface for naive users.

To sum up, user interface is important. At the beginning, we thought our main task in the project is to make sure the technology works, while interface is just a side. But, actually, interface is like the front end of our project. Without a good interface, we could not efficiently test whether our technology works and we could not show people how our technology works.

“Case 4: we didn’t play test at all”

What went well:

That’s an irony, we found out that play test couldn’t be more useful actually. We had 7 open play tests, and more than 100 play testers from ETC students, faculties, and visitors.

Oct 18, Alpha Test (passive)

Oct 19, Alpha Test (passive & active)

Nov 2, UI & Usability Test

Nov 6, Naive Test

Nov 9, Server & Audience Interaction Test

Nov 11, Open-loop Test

Nov 18, Beta Test (close-loop)

Dec 2, Live Debate Test

For each of these tests, we documented with demo fraps, videos and photos of play testers, surveys and inquiries. We found out that segment our whole product into parts and test them at different time helped the whole developing process. From one aspect, testing each feature separately can filter other features’ effect, and let the tester focus on commenting each well, so we got really strong and useful feedback on each of the features. In addition, this saved us time, instigated and guided our progress, because if we waited until everything finished and then do the test, it would be too late to change anything.

What went wrong:

As we figured out where the problems were after each test, it became really hard to follow the schedule we made before. Schedule changed sometimes, and many unplanned works appeared based on how strong testers feel about the drawbacks. We need time to fix the things people hate, in case they’ll still focus on that point next time and don’t give fresh feedback.

“Case 5: design in limitations”

What went well:

Since we had some prior experience working with the Kinect, we knew exactly what the specifications and limits to the Kinect Technology were. Knowing the limitations helped the design immensely. We followed the philosophy of “Small but done well” and it has worked out for us. Experience with the Unity Engine helped determined what was feasible design-wise ahead of time. After the initial 2 week brainstorming session, the project goals were clear and well within scope. Reaching the constant testing phase after halves has proven useful for a plethora of design changes. We have managed to keep Feature Creep in check by maintaining focus and we reached the Polish step on schedule.

What went wrong:

A good amount of originally planned features had to be cut as well because of either technological difficulties, from things found from testing or because similar ideas were merged. Difficulty on the Kinect Audio has lead to certain sub-features becoming cut from the Simulated Audience feature. There were several actions of the Simulated Audience that relied on Voice Recognition (booing, cheering, key words). Thankfully, we came with alternative actions the Simulated Audience would take that would not require voice recognition. The tomato/flower throwing feature was extremely well received but we found that it distracts the viewer from the broadcast. Instead of watching the broadcast, the viewer ends up spending the entire time trying to land a tomato on someone’s face. In light of this, we had to kill this baby because the goal of the project is ultimately about Television and not about scoring head-shots. Voting and Thumbs up/down were extremely similar ideas with two separate methods of input. The original “Voting” was actually more complex, involved and required a lot of gesture inputs in an attempt to relay information on how the viewers felt. In the end, we felt that it was too complicated and the broadcaster’s main pieces of information they need is level of interest and approval.

There are also some extra assets made for the project that ends up unused because of the above or because of theme inconsistency. The first Stage was designed to have an old time feel. However, we wanted to be involved in Politics, so the old time stage did not fit. A new and modern stage was created instead.

Conclusion

One of the most important things to be taken away from this project is the value of research and foresight. The ‘audio-problem’ resulted in part to a gross underestimation of the level of difficulty accessing the audio stream posed. In the future discovering problematic areas ahead of time and discovering an appropriate way to address them could help optimize efficiency. Additionally, exploring the concept of introducing new features without taking away from the core goals is crucial. Most notably, the “fun” feature of throwing things at the stage/screen was interesting, yet proved to distract from the core goal of watching a broadcast. Features added are meant to “enhance” an experience, not change the goal of the experience. Furthermore, the value of user testing was extremely evident throughout the project. Constant tests following halves allowed the UI to continuously evolve to be more intuitive and user friendly. Since the product is intended to appeal to a large audience it was imperative that all resources be explored to get as many test subjects from as broad a spectrum as possible.

For the next step, our client Microsoft will review our work and if they are interested in it, they might take over it. However, in DynacTiV future vision, people’s viewing experience has been refreshed dramatically with Kinect sitting in every one’s living room. Watching live broadcasts, and anxious to join them to show your opinions? Do a simple gesture of thumb up or down, Kinect will detect your action and send it to the broadcasters right away. Your feedback take actions immediately. From your Xbox friend club, you can choose companions to watch with you digitally and remotely, and interact with them. You are watching TV at home, but it’s really like you are watching it at the live spot!

What did we learn from Usability&UI Test?

November 7, 2011 in Articles

Team DynacTiV had the second test this Wednesday. The main aim is to test usability and user interface. We want to find out how people feel about controls, including political party selection and thumb up or down to vote, also if the UI is clear and easy to understand. We had 14 play testers, 12 of whom did after survey.

Check out our demo videos here on YouTube!

[DynacTiV] UI Play Test on Nov.2

Following are some of the valuable feedback we got:

About UI
Average rating: 3.3
It’s about even, some are confusing and some are clear. There are still lots of space for progress. Here are some notes of what we are going to take effort to change.

  • Removing redundant UI to make the view of actual show larger.
  • Screen looks too busy, be simple and efficient.
  • Vocal instruction together with text would help with initial directions better.
  • Hard to find Independent logo at first, only highlight useful choices.
  • It took me a second to realize I had to hold for longer to confirm my selection.
  • Filling up shadow may not be a good way for confirmation of gesture control.
  • People have variable gestures under vague instruction, like “reach and hold”. Need to provide them graphic instruction, show them gesture with image.
  • Voting bar needs to be larger, make it clear where is neutral, top and bottom, so people can know this represent their behavior accordingly.
  • Voting bar should be right next to my shadow.
  • Both filling up shadow and voting bar needs to end with strong feedback.
  • Have 5 states of agreement/disagreement, because I only want to vote if I really like or dislike. Five states can be: really like, agree, neutral, disagree, really hate. We are going to stress the top and bottom point.

About Control Experience

  • The effective region for gesture control didn’t map with UI so well, that it made people feel like they need to stand up to reach.
  • Once people see their silhouette, they want to see their thumb when they vote too. So they start to do voting by the side of the body. Our system only registered in front before.

Some Other Feedback

  • How do we respond to the feedback of “closing the gap”?

We’ll host two Close Loop Test on Nov.16 and Nov.18. The way to close the loop is to have a moderator of a political debate, rather than the individual performers read all our data while they give the talk. We’ll pick 2 debate video clips, edit them, and control on production side which videos to play based on the feedback we get from the audience.

  • Try to target our play testers to older group who will watch political show.

Half Presentation

November 1, 2011 in Articles

Team DynacTiV had half presentation in Oct 26th. It is about 15 min presentation and 5 min question time. We did a good job. People got a better knowledge about what our project is, what our progress is now and what our goal is. And we also got a lot of good feedback from faculties and ETCers.

What did we learn from Alpha Test?

November 1, 2011 in Articles

We had our alpha-playtest on Oct 18th and 19th to test what we had done on our client side.

Here is a video demo about what we had done so far in our client side and our play testers who participated.

DynacTiV AlphaPlaytest

 

After looking into all the data and feedback from our alpha play test, we changed part of our design.

Here is some data results and feedback, and what we are going to do to solve those problems.

 

Passive data:

Some data results match what we expected, some are not, since different viewers have different viewing habits.

And for multiple people group, people would affect each other. For example, some groups were really stiff in our playtest, because people are not familiar with each other. So it is hard to tell in multiple people groups whether the movement data from our test reflect people’s feeling or not.

 

Feedback:

Gesture control: people are comfortable with them, but feel some of them don’t work well.

Our solution is to provide them less gestures, just thumb up and thumb down, so they do not need to remember different gestures.

Interaction: people like it, but we found some people were distracted by our active interaction, especially throwing, and could not focus on the show.

Our solution is to remove the throwing, make thumb up and thumb down more intuitive.

UI feedback: a lot of people feel feedback on UI is too subtle.

Our solution is to make UI feedback more obvious and set a timer for viewers to confirm their gestures.

Multiple people groups: People get confused about who is having control and get frustrated when don’t have control.

Our solution is to limit the number of audience to 2 for one kinect, and decide to color code the shadows of audience on the screen.

Scenario for viewers

October 25, 2011 in Articles

We set up a scenario for viewers to go through our whole experience.

A day in DynacTiV Systems – Audience Side

An appointed live event is scheduled for a specific time.

  1. The viewer sets up at the appointed time.
  2. The welcome screen pops up saying “DynacTiV Systems presents: The Colbert Report.” Underneath the title slowly flashes “Wave to start.”
  3. After they wave, curtains come down, covering the welcome screen and a rope drops down with text saying “Pull.” A 3D hand appears.
  4. After the user pulls the rope, the curtains rise to reveal the stage and a giant projector screen with a countdown timer on it. The user sees silhouettes on the bottom and the UI appears on the screen. A text appears in the corner “Wave to lock User Interface.” Locking makes the hand disappear.
  5. The silhouettes are made up from a random selection of other DynacTiV users. Some of the silhouettes are bots.
  6. The countdown reaches zero and The Colbert Report intro starts.
  7. The stage audience cheers and applauds, clapping animation is played. The DynacTiV system detects noise, movement and clapping and interprets that as applause. Sends the signal to server. This signal is then fed to the other users currently connected in the same network.
  8. The show starts. The User waves to lock the interface.
  9. Stephen Colbert says a particularly funny line. People laugh. Passive data collected and sent to servers. Laughter is detected. Sent to other users.
  10. User waves to unlock the interface, it appears and the user gives a Thumbs Up. The thumbs up icon becomes bigger, flashes and shrinks back down to normal size. The 3D hand also does a Thumbs up. Thumbs Up is detected and is sent to server. Thumbs cool down is triggered. The Thumbs icon is grayed out with a bold number (the cool down) displayed on it.
  11. The Thumbs up bar on the side of the screen goes up. Other users have followed suit.
  12. The user continues to maintain a thumbs up for another 10 seconds. The flower icon appears in the corner with the text “Throw!” A flower appears in the 3D hand. The user makes a throwing motion and a flower flies onto the stage. The flower icon grays out with the text saying “5min.” Cool down initiated. Flower signal sent to servers.
  13. Other users decide to follow suit and also throw flowers. This information is fed back to all users.
  14. One of the silhouette sneezes in the silhouettes. The user says “Gazuntight.” The system detects it and relays the message. This also works vice versa.
  15. The user starts talking the phone. The Kinect starts detecting words and one of the bots say “Shh.”
  16. Hearing this, the User toggles mute. All data from the user stops being collected until it is un-muted. After the phone conversation is done, the user un-mutes and re-locks.
  17. Towards the end of the show, a disagreeable interviewee appears. After he says some disagreeable statements, the user unlocks and makes a thumbs down gesture. Similar to #12, #13 and #14 but with a tomato splat.
  18. Stephen Colbert closes with his ending remarks. The user stands up and claps for a standing ovation. Kinect detects loud noise, clapping and user is standing. Standing ovation. Some of the silhouettes follow suit and obscure slightly more of the screen from standing up.
  19. The digital screen says “End of Broadcast.” A rope comes down with text saying “Pull!”
  20. After the rope is pulled, the curtains fall and ends the program.

Note: As the performer side isn’t designed yet, the outline above doesn’t reflect anything performer-side.