Monday, November 10, 2014

How Eye Tracking can impact Head Tracking

I'll be visiting the Society for Neuroscience meeting this weekend in Washington, DC and will surely see some of the latest advancements in eye tracking.

Many times, people ask why is eye tracking useful beyond the obvious applications of facilitating research and providing user interface for people with disabilities.

One interesting application is in using eye tracking to minimize head tracking latency. Consider the following graph:

The graph shows the position of an eye (black line) and the position of the head (red line), provided in degrees over time. Let's look at some areas of this graph:

  • From about 6.5 seconds to 6.7 seconds, we see rapid eye movement from about -5 to +25 degrees. During this period, the head did not move.
  • From 6.7 to about 7 seconds, we see the head moving and, in the same time, the eye moving in the opposite direction. Notice that the sum of the head position and eye position is approximately constant throughout this period.
  • From 7 to 8 seconds, both the eye and the head are stationary
  • From 8 to about 8.2 seconds, the eye moves in the opposite direction
  • From 8.2 to about 8.5 seconds, the head follows and the eye reverses direction
What is happening? It turns out that eye movements precede head movements. When the body wants to look in a certain direction, the eye jumps ahead (to a direction over +20 or under -20 relative to the 'straight ahead' position). Then, the head starts to follow while the eye compensates with a movement to the other direction so that the gaze direction (head orientation + eye orientation) stays the same. During the head movement, the eye remains locked on the same target.

Why is this useful? It is useful because significant eye movements signal the intent to move the head. If we wish to minimize the latency of sensing head movements, it might be very useful to use the hints provided by the eye. It shows us that a large movement is coming, and we can use this information in anticipating head movement, and thus reduce latency.

Sunday, October 26, 2014

The (brief) Return of Tiled Displays

I've probably told the beginning of this story hundreds of times: in the late 1990's, a team of researchers led by Prof. Bob Massof at Johns Hopkins University, set out to achieve a seemingly-impossible task: to design a head-mounted display that was had both super-wide field of view as well as super-high resolutions.

The state-of-the-art display at that time was the eMagin SVGA OLED (800x600 resolution) that was just coming into production. One of the most popular mobile phones at the time was the Nokia 5110 which had an 84x47 black and white LCD display. The first HDTV broadcast in the US was in 1996 and displays were very large and heavy.

The solution was to use a tiled approach: bring together many small micro-displays, arrange them in a circular fashion so that they all point at the center of rotation of the eye, put an array of small "magnifying glasses", one magnifier in front of each display, and carefully align them to produce a nearly seamless image. The lenses were needed both to allow the eye to focus on an object so close to the eye as well as to eliminate the physical edges and borders between adjacent displays, magnifying each display so as to create an optical overlap. Keep in mind this optical overlap - it is going to be important later in this post.

See the images below for an illustration and a photo of an actual tiled module.

Concept of tiled HMD

Display module with 6 tiled SVGA displays

The performance was quite amazing. The original model had 16 800x600 displays per eye (4 rows of 4 displays each), so about 7.7M pixels. Commercial models of this, the piSight and xSight would go on to deliver >150 degrees field of view and 6 million pixels per eye. The image overlap between adjacent screens was about 30%.

There were a few downsides to this design: careful calibration was required to get both the geometry and colors nicely aligned across displays; a lot of computing or FPGA power was needed to reformat an image so that it displays inside the HMD because each display shows a somewhat different perspective of the 3D scene and because images in adjacent screens had content overlap. The result: some people loved it, some people not so much, and these products have generally been superseded by newer products that use either high-resolution microdisplays or high-resolution smartphone displays, both offering about 2 million pixels or more per eye using a single display.

A couple of months ago, I was invited to speak on a panel at the Siggraph 2014 show in Vancouver. While at Siggraph, I had the opportunity to visit some of the talks and technical exhibits and was delighted to see tiled display technology being used again.

The first example was a demonstration called Hapto MIRAGE from Tachilab in Keio University, See their demo video here

Hapto MIRAGE used three active-shutter LCD displays arranged so that their optical axis were intersecting. Large Fresnel lenses were placed in front of each display to magnify the image and take care of removing the seams. This was far from a wearable unit, but the concept seems identical to the original Johns Hopkins University work.

Another example was actually from 2013, though it was discussed again in an HMD session: the near-eye light field displays from NVIDIA:


In this demo, a small OLED was used and an array of very small lenses was placed in front of it. Small lenses allow for very small focal length, which makes it easy to place near the eye in a very lightweight package. The demo is very compelling but there is a catch: the overlap between the images displayed in each micro-lens is very large - approx 80% if I remember correctly. This means that the effective resolution of the display is reduced dramatically because so many pixels are overlapping. Thus, one might start with a 1280x720p micro display but end up with effective resolution of about 320x240, thus somewhat limiting the practical use of this particular configuration.

Are tiled displays making a comeback? Probably not, given high-resolution smartphone displays (including curved displays). But, it was certainly fun to see other implementations of the same fundamental principles developed at Johns Hopkins some 15 years ago.




Sunday, August 17, 2014

The Sensory Explosion

At last week's SIGGRAPH conference, I had the pleasure of contributing a "Sensory Explosion" presentation.on the "Sight, Sounds and Sensors" panel.

Below are the key slides and some speaker notes. Enjoy!


The panel featured several contributors:


Sensics has been doing VR for a long time. Historically, it has mostly focused on enterprise applications (government, academia, corporate) and it is considering how to best leverage its technologies and know-how into larger markets such as medical devices and consumer VR.



Traditionally, head-mounted displays had three key components: a display (or multiple displays), adaptation optics and most often an orientation sensor. Most of the efforts were focused on increasing resolution, improving field of view and designing better optics. The orientation sensor was necessary, but not the critical component.


Recently, we are seeing the evolution of the HMD as a sensory platform. On top of the traditional core, we see the emergence of new types of sensors: position trackers, eye tracking, cameras (either for augmented reality and/or depth sensing), biometric sensors, haptic feedback, sensors that perform real-time determination of hand and finger position and more. Increasingly, innovation is shifting to how to make these sensors deliver maximum performance,  lightest weight (after all, they are on the head), and utmost power-efficiency (both for heat dissipation reasons as well as battery life for portable systems)



Above and beyond these on-board sensors, VR applications can now access sensors that are external to the HMD platform. For instance, most users carry a phone that has its own set of sensors such as a GPS. Some might wear a fitness or be in a room where a Kinect or some other camera can provide additional information. These sensors pose an opportunity for application developers to know even more about what the user is doing.



Integrating all these sensors can be pretty complex pretty quickly. Above is a block diagram of the SmartGoggles(tm) prototype that was done by Sensics a few years ago. These days, there is a much greater variety of sensors, so what can be done about them?



I feel that getting a handle on the explosion of sensors requires a few things:
1. A way to abstract sensors, just like VRPN abstracted motion trackers.
2. A standardized way to discover which sensors are connected to the system.
3. An easy way to configure all these sensors, as well as store the configuration for quick retrieval
4. A way to map the various sensor events into high-level application events. Just like you might change the mapping of a the buttons on a gamepad, you should be able to decide what impact does a particular gesture, for instance, have on the application.

But beyond this "plumbing", what is really needed is a way to figure out the context of the user, to turn data from various sensors into higher-level information. For instance: turn the motion data from two hands into the realization that the user is clapping, or determine that a user is sitting down, or is excited, or happy or exhausted.

We live in exciting times with significant developments in display technologies, goggles and sensors. I look forward to seeing what the future holds, as well as to make my contribution to shaping it.

Monday, August 4, 2014

Positional tracking: "Outside-in" vs. "Inside-out"

Optical positional tracking for goggles uses a camera (or cameras) and a known set of markers to determine the position of the camera relative to the markers. Positional tracking can be done using the visible spectrum but is more commonly done using infra-red markers and a camera that is sensitive to IR light.

There are two main options:

  • Inside-out tracking: where the camera is based on the goggles and the IR markers are placed in stationary locations (e.g. on the computer monitor, on the wall, etc.)
  • Outside-in tracking: where the camera is placed in a stationary location and the IR markers are placed on the goggles.
In both cases, the targets sometimes flash in a way that is synchronized with the camera. This allows reducing power consumption for the targets and helps reduce tracking noise from IR sources that are not the targets.

Sensics dSight panoramic HMD with IR targets for "outside in" tracking
How do these approaches compare?

  • Tracking volume: in both cases, at least some of the targets need to be visible to the camera. When the user rotates the head, an "inside-out" system needs targets that are physically far apart. If the targets are, for instance, placed on the bezel of a notebook PC, it is easy to see how head rotation could easily take these targets out of the field of view of the camera. A wider lens could be used in the camera, but this would reduce the tracking precision as each camera pixel would now cover a greater physical space in the world. In the "outside-in" system, targets could be placed on most sides of the goggle, allowing reasonably large rotation while still having targets visible on the camera. Advantage: outside-in
  • Tracking inside an entire room: if we want to allow mobility within a room, an 'inside-out' system would require additional markers on the walls, whereas an 'outside-in' system would require additional cameras. Both systems would require room calibration to make sure the target and/or cameras are placed in a known position. Additional cameras require additional processing power. Slight advantage: inside-out
  • Where is data being processed? In "inside-out' tracking, the camera data is either processed on the goggle or the camera is connected to a computer that is either carried by the user or stationary and connected via a wire. In 'outside-in' tracking, the data is processed on a computer that could be stationary. Advantage: outside-in
  • Can this be used with a wireless goggle? If the goggle is not tethered to a computer, "inside-out" tracking requires that the data is either processed locally or that the camera signal is sent wirelessly to a base station. In contrast, an 'outside-in' approach does not require wireless data of the camera. At most, a synchronization signal can be sent to the IR LEDs to make sure they flash in sync with the camera. Advantage: outside-in
  • Ability to combine with augmented reality system. Sometimes, the goggle will already have an on-board camera (or cameras) for the purpose of augmented reality and/or 3D reconstruction. In that case, using the same camera for positional tracking may have some cost advantages if positional tracking can be used with visible targets or if the camera already has IR sensitivity. Advantage: inside-out
Note: tracking accuracy is also an important comparison parameter, but this is more difficult to generically compare across both approaches. Very often, accurate tracking is achieved not just through the camera data but also by integrating ("sensor fusion") rotational data and linear acceleration data from on-board sensors. The sensors used and the quality of the sensor fusion algorithm would determine which approach is better.

Bottom line: for most applications, an 'outside-in' approach would be better, and thus we expect to see a greater number of 'outside-in' solutions on the market.


What has been your experience?


For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here

Sunday, July 6, 2014

Wearable but Intrusive?

From: Universitat Pompeu Fabra, Barcelona
At last month's E3 show, I had the opportunity to experience an early version of ControlVR, dubbed a "wearable controller" by its creators.

ControlVR is comprised of a set of sensors that attach to the torso, arms and fingers. Once properly calibrated, they can provide fine-grained reporting of the position of each finger and use this information in a variety of virtual reality applications.

For a prototype, the system worked very well, and I am sure it will improve both in looks and performance as it heads towards a production model.

My concern, though, will users really be willing to a wear such device for a extended periods of time. Such a design is relatively intrusive - takes a while to strap on as well as touches the user in so many points. This is not specific to ControlVR. The same could be said of the PrioVR suit or similar devices.

To answer the question, one needs to consider the benefits of such devices vs. the alternatives. I am sold on the benefits of finger tracking, or full-body tracking for that matter. But, there are alternatives (Kinect, Leap Motion, others) that do not require so many touch points on the user and deliver, in my opinion, similar benefits in most situations.

A virtual reality goggle is also intrusive. It touches your face, potentially messes your hair and sometimes requires you to remove glasses, just to name a few. However, there is no cost-effective alternative to achieving the immersion and portability of goggles these days. Whether that is also the case for devices such as ControlVR and PrioVR - I am not sure.

What do you think?

Sunday, June 15, 2014

Playing with the Pros

Source: DontWorry.TV
They say that a common dream is to be able to fly, or sing. I'm sure virtual reality can help with the flying part.

Kids who play sports want to play - at least once - in a professional game. "Bend it like Beckham" or play with LeBron.

180 cm (5'11") in height, sometimes overweight, I am not tall enough to dunk nor light enough to fly.

So there had to be another way of playing with the pros.

For the next week, I'll be taking a short break from VR activities. My violin in hand, I'll be dedicating most of my time to practicing with like-minded individuals and members of the Baltimore Symphony Orchesta towards a concert next weekend at Symphony Hall. Conducted by the BSO's music director Marin Alsop, the concert will feature wonderful (and wonderfully difficult to play) pieces by Dvorak, Debussy, Tchaikovsky, Mahler and Berlioz. Below are a recording of two parts from these pieces.

I'll be back next week with more E3 thoughts, positional tracking information and more. But for now, let the music begin.




Tuesday, May 27, 2014

Creating VR games and software designed for cross-headset compatibility

With multiple consumer VR headsets now on the horizon, will it be difficult for developers to support multiple VR headsets from a single application, rather than creating custom distributables for each individual headset?

Interested? Read my guest blog entry on RoadToVR to find out.

Monday, May 26, 2014

An overview of positional tracking technologies for VR

Photo Credit: 
Saad Faruque
 
via Compfight
Positional tracking is very important towards achieving immersion and presence in virtual reality. Whether it is the head, arms, fingers or objects (such as a weapon), positional tracking can deliver multiple benefits:

  • Change the viewpoint of the user to reflect actions such as jumping, ducking or leaning forward.
  • Show hands and other objects in the displayed image. A common complaint of users in virtual reality is that they can't see their hands.
  • Connect the physical and virtual world. For instance, by detecting hand position, a software program can implement the option to move virtual objects by touching them.
  • Detect gestures. By analyzing position over time, a gesture can be detected. For instance, a user might draw the number "8" in air and have the software detect it.
There are several methods of tracking position and I felt it is worthwhile to describe some of them. This post focuses on tracking for virtual reality applications, so we will not look at vehicle tracking, tracking of firemen in buildings and so forth. In no particular order, here are some of the popular tracking methods include magnetic, inertial, optical and acoustic tracking as well as hybrid tracking that combines multiple methods.

I've described these tracking methods as well as others such as depth map in a guest blog post at RoadToVR. Please click here to read that post on the RoadToVR site.


For additional VR tutorials on this blog, click here
Expert interviews and tutorials can also be found on the Sensics Insight page here

Friday, April 18, 2014

New iPhone utility app simplifies VR calculations

A few months ago, I realized we go back again and again to the same Excel spreadsheets for some VR calculations. For instance, if the field of view is 90 degree diagonal and the aspect ratio is 16:9, what is the horizontal field of view? If using a goggle is like watching a 70 inch TV from 6 feet, what does that say about the field of view?

To help with this, the Sensics team decided to create a simple iPhone app (Android version coming soon) that helps with these calculations. This app is now available as a free download from iTunes. Think about it as public service for the VR community.


In it's current beta form, the app provides several useful conversions:

  • TV screen size to/from goggle field of view
  • Screen size and aspect ratio to/from field of view
  • Quaternion to/from Euler angles
The app also includes useful reference info as well as a way to ask the VR experts at Sensics.

Some of this math already appeared on this blog such as here and here, but it is now available in your pocket.

What additional calculators would you find useful? What changes would you like to see in the app? Any bugs that need to be fixed? Let me know.


Monday, April 14, 2014

Why have two types of HMDs: those with OLED micro-displays and those with flat-panel displays?

I was recently asked why it made sense to offer HMDs based on micro displays at the same time that we offer HMDs based on flat-panel displays.

Because we can.

But seriously, here's why.

Micro-OLEDs (such as those from eMagin) have certain advantages:


  • Small physical size allows for small physical size of HMD. Compare, for instance, the size of the zSight 1920 which uses micro-OLEDs with the size of the dSight which uses flat-panel displays. Both have 1920x1080 pixels per eye, but the dSight is physically larger. For applications that have space constraints or where the user needs to brings objects close to the cheek (such as a gun), a small HMD has a big advantage.
  • Micro-OLEDs current offer higher refresh rate: 85 Hz as opposed to typically 60 Hz for flat-panels. 
  • Most available flat-panel displays use some version of LCD technology. OLEDs offer superior response time and contrast. However, several vendors have announced (or are already selling) OLED flat-panel displays.
  • If you care about pixel density, it is easier to design an optical system that would provide very high pixel density - even eye-limiting resolution - using a small micro-display. High pixel density implies lower field of view for the same number of pixels. You would care about high pixel density if you need to see every bit of detail in your image, or need to detect virtual objects at far distances, such as in military training.

Flat-panel displays have different advantages:

  • Their cost is much lower since they are key components to cell phones.
  • Larger supplier diversity.
  • Much easier to create very wide field-of-view systems than with the micro-OLEDs. If you care about immersion, you can usually get more immersion with flat panels. Of course, wider field of view implies lower pixel density.
  • Resolutions are rapidly increasing. 1920x1080 seems to be the current standard for high-end phones but this will soon be displaced by 2560x1440 or other high resolutions.

Ultimately, there would be many more HMDs that are based on flat-panels, but there are unique professional applications that would continue to prefer OLED micro-displays.


Sunday, April 6, 2014

IEEE VR presentation: the next technical frontiers for HMDs



The annual IEEE conference on virtual reality took place in Minneapolis last week. It was a unique opportunity to meet some of the leading VR researchers in the world, to showcase new product innovations and to exchange views on the present and future of VR.

I had the pleasure of sharing the stage in "the battle of the HMDs" panel session at the conference, together with David A Smith, Chief Innovation Officer for Lockheed Martin, Stephen Ellis who leads the Advanced Displays and Spatial Perception Laboratory at NASA and Dr. Jason Jerald of NextGen Interactions.

Below are a (slightly edited) version of my slide and a free-form version of the accompanying text. The audience was primarily VR researchers, so if one thinks of "R&D" as "Research and Development", this talk was aimed more at the research side then the development side.




I believe that there are three layers to what I call the "HMD value pyramid": baseline technology, sensing and context. As one would expect, the pyramid cannot stand without its baseline technology, which we will discuss shortly, but once baseline technology exists, additional layers of value build upon it. While the baseline technologies are mandatory, the real value in my opinion is in the layers above it. This is where I am hoping the audience will focus their research: making these layers work, and then developing methods and algorithms to make these capabilities affordable and thus widespread.




There are several components that form the baseline of the VR visual experience:
  • Display(s)
  • Optics that adapt the displays to the appropriate viewing distance and provide the desired field of view, eye relief and other optical qualities.
  • Ergonomics: a way to wear these optics and displays comfortably on the head, understanding that there are different sizes and facial formations, and quickly adjust them to an optimal position
  • Wireless video, which allows disconnecting an HMD from a host computer, thus allowing freedom of motion without risk of cable entanglement
  • Processing power, whether performing the simple tasks of controlling the displays, performing calculation-intensive activities such as distortion correction or ultimately allowing applications to run completely inside the HMD without the need to connect to an external computing device.
There will clearly continue to be many improvements in these components. We will see higher-resolution and faster displays. We will continue to see innovative optical designs (as Sensics is showing in the exhibit outside). We will continue to see alternative displays such as pico projectors. But basically, we can now deliver reasonably good visual experience in a reasonably good price. Yes, just like in cars or audio systems or airplane seats or wedding services, there are different experience levels and different price levels, but I think these topics are moving from a 'research' focus into a 'development' focus.




Once the underlying technologies of the HMD are in place, we can move the next layer which I think is more interesting and more valuable: the sensory layer. I've spoken and written about this before: beyond a head-worn display, the HMD is a platform. It is a computing platform but it is first and foremost a sensory platform that is uniquely positioned to gather real-time information about the user. Some of the key sensors:
  • Head orientation sensors (yaw/pitch/roll) that have become commonplace in HMDs
  • Head position sensors (X/Y/Z)
  • Position and orientation sensors for other body parts such as arms or legs
  • Sensors to detect hands and fingers
  • Eye tracking which provides real-time reporting of gaze direction
  • Biometric sensors - heart rate, skin conductivity, EEG
  • Outward-facing cameras that can provide real-time image of the surroundings (whether visible, IR or depth)
  • Inward-facing cameras that might provide clues with regards to facial expressions
Each of these sensors are in a different stage of technical maturity. Head orientation sensors, for instance, used to cost thousands of dollars just a few years back. Today, an orientation sensor can be had for a few dollars and are much more powerful than those of the past: tracking accuracy has improved. Predictive tracking is sometimes built in. Tremor cancellation; gravity direction sensing, sensing of the magnetic north, and of course reporting speeds are increasingly higher.

HMD eye tracking sensors are behind in the development curve. Yes, it is possible to buy excellent HMD-based eye trackers for $10K-$20K, but at these prices, only a few can afford them. What would it take to have a "good enough" eye tracker follow the price curve of the orientation tracker?

HMD-based hand and finger sensors are probably even farther behind in terms of robustness, responsiveness, detection field and analysis capabilities.

All these sensors could bring tremendous benefits to the user experience, to the ability of the application to effectively serve the user, or even to the ability of remote users to naturally communicate with each other while wearing HMDs. I think the challenge this this audience is to advance these frontiers: make these sensors work; make them work robustly (e.g. across many users, in many different conditions and not just in the lab) and then make them in such a way that they can be mass-produced inexpensively. Whether these required breakthroughs are in new types of sensing elements, or new computational algorithms, that is up to you to decide, but I can't under-emphasize how important sensors are beyond the basic capabilities of HMDs.



Once sensors have been figured out, context is the next and ultimate frontier. Context takes data from one or more sensors and combines it into information. It provides the application a high-level cue of what is going on; what the user is doing or where the user is or what's going to happen next.
For instance, it's nice to know where my hand is, but tracking the hand over time might indicate that I am drawing a "figure 8" in air. Or maybe that my hands are positioned to signal a "time out". Or maybe, as in the Microsoft patent filing image above, that the hand is brought close to the ear to signal that I would like to increase the volume. That "louder" gesture doesn't work if the hand is 50 cm from the head. It takes an understanding of where the hand is relative to the head and thus I look at it as a higher level of information relative to just the positional data of the head and hand.

Additional examples of context that is derived from multiple sensors: the user is walking; or jumping; or excited (through biometric data and pupil size); or smiling; or scared. The user is about to run into the sofa. The user is next to Joe. The user is holding a toy gun and is aiming at the window.

Sometimes, there are many ways to express the same thing. Consider a "yes/no" dialog box in Windows. The user might click on "yes" using the mouse, or "tab" over to the "yes" button and hit space, or click alt-y, or say yes, and there are perhaps a few other modes to achieve the same result. Similarly in VR, the user might speak "yes" or might nod her head up and down in a "yes" gesture, or might provide the thumbs up sigh, or might touch a virtual "yes" button in space. Context enables the multi-modal interface that focuses on "what" you are trying to express as opposed to exactly "how" you are doing it.

Context, of course, requires a lot of research. Which sensors are available? How much can their data be trusted? How can we minimize training? How can we reduce false negative or false positives? This is yet another great challenge to this community.

In summary, we live in exciting times for the VR world, and we hope that you can join us for the travel up the HMD value pyramid.

Monday, March 31, 2014

zSight HMD unveiling archeology secrets as part of Operation Lune

From the Nautical Archeological Society
La Lune, the Sun King’s flagship, sank off the Toulon coastline in 1664. Almost 350 years later, a one-of-a-kind underwater archaeological expedition will unveil the secrets that this wreck conceals. A team of international experts embarked on this breathtaking venture armed with up-to-the-minute technology. Join the action, and a Sensics zSight HMD, and watch history meet the virtual realm.

Watch how the zSight HMD is used in the 3D exploration of this historical site.


Monday, March 3, 2014

Here come the sensors

Google project Tango is the latest example of increasingly-sophisticated sensors making their way to portable computing platforms such as phones, tablets and virtual reality goggles.

This mobile device includes several sensors:

  • An orientation sensor, likely providing yaw/pitch/roll, angular velocity and linear rotation. These have been fairly standard in modern mobile devices.
  • Cameras for both visible and IR spectrum. These are color cameras that aside from the usual RGB image also have pixels that sense near-IR wavelengths.
  • A depth sensor, providing real-time mapping of the distance (e.g. depth) of various points in space - such as walls, people, hands - from the sensor. As an aside, there are several ways to sense depth: structured light such as the Kinect which projects a seemingly random pattern of dots into space and analyzes their reflections, time of flight such as SoftKinetic which measures the round-trip time it takes for light to return to the sensor and single- or multi-camera solutions that use image processing to estimate depth.
Technically, what is unique about the new platform is that it has dedicated power-efficient vision processors that allow it to continuously analyze, fuse and decode the information from the various sensors. This is news because previous processors consumed the mobile battery too fast for continuous use. But, the real reason to be excited about project Tango is that it provide both extra motivation as well as a hardware platform to many developers so that they can develop new 3D applications and increase awareness for the power of sensors.

I've written many times about the importance of sensors in goggles as a way to turn "dumb goggles" into "smart goggles", so I am a believer. It will be fun to see some the new applications that come out of project Tango, as well as how these types of sensors could be used in goggles in the future.

Sunday, February 9, 2014

"Your hotel room is 5.6 meters diagonal" and other VR marketing nonsense

If you are seriously interested in learning about virtual reality technologies, you might want to skip this particular post. This post is about the little marketing inventions that VR vendors use, inventions that amuse and annoy me at the same time. As such, it should not be taken too seriously.

  • "A diagonal field of view of 60 degrees". Since when did diagonal field of view become an important measure? I know that the diagonal field of view is larger than the horizontal or vertical field of view, which is probably why it is chosen, but I think humans can visualize much better horizontal or vertical fields view. For instance, if you are looking for a hotel room in the city and the hotel room says that it has a diagonal size of 5.6 meters, why is that useful? Would it be more useful to know that siad hotel room is 4 x 3 x 2.5 meters LxWxH? The diagonal field is a carry over from the television world, where you buy a 60" television, though the aspect ratio for televisions (width:height) is much more consistent than the aspect ratio for goggles. A 60" television with a 16:9 aspect ratio has a 52" horizontal and 30" vertical size (I looked it up here. If you need the largest number how about circumference? Your 60" TV just became 164" by that measure.
  • "It is like watching a 70" television from 6 feet". This may be my favorite. In an effort to illustrate how wide the image, there is the (diagonal) TV analogy? 70" from 6 feet sounds a lot but it is actually a mere 52 degrees (I looked it up here). I would have just loved to be in that marketing meeting. 70" from 6 feet does not sound that impressive? Maybe we should write 140" from 12 feet (hint: the field of view is the same). I think the cake goes here to the vendor that likened their experience to a 750" screen - though I guess the intention was to visualize a movie theater experience. Google it to find out who.
  • "We have dynamic resolution which is like the human eyes". Translation: our optics are so-so and the image outside the center is not really in focus and has quite a bit of distortion, but this is OK because humans see better in the center of the visual field relative to how they see in the peripheral vision. What happens when you turn your eye and the side of the image is now viewed by your central vision?
  • "We have a 9-axis motion tracker". So, let's see: X, Y, Z are three, Yaw, Pitch and Roll are the next three, so what's 7, 8, 9? Time travel? I'd like to think this is often more of an honest mistake than an attempt at deception. Usually, this refers to a 9-sensor motion tracker that has a gyroscope, accelerometer and magnetometer and thus reports linear and angular acceleration as well as angular position. 
  • "Our micro-display has 4 million pixels". Usually, this is triple-counting. A third of these pixels are red, a third are blue and a third are green. This statement usually means "we have a 1280x1024 pixel display but each pixel is full color and is made of 3 sub pixels"
  • "Our goggle has 1080p resolution". This sounds a lot better than SXGA (1280x1024) resolution, but often 1080p resolution in a goggle could mean 1080p across both eyes, so 960x1080 per eye and thus fewer pixels than 1280x1024 per eye.
I was thinking for a while whether I should include "retina display" (as in "my phone as a retina display but yours does not") in this list, and decided against it. Retina display is a lovely way to trademark the benefit - high resolution that is similar to the eye's resolution - but other than scholarly discussions whether 'retina display' is indeed 'eye limiting', I don't see it as a misleading claim. It talks about the benefit much like Hertz Rental Car's GPS systems are called NeverLost to showcase what it does for you as opposed to how it does it.

One last note: my company is also guilty in some of the above sins - after all, we sell in the same market and cater to the same customers that have been trained to look for "diagonal field of view" or other not-so-important measures. 

What have I forgotten? Let me know.




Tuesday, January 28, 2014

Volkswagen uses virtual reality in interactive exhibit at shopping malls

My company's customer, Maniak Experiencial, delivered an interactive exhibit for Volkswagen. In this exhibit, a futuristic car was placed in shopping malls, and treated visitors to an exciting driving simulator using the zSight HMD

Check out the many images and additional details here

Wednesday, January 1, 2014

"Resolutions" for the New Year

Happy New Year!

This post is not going to be about the "less coffee; more exercise" type of resolutions for the new year. Instead, let's discuss display resolutions and how they are shaping up for 2014.

A few years ago, Apple introduced the "Retina Display" as a marketing term. The thought was that the resolution of the display is so high that when held in the typical viewing distance, the pixel density is practically as high as can be distinguished by the naked eye, and thus increasing the resolution for that viewing distance won't generate any benefits.

Let's run through the numbers. 25cm (10 inches) is considered to be closest comfortable viewing distance for most people. The visual acuity of the eye - assuming 20/20 vision - is considered to be 1 arcmin/pixel or approximately 60 pixels per degree. At a distance of 25 cm, 1 degree takes up 25*tan(1 degree) or about 0.44 cm. Thus, if 60 pixels take up 0.44 cm, 1 inch would require 60*2.54/0.44 = 346 pixels/inch, sometimes referred to as 346 DPI (dots per inch) or PPI (pixels per inch) at this distance of 25 cm . Indeed, Apple has 326 DPI in the iPhone 4, so pretty close.

Some have argued that Apple's claim is misleading and that a display needs to have at 477 DPI to truly have eye-limiting resolution. So, when new 5" 1080p flat panel displays - such as those from Sharp - came with 433 DPI, that should have been nearly enough, right?

Now, we are hearing about 538 DPI flat-panel screens, 2560x1440 resolution, that are coming out in smartphone or small tablet form factor. Where are all these people that can tell apart 443 DPI but can't tell apart 538 DPI? Other than the "my display has more pixels than your display" claim, is there really a tangible benefit for phones to have higher and higher DPI?

If you make goggles that use smartphone displays, you love these higher display resolutions. Goggles magnify displays so when a 1080p display gets magnified to - for the sake of examples - 90 degrees wide, the pixel density is 1920/90=21.3 pixels/degree, so still far from eye-limiting. a 2560x1440 display magnified the same way will produce 28.4 pixels/degree, providing a tangible improvement. To get to eye-limiting resolution at 90 degrees, you'd need a screen that is physically small enough to be worn on the head yet has 5400 pixels across. Good luck finding the GPU that can drive interactive content for that display!

Several years ago, my company developed the xSight HMD which used a unique optical tiling system to combine multiple 800x600 OLED displays into one large virtual display that had about 1920x1080 pixels. Today, similar performance can be achieved without tiling.

Thus, as long as goggle makers can ride the smartphone wave, they can get better and better resolutions, but since even 2560x1440 sounds like somewhat of an overkill, what's next? Will the next-generation goggles continue to use flat panel displays or will they gravitate towards other technologies?

It should be a fascinating 2014. Happy New Year!