Today the US Patent & Trademark Office published a patent application from Apple that relates to augmented reality and a method for displaying content, and, in particular, displaying respective representations of environments. This is a highly technical patent that engineers and developers working in the field of augmented reality will better appreciate.
However, the one theme of the patent that all Apple fans could appreciate is the new SharePlay feature allowing friends to watch movies together on their devices or TV and more. This was introduced at WWDC-21 and below is that particular segment of the keynote.
This is what today’s patent is about technically speaking. In fact, SharePlay is likely the first step towards an AR version that’s in-the-works. Today’s patent application dives deep into a future generation of SharePlay for AR that could also be used with their future Mixed Reality headset. Apple introduces us to the “home enhanced reality (ER) environment” that uses a plurality of diorama views.
In Apple’s patent background they note that a previously available device may display content within an operating environment. In augmented reality (AR) applications, the operating environment corresponds to a physical (e.g., real-world) environment including physical objects, and the previously available device displays virtual objects overlaid on a graphical representation of the physical environment. In virtual reality (VR) applications, the operating environment corresponds to a virtual environment including purely virtual objects.
However, the device lacks a mechanism for enabling three-dimensional (3D) manipulation of the displayed content within the operating environment. The device also lacks a mechanism for displaying virtual objects such that the virtual objects and physical objects from occluding each other within the operating environment.
A Home Enhanced Reality (ER) Environment
Apple’s invention introduces us to a new method that includes displaying, via a display device, what they brand as a “home enhanced reality (ER) environment” characterized by home ER world coordinates, including a first diorama-view representation of a first ER environment.
In some circumstances, a device may display one or more objects within an operating environment. In some applications, the device modifies display of a current operating environment. Modifying the current operating environment may include displaying different (e.g., new) virtual objects at respective predetermined locations within the current operating environment, independent of user input. Displaying a virtual object at a predetermined location within the current operating environment is problematic because the virtual object and a physical (e.g., real-world) object that is also within the current operating environment may obstruct (e.g., occlude) each other, degrading the functionality provided by the current operating environment.
By contrast, various implementations include systems, methods, and electronic devices that, in response to detecting a first input directed to a first diorama-view representation of a first ER environment, transforming a subset of ER objects included in the first diorama-view representation into the home ER environment as a function of home ER world coordinates and first ER world coordinates. In some implementations, the method includes moving the subset of ER objects in response to detecting a second input. For example, in some implementations, the method includes moving the subset of ER objects relative to physical objects within the home ER environment. Accordingly, occlusions (e.g., obstructions) between physical objects and ER objects are negated, enabling a richer set of functionalities with respect to the home ER environment.
In some circumstances, a device may display content within an operating environment. For example, in conventional video conference applications, the operating environment corresponds to a shared communication environment. The device displays remotely-located individuals associated with the shared communication environment based on respective recorded video streams of the individuals. Accordingly, the individuals may graphically interact with each other. The device displays the content within the operating environment in two-dimensions (2D), such as a flat 2D video stream associated with a video conference. However, the device lacks a mechanism for enabling three-dimensional (3D) manipulation of the displayed content within the operating environment.
By contrast, various implementations include methods, systems, and electronic devices that enable changing a perspective view of a first one of a plurality of displayed diorama-view representations based on a user input. The plurality of diorama-view representations corresponds to a plurality of enhanced reality (ER) environments. Each of the plurality of diorama-view representations is associated with a respective set of ER world coordinates that characterizes a respective ER environment. Based on the user input, an electronic device changes the perspective view of the first one of the plurality of diorama-view representations while maintaining the previous arrangement of ER objects therein according to a respective set of ER world coordinates. Accordingly, the first one of the plurality of diorama-view representations may be manipulated from a 3D perspective, enabling a richer set of functionality with respect to the ER environments.
Apple presents a series of patent FIGS. 2J-2N to illustrate functionality associated with an ER session that is associated with the first ER environment. You’ll have to view the patent to see all of them in the series.
Below is Apple’s patent FIG. 2J. Here, the ER session enables respective graphical representations of individuals to be concurrently within the first ER environment. For example, the ER session enables a particular individual that is represented by the avatar #236 to be within the first ER environment. In some implementations, the electronic device (#203 HMD, iPhone or iPad) receives a request to join an ER session and displays a corresponding indication.
For example, in some implementations, the indication corresponds to an ER join interface #254. Moreover, as illustrated, the electronic device plays (e.g., via a speaker) a first set of speech data #256 that is associated with the particular individual that is associated with (e.g., connected to) the ER session.
For example, the first set of speech data is a verbal request to join the ER session, such as “Hey, Bob. Come join my ER environment and we can virtually watch TV together.”
As another example, the first set of speech data may include ambient noise associated with the first ER environment, such as noise from the television #234.
(Click on image to Enlarge)
Apple’s patent FIG. 3 below is a flow diagram of a method of changing a perspective view of a diorama-view representation of an ER environment based on a user input in accordance with some implementations. In various implementations, the method 300 or portions thereof are performed by an HMD.
The patent dives deep into HMD gaze and eye tracking as well as hand tracking. In various implementations, the electronic device utilizes the hand tracking data in order to manipulate display of a diorama-view representation of an ER environment. For example, in some implementations, the electronic device moves the diorama-view representation in order to track the current position of the hand of the user.
In addition, Apple notes that a body pose sensor obtains body pose data indicative of a position of a head or body of a user. In various implementations, the electronic device utilizes the body pose data in order to manipulate display of a diorama-view representation of an ER environment.
For example, the body pose data indicates the user is turning his/her head sideways, and the electronic device accordingly manipulates the perspective view of the diorama-view representation of the ER environment.
There is a lot of detail to explore in Apple’s patent application number 20210311608 that you could review here.
Apple’s CEO Tim Cook has been commenting on the future of Augmented Reality going back to at least 2016. Below is just one of his many quotes on AR.
(Sept 2016) AR “gives the capability for both of us to sit and be very present, talking to each other, but also have other things — visually — for both of us to see. Maybe it’s something we’re talking about, maybe it’s someone else here who’s not here present but who can be made to appear to be present.” The Verge has a timeline on Cook’s public comments on AR here. It would appear that Tim Cook was describing what SharePlay AR is all about in this patent.