Today the US Patent & Trademark Office published a patent application from Apple titled “Projection Format in Multi-Directional Video.” The invention relates to coding/decoding systems for multi-directional imaging system and, in particular, to use of coding techniques that originally were developed for flat images, for multi-directional image data. Apple’s invention may end up in a developer’s tool relating to mixed reality imagery.
In multi-directional imaging, a two-dimensional image represents image content taken from multiple fields of view. Omnidirectional imaging is one type of multi-directional imaging where a single image represents content viewable from a single vantage point in all directions, 360° horizontally about the vantage point and 360° vertically about the vantage point. Other multi-directional images may capture data in fields of view that are not fully 360°
Modern coding protocols tend to be inefficient when coding multi-directional images. Multi-directional images tend to allocate real estate within the images to the different fields of view essentially in a fixed manner.
For example, in many multi-directional imaging formats, different fields of view may be allocated space in the multi-directional image equally. Some other multi-directional imaging formats allocate space unequally but in a fixed manner. And, many applications that consume multi-directional imaging tend to use only a portion of the multi-directional image during rendering, which causes resources spent to code unused portions of the multi-directional image to be wasted.
Apple has recognized a need to improve coding systems to increase efficiency of multi-directional image data.
Without a doubt this is a patent for future tools for coders relating to creating images in 360° in a new way. Apple notes that their invention provides techniques for implementing organizational configurations for multi-directional video and for switching between them. Source images may be assigned to formats that may change during a coding session. When a change occurs between formats, video coders and decoder may transform decoded reference frames from the first configuration to the second configuration. Thereafter, new frames in the second configuration may be coded or decoded predictively using transformed reference frame(s) as source(s) of prediction. In this manner, video coders and decoders may use inter-coding techniques and achieve high efficiency in coding.
For mere mortals, Apple describes that their new methodology is for handling 360° imagery, and more specifically, “it may present the multi-directional video in a head mounted display (for example, virtual reality applications) or it may store the multi-directional video for later use.”
Apple’s patent application 20190004414 was originally filed back in June 2017. It was held back quite a long time for some reason. Whether that was a request from Apple to USPTO is unknown at this time. Considering that this is a patent application, the timing of such a product to market is unknown at this time.
Developers could check out the details of this invention here.
Some of Apple’s Inventors
Some of Apple’s engineers that are behind this invention include:
Xiaosong Zhou: Senior Engineering Manager. His expertise is in video compression, computer vision with projects including FaceTime, iTunes, QuickTime and more; Shuangfei Zhai: Deep Learning Research Scientist who came to Apple via IBM’s Watson Research Center. His last major project was on Apple’s Face ID; Dazhong Zhang: Video Codec Engineer; Ming Chen: RF System Engineer; and Jae Hoon Kim, Video Coding and Processing Engineer who moved to Facebook 3 months ago.