If you are an avid follower of the tech world, you probably heard a lot about Apple’s Worldwide Developers Conference (WWDC) that took place this week. The Cupertino giant unveiled its latest products and software updates, including the highly anticipated Vision Pro mixed reality headset. But while Apple dominated the headlines, there were many other AI-related news and developments that you may have missed. In this post, we will highlight some of the most interesting and important AI stories that happened this week, and why you should care about them.
Apple’s WWDC Highlights
One of the biggest announcements at WWDC was Vision Pro, Apple’s first foray into the mixed reality market. The device, which combines augmented and virtual reality features, promises to deliver immersive and interactive experiences for entertainment, education, and work. The Vision Pro boasts a sleek design, high-resolution displays, advanced sensors, and spatial audio. It also integrates with Apple’s ecosystem of devices and services, such as Siri, FaceTime, Apple Music, and iCloud.
However, the Vision Pro also comes with some drawbacks. The device is priced at a whopping $2,999, making it inaccessible for many consumers. It also has limited availability, as Apple only plans to produce 250,000 units in the first year. Moreover, the battery life of the device is reported to be only three hours, which may limit its usability for longer sessions.
Apple’s entry into the mixed reality market did not go unnoticed by its competitors. Meta, formerly known as Facebook, issued a statement in response to Apple’s announcement, emphasizing its own vision and innovation in creating the metaverse. Meta’s CEO Mark Zuckerberg said that Meta is taking a different approach to mixed reality than Apple, focusing on social connection and collaboration rather than individual consumption. He also claimed that Meta’s products, such as the Oculus Quest 2 and the upcoming Project Cambria headset, offer more affordable and accessible options for users.
Machine Learning Additions in iOS 17
Another highlight of WWDC was the introduction of iOS 17, the latest version of Apple’s operating system for iPhones and iPads. While Apple did not explicitly mention AI in its presentation, many of the new features and enhancements rely on machine learning algorithms to provide better functionality and user experience.
One of these features is voicemail transcriptions, which allow users to read their voicemails as text messages instead of listening to them. This can help users manage their messages more efficiently and conveniently, especially when they are in noisy environments or have hearing difficulties.
Another feature is image-based stickers, which let users create personalized stickers from their photos or selfies. Users can choose from various categories and styles of stickers, such as animals, emojis, or cartoons. They can also adjust the size, position, and orientation of the stickers to fit their messages.
A third feature is automatic transcriptions of audio messages, which enable users to read their voice notes as text messages. This can be useful for users who prefer to send or receive voice notes rather than typing or reading long texts.
A fourth feature is AI-powered autocorrect enhancements, which improve the accuracy and speed of typing on iOS devices. The new autocorrect system can detect and correct spelling errors, grammatical mistakes, punctuation errors, and word usage errors. It can also suggest alternative words or phrases based on the context and tone of the message.
Notable AI Research and Development
Besides Apple’s WWDC, there were many other AI research papers and projects that were published or showcased this week. Here are some of the most notable ones:
- StyleDrop: Generating images in specific styles
- A team of researchers from Google Research and ETH Zurich developed a new method for generating images in specific styles using deep neural networks. The method, called StyleDrop, allows users to drop an image into another image and automatically transfer the style of the source image to the target image. For example, users can drop a photo of a person into a painting and generate a portrait in the style of the painting.
- StyleDrop can handle various types of styles, such as artistic styles (e.g., impressionism), photographic styles (e.g., black-and-white), or cartoon styles (e.g., anime). It can also preserve the content and structure of the target image while applying the style transfer.
- StyleDrop is based on a novel technique called style injection layers (SILs), which injects style information into different layers of a generative network. The researchers claim that StyleDrop outperforms existing methods for style transfer in terms of quality and diversity.
- Concerns and regulations around labeling AI-generated content in EU
- As AI becomes more capable of generating realistic and convincing content, such as images, videos, texts, and audio, there are growing concerns and debates about the ethical and social implications of such content. In particular, there are questions about how to label and disclose AI-generated content to avoid deception, manipulation, or harm.
- Some countries and organizations have already introduced or proposed regulations or guidelines for labeling AI-generated content. For example, in China, a new law requires online platforms to clearly mark any content that is produced by AI or virtual reality. In the EU, a draft regulation suggests that AI systems that generate or manipulate content should provide information about their identity, capabilities, and limitations. In the US, a bill has been introduced to ban the use of deepfakes in political campaigns without consent or disclosure.
- However, there are also challenges and limitations in enforcing such regulations or guidelines. For instance, there may be difficulties in defining what constitutes AI-generated content, detecting and verifying such content, and ensuring compliance and accountability. Moreover, there may be trade-offs between labeling AI-generated content and protecting the privacy, creativity, or freedom of expression of the creators or users.
- Introduction of Daz 3D’s character creation capabilities
- Daz 3D, a company that specializes in 3D modeling and animation software, announced the launch of its new character creation capabilities. The new features allow users to create realistic and diverse 3D characters using AI and machine learning.
- Users can start from a library of pre-made characters or create their own from scratch. They can customize various aspects of the characters, such as their face, body, skin, hair, clothing, and accessories. They can also adjust the age, gender, ethnicity, and emotion of the characters.
- The new character creation capabilities are powered by Daz’s proprietary technology called Daz Studio. Daz Studio uses deep neural networks to generate high-quality and realistic 3D models based on user inputs. It also leverages a large database of human data to ensure diversity and accuracy.
- Daz 3D claims that its character creation capabilities can be used for various purposes, such as gaming, animation, education, advertising, and art. It also says that its technology can democratize 3D modeling and animation by making it accessible and affordable for anyone.
- Convey’s real-time conversation technology showcased at the Computex event
- Convey, a startup that develops real-time conversation technology using AI and natural language processing (NLP), showcased its product at the Computex event in Taiwan this week. The product, also called Convey, is a software platform that enables users to have natural and engaging conversations with anyone in any language.
- Convey uses advanced NLP techniques to understand the meaning and intent of the user’s speech or text input. It then translates it into the desired language and generates an appropriate and natural response. It also adapts to the user’s style, tone, and personality to create a personalized and human-like conversation.
- Convey can support over 100 languages and dialects, including Mandarin Chinese, English, Spanish, French, German, Japanese, Korean, Arabic, Hindi, and more. It can also handle various domains and scenarios, such as travel, business, education, entertainment, social media, and more.
- Convey aims to provide a seamless and immersive communication experience for users across different languages and cultures. It also hopes to empower users with new opportunities and possibilities through language learning, global collaboration, and cross-cultural understanding.
- RunwayML’s Gen 2: Coherent text-to-video model with creative possibilities
- RunwayML, a company that provides an online platform for creating and exploring machine learning models for creative purposes, announced the release of its Gen 2 model this week. The Gen 2 model is a text-to-video model that can generate coherent and realistic videos based on natural language inputs.
- The Gen 2 model is built on top of OpenAI’s DALL-E model, which can generate images from text descriptions. The Gen 2 model extends the DALL-E model by adding temporal coherence and video generation capabilities. It can produce videos with up to 30 frames per second (fps) and up to 10 seconds in length.
- The Gen 2 model can handle various types of text inputs, such as keywords (e.g., “a cat playing with a ball”), sentences (e.g., “A woman is walking on the beach with her dog”), or stories (e.g., “A man wakes up in a hospital bed with no memory”). It can also generate videos with different styles, such as realistic (e.g., “A car driving on a highway”), abstract (e.g., “A kaleidoscope of colors”), or artistic (e.g., “A painting of a sunset”).
- The Gen 2 model offers a new way of creating and exploring video content using natural language. It can be used for various creative applications, such as storytelling, filmmaking, animation, education, and entertainment.
- Flair AI: AI-driven product photo shoots promoting inclusivity
- Flair AI, a startup that uses AI to create and optimize product photo shoots, launched its platform this week. The platform allows online retailers and brands to create high-quality and diverse product images without the need for physical models, photographers, or studios.
- Flair AI uses generative adversarial networks (GANs) to synthesize realistic and diverse human models based on user preferences. Users can select from various attributes, such as age, gender, ethnicity, body type, skin tone, hairstyle, and clothing size. They can also upload their own product images or choose from a catalog of products provided by Flair AI.
- Flair AI then generates multiple product images with different poses, angles, backgrounds, and lighting effects. Users can preview and edit the images before downloading them. They can also use Flair AI’s analytics tools to measure the performance and impact of the images on their sales and conversions.
- Flair AI aims to help online retailers and brands save time and money on product photo shoots while increasing their customer reach and satisfaction. It also hopes to promote inclusivity and diversity in the e-commerce industry by representing different types of customers and products.
- Uncrop feature by Stability.ai: Expanding and enhancing images
- Stability.ai, a company that develops AI-powered image editing tools, introduced a new feature called Uncrop this week. The feature allows users to expand and enhance their images beyond their original boundaries.
- Uncrop uses deep learning models to infer and generate the missing parts of an image based on its context and content. For example, users can uncrop a portrait photo to show more of the background or the body of the person. They can also uncrop a landscape photo to show more of the sky or the horizon.
- Uncrop can handle various types of images, such as selfies, group photos, scenery photos, or artwork. It can also preserve the quality and resolution of the original image while adding new details and textures.
- Uncrop offers a new way of improving and transforming images using AI. It can be used for various purposes, such as enhancing social media posts, creating wallpapers, or making collages.
Conclusion
As you can see, there were many exciting and important AI news and developments that happened this week besides Apple’s WWDC. These news show how AI is advancing and impacting various domains and industries, such as mixed reality, communication, creativity, and e-commerce. They also demonstrate the diversity and potential of AI applications for different purposes and audiences.
We hope you enjoyed this post and learned something new about AI. If you want to stay updated on the latest AI news and trends, you can subscribe to our newsletter at FutureTools.io. We will send you daily updates and insights on AI research, products, and events.
Thank you for reading and stay tuned for more!